Sie sind auf Seite 1von 4

2010 Seventh Web Information Systems and Applications Conference

Protecting Location Privacy using Cloaking Subgraphs on Road Network


Jiao Xue1, Xiangyu Liu1, Xiaochun Yang1,2, Bin Wang1,2
School of Information Science and Engineering, Northeastern University Liaoning 110819, China 1,2 Key Laboratory of Medical Image Computing (Northeastern University), Ministry of Education {yangxc, binwang}@mail.neu.edu.cn
AbstractMobile users traveling over roads often issue KNN queries based on their current locations with their mobile terminals (e.g. where is the nearest gas station?). However, exact location information transmitted to an unsecure server will easily lead the mobile user to be tracked. Thus it is important to protect mobile users location privacy while providing location-based services. People traveling over roads always follow a road network. We observe that two cloaking subgraph structures, which we name cloaking cycle and cloaking tree, can be used to protect mobile users location privacy effectively in road network environment. Based on these two subgraph structures, we propose a novel location privacy preserving approach using cloaking cycle and forest, which can effectively protect mobile users location privacy while efficiently providing exact locationbased services. Key words location privacy; locationbased services; road network; subgraph cloaking
n6 n1 n7 n8 u1 n10 n13 n14 n11 n15 n9 n16 n21 u2 n17 n18 n19 n12 n20
1

n2 n4 n3 n5

Figure 1. A road network model

I.

INTRODUCTION

Recent years, due to realizing the significance of locationbased services (LBS), peoples demands for LBS have appeared explosive growth. Present LBS include locationbased places or friends finders (Where is the nearest bank of china to my current location?), location-based navigation (How should I get to Northeastern University from Middle Street?), location-based traffic reports (How many cars in the freeway?), etc. In order to obtain LBS, a mobile user needs to issue a query and his exact location to LBS providers. To our knowledge, the server of LBS provider is usually unsecure. The mobile users location information is easily stolen by attackers who may be LBS provider themselves. The attacker can re-identify the mobile users identity through public information such as voter registration list and yellow pages, and further learn the mobile users private matters. There are pieces of news report that some people tracked their girlfriends with GPS, and some companies monitored their employees using cell phones equipped with GPS. Massive cases have shown that people using LBS in mobile environment are easily tracked and learned what they are doing and where they will go by other persons. Thus peoples location privacy is being threatened seriously. How to protect peoples location privacy has caused highly concerns from experts and scholars and a lot of solutions are proposed. Most works suppose that mobile users are in a free
978-0-7695-4193-8/10 $26.00 2010 IEEE DOI 10.1109/WISA.2010.38 65

space and move without any restrictions. However, in our daily life, when we walk or take some vehicle for traveling, we always follow a road network (e.g., Fig. 1 describes a road network model). Obviously, previous location privacy protecting techniques of the free space are not suitable for road network environment. Thus, in road network environment, providing LBS meanwhile protecting mobile users location privacy proposes a brand new challenge. In this paper, we tackle the problem and make the following contributions. First, we define two cloaking subgraph structures, cloaking cycle and cloaking tree. Second, based on defined cloaking subgraphs, we propose a novel location privacy preserving approach using cloaking cycle and forest, which can protect mobile users location privacy effectively while providing efficient LBS. II. PROBLEM DEFINITION

In this paper, we use an undigraph G = (V, E) to represent a road network. For example, Fig. 1 shows a road network model. We use d(n) to denote the degree of a node, specially, nodes with d(n)=1 can be seen as ends of roads, like n1, n17, n20 etc. Nodes with d(n)=2 can be seen as road bends, like n2, n16 . Nodes with d(n)3 can be seen as road junctions, like n10, n11, n12 etc. Meanwhile, we use squares to represent mobile users and dots to represent points of interest, such as stores, gas stations, banks and so on. We adopt three-tier[1] location anonymization architecture in the research, thats to say there is a trust middleware between the mobile client (mobile users) and the LBS server. The middleware is usually called location anonymizer, which is used for receiving and anonymously processing mobile users location information, submitting anonymous locations to the

LBS server and refining the candidate results returned by the LBS server. Our main work is to design a location privacy preserving approach for the location anonymizer. Each kind of privacy preserving methods needs to define a suitable anonymous model. So far, two location anonymous models have been proposed for protecting location privacy of mobile users. One is called location k-anonymity[2] and the other is called segment l-diversity[3] . In this paper, we combine these two models and give the definition of the mobile users location privacy. Definition 1: (privacy profile) We use (k, l, lmax) to represent a mobile users location privacy, i.e. privacy profile. k denotes that besides the mobile user, there are at least k-1 other mobile users contained in an anonymous location. l denotes that besides the segment that the mobile user located at, there are at lease l-1 other segments1 contained in an anonymous location. lmax denotes that besides the segment that the mobile user located at, there are at most lmax-1 other segments contained in an anonymous location. The privacy profile (k, l, lmax) of a user u can be presented as pp(u)= (k, l, lmax) for short. For example, if pp(u1)=(5,5,8) in Fig. 1, it means that u1 wants the location anonymizer to find an anonymous location that contain at least 4 other mobile users 4 other segments, and at most 7 other segments for him. Next, we will give a formal definition about the problem of protecting location privacy in road network environment. Supposing a mobile user u, and his pp(u)= (k, l, lmax). For the location anonymizer, how to find a set of segments S (S contained l segments and k mobile users) that not only contains the location of u, but also satisfy k k l l lmax , and meanwhile, attackers cannot associate u with one of segments with a higher probability (>0.5). III. THE CLOAKING CYCLE AND FOREST METHOD
n7 n6
n8

n11

n7 n12 n8 u1

n10

n10 n14

n15 n18

n11

n16 n3

n9
n8

n13 n17 n19

n9

n12

(a) (b) Figure 2. Finding smallest cycles for u1

cycle, they are represented by concentriccircles. We first set n11 n8 and their edge as visited, other nodes and edges of the graph are set as unvisited. The search begins with n11, the adjacent unvisited nodes n10 n15 n12 and edges n11n10 n11n15 n11n12 of n11 are visited in turn, then the adjacent unvisited nodes and edges of n10 n15 n12 are also visited in turn. Until the visited node is n8, a smallest cycle has been found. Then we get edges of the cycle using recursive method. Thick solid lines in Fig. 2(a) represent gotten edges. During the search, we found two smallest cycles for u1, which are n11, n11, n12, n9, n8, n11 , they are n10, n7, n8, n11 and represented by dot lines in Fig. 2(b). The smallest cycles found by above step may be cloaking cycles. But in order to reduce the cost of query processing at the server side, we want to get a cloaking cycle that the number of mobile users and segments contained in it is closest to the mobile users privacy profile. So we need to choose an optimal one from the found cloaking cycles. We use (1) to choose the optimal cloaking cycle having the highest score. Specially, k and l are both weighting factors between 0 and 1, and they satisfy k+l =1. k l are the mobile users privacy profile, and k l represent the number of mobile users and the number of segments contained in a cycle respectively. Scycle well describes the closeness of a cycle and a mobile users privacy profile. Theoretically, l has larger influence on the query processing cost, thus we let the closeness between l and l get higher weight, i.e. higher l . In our experiment, we set k =0.4 and l=0.6 Scycle = k k/k + l l/l (1)

Cycle is a special kind of subgraph in an undigraph due to their symmetry and closure. Just because of the properties, when there is at least one mobile user in each segment of a cycle, an attacker cannot easily associate a mobile user with one of the segments with a higher provability. Thus we get the idea of using a cycle as a mobile users anonymous location. But not any cycle can be used as the anonymous location, it needs to satisfy some conditions. Definition 2: (cloaking cycle) A cycle that the number of mobile users and segments contained in it satisfy the privacy profile of a mobile user, and there are at least two segments having mobile users. How to find a cloaking cycle for a mobile user? We solve the problem with three steps: finding smallest cycles, choosing the optimal cloaking cycle and enlarging smallest cycles. We find smallest cycles using breadth-first search. Fig. 2(a) shows an example of searching for smallest cycles for u1, in which we assign n11 n8 as the start node and stop node of a
1 Segment means a sequence of edges that one follows another and all nodes are ones with d(n)=2, and two end nodes satisfy d(n)3 or d(n)=1.

Algorithm 1 describes the procedure of choosing the optimal cloaking cycle. There also may be appearing the case that the found smallest cycles are not cloaking cycles. If the number of mobile users and/or the number of segments of a smallest cycle are/is smaller than the privacy profile of a mobile user, we can enlarge the smallest cycle. The process of enlarging is just like bubbling on an original smallest cycle. Specially, we first judge if the number of nodes with d(n)3 in a smallest cycle is greater than 1. If so, we then choose start-stop node pairs and search for the enlarged part of an enlarged cycle based on a start-stop node pair. If not, it means that the smallest cycle cannot be enlarged. For the enlarged cycles, we also need to choose the optimal cloaking cycle from them.

66

Algorithm 1. Choosing the optimal cloaking cycle Input : all found smallest cycles, pp(u)= (k, l, lmax), k , l Output : an optimal cloaking cycle 1. for each cycle c 2. get c.n of cycle c (the number of segments having mobile users); 3. if c.n 2 4. get c.k of cycle c; 5. get c.l of cycle c; 6. if (c.k, c.l) satisfy pp(u) 7. c.score =k k / c.k +l l / c.l; 8. else 9. c.score = 0; 10. end if 11. end if 12. end for 13. if exist cycles that have been scored 14. return the cycle with the highest score; 15. else 16. return null; 17. end if In a static road network, cloaking cycle cannot always be found for mobile users. But through observing the structural characteristic of the road network, we discover another cloaking subgraph. It can protect the location privacy of a mobile user when the location anonymizer cannot find a cloaking cycle for him. Definition 3: (tree edge) An edge of an undigraph which is not contained in any cycles. For example, in Fig. 1, edge n18n17 n18n19 n19n20 are tree edges. Definition 4: (border tree) A free tree of an undigraph that only constituted by tree edges. For example, in Fig. 1, the tree constituted by (n19n18, n19n20, n19n21) is a border tree. Definition 5: (relative maximum border tree) A border tree of an undigraph, to which if you add an edge, it will not a border tree any more. For example, in Fig. 1, the border tree constituted by (n18n17, n18n15, n19n18, n19n20, n19n21) is a relative maximum border tree. For all edges of a relative maximum border tree, they belong to the same relative maximum border tree. So if there is at least one mobile user in each segment of a relative maximum border tree and we use all edges of the tree as an anonymous set of segments for a mobile user, then an attacker cannot identify the segment where the mobile user located at even if he replayed attack[3]. Thus, we get the following definition. Definition 6: (cloaking tree) A relative maximum border tree that the number of mobile users and segments contained in it satisfy the privacy profile of a mobile user, and there are at least two segments having mobile users. How to find a cloaking tree for a mobile user? We solve this problem with three steps: finding a relative maximum

n18 n15 n15 n17 n19 u2

n17 n18 n19 n20

n14

n16

n11

n20

n21

n21

(a) (b) Figure 3. Finding a relative maximum border tree for u2

border tree, evaluating a relative maximum border tree and choosing a cloaking forest. Finding a relative maximum border tree is a process of gradually searching for tree edges during the breadth-first search. In Fig. 3(a), we give an example of searching for relative maximum border tree for u2 . We first set node n15 n18 as visited and searching begins with n18. During the search, we judge if the edge between a visited node and its adjacent unvisited node is a tree edge. If so, we save the tree information, and set the adjacent unvisited node as visited, which means the path will be searched continually. For example, because edge n18n19 is a tree edge, its information will be saved and n19 will be set as visited for judging if the edge n19n20 n19n21 are tree edges later. If an edge is not a tree edge, then the path will be stopped to search. For example, because edges n15n14 n15n16 n15n11 are not tree edges, the path n18, n15 is stopped at n15. In the end, we find a relative maximum border tree n18, n15, n17, n19, n20, n21 for u2. The dot lines in Fig. 3(b) represent the relative maximum border tree of u2 on the road network model. After we got a relative maximum border tree, we need to evaluate it and judge if it is a cloaking tree. If so, we use it as the anonymous location of a mobile user. If not and if the number of mobile users and/or the number of segments of the relative maximum border tree are/is smaller than the privacy profile of the mobile user, we then need to find a cloaking forest for the mobile user. A cloaking forest is composed of several relative maximum border trees. Specially, we pre-stored relative maximum border trees having lower query processing cost. When appearing the case that needs to find cloaking forest for a mobile user, we load the information of these pre-stored trees and select ones that best satisfy the mobile users privacy profile together to form a cloaking forest. IV. VALUATION

The dataset we used comes from a real road network, the city of Oldenburg. It contains 6,105 nodes and 7,029 edges. We used 10,000 moving objects generated by the Network-based Generator of Moving Objects of T.Brinkhoff[4]. We also generated points of interest including stores and gas station, and they uniformly distributed in the road network. A mobile users query request contains four parameters: k, l, lmax, knnp. Specially, k l lmax are mobile users privacy profile. knnp denotes the number of nearest points of interest that the mobile user requested for. In all query requests, we supposed that the four parameters obey the normal distribution. Their default

67

(a)

(b) (c) Figure 4. Anonymous success rate , relative anonymity level and query execution time w.r.t the increase of k

(d)

(a)

(b) (c) Figure 5. Anonymous success rate , relative anonymity level and query execution time w.r.t the increase of l

(d)

mean values are equal to 5 5 20 deviation values are all equal to 1.

5 respectively and default

In the experiment, we evaluated the performance of our algorithm from the anonymous success rate, the relative anonymity level and the query execution time respectively. Generally, the higher of the anonymous success rate, the better performance of a location privacy algorithm is. Though a higher relative anonymity level can provide a stronger protection for a mobile user, it will increase the query processing cost at the server side. Thus in order to balance the anonymity and the quality of service, a smaller relative anonymity level is proposed. The query execution time can reflect the query execution cost at the sever side, so an algorithm having lower query execution time will be popular. In Fig. 4, we changed the mean value of k and let other parameters take their default values respectively. With the increase of the mean value of k, we can see that the anonymous success rate of the algorithm decreases slightly in Fig. 4(a), and they are all greater than 90%; from Fig. 4(b) and Fig. 4(c), we can see that the relative anonymity level of k decreases swiftly, and the relative anonymity level of l only increase slightly, which mean our algorithm can provide better protection and service quality. Fig. 4(d) demonstrates the variation of the average query execution time, which varies between 4ms and 4.6ms. We also can see that the increase of the parameter k has no significant effect on the query execution time. In Fig. 5, we changed the mean value of l and let other parameters take their default values respectively. With the increase of l, the anonymous success rate gets lower and lower when the mean value of l is greater than 9. This means that a larger value of l easily leads to the failure of anonymizing, and comparing with the anonymous success rate in Fig. 4(a), the value of the parameter l has larger influence on the performance of the algorithm than the value of the parameter k. From Fig. 5(b), we can see that the relative anonymity level of k increases dramatically, this because that the larger value of l, the more mobile users contained in an anonymous location, and the value of the parameter k keeps unchanged, thus the relative

anonymity level of k gets bigger and bigger. While the relative anonymity level of l decreases swiftly in Fig. 5(c), and their values are all smaller than 1.4 when the mean value of l is greater than 6. So the algorithm can get better anonymous locations for mobile users when the mean value of l is not too large. From Fig. 5(d), we can see that the query execution time also gradually increases with the increase of l. This means that the query execution cost at the server side gets bigger and bigger with the increase of l. To sum up, our algorithm has good performance when the mean value of l is smaller than 10. V. CONCLUSION

Previous works of location privacy paid little attention on road network. In our research, we observe and mine the potential structural features of the road network and define two cloaking subgraph structures, cloaking cycle and cloaking tree. Based on these two cloaking subgraphs, we propose a novel location privacy preserving approach. In an extensive experimental study on real dateset, our method can effectively protect location privacy and provide LBS efficiently for mobile users. ACKONWLEDGEMENT This work is partially supported by National Nature Science Foundation of China (No.60973018, No.60973020), and the Center Education Fundamental, Scientific, Research (No.N090504004, N090104001). REFERENCE
[1] [2] [3] [4] G.Ghinita. Understanding the Privacy-Efficiency Trade-off in Location Based Queries. In SPRINGL,2008 M.Gruteser, D.Grunwald: Anonymous Usage of Location-Based Services Through Spatial and Temporal Cloaking. In MobiSys,2003. T.Wang, L.Liu: Privacy-Aware Services over Road Networks. In VLDB, 2009. T.Brinkhoff: A framework for generating network-based moving objects. GeoInfomatica, 2002.

68

Das könnte Ihnen auch gefallen