Beruflich Dokumente
Kultur Dokumente
by
ARUNRAJ GANAPATHY (15CO110)
MOHAMMED AMEEN (15CO131)
SATISH AVADHOOT MHETRE (15CO242)
We hereby declare that the Major Project-I Proposal Report entitled Using
Check-ins from Geo-Social Data to Determine Safe Locations during Natu-
ral Calamities which is being submitted to the National Institute of Technology
Karnataka, Surathkal in partial fulfilment of the requirements for the award of the
Degree of BACHELOR OF TECHNOLOGY in Computer Science and En-
gineering is a bonafide report of the work carried out by us. The material contained
in this report has not been submitted to any University or Institution for the award
of any degree.
This is to certify that the Major Project-I Proposal Report entitled Using
Check-ins from Geo-Social Data to Determine Safe Locations during Nat-
ural Calamities submitted by ARUNRAJ GANAPATHY (Register Number:
15CO110), MOHAMMED AMEEN (Register Number: 15CO131) and SATISH
AVADHOOT MHETRE (Register Number: 15CO242) as the record of the work
carried out by them, is accepted as the Major Project-I Proposal Report submission
in partial fulfilment of the requirements for the award of degree of Bachelor of
Technology.
Dr M Venkatesan
Guide
Chairman - DUGC
Acknowledgment
We would like to thank Dr. M Venkatesan for giving us an opportunity to work with
him for the major project.This is a great chance for learning and professional develop-
ment for us. His guidance starting from preliminary knowledge of the field to helping
us in selecting the proposal was valuable.
We would also like to extend our gratitude to one another for each one’s valuable
inputs.
Spatial clustering deals with the unsupervised grouping of places or locations into
clusters and finds important applications in urban planning and marketing. However,
the current spatial clustering models disregard information about the people and the
time who and when are related to the clustered places.
In our project , we will develop an algorithm to cluster places not only based on
their locations but also their semantics.Our model considers spatio-temporal informa-
tion and the social relationships between users who visit the clustered places.
Specifically, two places are considered similar if they are spatially close and visited
by people of similar communities.
With this information we can determine if a location is safe or not during natural
calamities, notify the people in the location and take necessary evacuation actions.
i
ii
Contents
1 Introduction 1
2 Literature Survey 3
2.2 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Bibliography 7
iii
Chapter 1
Introduction
Social networks have gained popularity recently with the advent of sites such as Twit-
ter, Instagram, Facebook, etc. Every day millions of people participate actively in
these platform and the numbers are growing exponentially. These networks are a rich
source of data as users populate their sites with personal information.
1
1.1 Issues and Challenges
During the course of out project development we will face the following challenges:-
4. Most features are zero for most samples, i.e. the object-feature matrix is sparse.
This property strongly affects the measurements of similarity and the compu-
tational complexity.
5. Outliers may have significant importance. Finding these outliers is highly non-
trivial, and removing them is not necessarily desirable.
Introduction section gives insight into what Geo-Social Network is and how we
can harness the data generated by them and using the same for our project. It also
gives the basic idea behind the project and its requirement. It also lists issues and
challenges.
The Literature Survey sections gives insights into the existing and already imple-
mented methods similar to our use-case..It is the stepping stone which will help us
during our development process.
2
Chapter 2
Literature Survey
There are various models that have been proposed for clustering geo-social network
data. The most prominent ones have three components :The social network is an
undirected graph G = (U,E) where U is the set of users and each edge (ui ,uj ) E
indicates that the users ui , uj U are friends.Set P is the set of all places visited by
users, in the form of <latitude , longitude> GPS points. A check-in in CK is a triplet
<ui , pk , tr > indicates that a user ui visited the place pk at certain time tr .
DCPGS Model
For two places pi ,pj E (pi ,pj ) is the Euclidian distance, Dgs =f( Ds (pi ,pj ), E (pi ,pj ))
is the geo-social distance, defined as a function of Dgs (pi ,pj ) and E (pi ,pj ).Parameter
is geo-social distance threshold, while τ and maxD are two sanity constraints for the
social and the spatial distances between places respectively.
Since the geo-social distance Dgs (pi ,pj ) is a function of a spatial and a social
distance,τ and maxD constrain these individual distances to avoid the following two
cases that negatively affect the quality of geo-social clusters:
3
1. The geo-social distance between two places pi and pi could be less than if they
are extremely close to each other in space, but have no social connection at all.
This may lead to putting places close to each other spatially, but having no
social relationship, into the same cluster.
2. The geo-social distance between two places pi and pi could be less than if
they have very small social distance, but they are extremely far from each other
spatially. This may lead to putting places with close social distances, but large
spatial distances, into the same cluster.
Constraints τ and maxD are defined for quality control and can be set by experts
or according to the analyst’s experience.
The social distance Ds (pi ,pj ) takes as inputs the sets of users Upi and Upj who
have visited pi and pj , respectively, and returns a value between 0 and 1. Also the
Euclid distance E(pi ,pj ) is normalized by converting into a spatial distance Dp (pi ,pj )
E(pi ,pj )
= maxD
so that any place pj in the geo-social neighborhood of pi has spatial distance
no larger than 1.
Finally, Dgs (pi ,pj ) is defined as weighted sum of Ds (pi ,pj ) and Dp (pi ,pj ).
Dgs (pi ,pj )=ω . Ds (pi ,pj ) + (1-ω) . Dp (pi ,pj ) where ω [0,1]
4
2.1 Problem Statement
To determine the safe locations during natural calamities by using spatio-temporal
clustering of Geo-Social Network Data.
2.2 Objectives
In order to achieve the task of identifying the safe locations from social media check-ins
the following objectives have to be met:-
4. Identifying the clusters which are safe based on the check-ins provided by the
users.
5
6
Bibliography
[1] Wu, D., Shi, J. and Mamoulis, N., 2018. Density-Based Place Clustering Using
Geo-Social Network Data. IEEE Transactions on Knowledge and Data Engineer-
ing, 30(5), pp.838-851.
[2] Srivastava, S., Pande, S. and Ranu, S., 2015, November. Geo-social clustering
of places from check-in data. In Data Mining (ICDM), 2015 IEEE International
Conference on (pp. 985-990). IEEE.
[3] Mishra, N., Schreiber, R., Stanton, I. and Tarjan, R.E., 2007, December. Cluster-
ing social networks. In International Workshop on Algorithms and Models for the
Web-Graph (pp. 56-67). Springer, Berlin, Heidelberg.
[4] Wu, D., Mamoulis, N. and Shi, J., 2015. Clustering in geo-social networks. Bulletin
of the IEEE Computer Society Technical Committee on Data Engineering.