Associative rule based Spatio-temporal Data Mining
using directions for handling skewness in data
Dissertation
Submitted in partial fulfilment of requirements for the degree of Master of Technology in
Geoinformatics and Natural esources !ngineering
by
Arun Madan"
#oll No" $%&&$%%'(
Under the guidance of:
)rof" Mrs" )" *enkatachalam
+entre of Studies in esources !ngineering
,ND,AN ,NST,T-T! ./ T!+0N.1.G2 3.M3A2
4%$4 Dissertation Approval Sheet This thesis entitled Associative rule based Spatio-temporal Data Mining using directions for handling skewness in data by Arun Madan. (Roll No: 10331007) is approved for degree of Master of Technology.
Date : _________________________ Place : _________________________
Declaration I declare that this written submission represents my ideas in my own words and where others' ideas or words have been included, I have adequately cited and referenced the original sources. I also declare that I have adhered to all principles of academic honesty and integrity and have not misrepresented or fabricated or falsified any ideadatafactsource in my submission. I understand that any violation of the above will be cause for disciplinary action by the Institute and can also evo!e penal action from the sources which have thus not been properly cited or from whom proper permission has not been ta!en when needed.
"run #adan
$%oll &o. '())'((*+
Date:
Abstract
There has been a rapid growth in geospatial data in the last decade. A lot of emphasis is on acquiring knowledge from these data. Spatial temporal data mining is the process of discovering interesting and previously unknown, but potentially useful patterns from large spatial temporal databases. Discovering patterns are important in applications domains such as Meteorology, epidemiology, public safety (Crime). Research has mainly been on three fronts: Mining unordered sequences, total order sequences and partially order sequences. Out of these methods we will concentrate on mining total order sequences. Generally the data are assumed to be uniform in nature and skewness in data is not considered. To overcome this limitation the direction has been taken into consideration. Experiments were done on climate data collected from NOAA ad it was see that using directions skewess in data were handled better. Computation time is another major factor due to the sheer volume of data. This is reduced by using breadth first expand algorithm instead of depth first expand used in this method. Experiments were conducted using simulated data and it was found that computation time reduced by 30%.