Beruflich Dokumente
Kultur Dokumente
04/28/2015
FINAL REPORT
WEB MINING
Page 1
Final Report
04/28/2015
Contents
Description................................................................................................................. 3
Questions................................................................................................................... 3
Data Dictionary.......................................................................................................... 4
Sample Data............................................................................................................... 5
Outlier Detection........................................................................................................ 6
Normalized Data......................................................................................................... 6
Association Rules........................................................................................................ 7
Performance Models................................................................................................ 8
Nave Bayes............................................................................................................ 9
Neural Network..................................................................................................... 10
SVM....................................................................................................................... 11
Logistic Regression................................................................................................ 12
KNN....................................................................................................................... 13
Evaluation Models:................................................................................................... 14
Decision Tree:........................................................................................................... 15
Nave Bayes.......................................................................................................... 15
Neural Net:............................................................................................................... 16
SVM....................................................................................................................... 16
Logistic Regression................................................................................................ 16
K Nearest Neighbor:................................................................................................. 17
Answers.................................................................................................................... 18
References................................................................................................................ 18
Page 2
Final Report
04/28/2015
Description:
Questions:
Data Dictionary
Attribute
Range
Page 3
Final Report
04/28/2015
Pregnancies
Number of
Pregnancies
PG
Plasma
Concentratio glucose at
2 hours in
n
an oral
glucose
tolerance
test
Numeric(17 0-17
)
Numeric(19 0-199
9)
Diastolic BP
Diastolic
Blood
Pressure
(mm Hg)
Tri Fold Thick Triceps Skin
Fold
Thickness
(mm)
Numeric(12 0-122
2)
Serums Ins
2-Hour
Serum
Insulin (mu
U/ml)
Numeric(84 0-846
6)
BMI
Body Mass
Index:
(weight in
kg/ (height
in m)^2)
Decimal(53. 0-53.2
2)
Numeric(52 0-52
)
Page 4
Final Report
DP Function
Age
Diagnosis
04/28/2015
Diabetes
Decimal(1.3 0.088Pedigree
53)
1.353
Function
Age (years) Numeric(66 21-66
)
Is the
Varchar(7)
patient Sick
or Healthy?
Healthy or
Sick
Sample Data:
Page 5
Final Report
04/28/2015
Outlier Detection:
Page 6
Final Report
04/28/2015
Normalized Data:
Association Rules:
Page 7
Final Report
04/28/2015
Performance Models:
Decision Tree:
Page 8
Final Report
04/28/2015
Nave Bayes:
Page 9
Final Report
04/28/2015
Neural Network:
Page 10
Final Report
04/28/2015
SVM:
Page 11
Final Report
04/28/2015
Logistic Regression:
Page 12
Final Report
04/28/2015
KNN:
Page 13
Final Report
04/28/2015
Performance Summary:
Model Name
Accuracy
Page 14
Final Report
Decision Tree
Nave Bayes
Neural Network
SVM
Logistic Regression
KNN
04/28/2015
73.18
76.17
79.95
77.73
76.43
100
Page 15
Final Report
04/28/2015
Evaluation Models:
Decision Tree
Nave Bayes
Page 16
Final Report
04/28/2015
Neural Net:
SVM
Logistic Regression
Page 17
Final Report
04/28/2015
K Nearest Neighbor
Evaluation Summary:
Ashwin Kumar Pitchai (akp73) & Suchait Mattoo (sm925)
Page 18
Final Report
Model Name
Decision tree
Nave Bayes
Neural Net
Regression
SVM
K Nearest Neighbor
04/28/2015
Accuracy (%)
71.86
75.51
74.74
75.65
76.95
68.24
Answers:
Page 19
Final Report
04/28/2015
Page 20