Sie sind auf Seite 1von 5

Prediction of Chances - Diabetic Retinopathy using Data Mining Classification

Techniques
Abstract
Diabeticretinopathythemostcommondiabeticeyedisease,iscausedbycomplicationsthatoccurs
whenbloodvesselsintheretinaweakensordistracted.Itresultsinlossofvisionifearlydetectionis
notdone.Severaldataminingtechniqueservesdifferentpurposesdependingonthemodeling
objective.Theoutcomeofthevariousdataminingclassificationtechniqueswascomparedusing
rapidminertool.WehaveusedNaivebayesandSupportVectorMachinetopredicttheearly
detectionofeyediseasediabeticretinopathyandfoundthatNaivebayesmethodtobe83.37%
accurate.Theperformancewasalsomeasuredbysensitivityandspecificity.Theabovemethodology
hasalsoshownthatourdatamininghelpstoretrieveusefulcorrelationevenfromattributeswhich
arenotdirectindicatorsoftheclasswhichwearetryingtopredict.

ThecommonestcauseofblindnessamongworkingclassisDiabeticRetinopathywhichoftenleads
tothecompletelossofvision1.TheWorldHealthOrganization(WHO)hasestimatedthatDiabetic
Retinopathyisresponsiblefor4.8%ofthe37millioncasesofblindnessthroughouttheworld.
Thereforeapredictiontechniqueisconceivedsothatearlyprecautionsorcontrolscanbe
implemented.Peoplewithdiabetesaresusceptibletoimpairmentofothervitalorganssuchasheart,
kidneyandeyes2.AttheinitialstageofDiabeticRetinopathy,therewillbesomechangesinthe
visionthatcanbenoticed.Butovertime,DiabeticRetinopathycangetworsenandcausevisionloss.
Imageanalysistoolscanbeusedforautomateddetectionofthesevariousfeaturesandstagesof
DiabetesRetinopathyandcanbereferredtothespecialistaccordinglyforintervention.Thussuch
toolswillbeusefulforeffectivescreeningofDiabeticRetinopathypatients3.Prevalenceofhighrate
ofretinopathycasesfoundworldwideisduetodelayindiagnosisforretinopathysinceitis
asymptomatic4.Therefore,apredictiontechniquehasbeenconceivedsothatearlyprecautionsor
controlscanbeimplemented.
LanordStanleyetal.5devisedamethodtodiagnosediabeticsintheIndiancommunitywiththehelp
offoursimplequestionsviz.age,abdominalobesity,physicalactivityandfamilyhistoryalongwith
onemeasurementforwaistcircumference.
DiabetesDataAnalysisandPredictionModelDiscoveryUsingRapidMiner6analyzeaPima
Indiansdiabetesdatasetcontaininginformationaboutpatientswithandwithoutdiabetes.Thiswork
focusesondatapreprocessing,includingattributeidentificationandselection,outlierremoval,data
normalizationandnumericaldiscretization,visualdataanalysis,hiddenrelationshipsdiscovery,and
adiabetespredictionmodelconstruction.
IHDPS7prototypepredictsthepossibilityofpatientsgettingaheartdiseasefromtheClevelandheart
diseasedatabaseusingdataminingtechniquesdecisiontrees,naiveBayesandneuralnetworkwith9
medicalattributes.Theresultsshowthatthemosteffectivemodeltopredictpatientswithheart
diseasesisnaiveBayes(86.12%)followedbyneuralnetworkanddecisiontrees.Furthermore,it
canincorporateotherdataminingtechniquessuchastimeseries,clusteringandassociationrules.
EmpiricalStudyonthePerformanceofIntegratedHybridPredictionModelontheMedical
Datasets8systemhasbeenproposedtoimprovethediagnosticaccuracyofdiabeticdiseaseby
selectinginformativefeaturesofPimaIndiansDiabetesdataset.Thehybridpredictionmodel
proposedcombinestwodifferentfunctionalitiesofdataminingclusteringandclassificationwithF
scoreselectionapproachtoidentifytheoptimalfeaturesubsetofthePimaIndiansDiabetesdataset.
Theproposedmodelwasvalidatedusingfourparameters,namelytheaccuracyoftheclassifier,area
underROCcurve,sensitivityandspecificity.
Thetwotraditionalclassificationmethods(logisticregressionandFisherlineardiscriminant
analysis)andfourmachinelearningclassifiers(neuralnetworks,supportvectormachines,fuzzyc
mean,andrandomforests)werecompared9toclassifypersonswithandwithoutdiabetes.
Duringtherecentyearstherehavebeenmanystudiesonautomaticdiagnosisofdiabetes,diabetic
retinopathy,heartdiseaseetc.In10amethodhasbeenproposedforautomateddetectionand
classificationofvascularabnormalitiesusingseveraltechniquessuchasscaleandorientation,
selectiveGaborfilterbanks.In11KaplanMeiermethodtogenerateunivariatesurvivalcurvesto
identifypatientswhowereatahigherriskforretinopathy,andresultsshoweddurationofdiabetes,
systolicbloodpressure,glycosylatedhaemoglobin,albuminuria,genderanddiabetestherapywere
significantlyassociatedwiththeoccurrenceofretinopathy.
Study12wasmadetoevaluatetheefficiencyofthreeplantcomponentsviz,cinnamaldehyde,
cinnamicacidandcinnamylalcoholininhibitingAldoseReductase(AR),anenzymeassociatedwith
retinopathyofbothtype1andtype2diabeticpatients.
Aproduct13madefromwholeleafconcentrateofStevia,foundtoreducehyperglycaemiaintype2
diabeticwomen.
In14,itwassuggestedthatincreasedawarenessandtreatmentofdiabetesshouldbeginwith
prevention.
Accordingto15dataminingapplicationscanbedevelopedtoevaluatetheeffectivenessofmedical
treatments.

1. Methods
Dataminingtechniquewasusedtopredictthechancesofdiabeticretinopathy.Underthedata
explorationmode,
almostallattributeselectionmodulesapplicableforthedatatocollectoptimalsubsetofattributes
wereexplored.RapidMinerwaschosenasthedataminingtoolduetoitslearningoperatorsand
operatorframework,whichallowsformingnearlyarbitraryprocesses.
ThoughthereisavailabilityofClevelandClinicFoundationHeartDiseasedataset,forthesakeof
determiningtheaccuracyrateinIndianregion,wehavecollected300clinicalrecordsfromDr.
SeshaiahDiabetesCentre,Chennai,TamilNadu.Theclinicaldatasetspecificationprovides
concise,unambiguousdefinitionforitemsrelatedtodiabetes.
Typically,crossvalidationisusedtogenerateasetoftraining,validationfolds,andwecompared
theexpectederroronthevalidationfoldsaftertrainingonthetrainingfolds.Crossvalidationworks
werecarriedoutbyusingpartofthedatatotrainthemodel,andtherestofthedatasettotestthe
accuracyofthetrainedmodel.Inthiscase,wehavedividedthedatasetinto10partswithtraining
andtestingdataforeachpart.TheproposedarchitectureisgiveninFigure1.Theattributesdata
viewofeachrecordsareshowninTable1.

Table 1. Diabetic Attributes used in our Experimentation


Attribute Role Attribute Name Attribute Type Descripti
Regular Sex Binomial on Male, Female
Sex of the patient. Values:
Regular Age Integer Age of the patient
Indicates whether the patients parents were
Regular Family / Heredity Polynomial
affected by diabetes. Values: Father, Mother, Both
Regular Weight Numeric Weight of the patient
Regular BP Polynomial 2.Blood
Datapressure of the patient Mining
Regular Fasting Integer Classification
Fasting Blood Sugar
Regular PP Integer
Techniques
Post prondial Blood Glucose
for
Regular A1C Numeric Glycosylated Hemoglobin Test
Predicting Diseases
Regular LDL Integer Low Density Lipoprotein
Regular VLDL Integer ByVery
using
Lowdata mining
Density technique method we have
Lipoprotein
found a Vulnerability
Indicates the model which described
of the patients to and
Label Vulnerability Binomial
differentiated data classes,
Retinopathy. Values which in turn helped
: High, Low
to predict accurately the class label which is unknown.
We have also used regression method which helped to
analyze the current and past states of the attributes and
prediction of the future. Many researchers in the past
used data mining techniques in diagnosis of various
diseases.

3.1 Naives Bayes Method


Naives Bayes method is based on probabilities which
are conditional and given the probability of another
event that has already occurred, the probability of an
event occurring is found using Bayes theorem16. If A is
referred as prior event and B as dependent event,
Bayes theorem can be given as

Prob (B given A) = Prob(A and B)/Prob(A)

The Naive bayes performance screen and plots for


each attributes are shown in Figure 2 to 5.

Figure 2. Naive Bayes performance screen

Figure 3. Bayes age attribute plot.


Figure4.Bayesheredityattributeplot.

Figure 5. Bayes LP total cholesterol attribute Plot.

Das könnte Ihnen auch gefallen