Sie sind auf Seite 1von 2

Coronary Heart Risk Study

https://www.kaggle.com/neisha/heart-disease-prediction-using-logistic-regression

https://rpubs.com/Jun_Pan43/448397

https://rstudio-pubs-static.s3.amazonaws.com/305586_38890b8784194c92a7a4a71d0744c19f.html

https://github.com/bdanalytics/BioLINCC-Framingham/blob/master/Fram_X_all.md

Objective:
The dataset provides the risk factors associated with heart disease for ~4200 patients and whether
they have a risk of coronary heart disease in the next 10 years.

Based on the dataset provided:


1. Create a segmentation of the patients based on the demographic, behavioural and health
data and analyse the risk propensity of heart disease for each segment
2. Predict the probability of a patient suffering a coronary heart disease in the next 10 years
3. Identify the most important factors that influence heart disease
4. Come up with recommendations for
a. Preventing / reducing chances of getting a heart disease
b. Extrapolated applications of the model you build and its findings

About the dataset


The dataset contains 4187 rows rows and 16 columns of data.

Column headings
 Demographic
o sex: male or female;(Nominal)
o age: age of the patient;(Continuous - Although the recorded ages have been truncated
to whole numbers, the concept of age is continuous)
 Behavioural
o currentSmoker: whether or not the patient is a current smoker (Nominal)
o cigsPerDay: the number of cigarettes that the person smoked on average in one day.(can
be considered continuous as one can have any number of cigarretts, even half a
cigarette.)
 Medical( history)
o BPMeds: whether or not the patient was on blood pressure medication (Nominal)
o prevalentStroke: whether or not the patient had previously had a stroke (Nominal)
o prevalentHyp: whether or not the patient was hypertensive (Nominal)
o diabetes: whether or not the patient had diabetes (Nominal)
 Medical(current)
o totChol: total cholesterol level (Continuous)
o sysBP: systolic blood pressure (Continuous)
o diaBP: diastolic blood pressure (Continuous)
o BMI: Body Mass Index (Continuous)
o heartRate: heart rate (Continuous - In medical research, variables such as heart rate
though in fact discrete, yet are considered continuous because of large number of
possible values.)
o glucose: glucose level (Continuous)
 Predict variable (desired target)
o 10 year risk of coronary heart disease CHD (binary: “1”, means “Yes”, “0” means “No”)

Das könnte Ihnen auch gefallen