First Phase-Zeroth Review

Coronary Heart Disease Predictive Decision Scheme

Using Big Data and RNN



● The major scope of our project is to make a predictive system to detect the
coronary heart disease of a patient which helps in creating an awareness for
the patient.
● This will greatly reduce the death rate which is caused due to the coronary
Heart Disease.
What is BIG DATA?
● Big data is a term that describes the large volume of data – both structured
and unstructured – that inundates a business on a day-to-day basis.
● Big Data is a broad term for data sets so large or complex that they are
difficult to process using traditional data processing system.
● Challenges include analysis, capture, curation, search, sharing, storage,
transfer, visualization, and information privacy.

Beginning of the Digital age:
● Data sets grow rapidly - in part because they are increasingly gathered by
cheap and numerous information-sensing Internet of things devices such as
mobile devices, aerial (remote sensing), software logs, cameras,
microphones, Radio-Frequency Identification (RFID) readers and wireless
sensor networks.

Why is Big Data?
● Big Data generates value from the storage and processing of very large
quantities of digital information that cannot be analyzed with traditional
computing techniques.
● Big data analytics helps organizations harness their data and use it to identify
new opportunities.
● The following are the advantages of big data:
1. Faster, better decision making.
2. New products and services.
3. Cost Reduction.
● Data quality is important in any context. With the increasing complexity of big
data, however has come greater attention to the importance of ensuring data
quality within complex data sets and analytics operations. 6
Characteristics of Big Data:
The characteristics of big data can be explained by the 5 V’s :

5 V’s Of Big Data

Volume Velocity Veracity Variation Visibility

The Volume The Variety Dimension:
❏ The name 'Big Data' itself is related to a size ❏ Variety refers to heterogeneous sources and
which is enormous. Size of data plays very the nature of data, both structured and
crucial role in determining value out of data. unstructured. Now days, data in the form of
Also, whether a particular data can actually emails, photos, videos, monitoring devices,
be considered as a Big Data or not, is PDFs, audio, etc. is also being considered in
dependent upon volume of data. Hence, the analysis applications..
'Volume' is one characteristic which needs to
be considered while dealing with 'Big Data'.

The Velocity Dimension: The Veracity Dimension:
❏ The term 'velocity' refers to the speed of
generation of data. ❏ Data comes in different kinds of formats such
❏ Big Data Velocity deals with the speed at which as structured, numeric data in traditional
data flows in from sources like business databases to unstructured text documents,
processes, application logs, networks and email, video, audio, stock and financial
social media sites, sensors, Mobile devices, etc. transactions.
The flow of data is massive and continuous.

The Visibility Dimension:
❏ This dimension refers to a customers ability to see, track their experience or
order through the operations process.
❏ Example:

A high visibility dimension includes courier companies where you can track
your package online or a retail store where you pick up the goods and
purchase them over the counter.

Types of Data in Big Data
Big data could be found in three forms:

1. Structured

2. Semi-Structured

3. Un-Structured

Structured Data: Semi-Structured Data:
❏ Any data that can be stored, accessed ❏ Semi-structured data is a form of structured data
and processed in the form of fixed format that does not conform with the formal structure of
is termed as a 'structured' data. data models associated with relational databases.

Unstructured Data:
❏ Unstructured data (or unstructured information) is information that either does
not have a pre-defined data model or is not organized in a pre-defined
manner. Unstructured information is typically text-heavy, but may contain data
such as dates, numbers, and facts as well.

Big Data Analytics:
● Big data analytics is the process of examining large and varied data sets --
i.e., big data -- to uncover hidden patterns, unknown correlations, market
trends, customer preferences and other useful information that can help
organizations make more-informed business decisions.
● Big Data Analytics is “the process of examining large data sets containing a
variety of data types . Companies and enterprises that implement Big Data
Analytics often reap several business benefits, including more effective
marketing campaigns, the discovery of new revenue opportunities, improved
customer service delivery, more efficient operations, and competitive
advantages. Big Data Analytics gives analytics professionals, such as data
scientists and predictive modelers, the ability to analyze Big Data from
multiple and varied sources, including transactional data and other structured
data. 14
The Big Data Processing Categories:
❏ The following are the different categories of Big Data Processing:
i. Transactional RDBM Systems.
ii. Analytic Platforms.
iii. Hadoop Distributions.
iv. NoSQL Databases.

Applications of Big Data:
The following are the applications of Big Data:

1. Procurement Weather.
2. Product development
3. manufacturing sector
4. Marketing field
5. Human Resources
6. Finance sector
7. HealthCare
8. Media and Entertainment

Tools used in Big Data:
● The Big data tools are majorly used to perform specific and individual tasks
based on their types and functions.
● The following are some commonly used Big Data Technologies:
1. Apache Hadoop(FrameWork).
2. Microsoft HDInsight
3. NoSQL
4. Hive
5. Sqoop
6. PolyBase
7. Big data in EXCEL
8. Presto

What is Predictive Models?
❏ Predictive modeling is a process that uses data mining and probability to
forecast outcomes.
❏ Each model is made up of a number of predictors, which are variables that
are likely to influence future results.
❏ Once data has been collected for relevant predictors, a statistical model is
❏ Predictive analytics is an area of statistics that deals with extracting
information from data and using it to predict trends and behavior patterns.
❏ Often the unknown event of interest is in the future, but predictive analytics
can be applied to any type of unknown whether it be in the past, present or
Neural networks:
● Neural networks are used to effectively process the e-health data which are stored in
large number.
● Neural networks are a set of algorithms, modeled loosely after the human brain, that
are designed to recognize patterns.
● They interpret sensory data through a kind of machine perception, labeling or clustering
raw input.
● They cannot be programmed to perform a specific task.

What is E-Health?
● E-Health is an emerging field in the intersection of medical informatics, public
health and business, referring to health services and information delivered or
enhanced through the Internet and related technologies.
● It is referred to the practice that are supported by electronic processes and
● It compromises of both structured data and unstructured data.
● Structured data includes a patient's name, date of birth, or a blood-test result.
● Unstructured data includes Emails, audio recordings, or physician notes
about a patient.
● It can also include health applications and links on mobile phones,

Personalized Ubiquitous Cloud and Edge-Enabled Networked Healthcare
System for Smart Cities”.
● Jiaping Lin, Jianwei Niu, Hui Li, “PCD: A Privacy-preserving Predictive
Clinical Decision Scheme with E-health Big Data Based on RNN”.
● M.D. Anto Praveena, Dr. B. Bharathi, ”A Survey Paper on Big Data Analytics”
WU, “Analyzing Healthcare Big Data With Prediction for Future Health