Sie sind auf Seite 1von 41

A Quick Guide to Artificial Intelligence (AI)

AI defined in a nutshell for corporate enterprise


What is artificial intelligence (AI)?

Artificial intelligence (AI) is, at its core, the science of simulating human intelligence
by machines. One definition is the branch of computer science that deals with the
recreation of the human thought process. The focus is on making computers human-
like, not making computers human. The goals of artificial intelligence usually fall
under one of three categories: to build systems that think the same way that humans
do; to complete a job successfully but not necessarily recreate human thought; or,
using human reasoning as a model but not as the ultimate goal.

With the advent of the internet of things (IoT), the interconnection via the Internet of
computing devices in everyday objects, AI is poised to play a large role. Artificial
intelligence plays a growing role in IoT, with some IoT platform software offering
integrated AI capabilities.

There are several sub-specialities that comprise the whole. Although many of these
subsections are used interchangeably with artificial intelligence, each of them has
unique properties that contribute to the topic.

Machine Learning vs. AI

Artificial intelligence and machine learning (ML) are terms that are often used
interchangeably in data science, though they aren’t the exact same thing. ML is a
subset of artificial intelligence that believes that data scientists should give machines
data and allow them to learn on their own. ML uses neural networks, a computer
system modeled after how the human brain processes information. It is an algorithm
designed to recognize patterns, calculate the probability of a certain outcome
occurring, and “learn” through error and successes using a feedback loop. Neural

networks are a valuable tool, especially for neuroscience research. Deep learning,
another term for neural networks, can establish correlations between two things and
learn to associate them with each other. Given enough data to work with, it can
predict what will happen next.

There are two frameworks of ML: supervised learning and unsupervised learning. In
supervised learning, the learning algorithm starts with a set of training examples that
have already been correctly labeled. The algorithm learns the correct relationships
from these examples and applies these learned associations to new, unlabeled data it
is exposed to. In unsupervised learning, the algorithm starts with unlabeled data. It is
only concerned with inputs, not outputs. You can use unsupervised learning to see
group similar data points into clusters and learn which data points have similarities. In
unsupervised learning, the computer teaches itself, wherein supervised learning, the
computer is taught by the data. With the introduction of Big Data, neural networks
are more important and useful than ever to be able to learn from these large
datasets. Deep learning is usually linked to artificial neural networks (ANN), variations
that stack multiple neural networks to achieve a higher level of perception. Deep
learning is being used in the medical field to accurately diagnoses of more than 50
eye diseases.

Predictive analytics is composed of several statistical techniques, including ML, to

estimate future outcomes. It helps to analyze future events based on what outcomes
from similar events in the past. Predictive analytics and ML go hand in hand because
the predictive models used often include an ML algorithm. Neural networks are one of
the most widely used predictive models.

Natural Language Processing

Natural language processing (NLP) began as a combination of artificial intelligence
and linguistics. It is a field that focuses on “computer understanding and manipulation

of human language.” NLP is a way for computers to analyze and extract meaning from
human language so that they can perform tasks like translation, sentiment analysis,
and speech recognition, among others. Each of these topics deals with textual data in
a different way. One such task is machine translation, where a computer
automatically converts one natural language into another while preserving the
meaning. It is difficult even by artificial intelligence standards, as it requires
knowledge of word order, sense, pronouns, tense, and idioms, which vary widely
across languages. In machine translation, the computer scans words that are already
translated by humans to look for patterns. Like machine learning, NLP has progressed
leaps and bounds by using neural network models that allow it to learn pattern
recognition. Services like Google Translate use statistical machine translation
techniques. There is still a long way to go until a computer can be considered
completely fluent in a given language, though.

Classification and clustering are two different ways that ML creates pattern
recognition. Classification is assigning things to a specific label, while clustering is
grouping similar things together. You can apply either of these approaches to NLP.
Text classification aims to assign a document or fragment of text to one or more
categories to make it easier to sort through. Text classification is a technique used in
spam detection and sentiment analysis, where effect is assigned to a given set of text
being analyzed. Successful text classification, or document classification, occurs when
an algorithm takes text input and reliably predicts what custom category that text
falls into. Document clustering is a technique that clusters, or groups, similar
documents into categories to allow structure within a collection of documents. The
algorithm can do this even without understanding or being fluent in the language of
the text input because it learns statistical associations between inputs and the
categories. It is able to perform information extraction from a chunk of text.

Question answering works in a similar way. A question answering system answers

questions posed on natural language. This practice is often used in customer service
chatbots that can answer the most frequent or basic questions before escalating the
query to a real human, if needed. These are different than bots, which are automated
programs that crawl the internet looking for a specific type of information. The
highest form of a question answering algorithm would pass the Turing test, a test to
see if a machine’s text-based chat capabilities can fool a human into thinking they are
talking to another human. A machine using text generation could arguably pass the
Turing test. Text generation is the ability of a machine to generate coherent, human-
like dialogue. Ethical concerns exist for AI text generation because they are so similar
to human text.

A major area of speech in AI is speech to text, which is the process of converting
audio and voice into written text. It can assist users who are visually or physically
impaired and can promote safety with hands-free operation. Speech to text tasks use
machine learning algorithms that learn from large data sets of human voice samples.
Data sets train speech to text systems to meet production-quality standards. Speech
to text has value for businesses because can aid in video or phone call transcription.
Text to speech converts written text into audio that sounds like natural speech. These
technologies can be used to assist individuals who have speech disabilities. Amazon’s
Polly is an example of a technology that uses deep learning to synthesize speech that
sounds human for e-learning, telephony and content creation applications.

Speech recognition is a task where speech is received by a system through a

microphone and checked against a vocabulary bank for pattern recognition. When a
word or phrase is recognized, it will respond with the associated verbal response or a
specific task. You can see examples of speech recognition from Apple’s Siri, Amazon’s
Alexa, Microsoft’s Cortana and Google’s Google Assistant. These products need to be
able to recognize the speech input from a user and assign the correct speech output
or action. Even more advanced are attempts to create speech from brainwaves for
those who lack or have lost the ability to speak.

Expert Systems
An expert system uses a knowledge base about its application domain and an
inference engine to solve problems that would normally require human intelligence.
Examples of expert systems include financial management, corporate planning, credit
authorization, computer installation design and airline scheduling. Expert systems
have potential value in IoT applications. For example, an expert system in traffic
management can aid with the design of smart cities by acting as a “human operator”
for relaying traffic feedback to the appropriate routes.

A limitation of expert systems is that they lack the common sense that humans have,
such as the limits of their skills and how recommendations they make fit into the
larger picture. They lack the self-awareness that humans have. Expert systems are
not substitutes for decision makers because they do not have human capabilities, but
they can drastically reduce the human work required to solve a problem.

Planning, scheduling and optimization

AI planning is the task of determining the course of action for a system to reach its
goals in the most optimal way possible. It is choosing a sequence of actions that have
a high likelihood of transforming the state of the world in a step-wise fashion to
achieve its goal. When this task is successful, it allows for task automation. These
solutions are often complex. In dynamic environments with constant change, they
require frequent trial and error iteration to fine tune. Scheduling is the creation of
schedules, or temporal assignments of activities to resources while taking into
account the goals and constraints are necessary.

Where planning is determining an algorithm, scheduling is determining the order and

timing of actions generated by the algorithm. These are typically executed by

intelligent agents, autonomous robots and unmanned vehicles. When they are done
successfully, they can solve planning and scheduling problems for organizations in a
cost-efficient manner compared to hiring more staff which increases overhead costs.
Optimization can be achieved by using one of the most popular ML and deep learning
optimization strategies: gradient descent. It is used to train a machine learning model
by changing its parameters in an iterative fashion to minimize a given function to its
local minimum.

Artificial intelligence is at one end of the spectrum of intelligent automation, while
robotic process automation (RPA), the science of software robots that mimic human
actions, is at the other. One is concerned with replicating how humans think and
learn, while the other is concerned with replicating how humans do things. Robotics
develops complex sensorimotor functions that give machines the ability to adapt to
their environment. Robots can sense the environment using computer vision.

Robotics are used in the global manufacturing sector in assembly, packaging,

customer service and sold as open source robotics where users can teach robots
custom tasks. Collaborative robots—or cobots—are robots that are designed to
physically interact with humans in a shared workspace. They can be valuable to
organizations who wish to eliminate human participation in dirty, dull and/or
dangerous tasks.

The main idea of robotics is to make robots as autonomous as possible through

learning. Despite not achieving human-like intelligence, there are still many
successful examples of robots executing autonomous tasks, such as swimming,
carrying boxes, picking up objects and putting them down. Some robots can learn
decision making by making an association between an action and a desired result.
Kismet, a robot at M.I.T.’s Artificial Intelligence Lab, is learning to recognize both
body language and voice and how to respond appropriately.

Computer vision
Computer vision is defined as computers obtaining a high-level understanding from
digital image or videos—on other words, image recognition. It is a fundamental
component of many IoT applications, including household monitoring systems, drones,
and car cameras and sensors. When computer vision is coupled with deep learning, it
combines the best of both worlds: optimized performance paired with accuracy and
versatility. Deep learning allows IoT developers greater accuracy in object

Machine vision takes computer vision one step further by combining computer vision
algorithms with image capture systems to better guide robot reasoning. An example
of computer vision is a computer being able to “see” a unique set of stripes on a UPC
and scan and recognize it as a unique identifier. Optical character recognition (OCR)
uses image recognition of letters to decipher paper printed records and/or
handwriting despite a multitude of different fonts and handwriting variations across
people. Another example is how Apple’s Face ID allows your iPhone to recognize your
face only to unlock your screen. A machine can use image recognition to interpret
input it receives through computer vision and categorize what that input is. With
training, its computer vision can learn to recognize input in different states, like
humans. Computer vision can also enable machine-assisted moderation of images.

A Beginner’s Guide to Neural Networks and Deep Learning [webpage]. (n.d.).

Alford, E. (2018, August 21). Bots, chatbots, robots, AI! Here’s why knowing the
difference could set your company apart.

Amazon. Amazon Polly: Turn text into lifelike speech using deep learning.

Bellhawk Systems Corporation. Real-Time Artificial Intelligence for Scheduling and

Planning Make-to-Order Manufacturing.

CFB Bots. (2018, April 20). The Difference between Robotic Process Automation and
Artificial Intelligence.

Clark, S. (n.d.). Part II: NLP Applications: Statistical Machine Translation [slides].

Computer Vision. (n.d.). Microsoft Azure.

Copeland, J. (2000, May). What is Artificial Intelligence?

DeNisco Rayome, A. (2018, April 10). Google's Speech-to-Text now includes improved
phone call and video transcription, automatic punctuation, and recognition metadata.

Donges, N. (2018, March 7). Gradient Descent in a Nutshell.

Falcon, W. (2019, February 18). OpenAI’s Realistic Text-Generating AI Triggers Ethics

Concerns. Forbes.

Technology Quarterly: Finding a Voice. (2017, May 1). The Economist.

Geitguy, A. (2018, August 15). Text Classification is Your New Secret Weapon.

Ghaffari, P. (2015, January 12). NLP and Text Analytics Simplified: Document

Greene, T. (2018). A beginner’s guide to AI: Computer vision and image recognition.

Hammond, C. (2015, April 10). What is artificial intelligence? Computer World.

Hardesty, L. (2017, April 14). Explained: Neural networks. MIT News.

IBM. (n.d.). AI Planning.

IBM. (n.d.). Text to Speech.

Harris, T. (2002, April 16). How Robots Work.

Introduction to Unsupervised Learning [webpage]. (2018, April 9). Algorithmia blog.

Joshi, K. (n.d.). Expert Systems and Applied Artificial Intelligence.

Kamesh, D.B.K., Sumadhuri, D.S.K., Sahithi, M.S.V. and Sastry, J.K.R. (2017). An
Efficient Architectural Model for Building Cognitive Expert System Related to Traffic
Management in Smart Cities. Journal of Engineering and Applied Sciences, 12: 2437-

Kismet. (n.d.).

Kunze, L. (2018, December 7). We’re thinking about the Turing Test all wrong.

Maini, V. (2017, August 19). Machine Learning for Humans, Part 2: Supervised

Maini, V. (2017, August 19). Machine Learning for Humans, Part 3: Unsupervised

Marr, B. (2018, February 14). The Key Definitions of Artificial Intelligence (AI) That
Explain Its Importance. Forbes.

Marr, B. (2018, August 29). The Future Of Work: Are You Ready For Smart Cobots?

Mozzilla. (Updated 2019, March 5). Using the Web Speech API.

Mozilla. (n.d.). Speech and Machine Learning.

Perez, J.A., Deligianni, F., Ravi, D. and Yang, G.Z. (n.d.). Artificial Intelligence and

Pesce, A. (2013). Natural Language Processing in the kitchen. L.A. Times.

Robotics Online Marketing Team. (2018, September 11). How Artificial Intelligence is
Used in Today’s Robots. Robotic Industries Association: Robotics online.

Sauer, Jürgen. (2003). Planning and Scheduling: An Overview.

Schatsky, D., Kumar, N., & Bumb, S. (2018, May). Bringing the Power of AI to the
Internet of Things. Wired.

Servick, K. (2019, January 2). Artificial intelligence turns brain activity into speech.

10 | P a g e
Talluri, R. (2017, November 29). Conventional computer vision coupled with deep
learning makes AI better. Network World.

Vincent, J. (2018, August 13). DeepMind’s AI can detect over 50 eye diseases as
accurately as a doctor. The Verge.

Wakefield, K. (n.d.). Predictive analytics and machine learning. SAS.

Wiggers, K. (2019, January 23). Google releases dataset to train more sophisticated
question-answering systems. Venture Beat.

11 | P a g e
A Quick Guide to Natural Language
Processing (NLP)
Tags: natural language processing machine learning unstructured data

AIIA Editorial Team


Natural Language Processing (NLP) is an area of research and application that explores
how computers can be used to understand and manipulate natural language text or
speech. NLP is a sub-field of Artificial Intelligence (AI) that is focused on enabling
computers to understand and process human languages, and to get computers closer
to a human-level understanding of language. Computers don’t yet have the same
intuitive understanding of natural language that humans do. There is a big difference
between the way humans communicate with one another, and the way we “talk” with
computers. When writing programs, we have to use very careful syntax and structure,
but when talking with other people, we take a lot of liberties. We make short
sentences. We make longer sentences, we layer in extra meaning, we use puns and
sarcasms. We find multiple ways to say the same thing.

That being said, recent advances in Machine Learning (ML) have enabled computers to
do quite a lot of useful things with natural or human language. Deep Learning has
enabled us to write programs to perform things like language translation (e.g. Apple’s
Siri, Amazon’s Alexa, Google Home etc.), semantic understanding, and text
summarization. As AI becomes ubiquitous by finding its way into more and more of our
devices and tasks, it becomes critically important for us to be able to communicate
with computers in the language we’re familiar with. We can always ask programmers
to write more programs, but we can’t ask consumers to learn to write code just to ask
Siri for the weather. Consumers have to be able to speak to computers in their
“natural” language.

12 | P a g e
The Necessities & Challenges of NLP
A lot of information in the world is unstructured i.e. raw text in English or another
human language. As long as computers have been around, programmers have been
trying to write programs that understand languages like English. Soon after the first
appearance of ‘electronic calculators’ research began on using computers as aids for
translating natural languages. The beginning may be dated to a letter in March 1947
from Warren Weaver of the Rockefeller Foundation to cyberneticist Norbert Wiener.
Two years later, Weaver wrote a memorandum (July 1949), putting forward various
proposals, based on the wartime successes in code breaking, the developments by
Claude Shannon in information theory and speculations about universal principles
underlying natural languages. The reason behind this continual interest in NLP is
pretty obvious — humans have been writing things down for thousands of years and it
would be really helpful if a computer could read and understand all that data.

The human language is one of the most diverse and complex part of us, considering a
total of over 6500 different languages that we humans speak in all over the world. It
has been estimated that almost 80% of available big data from text messages, tweets,
Facebook posts etc. is in unstructured or natural language form. The process of
reading and understanding English is very complex and that’s not even considering
that English doesn’t follow logical and consistent rules. There are many things that go
in to truly understanding what a piece of text means in the real-world. For example,
what do you think the following piece of text means? “I was on fire last night and
completely destroyed the other team!”- Us humans will get the layered meaning
behind this statement quite easily and almost intuitively. Our evolved brains will
effortlessly make a connection between the phrases ‘other team’ and ‘was on fire’ to
evaluate that the person uttering those words is most probably talking about some
kind of sport that she excelled in the night before. This is not the case for (regular)
computers, which use structured data (strict algorithms) as a means of making sense
out of things. To a computer the sentence, “I was on fire last night and completely
destroyed the other team” would literally mean that the person uttering those words
was lit with fire, like actual fire! And that he literally destroyed or killed other people
by burning them with fire!

NLP with Machine Learning

Computers using regular non-AI algorithms are not very capable of processing natural
language. Even if they are, it would not be efficient at all to use them, as a new
algorithm would have to be made with the introduction of any new document or set
of words. This is where machine learning comes to the rescue. Thanks to the recent
developments in ML we can actually do some really clever things to quickly extract
and understand information from natural language and simply give it the rules that it
can implement to understand any new phrase, or document.

13 | P a g e
For example, take a look at the following paragraph taken from Wikipedia:

“Vancouver is a coastal seaport city in western Canada, located in the Lower

Mainland region of British Columbia. As the most populous city in the province, the
2016 census recorded 631,486 people in the city, up from 603,502 in 2011. The
Greater Vancouver area had a population of 2,463,431 in 2016, making it the third-
largest metropolitan area in Canada. Vancouver has the highest population density in
Canada with over 5,400 people per square kilometer, which makes it the fifth-most
densely populated city with over 250,000 residents in North America behind New
York City, Guadalajara, San Francisco, and Mexico City according to the 2011

This paragraph contains a lot of useful information about a city in Canada and most
importantly this information is not in computer codes, it’s in unstructured and natural
form. ML can be implemented in order to make sense out of this paragraph and
extract useful information from it.
Doing anything complicated in machine learning usually means building a pipeline.
The idea is to break up your problem into very small pieces and then use ML to solve
each smaller piece separately. Then by chaining together several ML models that feed
into each other, you can do very complicated things, like understanding the nuances
of human language.

The basic steps that any ML model follow in order to build an NLP pipeline are the

 Step 1: Sentence Segmentation

The first thing that the ML model does is that it breaks the given paragraph into
separate sentences. This is quite intuitive in the sense that even human beings
tend to do the same thing. They try to process the meaning from one line to
the next.

 Step 2: Word Tokenization

After separating the different sentences, the next step is to extract the words
from each sentence one by one. The algorithm for tokenization can be as
simple as identifying a word every time a ‘space’ is noticed. In the first
sentence of the given paragraph, the tokenized words will be the following:
“Vancouver”, “is”, “a”, “coastal”, “seaport”, “city”, “in”, “Canada”, “,”,
“located”, “in”, “the”, “Lower”, “Mainland”, “Region”, “of”, “British”,

14 | P a g e
 Step 3: ‘Parts of Speech’ Prediction
As the name suggests, this step involves identifying whether a word is noun,
verb, adjective i.e. its parts of speech. Identifying the parts of speech of a
word helps the ML model to understand the role it’s playing in the sentence.
It’s important to highlight the fact that the ML model does not actually
understand the word’s meaning in the context of the sentence like a human
being would do. The model first has to be fed a lot of data i.e. millions of
English sentences along with the correct tag for each word’s meaning and parts
of speech. This in essence is the main characteristic of AI and deep learning.
In the first sentence our ML model will identify the word “Vancouver” as a
Proper Noun by implementing the basic rules of English Language.

 Step 4: Text Lemmatization

This step teaches the ML model to figure out the most basic form or lemma of
each word in a sentence. For example, the words (or more appropriately
strings) “horse” and “horses” might be processed by a ML model to be words
with 2 completely different meanings. But in reality, this is not the case. A
human being will not consider a “horse” and “horses” to be 2 different

 Step 5: ‘Stop Words’ Identification:

Next, the importance of each word in the sentence is identified. English has a
lot of filter words that appear very frequently like “and”, “the”, and “a”.
When doing statistics on text, these words introduce a lot of noise since they
appear way more frequently than other words. Some NLP pipelines will flag
them as stop words i.e. words that you might want to filter out before doing
any statistical analysis.

 Step 6: Dependency Parsing

In this step the model uses the grammatical laws of English language to figure
out how the words relate to one another. For example, in the first sentence of
our paragraph, the ML model will identify the word “is” as a root between the
proper noun “Vancouver” and noun “city”. Hence extracting the simple
meaning, Vancouver is a city!

 Step 7: Entity Analysis

Entity analysis will go through the text and identify all of the important words
or “entities” in the text. When we say “important” what we really mean is

15 | P a g e
words that have some kind of real-world semantic meaning or significance. The
ML model will categorize the words in each sentence as one of the following:
Place, Organization, Person, Date, Location, Events, Sum of money etc.

 Step 8: Pronouns Parsing

This is one of the toughest steps for the ML model to carry out. This step
requires the ML model to keep track of the pronouns with respect to the
context of the sentence. Simply put, we want our ML model to understand that
the word “it” in a sentence is referring to let’s say the place “Vancouver”. In
English we use words (pronouns) such as “it”, “he” or “she” as substitutes for
names of people and places. Humans can understand the meaning simply from
the context of the sentence, but computers can’t. Hence a ML model needs to
be fed a lot of data along with the correct tags for it to learn how to identify
the effect of the pronouns in a sentence.

Using NLP to Analyze Human Sentiments

One of the most exciting applications of NLP is known as ‘sentiment analysis’. It is the
process that can be used by computers (using NLP) to not only understand the literal
meaning of words, but to also extract the emotions behind them. Sentiment analysis
also known as ‘opinion mining’ is the automated process of understanding an opinion
about a given subject from written or spoken language.

Besides identifying the opinion, these systems extract the following 3 attributes of an

 Polarity: if the speaker expresses a positive or negative opinion,

 Subject: the thing that is being talked about,
 Opinion holder: the person, or entity that expresses the opinion.

As a simple example let’s consider giving the following sentence to an NLP model:
“How the hell could you do this to me?”

If we were to use the basic model of NLP, then a computer will have no problem
identifying that the sentence is in the form of a question. But that is clearly not
enough information about the sentence. Any human being can easily make a
connection between the words ‘hell’, ‘to me’ and the ‘?’ mark at the end of the
sentence to realize that the person uttering these words are either not happy or not

16 | P a g e
satisfied, or maybe even furious. This is where sentiment analysis comes in and
enables the NLP model to truly understand both the literal and emotional message
behind a phrase.

Currently, sentiment analysis is a topic of great interest and development since it has
many practical applications such as: social media monitoring, product analytics,
market research & analysis, brand monitoring workforce analytics and etc. Effective
sentiment analysis means understanding people better and more accurately than ever
before, with far-reaching implications for marketing, research, politics and security.

Sentiment analysis has moved beyond merely an interesting, high-tech whim, and will
soon become an indispensable tool for all companies of the modern age. Ultimately,
sentiment analysis enables us to collect new insights, better understand our
customers, and empower our own teams more effectively so that they do better and
more productive work.

From Wikipedia, the free encyclopedia. [18 March, 2019]. Natural Language

Hutchins J. [November, 2005]. The History of Machine Translation in a Nutshell.

Gonfalonieri A. [21 November, 2018]. How Amazon Alexa works? Your guide to Natural
Language Processing (AI)

Ashish. [3 March, 2016]. How Does Apple’s Siri Work?

Finkel J, Bethard S, Bauer J. [June 23, 2014]. The Stanford CoreNLP Natural Language
Processing Toolkit

17 | P a g e
Liddy E. [2001]. Natural Language Processing. Syracuse University.

From Wikipedia, the free encyclopedia. [5 April, 2019]. Vancouver.

Pang B & Lee L. [2008]. Opinion Mining and Sentiment Analysis.

Lexalytics - Blogs – Applications

18 | P a g e
A Quick Guide to Deep Learning
Tags: Deep Learning Artifical Intelligence neural networks

AIIA Editorial Team


What is Deep Learning?

Deep learning is an approach to machine learning (ML), which falls under artificial
intelligence (AI), which is most commonly used to label complex data. Machine
learning can use neural networks, a computer system modeled after the human
brain, to processes information. It is an algorithm designed to recognize patterns,
calculate the probability of a certain outcome occurring, and “learn” through error
and successes using a feedback loop.

The algorithms that are utilized in deep learning are best suited for cases where the
outputs will always be the same. For example, deep learning algorithms can be
trained to classify a cat at a high rate of success, because a cat is a static output. The
strength in deep learning lies in its ability to extract features from data to classify.
Approaching predictive analytics with artificial neural networks is less successful
due in part to the changing seas of prediction. Predictive analytics takes historical
data and garners insight from that data to predict future outcomes. This is why deep
learning is best deployed in computer vision, speech recognition, medical
diagnostics, and other categorization applications.

The concept of deep learning—albeit not the term, which came later—originated in
the 1950s when artificial neural networks took shape. These algorithms were simple
and populated by hand. As computing power grew, the field of big data began to feed
these deep learning algorithms with larger and larger datasets, leading to faster and
more accurate results.

19 | P a g e
Cloud Computing
The immense amount of data, or big data that goes into deep learning has outgrown
personal computer or even local server capacities. Cloud computing remotely
connects multiple servers. Advancements in artificial intelligence have stalled
several times in history. These periods are known as AI Winter. Lack of processing
power was one of the main causes of AI Winters. The power of cloud computing
helped kickstart the next round of artificial intelligence advancements. Some predict
a long AI Spring.

What is an Artificial Neural Network (ANN)?

Deep learning relies on artificial neural networks (ANN) to receive, process, and
deliver data. An artificial neural network is a framework modeled after the human
brain. This framework consists of inputs, outputs, and hidden layers. Artificial
neural networks have artificial neurons, or nodes, that weigh input data and
categorize aspects of that data, connect to other nodes, and feed it to the next
hidden layer until an output is achieved.

 Input Layer - The input node in an artificial neural network model is simply
the information provided to the network. If these inputs are labeled, the
learning model is considered supervised. If the inputs are unlabeled, the
machine learning algorithm must categorize through pattern analysis, which is
called unsupervised learning.

Preprocessing input data is a critical step not to be undervalued. When training a

deep network, accuracy is sacrificed if the input data isn’t first processed. If an image
of a pig is fed into the input layer when training the algorithm to distinguish between
cats and dogs, the neural network becomes polluted with unexpected data in the
supervised learning model. Similarly, if hairless cats aren’t accounted for in the input
data, the cat classification algorithm is missing an input that could help it correctly
classify hairless cats in the future.

 Hidden Layers - It is in the hidden layers that all the “thinking” happens.
Hidden layers can be complex, plentiful, and—thanks to modern-day
computing power—are able to process high amounts of data. Each connection
between nodes in a hidden layer garners a numbered weight. The heavier the
weight, the stronger the connection to the next node. For example, if an ANN
is classifying cats, an input node with a curly tail garners a lower weight than
an input node with a long, fuzzy tail.

 Output Layer - The output layer in an ANN is the conclusion of the data. The
algorithm has now taken the input data, weighed and dispersed it to nodes
throughout the hidden layers, and disseminated the information into output

20 | P a g e
There are several different types of artificial neural networks, each with specific use
cases. Additionally, ANNs can work on top of one another and feed into one another,
correcting mistakes and reweighing outputs on a large scale and to a granular level.

Feedforward Neural Network

The feedforward neural network is the simplest neural network where data only
flows through connected nodes in one direction. There is no looping, dead ends, or
backflow. The feedforward neural network is the foundation for other neural
network frameworks. Due to their simplistic nature, FNNs work fine for a general
classification of datasets. While the science behind FNN is still applicable, with
current computing and data analytics power, layering them with other neural
networks or adding additional features into the algorithm is now commonplace.

Convolutional Neural Network

Just as neural networks are loosely modeled after the brain, convolutional neural
networks are inspired by the visual cortex. A CNN sees images as RGB pixels
represented as numbers, and processes width, height, and depth. Hidden layers
examine features and create a feature map which is then passed on to the next
hidden layer. These hidden layers eventually merge with their newfound knowledge
to create an output, or conclusion.

CNNs require large amounts of data in order to avoid overfitting. Overfitting is where
the deep learning model produces outputs too similar to the inputs. The goal is to
provide CNNs with enough data to allow it to uncover, recognize, and apply general
patterns. For example, if just a few cat and dog images are fed into a CNN,
overfitting can result in a narrow output of cat versus dog. With enough data, a CNN
can discriminate between a cat and a dog based on unique and highly specific

Convolutional neural networks are responsible for image recognition, video analysis,
and even drug discovery through the creation of 3D molecule and treatment models.
Where humans have difficulty with granular classifications, such as dog breeds, CNN
excels. However, CNN struggles to process small or thin images in a way humans

Recurrent Neural Network

21 | P a g e
As its name implies, RNNs are able to take output data and loops it back into the ANN
as input data. Because of its memory, RNNs are able to see the bigger picture and
make data generalizations in a more predictable way.

Long short-term memory networks (LSTM) is a type of RNN whose memory is

efficiently directed and looped throughout the hidden layers. Simply put, a LSTM
opens and closes gates between nodes to allow relevant information from previous
nodes in and filter out unnecessary noise, allowing a RNN to extend its memory for
longer periods of time.

Language modeling and prediction is a use case of RNN. The input is usually a
sequence of words, and the output is a predicted sequence of words that matches the
style of the input. For example, this Shakespeare text was actually written by an RNN
after being fed large sets of Shakespeare works as its input.

Speech recognition is another use of RNN when audio is used as the input. Just like
the human brain learns and memorizes certain rules like the letter q is nearly always
followed with the letter u, RNN working with speech recognition can learn syntax
and use this knowledge to recognize and predict speech.

Activation Functions
In order to process information quickly and accurately, the artificial neural network
needs to decide if a particular feature of its input is relevant or irrelevant. Activation
functions work as on/off switches for a particular node, deciding whether or not to
weigh it and connect it to another node or to ignore the node completely. Continuing
with the cat/dog example, an activation function turns off the node that categorizes
inputs as “four legged.” Since both cats and dogs have four legs, continuing to
progress that node through the ANN layers dilutes the processing power of the ANN.

Backprop Algorithm
To account for errors discovered within an ANN, the backward propagation of errors,
or backprop algorithm was born. This is a supervised learning model where incorrect
outputs are corrected and pushed back into the previous layer. For example, if an
ANN weighed a dog tail too heavily when working to classify cats, the model
supervisor adjusts the weight accordingly until the correct output is achieved.

22 | P a g e
The Future of Deep learning
Deep learning is evolving exponentially. Smart healthcare, education, smart cities,
IoT and autonomous vehicles are all fields where the application of Deep learning
will potentially change these fields as we know it.

The Internet of Things (IoT) connects the world around us to the internet through
sensors, communication devices, and the cloud. Autonomous cars, medical
innovations like ingestible sensors, and advanced robotics are slowly becoming a
reality in part due to Deep learning. If artificial neural networks are modeled after
the human neural network, it can be said that sensor technology will become the
eyes and ears that feed inputs to the artificial neural networks.

Deep learning is highly complex both conceptually and mathematically. Technological
advancements and software as a service (SaaS) has opened up the world of Deep
learning to businesses and laypeople alike. As computing power and cloud computing
advances, the only limit to the power of Deep learning and artificial intelligence in
general will be the imagination of the human mind.

A.I. Wiki. (n.d.). Artificial Intelligence (AI) vs. Machine Learning vs. Deep Learning.

Beissinger, M. (2013, November 13). Deep Learning 101.

Beqari, E. (2018, January 23). A Very Basic Introduction to Feed-Forward Neural


Brownlee, J. (2018, July 23). When to Use MLP, CNN, and RNN Neural Networks.

Deshpande, M. (2017, January 11). Recurrent Neural Networks for Language Modeling.

Donges, N. (2018, February 25). Recurrent Neural Networks and LSTM.

23 | P a g e
Fogg, A. (2018, May 30). A History of Deep Learning.

Forrest, C. (2018, May 2). How to use machine learning to accelerate your IoT

Gupta, D.S. (2017, October 23). Fundamentals of Deep Learning – Activation Functions
and When to Use Them?

Linthicum, D. (2017, October 10). Now’s the time to do deep learning in the cloud.

Maini, V. (2017, August 19). Machine Learning for Humans, Part 2.1: Supervised

Maini, V. (2017, August 19). Machine Learning for Humans, Part 3: Unsupervised

Maladkar, K. (2018, January 17). Overview of Recurrent Neural Networks And Their

Marr, B. (2018, September 24). What Are Artificial Neural Networks - A Simple
Explanation For Absolutely Anyone.

Montañez, A. (2016, May 20). Unveiling the Hidden Layers of Deep Learning.

Mozilla Research. (n.d.). Speech & Machine Learning.

Nielsen, M. (2015). “Neural Networks and Deep Learning.” Determination Press.

Overfitting in Machine Learning: What It Is and How to Prevent It. (2017, September

Pietrasik, A. (2018, August 8). Machine Learning 101: Convolutional Neural Networks,
Simply Explained.

Saxena, S. (2017, October 16). Artificial Neuron Networks (Basics) | Introduction to

Neural Networks.

Silver, R. (2018, August 24). The Secret Behind the New AI Spring: Transfer Learning.

Taulli, T. (2019, March 16). Is AI Headed For Another Winter?

Techopedia. (n.d.). Input Layer.

24 | P a g e
Thompson, W. (2019, January 10). Deep Learning: The Confluence of Big Data, Big
Models, Big Compute.

25 | P a g e
A Quick Guide to Machine Learning (ML)
Machine Learning explained, whether you're a beginner
or a corporate enterprise practitioner

Tags: Machine Learning Artificial Intelligence Big Data

Seth Adler

What is machine learning?

Artificial intelligence (AI) and machine learning (ML) are terms that are often used
interchangeably in data science, though they aren’t the exact same thing. Machine
learning is a subset of AI that believes that data scientists should give machines data
and allow them to learn on their own. Machine learning uses neural networks, a
computer system modeled after how the human brain processes information. It is an
algorithm designed to recognize patterns, calculate the probability of a certain
outcome occurring, and “learn” through error and successes using a feedback loop.

26 | P a g e
Neural networks are a valuable tool, especially for neuroscience research. Deep
learning, another term for neural networks, can establish correlations between two
things and learn to associate them with each other. Given enough data to work with,
it can predict what will happen next.

Supervised and unsupervised learning

There are two frameworks of ML: supervised learning and unsupervised learning. In
supervised learning, the learning algorithm starts with a set of training examples that
have already been correctly labeled. The algorithm learns the correct relationships
from these examples and applies these learned associations to new, unlabeled data it
is exposed to. In unsupervised learning, the algorithm starts with unlabeled data. It is
only concerned with inputs, not outputs. You can use unsupervised learning to group
similar data points into clusters and learn which data points have similarities. In
unsupervised learning, the computer teaches itself, whereas in supervised learning,
the computer is taught by the data. With the introduction of Big Data, neural
networks are more important and useful than ever to be able to learn from these
large datasets.

Deep learning is usually linked to artificial neural networks (ANN), variations that
stack multiple neural networks to achieve a higher level of perception. Deep learning
is being used in the medical field to accurately diagnoses of more than 50 eye

Uses of machine learning

Predictive analytics is composed of several statistical techniques, including machine
learning, to estimate future outcomes. It helps to analyze future events based on
what outcomes from similar events in the past. Predictive analytics and machine
learning go hand in hand because the predictive models used often include a machine
learning algorithm. Neural networks are one of the most widely used predictive

Cognitive Computing is the blanket term used for receiving data, analyzing it, and
building actionable insights from that data much like the human brain would. Big
Data, cloud computing, and machine learning all fall under cognitive computing.

Because business is all about solving the same problem with different targets,
products, or services, creating one flexible machine learning model that is able to
repeat tasks is imperative.

27 | P a g e
Machine learning tools
Computational learning theory (COLT) makes predictions based on past data. This is
applicable in today’s machine learning environment because it helps the user define
useful data and avoid irrelevant data, which speeds up the machine learning process
and decreases the chance for incorrect outputs. Data isn’t just computed, but with
CLT, patterns are recognized and rules are developed, such as how many training
examples are necessary or how much time a problem will take to solve.

Pattern recognition is a tool used by machine learning to define and refine algorithms.
It can operate with tangible patterns through computer vision, which receives inputs
from the visual world and reacts to them. As it relies completely on data, pattern
recognition is utilized to present data and theoretical predictions which are then built
upon by other branches of machine learning. Pattern recognition in technology gained
momentum in the 1970s and opened the door to heuristic and decision tree methods.

Pattern recognition is only viable in the context of machine learning if it can identify
patterns quickly and accurately and is able to correctly identify a partially covered
image or an object from several angles. Such applications for this type of pattern
recognition include autonomous vehicles and cancer screening, wherein the patterns
are detected by pattern recognition technology but acted upon by the broader scope
of human or artificial intelligence. As so, the term pattern recognition is being used
less often and instead falls under the broader scope of machine learning and deep

Cluster analytics, or clustering, is a mechanism of machine learning that groups data

sets with similar characteristics together. Clustering can utilize multiple different
algorithms and parameters when going on its fact-finding mission, which often leads
to different data sets. These algorithms are useful when grouping the same data set
into different clusters. These clusters then act as data sets in other machine learning
capacities, such as computer vision, natural language processing, and data mining.
While cluster learning can be helpful in identifying target groups, its power goes
beyond simple surveying. It currently operates as a predictive tool in cybersecurity, as
it is able to cluster and identify malicious URLs and spam keywords.

Clustering falls under unsupervised learning and can also be used on its own for data
distribution insights, such as how certain demographics poll politically. This data is
then fed into other algorithms, such as marketing to target political demographics.

Clustering is currently being leveraged in this way by deep learning conventions.

Metaheuristic algorithms
Because ML is only beneficial if the time it takes to compute delivers a return on
investment, metaheuristic algorithms were developed to cut down on the

28 | P a g e
computational time of an algorithm. While precision is sometimes sacrificed, a
general answer computed in a short time frame is sufficient for certain use cases. A
heuristic is a machine learning shortcut that arrives at approximations in the case of
undefinable exact solutions or for the sake of time management, prioritizing speed
over perfection. In a heuristic, branch paths are weighted so time isn’t spent
traveling down every branch repeatedly for the sake of generating new data and
coming to precise solutions. Instead, a heuristic works on preset conditions, such as a
time limit or an estimation based on a smaller dataset. For example, a heuristic could
be defined to count the blue crayons in a crayon box. Its estimation would include sky
blue, royal blue, and cerulean, which are perfectly acceptable parameters, although
not exact.

Automated machine learning

Historically, machine learning was a timely and expensive process reserved for the
large corporations and organizations who could afford data scientists,
mathematicians, and engineers. As it has evolved, systems such as autonomic
computing and automated machine learning have driven down the complexity and
costs of machine learning processes through third-party software.

No longer is it necessary for data scientists to create complicated algorithms on a

case-by-case basis for the execution of machine learning. Much like how HTML can be
coded through simpler block-based tools, increasing its accessibility and usability to
the layperson, automated machine learning provides the building blocks of machine
learning with model presets. The user then plugs in the appropriate data categories.
Those categories get automatically populated, and the model is able to build on itself
and act accordingly and in real time with adaptive algorithms. Adaptive algorithms
further enhance this process by including new data into output calculations. The
algorithms shift and refine themselves based on new input. For example, Google Maps
darkens in a tunnel or at night when its computer vision receives data that the
environment is dark. Because of its ability to process data as it comes in and give less
weight to old or irrelevant data, adaptive algorithms are also being used by
automated stock trading software.

Reinforcement learning
Reinforcement learning is a technique applied to automated machine learning (AML)
and is a cousin of unsupervised learning and supervised learning. Where unsupervised
learning provides output based on undefined inputs and supervised learning utilizes
labeled data sets, reinforcement learning repeats processes, abandoning paths that
lead to negative reinforcement, and refines paths that lead to positive reinforcement.

29 | P a g e
In other words, they can practice and experiment toward the goal of a certain
outcome, constantly refining and optimizing its technique at a phenomenal speed.

The actor, action, and environment in reinforcement learning is defined, but the
optimal path is not. Reinforcement learning combined with deep learning is how
machine learning and artificial intelligence programs learn how to beat human chess
pros. Real-world applications include targeted ads that increase click-through rates.

It issues have grown in complexity with the advancement of technology. Because of

the utilization of complicated machine learning and artificial intelligence principles,
IT departments are at risk of performance slowdowns and human error that go
undetected in the system’s hardware and software. With reinforcement learning as its
base, autonomic computing solves these dilemmas by using technology to manage

In autonomic computing, machine learning and reinforcement learning is used to

enable a "self-"system. Examples of self-systems are self-protected systems, which
automates cybersecurity; or self-healing systems, which are able to perform actions
such as downloading patches, deleting malware, or defragment a hard drive.
Autonomic computing is designed to operate like the human nervous system, running
in the background to monitor, fix, and maintain the programs and operating systems
that technology depends on.

Action model learning

Action model learning is a type of reinforcement learning where previously existing
algorithmic models are utilized. Where reinforcement learning runs the race quickly,
learning from successes and failures through trial and error, action model learning is
more “thoughtful” in that it can reason from new knowledge and predictive analytics,
allowing it to take educated shortcuts to the finish line.

Predictive analytics uses historical data to predict the future. Pattern recognition
reorganizes data based on like characteristics to enhance the accuracy of predictive
analytics. Using an eCommerce example, predictive analytics observes umbrellas
being purchased during the rainy season. Action model learning can take this
knowledge and apply it to online advertising by populating ads for umbrellas based off
the weather forecast. Manually customizing ads in this way is timely, and with the
scope of the eCommerce world, nearly impossible.


30 | P a g e
The scope and definition of machine learning are constantly evolving with technology.
As new applications and resources are developed to deploy the power of machine
learning, its accessibility and utilization in the broader population continues to be
observed, assessed, and refined.

A Beginner’s Guide to Neural Networks and Deep Learning [webpage]. (n.d.).

A Beginner's Guide to Deep Reinforcement Learning [webpage]. (n.d.). Skymind.

An architectural blueprint for autonomic computing [white paper]. (2005). www-

Ansari, S. (n.d.). Pattern Recognition | Introduction.

Banafa, A. (2016, July 14). What is Autonomic Computing?

Banerjee, S. (2018, October 1). Difference between Algorithms and Heuristic.

Bavoria, V. (2018, March 21). Cluster Analysis Will Power Up Cognitive Computing. i-

Certicky, M. (2014, August). Real-Time Action Model Learning with Online Algorithm

Dean, J. (2018, January 11). The Google Brain Team — Looking Back on 2017.

Deep Learning vs. Machine Learning vs. Pattern Recognition. (2017, September 14).

Evans, D. (2017, March 28). Cognitive Computing vs Artificial Intelligence: what’s the

Greene, T. (2018, July 17). A beginner’s guide to AI: Computer vision and image

Gupta, A. Machine Learning Algorithms in Autonomous Driving. (n.d.).

Hardesty, L. (2017, April 14). Explained: Neural networks. MIT News.

31 | P a g e
Li, H. (n.d.). Short Term Forcasting of Financial Market Using Adaptive Learning in
Neural Network.

Loshin, D. (2017, April). Three examples of machine learning methods and related

Marr, B. (2016, December 6). What Is The Difference Between Artificial Intelligence
And Machine Learning? Forbes

Marr, B. (2016, March 23). What Everyone Should Know About Cognitive Computing.

Reinforcement Learning Explained: Overview, Comparisons and Applications in

Business [webpage]. (2019, January).

Sarkar, T. (2018, October 26). How to analyze “Learning”: Short tour of

Computational Learning Theory. Towards Data Science

Siva, C. (2018, November 30). Machine Learning and Pattern Recognition.

Wakefield, K. (n.d.). Predictive analytics and machine learning. SAS.

What is Cognitive Computing? 5 Ways to Make Your Business More Intelligent

[webpage]. (2017, October 16).

Yadav, A. (2019, January 15).


32 | P a g e
A Quick Guide to Predictive Analytics
Predictive analytics defined, explained and simplified

Tags: AI ML Predictive Analytics

Seth Adler

What is Predictive Analytics?

In its simplest form, predictive analytics takes historical data and garners insight
from that data to predict future outcomes. Because of the massive computing power
available today, Big Data has unlocked the power of predictive analytics in new and
actionable ways. Human-built algorithms no longer need to be populated by hand.
Data mining, artificial intelligence, and machine learning does those laborious, time-
consuming tasks at a scale and cost humans never could. Now, human capital channels
into the business intelligence (BI) field, creating a model that best suits an
organization and adapts to its goals. Predictive analytics is being deployed across all
industries, from healthcare and banking to astrology and dating, with incredible

33 | P a g e
How Machine Learning (ML) Enhances Predictive Analytics
While the two terms are often used interchangeably, predictive analytics is an area of
study that has been around since the 1940s. Machine learning automates the process
and allows the predictive analytic outputs to change and evolve with the access to
new internal and external data. Machine learning can do predictive analytics, but
machine learning is not predictive analytics. Other uses of machine learning outside
of predictive analytics include natural language processing and facial recognition.
Machine learning encompasses unsupervised and supervised learning models.

 Unsupervised learning provides the ML algorithm with no predefined data. In

unsupervised learning, the unstructured data is examined for patterns, then

 Supervised learning provides an output goal to the ML algorithm. The training

dataset is labeled.

Data Mining
Data miningdiscovers hidden patterns within massive amounts of data, which is then
leveraged by predictive analytics. A dataset is a collection of the mined data, and Big
Data is the term coined to describe massive datasets. Data is collected from
everywhere today. In 2017, 2.5 exabytes (EB) of data was created daily. One billion
gigabytes (GB) make up an exabyte. Data comes from sensors, social media, club
cards, transactional data, and self-reported data, to name a few.

Breast cancer detection is a good example of leveraged data mining and Big Data.
Large swaths of breast cancer scans are collected through Big Data processes, and
patterns are unearthed through data mining. These patterns inform doctors or
artificial intelligence, (AI) who are then able to non-invasively spot cancer earlier and
more accurately than by a single scan alone. Mined data is also used in the business
world to spot customer trends and recognize fraudulent activities. Fierce competition
can be overcome through the correct predictive analytics strategy, and it starts with
data mining and Big Data.

34 | P a g e
Jamie Campbell, marketing lead at the financial service company, Bud, shares their approach at
handling customer data

Source: The AIIA Network Podcast

Because the amount of data available is so vast, data preparation, including data
cleansing, are vital tasks to perform before data can be effectively plugged into a
predictive analytics algorithm.

Data Classification
Collecting the data alone is not typically actionable until it gets put into a data
matrix. A data matrix is a collection of data organized into rows and columns—think
Excel spreadsheets. The characteristics are usually stored in columns, and the
variables are stored in rows. Data matrices help clean up data, removing outliers or
irrelevant data points, and sorts and organizes information into functional data, or
data classifications. From data classifications, data models can then be formed.

Data Modeling
After mining and sorting data, a mathematical formula called an algorithm can apply
that data to models. Data modeling turns historical data into insights and predictions
by mapping past action and simulating future action. Data modeling takes some

35 | P a g e
creativity and experimentation. Once the model is built, its values can be changed to
help spot obvious or obscure trends and predictors. Data clusters and decision trees
are two common types of data models.

Data Clusters

A data cluster is a machine learning algorithm that creates data models by grouping
the data into sets with like characteristics. Data clusters are one modeling avenue for
predictive analytics by predicting future behavior or outcomes of a particular cluster.
There are different ways to cluster data, each with its own specific uses case.

 K-means clustering -A K-means clusteringalgorithm starts by randomly

assigning dataset points as cluster representatives. From there, other dataset
points are pulled into the data cluster representative it is closest to. New
centers are calculated with the new data points, and the process is repeated
until the clusters remain the same. K-meansalgorithms use the unsupervised
learning method and is commonly deployed with large datasets. While K-means
clustering algorithms are the simplest form of data clustering, they are also
sensitive to outliers and produce only a general set of clusters.

 Nearest neighbor clustering - K-nearest neighbor clustering (KNN) is a

supervised learning method that is commonly used for classification. The target
attribute is known beforehand, and clusters are built around those attributes
using a similar clustering model as k-means clustering.

 Biologically inspired clustering - Just as machine learning is modeled after

human neural networks, clustering can also be modeled after nature. In data
analytics, “bird flocking” and “ant colonizing” algorithms are commonly used
to cluster data in an organic way. Essentially, these methods cluster groups
based on what keeps each data point away from each other, what keeps a data
point moving congruently with another, and which data points move together.
These algorithms are powerful predictive analytics tools, because with enough
data points, one person’s actions, such as their buying habits, can help predict
what their dataset peer group’s buying habits may be, allowing for a holistic
target marketing approach.

36 | P a g e
Decision Trees

A decision tree is a directed supervised learning model. They are less susceptible to
outliers than clustering, and they are simple to comprehend. The model is tree-like
visually and structurally in that it starts at the root and branches out into leaves
based on varying factors. A decision tree looks like a flow chart and uses a rule-based
tactic in predictive learning. Observations are represented as branches, and outcomes
are represented as leaves or nodes. Decision trees are outcome-specific and use data
to reach particular conclusions, whereas clustering algorithms have no defined
direction and cluster accordingly.

 Classification Tree – In a classification tree, data points are already recognized

and defined. New data is then plugged into the algorithm and fed down the
tree until it reaches a definitive leaf. Animal classification is a descript
example. The limbs of the trees categorize by characteristics such as, does it
breathe air; does it lay eggs; does it have fur, et cetera. In this way, a
classification tree can take an unknown data point and come to certain
conclusions about it.

By combining classification trees, an ensemble is born. Ensembles are supervised

learning algorithms made up of overlapping yet differentiated classification trees. The
power of ensembles lies in its ability to weed out outliers, decrease biases, and
develop a more robust predictive model.

 Regression Tree – A regression tree has no categories, because the variables

are ongoing. A regression tree uses variables as inputs and numbers as outputs.
For example, based on age, weight, and sex, a regression tree can predict
mortality rates for patients with heart disease.

Classification and Regression Tree (CART) is the blanket term for decision tree
learning. Ultimately, a classification tree produces an A/B outcome, as in, yes it is a

37 | P a g e
mammal or no it is not; and a regression tree offers a continuous and numerical data
outcome, such as the forecasting of housing market prices.

Business Intelligence (BI) in Predictive Analytics

At its core, predictive analytics is a powerful computing tool. However, it is the
practice of business intelligence that prioritizes data, takes aim at desired outcomes,
and designs a solution around the pattern recognition and forecasting of predictive
analytics. In this way, predictive analytics doesn’t replace a human’s capacity to
plan, strategize, and creatively execute solutions. The term business intelligence
houses not just the technological tools used in predictive analytics and other ML
business applications, but the tools, strategies, and best practices brought forth by
those capabilities as well.

BI software and platforms offer frontend services like dashboards, reports, and visual
tools in the form of graphs and charts. Such an offering allows businesses to scale its
data insights companywide, leading to greater collaboration, elimination of silos, and
an agile workplace.

Use Cases for Predictive Analytics

The uses for predictive analytics are ever increasing. It is no longer confined by
limited computing, processing, and storing capacities. Big Data, the cloud, and an
increase in processing power has allowed industries across the board to leverage the
power of predictive analytics and machine learning. In addition, third party software
and vendors are inexpensively providing these services, which now makes them
affordable to small businesses or industries who don’t have access to their own
mathematicians and data scientists.

 Healthcare – Predictive analytics in healthcare has the power to take current

science and healthcare processes and expand them on a scale humans alone
cannot. Symptom calculators, genetic screening, and early intervention and
disease prevention all benefit from predictive analytics.

As science better understands the human genome, Big Data and predictive analytics
takes this new dataset and offers medical professionals the ability to customize
healthcare options to patients based on their predicted future risks and needs.

38 | P a g e
 Retail – In brick and mortar retail, predictive analytics assists with inventory
management, pricing, and revenue forecasting. New and old customers benefit
from the customized offerings predictive analytics allows, and retailers
decrease customer churn and increase customer loyalty as a result.

 eCommerce – Thanks largely to predictive analytics, eCommerce has exploded

in recent years. It is predictive analytics that is behind Netflix’s
recommendation engine, Amazon’s suggestions, and Facebook’s networking

Personalized ads on social media platforms are all made possible by predictive
analytics. Done correctly, both the customer and e-tailer benefit. The customer
discovers new products and services organically and noninvasively, and the brand
strategically spends its marketing dollars in a highly specific and targeted way.

 Banking – Financial institutions use predictive analytics to detect and prevent

fraud, to score credit, and to approve loans. Instantly signing up for a credit
card or loan online is possible because of predictive analytics.

 Cybersecurity – Predictive analytics can help discover data breaches far earlier
than the human eye. It can also forecast the when and where of potential
cybercrimes, leading to concentrated fortification and detection efforts. While
there is still a long way to go in this arena, with the explosion of cybercrime,
predictive analytics is shaping up to be a powerful cybersecurity tool.

Strategic models are relying on predictive analytics software and technologies like
never before. To stay competitive in today’s fast-moving landscape, understanding
and applying the competency of predictive analytics is becoming a necessity more
than just a nice-to-have across all public and private sectors. Utilizing current and
past knowledge to predict the future has limitless potential.

39 | P a g e
AltexSoft. (2019, February 26). Price Forecasting: Applying Machine Learning
Approaches to Electricity, Flights, Hotels, Real Estate, and Stock Pricing,

Austin, P., Lee, D., Steyerberg, E. Tu, J. (2012, July 6). Regression trees for
predicting mortality in patients with cardiovascular disease: What improvement is
achieved by using ensemble-based methods? Biometrical Journal.

Bari, A., Chaouchi, M., Jung, T. (n.d.). How To Convert Raw Data Into A Predictive
Analysis Matrix.

Dickson, B. (2016). How predictive analytics discovers a data breach before it


Editorial Team, insideBIGDATA (2018, January 20). How Netflix Uses Big Data to Drive

Editorial Team, insideBIGDATA (2018, November 28). Data Mining and Predictive
Analytics: Things We should Care About.

Ferguson, D. (2013, June 18). How supermarkets get your data – and what they do
with it. The Guardian.

Gandhi M., Wang T. (n.d.). The Future of Personalized Healthcare: Predictive


Garbade, M. (2018, September 12). Understanding K-means Clustering in Machine


Harrington, R. (2017, July 31). Predictive Analytics & Data Mining 101: Clustering.

How can Machine Learning boost your predictive analytics? (n.d.).

Ibnouhsein, I., Jankowski, S., Neuberger, K., Mathelin, C. (2018, April 1). The Big
Data Revolution for Breast Cancer Patients. J Breast Health.

Jain G. (2018, June 28). Predictive Analytics in the Retail Industry.

Kumar, S. (2018, March 15). The Differences Between Machine Learning And
Predictive Analytics. D!gitalist Magazine by SAP.

40 | P a g e
Maini, V. (2017, August 19). Machine Learning for Humans, Part 2: Supervised

Maini, V. (2017, August 19). Machine Learning for Humans, Part 3: Unsupervised

Mallon, S. (2018, November 17). What To Know About How Big Data Is Affecting Social

Marr, B. (2018, May 21). How Much Data Do We Create Every Day? The Mind-Blowing
Stats Everyone Should Read. Forbes.

Marvin, R. (2016, July 12). Predictive Analytics, Big Data, and How to Make Them
Work for You.

Pant, B., Pant K., Pardasani, K.R. (2009). Decision Tree Classifier for Classification of
Plant and Animal Micro RNAs. Communications in Computer and Information Science.

Pop C.B., Chifu, V.R., Salomie, I., Dinsoreanu, M., David, T., Acretoaie, V., et al.
(2011, October 10). Biologically-inspired clustering of semantic web services. Birds or
ants intelligence? Concurrency and Computation: Practice and Experience. Wiley
Online Library.

Pritchard, J. (2019, March 20). How Banks Use Predictive Analytics for Service,
Marketing, & Security.

Ray, S. (2015, January 15). Decision Tree – Simplified!

Srivastava, T. (2018, March 26). Introduction to k-Nearest Neighbors: Simplified (with

implementation in Python).

Stedman, C., Burns, E. (n.d.). Business intelligence (BI).

Training Data. (n.d.).

Van Rijmenam, M. (2018, June 24). The History Of Predictive Analytics - Infographic.

41 | P a g e