Sie sind auf Seite 1von 19

Introduction to Big Data

BS (CS) 6th
Lecture # 1

Dr. Syed Attique Shah (Ph.D.)


Assistant Professor
Department of Information Technology,
Faculty of Information & Communication Technologies (FICT), 
BUITEMS, Quetta, Pakistan

1
About Me

Dr. Syed Attique Shah


PhD
Department of Geographical Information Technologies,
Informatics Institute,
Istanbul Technical University, Turkey.

Assistant Professor/ Chairperson


Department of Information Technology,
Faculty of Information & Communication
Technologies (FICT), 
BUITEMS, Quetta, Pakistan.
DATA: SOME INTERESTING FACTS
 In 2020, there will be around 40 trillion gigabytes of data (40 zettabytes).
(Source: Dell EMC)

 90% of all data has been created in the last two years.
(Source: IBM)

 In 2012, only 0.5% of all data was analyzed.


(Source: The Guardian)

 97.2% of organizations are investing in big data and AI.


(Source: New Vantage)
DATA: SOME INTERESTING FACTS (cont.)
 What was big data and analytics market worth in 2019? $49 billion, says Wikibon.
(Source: Wikibon)

 In 2020, the big data market is expected to grow by 14%.


(Source: Statista)

 Job listings for data science and analytics will reach around 2.7 million by 2020.
(Source: Forbes)
HOW IS BIG DATA DIFFERENT?
• Typically an entirely new source of data (e.g. Social Media, Mobile devices)

• Automatically generated by a machine (e.g. Sensor embedded in an engine)

• Not designed to be friendly (e.g. Text streams/ Unstructured data)

• May not have much values if an effective analytical technique is not used
TACKLING NEW BIG DATA
DATA is the “NEW OIL”
But then again:
• Mostly unstructured data (Approx: 80%+ of all the data)
• Lots of machine-generated data
• Key-value pairs instead of data tables
• No-SQL vs. RDBMs
• Storage and computing on commodity hardware
• Distributed storage and computing
• Lots of open-source solutions
• Complex data pre-processing (parsing, ETL, etc.)
• New analytics technologies (Hadoop/MR, 2005)
• New visualization techniques
• Cloud-based analytics vs. local analytics

Source: Andrei Khurshudov, PhD Alchemy IoT, Colorado, USA


BIG DATA ANALYTICS (BDA) VS. TRADITIONAL ANALYTICS

Source: Andrei Khurshudov, PhD Alchemy IoT, Colorado, USA


WHAT BDA OFFERS?

• Perfect consumer of data and a generator of decisions/commands/processed data


• Highly-dependent on fast, reliable, robust distributed infrastructure
• Abundance of cheap distributed storage and computing makes progress faster
• Fully suitable for working with the IoT ecosystem and users
• Main future direction: intelligent, autonomous, self-learning algorithms
• Over time, as analytics becomes more autonomous so will be decision-making in various applications
A GENERAL BIG DATA FRAMEWORK
A GENERAL BIG DATA FRAMEWORK (TOOLS OPTIONS)
WHAT IS THERE IN THE FUTURE FOR ICT?
 Data, Data and more (Data) ….
Equipped with enhanced data storage
and processing Infrastructure

 Enhanced global network of connected sensors,


devices, and machines (IoT) ….
The ever scalable INTERNET

 Smart, autonomous, self-learning


data-processing and decision-making
algorithms (Analytics) ….

 Integration of technologies (Solutions)….


WHERE INTERNET OF THINGS (IOT) STANDS?
Sensors + Connectivity + People and Processes

We are giving our world a digital


nervous system. Location data using These inputs are digitized and placed onto These networked inputs can then be
GPS sensors. Eyes and ears using networks. combined into bi-directional systems that
cameras and microphones, along with integrate data, people, processes and
sensory organs that can measure systems for better decision making.
everything from temperature to
pressure changes.

Source: https://www.postscapes.com/what-exactly-is-the-internet-of-things-infographic/
IOT APPLICATIONS (To name a few)

All these applications


will generate and
consume a lot of Data

Source: https://www.postscapes.com/what-exactly-is-the-internet-of-things-infographic/
WHAT IOT OFFERS?

• Huge new source of data and a big consumer of decisions/commands/processed data

• Today, relatively little data is collected, and little of what is collected is analyzed (less than 5%) from
IoTs

• Main future direction: “to Connect Everything to Everything and to Everyone”

• Highly-dependent on fast & smart analytics

• People will become less involved in decision-making over time

• Up to 40% of IoT would rely on local analytics (Edge/Fog IoT)


GENERAL BIG DATA ANALYTICS VS. IOT ANALYTICS
The main possible differentiator:

• Hybrid Analytics (Edge + Cloud)

• Edge analytics deals with


high-velocity data processed
near the sensors (because
cloud-based analytics is too
slow and too expensive for
this task)

• Cloud-based analytics
complements “Edge” analytics

• Both analytics systems work


together to deliver the best
possible value
Source: Andrei Khurshudov, PhD Alchemy IoT, Colorado, USA
THE INTEGRATION OF IOT AND BDA

• IoT is a giant, fast-growing data-generating platform


• BDA is a giant, fast-growing data-processing engine

• IoT is nearly helpless without fast and powerful analytics that enable the best decisions in real time
• BDA strengths become more pronounced in the presence of a lot of fast, diverse, noisy, multi-
dimensional data, which is what IoT is generating

• Growth and success in one area will promote growth and success in another area as well
CONCLUSION
• Data has become the unprecedented driver for innovation
• In the era of the Internet of Things and Mobility, with a huge volume of data becoming
available at a fast velocity, and there is huge scope for an efficient data analytics systems
• Big Data Frameworks are an efficient way to handle IoT data as we move to real time use
cases
• IoT is on the way to adopt fully-automated, autonomous, selflearning Big Data Analytics
• Big Data and IoT is changing the way data is leveraged and can lead to a paradigm shift in
various function-specific systems

Das könnte Ihnen auch gefallen