Sie sind auf Seite 1von 15

Large Scale Distributed Computing

Lecture 01: Introduction


Dr. Muhammad Abid,
DCIS, PIEAS

GPU Computing, PIEAS


Large Scale Distributed Computing

Name of this subject is not appropriate.


Should have been named Big Data
Analytics

GPU Computing, PIEAS


What is data science - A Definition

Data Science is the science which uses


computer science, statistics
and machine learning, visualization
and human computer interactions to collect,
clean, integrate, analyze, visualize, interact
with data to create data products.

GPU Computing, PIEAS


Data science - Motivation

Hal Varian (Googles Chief Economist) says: The


ability to take data to be able to understand
it, to process it, to extract value from it, to
visualize it, to communicate itthats going to
be a hugely important skill in the next
decades, not only at the professional level but
even at the educational level for elementary
school kids, for high school kids, for college
kids. Because now we really do have
essentially free and ubiquitous data. So the
complimentary scarce factor is the ability to
understand that data and extract value from
it.
GPU Computing, PIEAS
Data science - Motivation

#1 Catalyst for economic growth, says


McKinsey (advisor to world's most
influential businesses and institution)
The New York Times states: "This hot
new field promises to revolutionize
industries from business to govern.,
health care to academia."
Harvard Business Review call Data
Scientist, the sexiest job of 21st
century.
Fortune
GPU Computing, PIEAS states: "Hot new gig in tech".
Goal of Data Science ?

Turn data into data products.


Data Product: facility, product or service
that uses data in smart ways to provide
value, e.g. prediction

GPU Computing, PIEAS


Data Products - Google

web Search
Google Ads
News Recommendation Engine
Google Maps
PlayStore
In fact now almost all the products &
services Google provide are data
science driven.
GPU Computing, PIEAS
Data Products - Netflix

Movie Recommendations

GPU Computing, PIEAS


Data Products - Twitter

People you may know


Applications you may like
Jobs/Events you might be interested
Classifier for bad users and bad content
With high accuracy, Facebook can
guess whether you are single or married

GPU Computing, PIEAS


Data Products - Facebook/LinkedIn

Text Analysis - Spam Filter/ Similarity


Search
User Sentiment/Satisfaction/Feedback
News Breakout
Trend and Topics
200 million users as of 2011, generating
over 200 million tweets and handling
over 1.6 billion search queries per day.

GPU Computing, PIEAS


What is Data Scientist ?

A Data Scientist should be good at data


analysis, math, statistics, but also be
able to code with huge amounts of data
and use the extracted information to
build data products.

GPU Computing, PIEAS


Data Scientists Practice

GPU Computing, PIEAS


How to become a Data Scientist?

GPU Computing, PIEAS


How Will You Be Evaluated?

Sessional 01: 15 marks


Coding Assignments: 15 marks
Assignments + Readings: 10 marks
Quizzes + Class partic.: 10 marks

Final Exam: 50 marks

GPU Computing, PIEAS


Course Contents

Introduction to Big Data


Big Data: Why and Where
Characteristics of Big Data and Dimensions of
Scalability
Foundations for Big Data Systems and
Programming
Big Data Integration and Processing
Machine Learning with Big Data
NoSQL Databases

GPU Computing, PIEAS

Das könnte Ihnen auch gefallen