Brochure Big Data

& FORE School of Management
BIG DATA
CERTIFICATION COURSE IN
ANALYTICS FOR BUSINESS &

MANAGEMENT
A Mahindra Group Initiative

PROGRAM COVERAGE
 Data Mining and Data Analytics
o Machine Learning algorithms using R and Python
o Hadoop and Kafka eco-systems
o NoSQL & Graph Databases
o Deep Learning, NLP and AI
 Business Analytics (Capstone, Python oriented)
 Web Analytics
Be able to clean, Be able to select a subset of Gain sufficient Put to use relevant tools
transform and visualize appropriate machine proficiency in tools and techniques to get a
the dataset to gain deeper learning algorithms that necessary to implement reasonable predictive
insights and make it ready could be applied to get the algorithms accuracy
for analysis desired predictive results
Apply the knowledge of Should be able to himself Should be able to install, Be able to Install &
Deep Learning to a wide install, setup and configure configure and be Configure important
array of disciplines such and experiment with a sufficiently familiar with Analytics and Storage
as health, process control, complete hadoop and Kafka the variety of NoSQL Systems such as
navigation and others ecosystem databases and decide Hadoop-ecosystem,
for himself which one to NoSQL databases & others
use, when and how
WHO SHOULD ATTEND
Specifically, the course will be useful to:
EXECUTIVES ACADEMICIANS DATA SCIENTISTS/ STUDENTS / RESEARCH

Ambitious Executives Lecturers and DEVELOPERS SCHOLARS
(from Private/Public Professors for Techniques taught II nd year students
sectors) looking extending the to them will have currently enrolled in Engg.
forward to sharpening horizon of their applications in a /PGDM/ MBA or any
their skills in making
sense of data in order
knowledge
through
broad array of
disciplines
graduate or Post-
graduate program who
PEDAGOGY
to innovate and add deepening their have had an introductory  The Data Analytics program is Project based not
more value to their research skills course in statistics. These Pure Theory-based
organization and to students can look forward
society to better placement  Learning with question/answers are in
opportunities with added real-time: Live Virtual Interactive Learning
skill set
 Algorithms are first explained conceptually,
avoiding mathematics and then these are
ELIGIBILITY CERTIFICATE ISSUED BY implemented with real data from Industry as
Graduate in any University of California, Riverside, Extension USA and FORE Projects
discipline School of Management, New Delhi  Datasets for implementation are made available
in advance and so also a copy of code to be
executed. The code is numbered and copiously
PROGRAM SCHEDULE commented so that long after the lecture has
Duration Frequency Class Schedule Timings finished, students can go back through the
150 hours Twice a week Saturday-Sunday 10:30 am - 1:30 pm code/comments and refresh their knowledge
PROGRAM FEE  During During the lecture, code is explained and

Registration Fees Admission Fee 1 Installment
st
executed line-by-line. At his end the student
`55,000/- Date At the time of registration 20th March 2018 25th May 2018
also executes it. Consequently, results are
+GST Amount `10,000/-* `25,000/- `20,000/-
available at our end as also on Students Laptop.
Books and Material Fee: `5,500 +GST per participant, payable to ‘FORE School of Management, New Delhi’ The whole experience is as if everyone is sitting
* Any request for refund of registration fees on account of valid reason prior to the closure of registrations or 10 working days before the date of course commencement whichever is
earlier, the amount paid shall be refunded with a deduction of `5,000 + applicable taxes. together in a lab.
PROGRAM CONTENT
INTRODUCTORY BUSINESS STATISTICS Bias-variance trade-off; L1 & L2 regularization and data structures in python and pandas; Loops a
 Measures of Central Tendency and Dispersion  Ensemble modeling: A review of variety of techniques; Conditionals in python;
 Probability Theory (Different Approaches, Rules of Balancing datasets  Exploring data with pandas—Quick Start
Probability, Baye’s Theorem)  eXtreme Gradient Boosting (XGBoost)  Numpy: Arrays; Basic arrays operations; Comparison
 Random Variables and Probability Distributions Discrete  LightGBM: Light Gradient Boosting Machine operators and value testing for arrays; Array item
Probability Distributions - Binomial and Poisson selection and manipulation;
Distribution Module 2.2: Hadoop and Kafka Eco System; Processing  Data Visualization in python; Data Visualization using
 Continuous Probability Distributions – Normal streaming data and analysis t-distributed stochastic neighbor embedding (t-sne)
Distribution  Introduction to Hadoop and its ecosystem; Hadoop file  k-means clustering with scikit-learn
 Correlation and Regression Analysis: Simple & Multiple storage formats  Decision trees classifier
Regression  Linux and Hadoop shell commands  Ensemble Modeling
 Concept of Hypotheses Testing, Type I & Type II Errors,  Hadoop streaming  Logistic Regression (along with Dimensionality
Power of The Test, Hypothesis Testing of Mean and Reduction, PCA)
Proportion, Two Sample Tests, Tests for Difference in  Hive on Tez and hadoop
 Pig on Tez and hadoop  Support Vector Machines
Means and Proportions.
 Pyspark and SparkSQL: Data storage and Extraction with  Introduction to Keras on Tensorflow
 Chi-Square Goodness-of-Fit Test, Test of
Independence SQL; Executing ML algorithms (including grid-search))
using MLlib and ML libraries WEB ANALYTICS
DATA MINING AND DATA ANALYTICS  Recommender Engine using Mahout on hadoop  Basics of Web analytics
Module 2.1: Machine Learning Algorithms (using R and  Installation of Hadoop ecosystem  Analytic techniques and
Python*)  Apache Kafka: Stream data processing  Tools: Google trends, Google Website optimizer, Google
 Developing familiarity with R; Data structures; Analytics, Google Tag manager
Summarizing data; Data Exploration and transformation; Module 2.3: NoSQL and Graph Databases  Data Analysis and Data Visualization
integrating datasets; data & dates wrangling  Introduction to NoSQL Databases and CAP theorem;
 Data Visualization and story-telling. Developing Comparison with RDBMS STUDENTS EXERCISES/PROJECTS
relationships between various features and plotting  Redis in-memory data structure store  Data Visualization and story-telling.
distributions
 MongoDB Document Database  K-means clustering
 Data Mining: Measures of Proximity; Cluster Analysis:
Curse of Dimensionality;  Hbase column family database on hadoop  Model based clustering
 K-means clustering and Model based clustering  Neo4j Graph Database  Dimensionality reduction and t-sne visualization
 Decision trees Induction
 Text clustering and Agglomeration clustering;
Module 2.4: Deep learning, NLP & AI  K-Nearest Neighbour
 Evaluation of clusters; Cluster Validation; Clustering
tendency  Autoencoders and anomaly detection  Neural Network
 Classification Analysis: Decision tree Induction;  Deep Learning with Convolution Neural Network  Naïve Bayes Modeling
Cross-validation, parameter tuning & grid search  Using very Deep Convolution networks and Data  Random Forest
 Techniques of Dimensionality Reduction: PCA and SVD Augmentation  Feature plotting
(Singular Value Decomposition)  Transfer Learning  eXtreme Gradient Boosting (XGBoost)
 Neural Network  Generative-Adversarial Networks (GAN)  Support Vector Machines
 Random Forest and Regression Trees; Determining  Recurrent Neural Networks & LSTM  Regression trees
feature importance with Boruta  Natural Language Processing & Word2Vec  Apache Pig Exercises
 Gradient Boosting Technique for Machine Learning & transformation  Analyse data on Spark/PySpark
grid search of its parameters  mongoDB Exercises
 Evaluating Classification: ROC, AUC, Precision, Recall, BUSINESS ANALYTICS CAPSTONE (PYTHON ORIENTED)  Deep-Learning: Autoencoder
Specificity, Sensitivity; kappa metric; Overfitting;  Introduction to python; Using iPython; Basic data types  Deep Learning
FACULTY PROFILE
Prof. Ashok Kumar Harnal, Professor in IT Area at FORE SChool of Management: Graduated from IIT Delhi; M.Phil, MA (Economics):
Expert in Big Data and Data Analytics both on the technology side as also on Analytics side. Extensively taught faculty and students
on the subject of big data technology and analytics. Participated in various machine learning problems in areas of business and
marketing.
Prof. Kemal Oflus, Professor at UCR: Capstone Project Faculty covering Python module, Ex-rocket scientist. Highly motivated and
versatile data scientist with fifteen plus years of proven analytics performance. Skilled at building effective and productive working
relationships with customers, team members, executive management. Excellent time management, negotiation, interpersonal and
presentation skills. A talent for analyzing problems, developing simplified procedures, and finding innovative solutions those improve
operating efficiency and lower costs for customer. Successful in bringing methods long have been used in engineering and scientific
communities to business customers and decision makers.
Prof. Hitesh Arora is a Professor in the area of Quantitative Techniques/ Operations Management at FORE School of Management,
New Delhi. A graduate in Mathematics and a post graduate in Operational Research from University of Delhi, he has earned his
Doctorate in Mathematical Programming from Department of Operational Research, University of Delhi. He has qualified National
Eligibility Test (NET) conducted jointly by CSIR & UGC for Lectureship with Junior Research Fellowship (JRF) in Mathematical
Sciences. He started his teaching career from University of Delhi and taught subjects like Optimization, Queuing Theory, Inventory
Management and Statistics besides guiding students in their project work. As an actuarial consultant, his work involved Data
Modeling and Reserving for Personal and Commercial Lines of different UK-based insurance companies. He has over seventeen
years of experience in academics and industry.
Prof. Rakhi Tripathi, Associate Professor at FORE: PhD (IIT, Delhi) and MS (Computer Science) from Bowie State University (University
of Maryland System). She has an experience of more than 10 years in research. She has worked on prestigious projects on Computer
Networks and E-government at IIT Delhi. She has also presented and published several research papers in national as well as
international reputed journals, conferences and books. Her current areas of interest include Computer Networks, E-government,
Cloud computing, Mobile computing, Digital strategy and Social Media.
Prof. Dhanya Jothimani is PhD (Financial Analytics) – Thesis Submitted, IIT Delhi; M.Tech (Industrial Engineering and Management),
IIT Kharagpur; B.Tech (Production Engineering), NIT Trichy. During her doctoral programme at IIT Delhi, Dhanya has presented her
research work in well-reputed conferences including INFORMS Annual Meet, Annual meeting of Decision Sciences Institute (DSI) and
Annual conference of Midwest Association for Information Systems (MWAIS). She was sponsored by Department of Science and
Technology (DST) to present her research work at INFORMS Annual Meet 2016 at Nashville, Tennessee, USA. She has deliverd few
lectures on R language and multi-criteria decision-making tools to postgraduate and doctoral students at IIT Delhi.
ABOUT EDUCATION LANES
Education Lanes is Tech Mahindra Growth Factories’ initiative that offers certificate
programs from premier institutes on a virtual platform. Education Lanes offers a
comprehensive direct-to-device education suite with real-time interactive and participative
virtual classroom sessions.
A Mahindra Group Initiative

For queries, call us at: 9975806184, 9811243210
Email: info@educationlanes.com | www.educationlanes.com
CLICK HERE TO REGISTER

Brochure Big Data

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Brochure Big Data

Hochgeladen von

Copyright:

Verfügbare Formate

& FORE School of Management

ANALYTICS FOR BUSINESS &

A Mahindra Group Initiative

EXECUTIVES ACADEMICIANS DATA SCIENTISTS/ STUDENTS / RESEARCH

PROGRAM FEE  During During the lecture, code is explained and

A Mahindra Group Initiative

CLICK HERE TO REGISTER

Das könnte Ihnen auch gefallen