Beruflich Dokumente
Kultur Dokumente
Topics: Predictive Analytics, Big Data, Data Science, Analytics, Data Visualization
Course Numbers:
38740, 38741
Description:
Data and Information analytics extends analysis (descriptive and predictive models to obtain
knowledge from data) by using insight from analyses to recommend action or to guide and
communicate decision-making. Thus, analytics is not so much concerned with individual analyses or
analysis steps, but with an entire methodology. The world at-large is confronted with increasingly
larger and complex sets of structured/unstructured information; from sensors, instruments, and
generated by computer simulations; data is "hidden" in websites, application servers, social networks
and on mobile devices. As a nation, assimilating information across disparate domains (e.g.,
intelligence, economics, science) has the potential to provide improved capabilities for decision
makers. In commerce and industry, analytics-driven enterprises are becoming mainstream. Yet, there
is a shortfall in the key education skills needed to meet the growing needs. Traditional enterprises are
moving toward analytics-driven approaches for core business functions. In the government and
corporations, cybersecurity problems are prevalent. The investment in advanced analytics capabilities
could potentially be more broadly leveraged today and greater than any prior government investments
in computing. Emphasis is now placed on disruptive data and information sources on the Web and
Internet: using Web Science and informatics to explore social networks, platform competition, the
"long tail" and economic or resource impacts of the search for new findings. Key topics include:
advanced statistical computing theory, multivariate analysis, and application of computer science
courses such as data mining and machine learning and change detection by uncovering unexpected
patterns in data.
To develop students' strategic thinking skills, combined with a solid technical foundation in
Develop ability to apply critical and analytical methods to formulate and solve science,
statistical and data-mining techniques in context, to develop data-analytic thinking, and to illustrate that
specialists
phone: x4862
Syllabus/ Calendar
Refer to Reading/ Assignment/ Reference list for each week (see below).
Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die (online)
(RECOMMENDED)
Big Data Analytics : Turning Big Data into Big Money (online)
Big Data Analytics : From Strategic Planning to Enterprise Integration with Tools, Techniques,
Week 1 (Jan. 19): Introduction to Course, Case Studies, and Preview of Course Material Week
1 Thursday slides [Download], Introduction/ refresher on basic statistics Week 1 Thursday slides
Week 2 (Jan. 23/26): Starting with Data and Information Resources, Role of Hypothesis,
Synthesis and Model Choices Week 2 video recording view video here: TBD, R/ RStudio bootcamp Week 2
Thursday slides
Week 3 (Jan. 30/Feb 2): , Data filtering, hypothesis exploration, visual analysis, model
(Lab) Assignment 2
Analytic Methods, Types of Data Mining for Analytics Week 4 Monday slides, (lab) Week 4 Thursday slides
Assignment 3
Week 5 (Feb. 13/16): Weighted kNN, Clustering, early decision trees and Bayesian Inference
Week 5 Monday slides, Exercises for linear regression, kNN and K-means (lab) Week 5 Thursday slides
Assignment 4
Assignment 5
Week 6 (Feb. 21/23):More Clustering and Bayesian Inference Week 6 Tuesday slides (lab), lab
Week 7 (Feb. 27/Mar. 2): Interpreting, Regression, Classification, Clustering and Bayesian
, lab for Regression, Classification, Clustering and Bayesian Inference Week 6 Thursday slides
Assignment 6
Week 8 (Mar. 6/9): Decision trees, cross-validation Week 8 Monday lab slides, Lab for decision
Week 9 (Mar. 20/24): Dimension reduction and scaling, Support Vector Machines Week 9
Monday slides, Lab for DR, MDS, SVM Week 9 Thursday lab
Week 10 (Mar. 27/30): Factor Analysis Week 10 Monday slides SVM, Dimension Reduction,
Assignment 7
Week 11 (Apr. 3/6): Interpreting PCA, MDS, DR, and FA, Boosting, Bootstrapping, Bagging
Week 11 Monday slides Boosting, Bootstrapping, Bagging (lab) Week 11 Thursday slides
Monday slides Lab - Cross-validation, Regression - local methods and continue project and assignment
Week 13 (Apr. 17/Apr 20): Local Regression ctd, Mixed Models, Optimizing, Iterating, (Fischer
Linear Discriminant) Week 13 Monday slides Open Lab and continue project and assignment work -
http://www.slideshare.net/lsakoda/case-studies-utilizing-real-time-data-...
http://www.marketquotient.com/case-studies.html
http://www.ibm.com/analytics/us/en/case-studies/
http://www.r-tutor.com/r-introduction/data-frame
http://www.r-tutor.com/r-introduction/
http://en.wikipedia.org/wiki/Degrees_of_freedom_(statistics)
http://en.wikipedia.org/wiki/Regression_analysis
http://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm
http://varianceexplained.org/r/kmeans-free-lunch/
http://en.wikipedia.org/wiki/K-means_clustering
http://www.stanford.edu/group/wonglab/RSVMpage/R-SVM.html RSVM
http://data-informed.com/focus-predictive-analytics/ /li>
Course goals:
Introduce students to relevant methods to recognize and apply quantitative algorithms,
To develop students' strategic thinking skills, combined with a solid technical foundation in
Develop ability to apply critical and analytical methods to formulate and solve science,
statistical and data-mining techniques in context, to develop data-analytic thinking, and to illustrate that
By the end of the course, students can effectively communicate analytic findings to non-
specialists
Students to demonstrate strategic thinking skills, combined with a solid technical foundation in
develop data-analytic thinking, and to illustrate that proper application is as much an art as it is a science.
[graduate level]
Students must develop and demonstrate a working knowledge of decision making under uncertainty, be
able to build optimization models that incorporate random parameters: static stochastic optimization, two-
stage optimization with recourse, chance-constrained optimization, and sequential decision making.