0 Bewertungen0% fanden dieses Dokument nützlich (0 Abstimmungen)

177 Ansichten25 SeitenBasic introduction to ML

Feb 23, 2017

© © All Rights Reserved

DOCX, PDF, TXT oder online auf Scribd lesen

Basic introduction to ML

© All Rights Reserved

Als DOCX, PDF, TXT **herunterladen** oder online auf Scribd lesen

0 Bewertungen0% fanden dieses Dokument nützlich (0 Abstimmungen)

177 Ansichten25 SeitenBasic introduction to ML

© All Rights Reserved

Als DOCX, PDF, TXT **herunterladen** oder online auf Scribd lesen

Sie sind auf Seite 1von 25

About this course: Machine learning is the science of getting computers to act without being

explicitly programmed. In the past decade, machine learning has given us self-driving cars,

practical speech recognition, effective web search, and a vastly improved understanding of the

human genome. Machine learning is so pervasive today that you probably use it dozens of times

a day without knowing it. Many researchers also think it is the best way to make progress

towards human-level AI. In this class, you will learn about the most effective machine learning

techniques, and gain practice implementing them and getting them to work for yourself. More

importantly, you'll learn about not only the theoretical underpinnings of learning, but also gain

the practical know-how needed to quickly and powerfully apply these techniques to new

problems. Finally, you'll learn about some of Silicon Valley's best practices in innovation as it

pertains to machine learning and AI. This course provides a broad introduction to machine

learning, datamining, and statistical pattern recognition. Topics include: (i) Supervised learning

(parametric/non-parametric algorithms, support vector machines, kernels, neural networks). (ii)

Unsupervised learning (clustering, dimensionality reduction, recommender systems, deep

learning). (iii) Best practices in machine learning (bias/variance theory; innovation process in

machine learning and AI). The course will also draw from numerous case studies and

applications, so that you'll also learn how to apply learning algorithms to building smart robots

(perception, control), text understanding (web search, anti-spam), computer vision, medical

informatics, audio, database mining, and other areas.

Syllabus

WEEK 1

Introduction

Welcome to Machine Learning! In this module, we introduce the core idea of teaching a

computer to learn concepts using datawithout being explicitly programmed. The Course Wiki

is under construction. Please visit the resources tab for the most complete and up-...

4 videos, 10 readings

Graded: Introduction

Linear Regression with One Variable

Linear regression predicts a real-valued output based on an input value. We discuss the

application of linear regression to housing price prediction, present the notion of a cost function,

and introduce the gradient descent method for learning.

7 videos, 8 readings

Graded: Linear Regression with One Variable

Linear Algebra Review

This optional module provides a refresher on linear algebra concepts. Basic understanding of

linear algebra is necessary for the rest of the course, especially as we begin to cover models with

multiple variables.

6 videos, 1 reading, 1 reading

WEEK 2

Linear Regression with Multiple Variables

What if your input has more than one value? In this module, we show how linear regression can

be extended to accommodate multiple input features. We also discuss best practices for

implementing linear regression.

8 videos, 16 readings

Graded: Linear Regression with Multiple Variables

Octave/Matlab Tutorial

This course includes programming assignments designed to help you understand how to

implement the learning algorithms in practice. To complete the programming assignments, you

will need to use Octave or MATLAB. This module introduces Octave/Matlab and shows yo...

6 videos, 1 reading

Graded: Octave/Matlab Tutorial

WEEK 3

Logistic Regression

Logistic regression is a method for classifying data into discrete outcomes. For example, we

might use logistic regression to classify an email as spam or not spam. In this module, we

introduce the notion of classification, the cost function for logistic regr...

7 videos, 8 readings

Graded: Logistic Regression

Regularization

Machine learning models need to generalize well to new examples that the model has not seen in

practice. In this module, we introduce regularization, which helps prevent models from

overfitting the training data.

4 videos, 5 readings

Graded: Regularization

WEEK 4

Neural Networks: Representation

Neural networks is a model inspired by how the brain works. It is widely used today in many

applications: when your phone interprets and understand your voice commands, it is likely that a

neural network is helping to understand your speech; when you cash a ch...

7 videos, 6 readings

Graded: Neural Networks: Representation

WEEK 5

Neural Networks: Learning

In this module, we introduce the backpropagation algorithm that is used to help learn parameters

for a neural network. At the end of this module, you will be implementing your own neural

network for digit recognition.

8 videos, 8 readings

Graded: Neural Networks: Learning

WEEK 6

Advice for Applying Machine Learning

Applying machine learning in practice is not always straightforward. In this module, we share

best practices for applying machine learning in practice, and discuss the best ways to evaluate

performance of the learned models.

7 videos, 2 readings

Graded: Advice for Applying Machine Learning

Machine Learning System Design

To optimize a machine learning algorithm, youll need to first understand where the biggest

improvements can be made. In this module, we discuss how to understand the performance of a

machine learning system with multiple parts, and also how to deal with skewe...

5 videos, 1 reading

Graded: Machine Learning System Design

WEEK 7

Support Vector Machines

Support vector machines, or SVMs, is a machine learning algorithm for classification. We

introduce the idea and intuitions behind SVMs and discuss how to use it in practice.

6 videos, 1 reading

Graded: Support Vector Machines

WEEK 8

Unsupervised Learning

We use unsupervised learning to build models that help us understand our data better. We discuss

the k-Means algorithm for clustering that enable us to learn groupings of unlabeled data points.

5 videos, 1 reading

Graded: Unsupervised Learning

Dimensionality Reduction

In this module, we introduce Principal Components Analysis, and show how it can be used for

data compression to speed up learning algorithms as well as for visualizations of complex

datasets.

7 videos, 1 reading

Graded: Principal Component Analysis

WEEK 9

Anomaly Detection

Given a large number of data points, we may sometimes want to figure out which ones vary

significantly from the average. For example, in manufacturing, we may want to detect defects or

anomalies. We show how a dataset can be modeled using a Gaussian distributi...

8 videos, 1 reading

Graded: Anomaly Detection

Recommender Systems

When you buy a product online, most websites automatically recommend other products that you

may like. Recommender systems look at patterns of activities between different users and

different products to produce these recommendations. In this module, we introd...

6 videos, 1 reading

Graded: Recommender Systems

WEEK 10

Large Scale Machine Learning

Machine learning works best when there is an abundance of data to leverage for training. In this

module, we discuss how to apply the machine learning algorithms with large datasets.

6 videos, 1 reading

Graded: Large Scale Machine Learning

WEEK 11

Application Example: Photo OCR

Identifying and recognizing objects, words, and digits in an image is a challenging task. We

discuss how a pipeline can be built to tackle this problem and how to analyze and improve the

performance of such a system.

5 videos, 1 reading

Graded: Application: Photo OCR

Lesson 1

Welcome to Machine Learning

Lesson 2

Naive Bayes

Splitting data between training sets and testing sets with scikit learn.

distributions.

Lesson 3

Support Vector Machines

Identify how to choose the right kernel for your SVM and learn about RBF and

Linear Kernels.

Lesson 4

Decision Trees

Learn the formulas for entropy and information gain and how to calculate

them.

Implement a mini project where you identify the authors in a body of emails

using a decision tree in Python.

Lesson 5

Choose your own Algorithm

Decide how to pick the right Machine Learning Algorithm among K-Means,

Adaboost, and Decision Trees.

Lesson 6

Datasets and Questions

Apply your Machine Learning knowledge by looking for patterns in the Enron

Email Dataset.

Lesson 7

Regressions

learning.

Understand different error metrics such as SSE, and R Squared in the context

of Linear Regressions.

Lesson 8

Outliers

Apply your learning in a mini project where you remove the residuals on a

real dataset and reimplement your regressor.

Apply your same understanding of outliers and residuals on the Enron Email

Corpus.

Lesson 9

Clustering

Learning.

Implement K-Means in Python and Scikit Learn to find the center of clusters.

Apply your knowledge on the Enron Finance Data to find clusters in a real

dataset.

Lesson 10

Feature Scaling

algorithms.

Requirements

Basic python

Description

This course is about the fundamental concepts of machine learning, focusing on neural networks,

SVM and decision trees. These topics are getting very hot nowadays because these learning

algorithms can be used in several fields from software engineering to investment banking.

Learning algorithms can recognize patterns which can help detect cancer for example or we may

construct algorithms that can have a very good guess about stock prices movement in the market.

In each section we will talk about the theoretical background for all of these algorithms then we

are going to implement these problems together.

The first chapter is about regression: very easy yet very powerful and widely used machine

learning technique. We will talk about Naive Bayes classification and tree based algorithms such

as decision trees and random forests. These are more sophisticated algorithms, sometimes works,

sometimes not. The last chapters will be about SVM and Neural Networks: the most important

approaches in machine learning.

This course is meant for newbies who are not familiar with machine learning or students

looking for a quick refresher

1. Introduction

a. Introduction

b. Introduction to machine learning

2. Regression

a. Linear regression introduction

b. Linear regression example

c. Logistic regression introduction

d. Cross validation

e. Logistic regression example I - sigmoid function

f. Logistic regression example II

g. Logistic regression example III - credit scoring

a. K-nearest neighbor introduction

b. K-nearest neighbor introduction - normalize data

c. K-nearest neighbor example I

d. K-nearest neighbor example II

a. Naive Bayes introduction

b. Naive Bayes example I

c. Naive Bayes example II - text clustering

a. Support vector machine introduction

b. Support vector machine example I

c. Support vector machine example II - character recognition

a. Decision trees introduction

b. Decision trees example I

c. Decision trees example II - iris data

d. Pruning and bagging

e. Random forests introduction

f. Boosting

g. Random forests example I

h. Random forests example II - enhance decision trees

7. Clustering

a. Principal component analysis introduction

b. Principal component analysis example

c. K-means clustering introduction

d. K-means clustering example

e. DBSCAN introduction

f. Hierarchical clustering introduction

g. Hierarchical clustering example

8. Neural Networks

a. Neural network introduction

b. Feedfordward neural networks

c. Training a neural network

d. Error calculation

e. Gradients calculation

f. Backpropagation

g. Applications of neural networks

h. Deep learning

i. Neural network example I - XOR problem

j. Neural network example II - face recognition

9. Face Detection

a. Face detection introduction

b. Installing OpenCV

c. CascadeClassifier

d. CascadeClassifier parameters

e. Tuning the parameters

10. Outro

a. Final words

a. Source code & CSV files

b. Data

c. Slides

d. Coupon codes - get any of my courses for a discounted price

Course Description

CS 403/725 provides a broad introduction to machine learning and various fields of application.

The course is designed in a way to build up from root level.

Topics include:

kernels, neural networks and deep learning)

The course will discuss the application of machine learning in devanagari script

recognition which is a developing field in the machine learning community.

Identification

Description

Course Content :

Supervised learning: decision trees, nearest neighbor classifiers, generative classifiers like naive

Bayes, linear discriminate analysis, loss regularization framework for classification, Support

vector Machines

distortion theory.

References

3. Selected papers.

Home Page

http://www.cse.iitb.ac.in/~sunita/cs725

Prerequisites

N/A

About The Course

This course provides a concise introduction to the fundamental concepts in machine learning and

popular machine learning algorithms. We will cover the standard and most popular supervised

learning algorithms including linear regression, logistic regression, decision trees, k-nearest

neighbour, an introduction to Bayesian learning and the nave Bayes algorithm, support vector

machines and kernels and neural networks with an introduction to Deep Learning. We will also

cover the basic clustering algorithms. Feature reduction methods will also be discussed. We will

introduce the basics of computational learning theory. In the course we will discuss various

issues related to the application of machine learning algorithms. We will discuss hypothesis

space, overfitting, bias and variance, tradeoffs between representational power and learnability,

evaluation strategies and cross-validation. The course will be accompanied by hands-on problem

solving with programming in Python and some tutorial sessions.

Intended Audience

Elective course

UG or PG

BE/ME/MS/MSc/PhD

Pre-requisites

Basic programming skills (in Python), algorithm design, basics of probability & statistics

course

Data science companies and many other industries value machine learning skills.

Course Instructor

Sudeshna Sarkar is a Professor and currently the Head in the Department of Computer Science

and Engineering at IIT Kharagpur. She completed her B.Tech. in 1989 from IIT Kharagpur, MS

from University of California, Berkeley, and PhD from IIT Kharagpur in 1995. She served

briefly in the faculty of IIT Guwahati and at IIT Kanpur before joining IIT Kharagpur in 1998.

Her research interests are in Machine Learning, Natural Language Processing, Data and Text

Mining.

The Teaching Assistants of this course are Anirban Santara and Ayan Das, both of whom are

PhD students in Computer Science & Engineering Department, IIT Kharagpur. They will take

active part in the course especially in running demonstration and programming classes as well as

tutorial classes.

Course layout

Week 1:

Introduction: Basic definitions, types of learning, hypothesis space and inductive

bias, evaluation, cross-validation

Week 2:

Linear regression, Decision trees, overfitting

Week 3:

Instance based learning, Feature reduction, Collaborative filtering based

recommendation

Week 4:

Probability and Bayes learning

Week 5:

Logistic Regression, Support Vector Machine, Kernel function and Kernel SVM

Week 6:

Neural network: Perceptron, multilayer network, backpropagation, introduction to

deep neural network

Week 7:

Computational learning theory, PAC learning model, Sample complexity, VC

Dimension, Ensemble learning

Week 8:

Clustering: k-means, adaptive hierarchical clustering, Gaussian mixture model

suggested reading

2. Introduction to Machine Learning Edition 2, by Ethem Alpaydin

Course url: https://onlinecourses.nptel.ac.in/noc16_cs18

Course duration : 08 weeks

Start date and end date of course: 18 July 2016 - 09 September 2016

Dates of exams : 18 September 2016 & 25 September 2016

Time of exam : 2pm - 5pm

Final List of exam cities will be available in exam registration form.

Exam registration url - Will be announced shortly

Exam Fee: The online registration form has to be filled and the certification exam

fee of approximately Rs 1000(non-Programming)/1250(Programming)needs to

be paid.

certificate

E-Certificate will be given to those who register and write the exam. Certificate will have your

name, photograph and the score in the final exam. It will have the logos of NPTEL and IIT

Kharagpur.

It will be e-verifiable at nptel.ac.in/noc.

With the increased availability of data from varied sources there has been increasing

attention paid to the various data driven disciplines such as analytics and machine

learning. In this course we intend to introduce some of the basic concepts of machine

learning from a mathematically well motivated perspective. We will cover the different

learning paradigms and some of the more popular algorithms and architectures used in

each of these paradigms.

INTENDED AUDIENCE

This is an elective course. Intended for senior UG/PG students. BE/ME/MS/PhD

PRE-REQUISITES

We will assume that the students know programming for some of the assignments.If the

students have done introdcutory courses on probability theory and linear algebra it

would be helpful. We will review some of the basic topics in the first two weeks as well.

Any company in the data analytics/data science/big data domain would value this

course.

COURSE INSTRUCTOR

has nearly two decades of research experience in machine learning and specifically

reinforcement learning. Currently his research interests are centered on learning from

and through interactions and span the areas of data mining, social network analysis,

and reinforcement learning.

COURSE LAYOUT

Week 2: Linear Regression and Feature Selection

Week 3: Linear Classification

Week 4: Support Vector Machines and Artificial Neural Networks

Week 5: Bayesian Learning and Decision Trees

Week 6: Evaluation Measures

Week 7: Hypothesis Testing

Week 8: Ensemble Methods

Week 9: Clustering

Week 10: Graphical Models

Week 11: Learning Theory and Expectation Maximization

Week 12: Introduction to Reinforcement Learning

Certification Exam

The exam is optional. Exams will be on 24 April 2016 and 30 April 2016.

Time: 2pm-5pm

Registration url: Announcements will be made when the registration form is open for

registrations.

The online registration form has to be filled and the certification exam fee of approximately Rs

1000 needs to be paid.

Certificate

Certificate will be given to those who register and write the exam. Certificate will have your

name, photograph and the score in the final exam.

It will have the logos of NPTEL and IIT Madras. It will be e-verifiable at nptel.ac.in/noc.

SUGGESTED READING

2. Christopher Bishop. Pattern Recognition and Machine Learning. 2e.

IIT Madras

CS5011: Introduction to Machine Learning

Home | Research & Publications | Teaching | Students | CV | Contact

Sr.

Date Lecture Contents Reference

No.

2011 Learning by Tom Mitchell

2011 Learning by Tom Mitchell

2011 representations Learning by Tom Mitchell

2011 Machine Learning by Tom

Mitchell

2011 selection through cross validation to Machine Learning by

Ethem Alppaydin

2011 fitting and over-fitting concepts to Machine Learning by

Ethem Alppaydin

7 Aug 11, Q&A on over and under-fitting, bias- Chapter 2 from Principles of

2011 variance, Data: types of features, data Data Mining by David Hand

normalization et al.

2011 example

2011 distance Data Mining by David Hand

et al.

2011 distance, distance metric, Jaccard of Data Mining by David

coefficient, missing values, feature Hand et al.

transformations

2011 Mahalanobis distance, dealing with Data Mining by David Hand

uncertainty et al.

2011 theory and example using binomial Data Mining by David Hand

distribution et al.

2011 univariate Gaussian, generative vs Data Mining by David Hand

discriminative models et al.

2011 bivariate Gaussian distribution, sufficient Data Mining by David Hand

statistics et al.

2011 Recognition and Machine

Learning by Christopher M.

Bishop

Course Data :

Syllabus:

Basic Maths : Probability, Linear Algebra, Convex Optimization

Background: Statistical Decision Theory, Bayesian Learning (ML, MAP, Bayes

estimates, Conjugate priors)

Regression : Linear Regression, Ridge Regression, Lasso

Dimensionality Reduction : Principal Component Analysis, Partial Least Squares

Classification : Linear Classification, Logistic Regression, Linear Discriminant

Analysis, Quadratic Discriminant Analysis, Perceptron, Support Vector Machines +

Kernels, Artificial Neural Networks + BackPropagation, Decision Trees, Bayes

Optimal Classifier, Naive Bayes.

Evaluation measures : Hypothesis testing, Ensemble Methods, Bagging Adaboost

Gradient Boosting, Clustering, K-means, K-medoids, Density-based Hierarchical,

Spectral

Miscellaneous topics: Expectation Maximization, GMMs, Learning theory Intro to

Reinforcement Learning

Graphical Models: Bayesian Networks.

Machine Learning

CS771A

Autumn 2016

Instructor: Piyush Rai: (office: KD-319, email: piyush AT cse DOT iitk DOT ac DOT in)

Office Hours: Tuesday 12-1pm (or by appointment)

Q/A Forum: Piazza (please register)

Class Location: L-16 (lecture hall complex)

Timings: WF 6:00-7:30pm

Background and Course Description

Machine Learning is the discipline of designing algorithms that allow machines (e.g.,

a computer) to learn patterns and concepts from data without being explicitly

programmed. This course will be an introduction to the design (and some analysis)

of Machine Learning algorithms, with a modern outlook, focusing on the recent

advances, and examples of real-world applications of Machine Learning algorithms.

This is supposed to be the first ("intro") course in Machine Learning. No prior

exposure to Machine Learning will be assumed. At the same time, please be aware

that this is NOT a course about toolkits/software/APIs used in applications of

Machine Learning, but rather on the principles and foundations of Machine Learning

algorithms, delving deeper to understand what goes on "under the hood", and how

Machine Learning problems are formulated and solved.

Pre-requisites

MSO201A/equivalent, CS210/ESO211/ESO207A; Ability to program in

MATLAB/Octave. In some cases, pre-requisites may be waived (will need instructor's

consent).

Grading

There will be 4 homework assignments (total 40%) which may include a

programming component, a mid-term (20%), a final-exam (20%), and a course

project (20%)

Reference materials

There will not be any dedicated textbook for this course. In lieu of that, we will have

lecture slides/notes, monographs, tutorials, and papers for the topics that will be

covered in this course. Some recommended, although not required, reference books

are listed below (in no particular order):

Learning, Springer, 2009 (freely available online)

Hal Daum III, A Course in Machine Learning, 2015 (in preparation; most

chapters freely available online)

2007.

From Theory to Algorithms, Cambridge University Press, 2014

Schedule (Tentative)

Dat Deadli Slides/No

Topics Readings/References

e nes tes

Linear Algebra review,

Course Logistics and

July Probability review, Matrix

Introduction to Machine slides

28 Cookbook, MATLAB review,

Learning

[JM15], [LBH15]

Supervised Learning

Learning by Computing

Aug Distances: Distance from Distance from Means, CIML

slides

3 Means and Nearest Chapter 2

Neighbors

Learning by Asking

Aug Questions: Decision Tree Book Chapter, Info Theory

slides

5 based Classification and notes DT - visual illustration

Regression

Aug Learning as Optimization,

useful resources on slides

10 Linear Regression

optimization for ML

Aug Murphy (MLAPP): Chapter 7

Modeling: Probabilistic slides

12 (sections 7.1-7.5)

Linear Regression

Aug Murphy (MLAPP): Chapter 8

Modeling: Logistic and slides

17 (sections 8.1-8.3)

Softmax Regression

Aug Murphy (MLAPP): Chapter 8

Stochastic Optimization, slides

19 (section 8.5)

Perceptron

Aug

Margin Hyperplanes: SVM, Optional: Advanced Intro slides

24

Support Vector Machines to SVM, SVM Solvers

Aug Nonlinear Learning with

9.4), Murphy (MLAPP): Chapter slides

26 Kernels

14 (up to section 14.4.3)

Unsupervised Learning

Bishop (PRML): Section 9.1.

Aug Data Clustering, K-means Optional reading: Data HW 1

slides

31 and Kernel K-means clustering: 50 years beyond k- Due

means

Linear Dimensionality Bishop (PRML): Section 12.1.

Sept

Reduction: Principal Optional reading: PCA tutorial slides

2

Component Analysis paper

Sept

Nonlinear Dimensionality Optional reading: Kernel PCA slides

7

Reduction via Kernel PCA

Sept Matrix Factorization and

Factorization for Recommender slides

21 Matrix Completion

Systems, Scalable MF

Sept Introduction to

slides

23 Generative Models

Sept Bishop (PRML): Section 9.2 and slides

Clustering: GMM and

26 9.3 (up to 9.3.2) (notes)

Intro to EM

Expectation

Sept Maximization and Bishop (PRML): Section 9.3 (up

slides

28 Generative Models for to 9.3.2) and 9.4

Dim. Reduction

Bishop (PRML): Section 12.2 (up

Oct Dim. Reduction: HW 2

to 12.2.2). Optional reading: slides

5 Probabilistic PCA and Due

Mixtures of PPCA

Factor Analysis

Assorted Topics

Practical Issues:

Model/Feature Selection,

Oct On Evaluation and Model

Evaluating and slides

19 Selection

Debugging ML

Algorithms

Oct Introduction to Learning Mitchell ML Chapter 7 (sections

slides

24 Theory 7.1-7.3.1, section 7.4 (up to

7.4.2))

Oct Ensemble Methods:

Brief Intro to Boosting, slides

26 Bagging and Boosting

Explaining AdaBoost

28 Learning Optional: A (somewhat old but

recommended) survey on SSL

Nov HW 3

Feedforward Neural Nets Neural Networks, Convolutional slides

2 Due

and CNN Neural Nets

Optional Readings: RNN and

Nov Models for Sequence

LSTM, Understanding LSTMs, slides

4 Data (RNN and LSTM)

RNN and LSTM Review

and Autoencoders

slides

5 Imbalanced Data

Online Learning

Nov Optional Reading: Foundations

(Adversarial Model and slides

9 of ML (Chapter 7)

Experts)

slides

11 and Conclusions

Useful Links

- Machine Learning Summer Schools

- Scikit-Learn: Machine Learning in Python

- Awesome Machine Learning (a comprehensive list of various Machine Learning

libraries and softwares)

IISc Bangalore

shivani / Chiranjib

E0 270 (3:1) Machine Learning Bhattacharyya / Indrajit

Bhattacharya

support vector machines, VC-dimension. Regression: linear least squares regression, support

vector regression. Additional learning problems: multiclass classification, ordinal regression,

ranking. Ensemble methods: boosting. Probabilistic models: classification, regression, mixture

models (unconditional and conditional), parameter estimation, EM algorithm. Beyond IID,

directed graphical models: hidden Markov models, Bayesian networks. Beyond IID, undirected

graphical models: Markov random fields, conditional random fields. Learning and inference in

Bayesian networks and MRFs: parameter estimation, exact inference (variable elimination, belief

propagation), approximate inference (loopy belief propagation, sampling). Additional topics:

semi-supervised learning, active learning, structured prediction.

References:

Bishop. C M, Pattern Recognition and Machine Learning. Springer, 2006.

Edition, 2000.

Mining, Inference and Prediction. Springer, 2nd Edition, 2009.

Current literature.

Prerequisites

Probability and Statistics (or equivalent course elsewhere). Some background in linear

algebra and optimization will be helpful.

IIT Delhi

CSL341: Fundamentals of Machine Learning

General Information

Instructor: Parag Singla (email: parags AT cse.iitd.ac.in)

Teaching Assistants

Name Email

Happy Mittal csz138233 AT cse.iitd.ac.in

Announcements

[Thu Oct 31]: Assignment 2, New Due Date: Monday Nov 4 (11:50 pm).

[Mon Sep 30]: Assignment 2 is out! Due Date: Thursday Oct 31 (11:50 pm).

[Fri Sep 27]: Assignment submission instructions have been updated (See

below).

[Wed Sep 25]: Assignment 1 has been updated. New Due Date: Sunday Sep

29 (11:50 pm).

[Wed Sep 4]: The venue for the class on Thursday Sep 5 will be Bharti 101

(instead of WS 101).

[Sat Aug 10]: Assignment 1 is out! Due Date: Sunday Sep 15 (11:50 pm).

Course Content

Wee Book

Topic Supplementary Notes

k Chapters

Duda,

1 Introduction

Chapter 1

Bishop,

Linear and Logistic Regression,

2,3 Chapter 3.1, lin-log-reg.pdf, gda.pdf

Gaussian Discriminant Analysis

4

Bishop,

4,5 Support Vector Machines svm.pdf

Chapter 7.1

Mitchell,

6 Neural Networks nnets.pdf nnets-hw.pdf

Chapter 4

Mitchell,

7 Decision Trees dtrees.pdf

Chapter 3

Chapter 6 Conjugate Prior model.pdf

1 Models, EM em.pdf

12 PCA pca.pdf

Mitchell,

13 Learning Theory, Model Selection theory.pdf model.pdf

Chapter 7

Application of ML to

14 crowd-ml.pdf nlp-ml.pdf

CrowdSourcing and NLP

Additional Reading

Quinlan)

Classifier

Review Material

Topic Notes

Probability prob.pdf

References

Springer, 2006.

2. Pattern Classification. Richard Duda, Peter Hart and David Stock. Second

Edition, Wiley-Interscience, 2000.

Assignment Submission Instructions

1. You are free to discuss the problems with other students in the class. You

should include the names of the people you had a significant discussion with

in your submission.

discussion notes or the code someone else would have written.

readability.

5. [Updated October 31, 2013]: Create a separate directory for each of the

questions named by the question number. For instance, for question 1, all

your submissions files (code/graphs/write-up) should be put in the directory

named Q1 (and so on for other questions). Put all the Question sub-

directories in a single top level directory. This directory should be named as

"yourentrynumber_firstname_lastname". For example, if your entry number is

"2009anz7535" and your name is "Nilesh Pathak", your submission directory

should be named as "2009anz7535_nilesh_pathak". You should zip your

directory and name the resulting file as

"yourentrynumber_firstname_lastname.zip" e.g. in the above example it will

be "2009anz7535_nilesh_pathak.zip". This single zip file should be submitted

online.

assignment. More severe penalties may follow.

7. Late Policy: You will lose 20% for each late day in submission. Maximum of 2

days late submissions are allowed.

Assignments

Datasets:

o Problem 1: q1_data.zip

o Problem 2: q2_data.zip

o Problem 3: q3_data.zip

o Original Version

Datasets

IIT Kharagpur

Machine Learning (CS60050)

Instructor: Sourangshu Bhattacharya

12:30)

Classroom: CSE-108

Website: http://cse.iitkgp.ac.in/~sourangshu/cs60050.html

Syllabus:

Basic Principles: Introduction, The concept learning task. General-to-specific ordering of

hypotheses. Version spaces. Inductive bias. Experimental Evaluation: Over-fitting, Cross-

Validation.

Supervised Learning: Decision Tree Learning. Instance-Based Learning: k-Nearest neighbor

algorithm, Support Vector Machines, Ensemble learning: boosting, bagging. Artificial Neural

Networks: Linear threshold units, Perceptrons, Multilayer networks and back-propagation.

Probabilistic Models: Maximum Likelihood Estimation, MAP, Bayes Classifiers Naive Bayes.

Bayes optimal classifiers. Minimum description length principle. Bayesian Networks, Inference

in Bayesian Networks, Bayes Net Structure Learning.

Unsupervised Learning: K-means and Hierarchical Clustering, Gaussian Mixture Models, EM

algorithm, Hidden Markov Models.

Computational Learning Theory: probably approximately correct (PAC) learning. Sample

complexity. Computational complexity of training. Vapnik - Chervonenkis dimension,

Reinforcement Learning.

Textbooks:

2006.

Richard O. Duda, Peter E. Hart, David G. Stork. Pattern Classification. John

Wiley & Sons, 2006.

## Viel mehr als nur Dokumente.

Entdecken, was Scribd alles zu bieten hat, inklusive Bücher und Hörbücher von großen Verlagen.

Jederzeit kündbar.