Beruflich Dokumente
Kultur Dokumente
Brief Introduction
Outline of lecture
Course information
Survey
Course Information
Reference books
Grading
Programming language
Matlab
Tutorials
http://www.math.ufl.edu/help/matlab-tutorial/
http://www.math.mtu.edu/~msgocken/intro/node1.htm
l
R language
Or other languages
Face recognition
Bioinformatics
Semi-supervised learning
Data mining is
Different focuses
Clustering
Intra-cluster
distances are
minimized
Inter-cluster
distances are
maximized
Applications of Cluster
Understanding
Analysis
Clustering precipitation
in Australia
Classification: Definition
Classification Example
al
al
us
c
c
i
i
o
or
or
nu
i
g
g
t
n
te
te
ss
a
a
o
a
l
c
c
c
c
Refund Marital
Status
Taxable
Income Cheat
No
No
Single
75K
100K
No
Yes
Married
50K
Single
70K
No
No
Married
150K
Yes
Married
120K
No
Yes
Divorced 90K
No
Divorced 95K
Yes
No
Single
40K
No
Married
No
No
Married
80K
Taxable
Income Cheat
Yes
Single
125K
No
Married
No
60K
10
Yes
Divorced 220K
No
No
Single
85K
Yes
No
Married
75K
No
10
10
No
Single
90K
Yes
Training
Set
Learn
Classifier
Test
Set
Model
Classification: Application
Fraud Detection
Character Recognition
Given a digit
representation.
What is its class?
Other applications
Face recognition
Protein function
prediction
Cancer detection
Document
categorization
Data representation
Original Space
Feature Space
Data integration
mRNA
expression data
hydrophobicity
data
protein-protein
interaction data
sequence
data
(gene,
protein)
Genome-wide data
Curse of dimensionality
Strategies
Feature reduction
Feature selection
Kernel learning
Model selection
Computer vision,
information retrieval,
image processing,
bioinformatics,
text mining,
web mining
etc.
Course schedule
Weeks 1 6:
Introduction
Data Types
Classification
Evaluation
Preprocessing
Week 7: Midterm Exam
Weeks 8 11:
Clustering
Semi-supervised Learning
Advances Topics
Weeks 12 14: Presentations
Week 15: Final Exam
Survey
What topics are you most interested in learning about from this
course?