Linear Algebra

Machine Learning
Math Essentials
Jeff Howbert Introduction to Machine Learning Winter 2012 1

Areas of math essential to machine learning
Machine learning is part of both statistics and computer

science
– Probability
– Statistical inference
– Validation
– Estimates of error, confidence intervals
Li
Linear algebra
l b
– Hugely useful for compact representation of linear
transformations on data
– Dimensionality reduction techniques
Optimization
p theoryy

Why worry about the math?
There are lots of easy-to-use machine learning

packages out there.
After this course, you will know how to apply
several of the most ggeneral-purpose
p p algorithms.
g
HOWEVER
To get really useful results, you need good
mathematical
at e at ca intuitions
tu t o s about certain
ce ta general
ge e a
machine learning principles, as well as the inner
workings of the individual algorithms.

Why worry about the math?
These intuitions will allow you to:

– Choose the right algorithm(s) for the problem
– Make good choices on parameter settings,
validation strategies
g
– Recognize over- or underfitting
– Troubleshoot poor / ambiguous results
– Put appropriate bounds of confidence /
uncertainty on results
– Do a better job of coding algorithms or
incorporating
p g them into more complex
p
analysis pipelines
Notation
a∈A set membership: a is member of set A

|B| cardinality: number of items in set B
|| v || norm: length of vector v
∑ summation
∫ integral
ℜ th sett off reall numbers
the b
ℜn real number space of dimension n
n = 2 : plane or 2-space
n = 3 : 3- (dimensional) space
p
n > 3 : n-space or hyperspace
yp p

Notation
x, y, z, vector (bold, lower case)

u, v
A, B, X matrix (bold, upper case)
y = f( x ) function (map): assigns unique value in
range of y to each value in domain of x
dyy / dx derivative of y with respect
p to single
g
variable x
y = f(( x ) function on multiple
p variables, i.e. a
vector of variables; function in n-space
∂y / ∂xi partial derivative of y with respect to
element i of vector x
Linear algebra applications
1) Operations on or between vectors and matrices

2) Coordinate transformations
3) Dimensionality reduction
4) Linear regression
5) Solution of linear systems of equations
6) M
Many others
th
Applications 1) – 4) are directly relevant to this

course. Today we’ll start with 1).

Why vectors and matrices?
Most common form of data vector

organization for machine Refund Marital Taxable
learning is a 2D array, where Status Income Cheat
Yes Single 125K No
– rows represent
p samples
p No
No
Married
Single
100K
70K
No
No
(records, items, datapoints) Yes Married 120K No
No Divorced 95K Yes
– columns represent
p attributes No Married 60K No
(features, variables)
Yes Divorced 220K No
No Single 85K Yes
No Married 75K No
Natural to think of each sample No Single 90K Yes
as a vector of attributes, and

10
whole array as a matrix matrix

Vectors
Definition: an n-tuple of values (usually real

numbers).
– n referred to as the dimension of the vector
– n can be any positive integer
integer, from 1 to infinity
Can be written in column form or row form
– Column form is conventional
– Vector elements referenced by subscript
⎛ x1 ⎞
⎜ ⎟
x=⎜ M ⎟ x T = ( x1 L xn )
⎜x ⎟ T
⎝ n⎠ means " transpose"
t "
Vectors
Can think of a vector as:

– a point in space or
– a directed line segment with a magnitude and
direction

Vector arithmetic
Addition of two vectors

– add corresponding elements
z = x + y = (x1 + y1 L xn + yn )
T
– result is a vector
Scalar multiplication of a vector

– multiply each element by scalar
y = ax = (a x1 L axn )
T
– result is a vector

Vector arithmetic
Dot product of two vectors

– multiply
lti l corresponding
di elements,
l t ththen add
dd products
d t
n
a = x ⋅ y = ∑ xi yi
i =1
– result is a scalar
y
Dot product alternative form
a = x ⋅ y = x y cos (θ ) θ
x

Matrices
Definition: an m x n two-dimensional array of

values (usually real numbers).
– m rows
– n columns
Matrix referenced by two-element subscript
– first element in
⎛ a11 L a1n ⎞
subscript is row ⎜ ⎟
A=⎜ M O M ⎟
– second element in ⎜a L a ⎟
⎝ m1 mn ⎠
subscript is column
– example: A24 or a24 is element in second row,
fourth column of A
Matrices
A vector can be regarded as special case of a

matrix, where one of matrix dimensions = 1.
Matrix transpose (denoted T)
– swap columns and rows
row 1 becomes column 1, etc.
– m x n matrix becomes n x m matrix
– example: ⎛2 4 ⎞
⎜ ⎟
⎜7 6 ⎟
⎛ 2 7 − 1 0 3⎞
A = ⎜⎜ ⎟⎟ AT = ⎜ − 1 − 3⎟
⎝ 4 6 − 3 1 8⎠ ⎜ ⎟
⎜0 1 ⎟
⎜3 8 ⎟
⎝ ⎠
Matrix arithmetic
Addition of two matrices C= A+B =

– matrices must be same size
⎛ a11 + b11 L a1n + b1n ⎞
– add corresponding elements: ⎜ ⎟
⎜ M O M ⎟
cij = aij + bij ⎜a + b ⎟
⎝ m1 m 1 L a mn + bmn ⎠
– result is a matrix of same size
Scalar multiplication of a matrix B = d ⋅A =

– multiply each element by scalar: ⎛ d ⋅ a11 L d ⋅ a1n ⎞
⎜ ⎟
bij = d ⋅ aij ⎜ M O M ⎟
– result is a matrix of same size ⎜d ⋅a L d ⋅ a ⎟
⎝ m1 mn ⎠

Matrix arithmetic
Matrix-matrix multiplication
– vector-matrix multiplication
p jjust a special
p case
TO THE BOARD!!
Multiplication is associative
A⋅(B⋅C)=(A⋅B)⋅C
Multiplication is not commutative
A ⋅ B ≠ B ⋅ A (generally)
Transposition rule:
( A ⋅ B )T = B T ⋅ A T

Matrix arithmetic
RULE: In any chain of matrix multiplications, the

column dimension of one matrix in the chain must
match the row dimension of the following matrix
in the chain.
Examples
A3x5 B5x5 C3x1
Right:
A ⋅ B ⋅ AT CT ⋅ A ⋅ B AT ⋅ A ⋅ B C ⋅ CT ⋅ A
Wrong:
A⋅B⋅A C⋅A⋅B A ⋅ AT ⋅ B CT ⋅ C ⋅ A
Vector projection
Orthogonal projection of y onto x

– Can take place in any space of dimensionality > 2
– Unit vector in direction of x is
y
x / || x ||
– Length of projection of y in
direction of x is
θ
|| y || ⋅ cos(θ ) x
projx( y )
– Orthogonal projection of
y onto x is the vector
projx( y ) = x ⋅ || y || ⋅ cos(θ ) / || x || =
[ ( x ⋅ y ) / || x ||2 ] x (using dot product alternate form)

Optimization theory topics
Maximum likelihood
Expectation maximization
Gradient descent

Linear Algebra

Hochgeladen von

Dokumentinformationen

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Linear Algebra

Hochgeladen von

Copyright:

Verfügbare Formate

Machine Learning

Jeff Howbert Introduction to Machine Learning Winter 2012 1

Machine learning is part of both statistics and computer

Jeff Howbert Introduction to Machine Learning Winter 2012 2

There are lots of easy-to-use machine learning

Jeff Howbert Introduction to Machine Learning Winter 2012 3

These intuitions will allow you to:

a∈A set membership: a is member of set A

Jeff Howbert Introduction to Machine Learning Winter 2012 5

x, y, z, vector (bold, lower case)

1) Operations on or between vectors and matrices

Applications 1) – 4) are directly relevant to this

Jeff Howbert Introduction to Machine Learning Winter 2012 43

Most common form of data vector

learning is a 2D array, where Status Income Cheat

Yes Single 125K No

as a vector of attributes, and

whole array as a matrix matrix

Jeff Howbert Introduction to Machine Learning Winter 2012 44

Definition: an n-tuple of values (usually real

Can think of a vector as:

Jeff Howbert Introduction to Machine Learning Winter 2012 46

Addition of two vectors

Scalar multiplication of a vector

Jeff Howbert Introduction to Machine Learning Winter 2012 47

Dot product of two vectors

Jeff Howbert Introduction to Machine Learning Winter 2012 48

Definition: an m x n two-dimensional array of

A vector can be regarded as special case of a

Addition of two matrices C= A+B =

Scalar multiplication of a matrix B = d ⋅A =

Jeff Howbert Introduction to Machine Learning Winter 2012 51

Jeff Howbert Introduction to Machine Learning Winter 2012 52

RULE: In any chain of matrix multiplications, the

Orthogonal projection of y onto x

Jeff Howbert Introduction to Machine Learning Winter 2012 54

Jeff Howbert Introduction to Machine Learning Winter 2012 55

Das könnte Ihnen auch gefallen