Sie sind auf Seite 1von 37

Lecture 3: Analytic Geometry

Yi, Yung (이융)

Mathematics for Machine Learning


https://yung-web.github.io/home/courses/mathml.html
KAIST EE

April 8, 2021

April 8, 2021 1 / 37
Roadmap

(1) Norms
(2) Inner Products
(3) Lengths and Distances
(4) Angles and Orthogonality
(5) Orthonormal Basis
(6) Orthogonal Complement
(7) Inner Product of Functions
(8) Orthogonal Projections
(9) Rotations

April 8, 2021 2 / 37
Roadmap

(1) Norms
(2) Inner Products
(3) Lengths and Distances
(4) Angles and Orthogonality
(5) Orthonormal Basis
(6) Orthogonal Complement
(7) Inner Product of Functions
(8) Orthogonal Projections
(9) Rotations

L3(1) April 8, 2021 3 / 37


Norm

• A notion of the length of vectors


• Definition. A norm on a vector space V is a function k·k : V 7→ R, such that for all
λ ∈ R the following hold:
◦ Absolutely homogeneous: kλxk = |λ| kxk
◦ Triangle inequality: kx + y k ≤ kxk + ky k
◦ Positive definite: kxk ≥ 0 and kxk ⇐⇒ x = 0

L3(1) April 8, 2021 4 / 37


Example for V ∈ Rn

• Manhattan Norm (also called `1 norm) For x = [x1 , · · · , xn ] ∈ Rn ,


n
X
kxk1 :== |xi |
i=1

• Euclidean Norm (also called `2 norm) For x ∈ Rn ,


v
u n
uX √
kxk2 :== t 2
xi = x T x
i=1

L3(1) April 8, 2021 5 / 37


Roadmap

(1) Norms
(2) Inner Products
(3) Lengths and Distances
(4) Angles and Orthogonality
(5) Orthonormal Basis
(6) Orthogonal Complement
(7) Inner Product of Functions
(8) Orthogonal Projections
(9) Rotations

L3(1) April 8, 2021 6 / 37


Motivation

• Need to talk about the length of a vector and the angle or distance between two
vectors, where vectors are defined in abstract vector spaces
• To this end, we define the notion of inner product in an abstract manner.
Dot product: A kind of inner product in vector space Rn . x T y = ni=1 xi yi
P

• Question. How can we generalize this and do a similar thing in some other vector
spaces?

L3(2) April 8, 2021 7 / 37


Formal Definition

• An inner product is a mapping h·, ·i : V × V 7→ R that satisfies the following


conditions for all vectors u, v , w ∈ V and all scalars λ ∈ R:
1. hu + v , w i = hu, w i + hv , w i
2. hλv , w i = λ hv , w i
3. hv , w i = hw , v i
4. hv , v i ≥ 0 and equal iff v = 0

• The pair (V , h·, ·i) is called an inner product space.

L3(2) April 8, 2021 8 / 37


Example

• Example. V = Rn and the dot product hx, y i := x T y

• Example. V = R2 and hx, y i := x1 y1 − (x1 y2 + x2 y1 ) + 2x2 y2

Rb
• Example. V = {continuous functions in R over [a, b]}, hu, v i := a u(x)v (x)dx

L3(2) April 8, 2021 9 / 37


Symmetric, Positive Definite Matrix

• A square matrix A ∈ Rn×n that satisfies the following is called symmetric, positive
definite (or just positive definite):
∀x ∈ V \ {0} : x T Ax > 0.
If only ≥ in the above holds, then A is called symmetric, positive semidefinite.

 
9 6
• A1 = is positive definite.
6 5
 
9 6
• A2 = is not positive definite.
6 3

L3(2) April 8, 2021 10 / 37


Inner Product and Positive Definite Matrix (1)

• Consider an n-dimensional vector space V with an inner product h·, ·i and an


ordered basis B = (b1 , . . . , bn ) of V .
Pn Pn
• Any x, y ∈ V can be represented as: x = i=1 ψi bi and y = i=j λj bj for some
ψi and λj , i, j = 1, . . . , n.
* n n
+ n X
n
X X X
hx, y i = ψi bi , λj bj = ψi hbi , bj i λj = x̂ T Aŷ ,
i=1 i=j i=1 j=1

where Aij = hbi , bj i and x̂ and ŷ are the coordinates w.r.t. B.

L3(2) April 8, 2021 11 / 37


Inner Product and Positive Definite Matrix (2)

• Then, if ∀x ∈ V \ {0} : x T Ax > 0 (i.e., A is symmetric, positive definite), x̂ T Aŷ


legitimately defines an inner product (w.r.t. B)

• Properties
◦ The kernel of A is only {0}, because x T Ax > 0 for all x 6= 0 =⇒ Ax 6= 0 if x 6= 0.

◦ The diagonal elements aii of A are all positive, because aii = ei T Aei > 0.

L3(2) April 8, 2021 12 / 37


Roadmap

(1) Norms
(2) Inner Products
(3) Lengths and Distances
(4) Angles and Orthogonality
(5) Orthonormal Basis
(6) Orthogonal Complement
(7) Inner Product of Functions
(8) Orthogonal Projections
(9) Rotations

L3(3) April 8, 2021 13 / 37


Length

• Inner product naturally induces a norm by defining:


p
kxk := hx, xi

• Not every norm is induced by an inner product

• Cachy-Schwarz inequality. For the induced norm by the inner product,


| hx, y i | ≤ kxk ky k

L3(3) April 8, 2021 14 / 37


Distance

• Now, we can introduce a notion of distance using a norm as:


p
Distance. d(x, y ) := kx − y k = hx − y , x − y i
• If the dot product is used as an inner product in Rn , it is Euclidian distance.
• Note. The distance between two vectors does NOT necessarily require the notion of
norm. Norm is just sufficient.
• Generally, if the following is satisfied, it is a suitable notion of distance, called
metric.
◦ Positive definite. d(x, y ) ≥ 0 for all x, y and d(x, y ) = 0 ⇐⇒ x = y
◦ Symmetric. d(x, y ) = d(y , x)
◦ Triangle inequality. d(x, z) ≤ d(x, y ) + d(y , z)

L3(3) April 8, 2021 15 / 37


Angle, Orthogonal, and Orthonormal

• Using C-S inequality,


hx, y i
−1 ≤ ≤1
kxk ky k
• Then, there exists a unique ω ∈ [0, π] with
hx, y i
cos ω =
kxk ky k
• We define ω as the angle between x and y .
• Definition. If hx, y i = 0, in other words their angle is π/2, we say that they are
orthogonal, denoted by x ⊥ y . Additionally, if kxk = ky k = 1, they are orthonormal.

L3(4) April 8, 2021 16 / 37


Example

• Orthogonality is defined by a given inner product. Thus, different inner products


may lead to different results about orthogonality.
   
1 −1
• Example. Consider two vectors x = and y =
1 1

• Using the dot product as the inner product, they are orthogonal.
 
2 0
• However, using hx, y i = x T y , they are not orthogonal.
0 1
hx, y i 1
cos ω = = − =⇒ ω ≈ 1.91 rad ≈ 109.5°
kxk ky k 3

L3(4) April 8, 2021 17 / 37


Orthogonal Matrix

• Definition. A square matrix A ∈ Rn×n is an orthogonal matrix, iff its columns (or
rows) are orthonormal so that
AAT = I = AT A, implying A−1 = AT .
◦ We can use A−1 = AT for the definition of orthogonal matrices.
◦ Fact 1. A, B: orthogonal =⇒ AB: orthogonal
◦ Fact 2. A: orthogonal =⇒ det(A) = ±1
• The linear mapping Φ by orthogonal matrices preserve length and angle (for the dot
product)
kΦ(A)k = kAxk2 = (Ax)T (Ax) = x T AT Ax = x T x = kxk2
(Ax)T (Ay ) x T AT Ay x Ty
cos ω = =p =
kAxk kAy k T T
x A Axy A AyT T kxk ky k

L3(4) April 8, 2021 18 / 37


Roadmap

(1) Norms
(2) Inner Products
(3) Lengths and Distances
(4) Angles and Orthogonality
(5) Orthonormal Basis
(6) Orthogonal Complement
(7) Inner Product of Functions
(8) Orthogonal Projections
(9) Rotations

L3(5) April 8, 2021 19 / 37


Orthonormal Basis

• Basis that is orthonormal, i.e., they are all orthogonal to each other and their
lengths are 1.
• Standard basis in Rn , {e1 , . . . , en }, is orthonormal.
• Question. How to obtain an orthonormal basis?

1. Use Gaussian elimination to find a basis for a vector space spanned by a set
of vectors.
◦ Given a set {b1 , . . . , bn } of unorthogonal and unnormalized basis vectors. Apply
Gaussian elimination to the augmented matrix (BB T |B)

2. Constructive way: Gram-Schmidt process (we will cover this later)

L3(5) April 8, 2021 20 / 37


Orthogonal Complement (1)

• Consider D-dimensional vector space V and M-dimensional subspace W ⊂ V . The


orthogonal complement U ⊥ is a (D − M)-dimensional subspace of V and contains
all vectors in V that are orthogonal to every vector in U.
• U ∩ U⊥ = 0
• Any vector x ∈ V can be uniquely decomposed into:
M
X D−M
X
x= λm bm + ψj bj⊥ , λm , ψj ∈ R,
m=1 j=1

where (b1 . . . , bM ) and (b1⊥ , . . . , bD−M


⊥ ) are the bases of U and U ⊥ , respectively.

L3(6) April 8, 2021 21 / 37


Orthogonal Complement (2)

• The vector w with kw k = 1, which is orthogonal to U, is the basis of U ⊥ .


• Such w is called normal vector to U.
• For a linear mapping represented by a matrix A ∈ Rm×n , the solution space of
Ax = 0 is row(A)⊥ , where row(A) is the row space of A (i.e., span of row vectors).
In other words, row(A)⊥ = ker(A)

L3(6) April 8, 2021 22 / 37


Inner Product of Functions
• Remind: V = {continuous functions in R over [a, b]}, the following is a proper
inner product.
Z b
hu, v i := u(x)v (x)dx
a

• Example. Choose u(x) = sin(x) and v (x) = cos(x), where we select a = −π and
b = π. Then, since f (x) = u(x)v (x) is odd (i.e., f (−x) = −f (x)),
Z π
u(x)v (x)dx = 0.
−π

• Thus, u and v are orthogonal.

• Similarly, {1, cos(x), cos(2x), cos(3x), . . . , } is orthogonal over [−π, π].

L3(7) April 8, 2021 23 / 37


Roadmap

(1) Norms
(2) Inner Products
(3) Lengths and Distances
(4) Angles and Orthogonality
(5) Orthonormal Basis
(6) Orthogonal Complement
(7) Inner Product of Functions
(8) Orthogonal Projections
(9) Rotations

L3(8) April 8, 2021 24 / 37


Projection: Motivation
• Big data: high dimensional
• However, most information is contained in a few dimensions
• Projection: A process of reducing the dimensions (hopefully) without loss of much
information1
• Example. Projection of 2D dataset onto 1D subspace

1
In L10 , we will formally study this with the topic of PCA (Principal Component Analysis).
L3(8) April 8, 2021 25 / 37
Projection onto Lines (1D Subspaces)

• Consider a 1D subspace U ⊂ Rn spanned by the basis b.


• For x ∈ Rn , what is its projection πU (x) onto U (assume the dot product)?

πU (x)=λb
hx − πU (x), bi = 0 ←−−−−−→ hx − λb, bi = 0
hb, xi bT x bT x
=⇒ λ = 2 = 2, and πU (x) = λb = 2b
kbk kbk kbk

• Projection matrix Pπ ∈ Rn×n in πU (x) = Pπ x


bb T bb T
πU (x) = λb = bλ = 2
x, Pπ =
kbk kbk2

L3(8) April 8, 2021 26 / 37


Inner Product and Projection
• We project x onto b, and let πb (x) be the projected vector.
• Question. Understanding the inner project hx, bi from the projection perspective?

hx, bi = kπb (x)k × kbk

• In other words, the inner product of x and


b is the product of (length of the
projection of x onto b) × (length of b)

L3(8) April 8, 2021 27 / 37


Example
 
1
• b = 2
2
   
1  1 1 2 2
bb T 1 
Pπ = 2
= 2 1 2 2 = 2 4 4
kbk 9 9
2 2 4 4
 
1
For x = 1 ,
1
      
1 2 2 1 5 1
1 1 
πU (x) = Pπ x = 2 4 4 1 = 10 ∈ span[2]
9 9
2 4 4 1 10 2

L3(8) April 8, 2021 28 / 37


Projection onto General Subspaces

• Rn → 1-Dim • Rn → m-Dim, (m < n)


• A basis vector b in 1D subspace • A basis matrix 
bb T x bT x B = b1 , · · · , bm ∈ Rn×m
πU (x) = T , λ = T −1 −1
b b b b πU (x) = B(B T B) B T x, λ = (B T B) B Tx
bb T −1
Pπ = T Pπ = B(B T B) BT
b b

• λ ∈ R1 and λ ∈ Rm are the coordinates in the projected spaces, respectively.


−1
• (B T B) B T is called pseudo-inverse.
• How to derive is analogous to the case of 1-D lines (see pp. 71).

L3(8) April 8, 2021 29 / 37


Example: Projection onto 2D Subspace
     
1 0 6 T T
3
• U = span[1 , 1] ⊂ R and x = 0. Check that { 1 1 1 , 0 1 2 } is a basis.
1 2 0
   
1 0   1 0  
• Let B = 1 1 . Then, B T B =
1 1 2 1 1 = 3 3
0 1 2 3 5
1 2 1 2
 
5 2 −1
−1 1
• Can see that Pπ = B(B T B) B T =  2 2 2 , and
6
     −1 2 5
5 2 −1 6 5
1
πU (x) = 2 2 2  0 =  2 
6
−1 2 5 0 −1

L3(8) April 8, 2021 30 / 37


Gram-Schmidt Orthogonalization Method (G-S method)
• Constructively transform any basis (b1 , . . . , bn ) of n-dimensional vector space V
into an orthogonal/orthonormal basis (u1 , . . . , un ) of V
• Iteratively construct as follows

u1 := b1
uk := bk − πspan[u1 ,...,uk−1 ] (bk ), k = 2, . . . , n (∗)

L3(8) April 8, 2021 31 / 37


Example: G-S method

   
2 1
• A basis (b1 , b2 ) ∈ R2 , b1 = and b2 =
0 1
 
2
• u1 = b1 = and
0
u1 u2 T
      
1 1 0 1 0
u2 = b2 − πspan[u1 ] (b2 ) = b2 = − =
ku1 k 1 0 0 1 1

• u1 and u2 are orthogonal. If we want them to be orthonormal, then just


normaliation would do the job.

L3(8) April 8, 2021 32 / 37


Projection onto Affine Subspaces

• Affine space: L = x0 + U
• Affine subspaces are not vector spaces
• Idea: (i) move x to a point in U, (ii) do the projection, (iii) move back to L
πL (x) = x0 + πU (x − x0 )

L3(8) April 8, 2021 33 / 37


Roadmap

(1) Norms
(2) Inner Products
(3) Lengths and Distances
(4) Angles and Orthogonality
(5) Orthonormal Basis
(6) Orthogonal Complement
(7) Inner Product of Functions
(8) Orthogonal Projections
(9) Rotations

L3(9) April 8, 2021 34 / 37


Rotation

• Length and angle preservation: two properties of linear mappings with orthogonal
matrices. Let’s look at some of their special cases.
• A linear mapping that rotates the given coordinate system by an angle θ.
• Basis change
       
1 cos θ 0 − sin θ
• e1 = → and e2 = →
0 sin θ 1 cos θ
 
cos θ − sin θ
• Rotation matrix R(θ) =
sin θ cos θ
• Properties
◦ Preserves distance: kx − y k = kRθ (x) − Rθ (y )k
◦ Preserves angle

L3(9) April 8, 2021 35 / 37


Questions?

L3(9) April 8, 2021 36 / 37


Review Questions

1)

L3(9) April 8, 2021 37 / 37

Das könnte Ihnen auch gefallen