Beruflich Dokumente
Kultur Dokumente
Convex Optimization - Ti u Li
Theo blog: http://machinelearningcoban.com
Last update:
July 4, 2017
Contents
0 Li ni u . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
18 Duality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
51 n tp i s tuyn tnh . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
0
Li ni u
Cho cc bn,
Ngn ng trong ti liu ny gn ging vi ngn ng trong blog v c lin quan cht ch ti
cc bi vit khc m ti c dn links. Tuy nhin, c gi cha c blog cng c th hiu
c v y l kin thc tng quan v Ti u Li.
Vn bn quyn:
Ton b ni dung trong bi, source code, v hnh nh minh ha (tr ni dung c trch dn)
u thuc bn quyn ca ti, V Hu Tip.
Ti rt mong mun kin thc ti vit trong blog ny n c vi nhiu ngi. Tuy nhin,
ti khng ng h bt k mt hnh thc sao chp khng trch ngun no. Mi ngun tin trch
ng bi vit cn nu r tn blog (Machine Learning c bn), tn tc gi (V Hu Tip),
v km link gc ca bi vit. Cc bi vit trch dn qu 25% ton vn bt k mt post no
trong blog ny l khng c php, tr trng hp c s ng ca tc gi.
Mi vn lin quan n vic sao chp, ng ti, s dng bi vit, cng nh trao i, cng
tc, xin vui lng lin h vi ti ti a ch email: vuhuutiep@gmail.com.
Ni dung trn blog ny l hon ton min ph. Ti cng khng s dng dch v qung co
no v khng mun lm phin cc bn trong khi c. Tuy nhin, nu bn thy ni dung blog
v ti liu ny hu ch v mun ng h, bn c th Mi ti mt ly c ph bng cch click vo
nt Buy me a coffee pha trn ct bn tri ca blog, loi c ph m bn vn thch ung.
Trn trng,
V Hu Tip
www.machinelearningcoban.com
Ch :
Cc bi ton ti u, nhn chung khng c cch gii tng qut, thm ch c nhng bi cha
c li gii. Hu ht cc phng php tm nghim khng chng minh c nghim tm c
c phi l global optimal hay khng, tc ng l im lm cho hm s t gi tr nh nht
hay ln nht hay khng. Thay vo , nghim thng l cc local optimal, tc cc im cc
tr. Trong nhiu trng hp, cc nghim local optimal cng mang li nhng kt qu tt.
16.2.1 nh ngha
Khi nim v convex sets c l khng xa l vi cc bn hc sinh Vit Nam khi chng ta
nghe v a gic li. Li, hiu n gin l phnh ra ngoi, hoc nh ra ngoi. Trong ton hc,
bng phng cng c coi l li.
Mt vi v d thc t:
thng ni hai im y c th cha phn gia khng thuc tp ang xt (Nu khng c
bin th thnh vung vn l mt tp li, nhng bin na vi nh v d ny th hy ch ).
Mt ng cong bt k cng khng phi l tp li v d thy ng thng ni hai im bt
k khng thuc ng cong .
Di y l mt vi v d hay gp v tp li.
16.2.2 V d
Hyperplanes v halfspaces
aT x1 = aT x2 = b
th vi 0 1 bt k:
aT x = aT (x1 + (1 )x2 ) = b + (1 )b = b
Norm balls
Euclidean balls (hnh trn trong mt phng, hnh cu trong khng gian ba chiu) l tp
hp cc im c dng:
B(xc , r) = x kx xc k2 r} = {xc + ru kuk2 1
Vy nn x B(xc , r).
Hnh 16.13 minh ha tp hp cc im c ta (x, y) trong khng gian hai chiu tha mn:
Cp = {(x, y) (|x|p + |y|p )1/p 1}
1 1 1 1 1
-1 1 -1 1 -1 1 -1 1 -1 1
-1 -1 -1 -1 -1
p = 18 p = 14 p= 12 p = 23 p = 45
p < 1: nonconvex sets
1 1 1 1 1
-1 1 -1 1 -1 1 -1 1 -1 1
-1 -1 -1 -1 -1
4 p=
p=1 p= 3
p=2 p=4
p 1: convex sets
Hnh 16.3: Hnh dng ca cc tp hp b chn bi pseudo-norms (hng trn) v norm (hng
di).
Ellipsoids Cc ellipsoids (ellipse trong khng gian nhiu chiu) cng l cc tp li. Thc
cht, ellipsoides c mi quan h mt thit ti Khong cch Mahalanobis. Khong cch ny
vn d l mt norm nn ta c th chng minh theo nh ngha 2 c tnh cht li ca cc
ellipsoids.
xT A1 x 0, x Rn (16.3)
Cng li nhn tin, khong cch Mahalanobis c lin quan n khong cch t mt im ti
mt phn phi xc sut (from a point to a distribution).
Vic ny c th nhn d nhn thy vi Hnh 16.4 (tri). Giao ca hai trong ba hoc c ba
tp li u l cc tp li.
Vic chng minh vic ny theo nh ngha 2 cng khng kh. Nu x1 , x2 thuc vo giao ca
cc tp li, tc thuc tt c cc tp li cho, th (x1 + (1 )x2 ) cng thuc vo tt c
cc tp li, tc thuc vo giao ca chng!
e
lan
erp
hyp
ng ti
ara
sep
Hnh 16.5: Tri: Giao ca cc tp li l mt tp li. Phi: giao ca cc hyperplanes v halfspace
l mt tp li v c gi l polyhedron (s nhiu l polyhedra).
AT x b, CT x = d
x = 1 x1 + 2 x2 + + k xk , with 1 + 2 + + k = 1
16.3.1 nh ngha
vi mi x, y domf, 0 1.
f (x) f x + (1 )y
f (x) + (1 )f (y) f x + (1 )y
f1 (x)
y = ax + b y = |x| y = x2 y = ex y = x1 , x > 0
16.3.3 V d
Cc hm mt bin
Hm y = eax vi a R bt k.
y = ax + b y=2 x y = 2 log(x) y = 3 + x if x < 0
y = 3 x2 if x 0
Hm y = xa trn tp s dng v 0 a 1.
Affine functions
f (X) = trace(AT X) + b
Quadratic forms
f (x) = xT Ax + bT x + c
Norms
Hnh 16.10 minh ho hai v d v norm 1 (tri) v norm 2 (phi) vi s chiu l 2 (chiu th
ba trong hnh di y l gi tr ca hm s).
Tin y, ti cng ly thm hai v d v cc hm khng phi convex (cng khng phi
1
concave). Hm th nht f (x, y) = x2 y 2 l mt hyperbolic, hm th hai f (x, y) = 10 (x2 +
2
2y 2 sin(xy)).
Contours l cch m t cc mt trong khng gian ba chiu bng cch chiu n xung khng
gian hai chiu. Trong khng gian hai chiu, cc im thuc cng mt ng tng ng vi
(a) Norm 1
(b) Norm 2
hng di, cc ng khng phi khp kn. Hnh bn tri tng ng vi mt hm tuyn
tnh f (x, y) = x + y v l mt convex function. Hnh gia cng l mt convex function
(bn c th chng minh iu ny sau khi tnh o hm bc hai, ti s ni pha di) nhng
cc level sets l cc ng khng kn. Hm ny c log nn tp xc nh l gc phn t th
nht tng ng vi cc ta dng (ch rng tp hp cc im c ta dng cng l
mt tp li). Cc ng khng kn ny nu kt hp vi trc Ox, Oy s to thnh bin ca
cc tp li. Hnh cui cng l contours ca mt hm hyperbolic, hm ny khng phi l hm
li.
Tc tp hp cc im trong tp xc nh ca f m ti hm s t gi tr nh hn hoc
bng .
Quay li vi Hnh 16.12, hng trn, cc sublevel sets chnh l phn b bao bi cc level
sets.
Hng di, bn phi, cc sublevel sets hi kh tng tng cht. Vi > 0, cc level
sets l cc ng mu vng hoc . Cc sublevel sets tng ng l phn b bp vo
trong, gii hn bi cc ng cng mu. Cc vng ny, c th d nhn thy, l khng li.
First-order condition
Hnh 16.14: Kim tra tnh convexity da vo o hm bc nht. Tri: hm li v tip tuyn ti
mi im u nm di th hm s , phi: hm khng li.
Di y l v d v hm li v hm khng li.
Second-order condition
V d:
16.4 Tm tt
[1] Convex Optimization Boyd and Vandenberghe, Cambridge University Press, 2004.
Bi ton
Phn tch
Tng chi ph (objective function) s l f (x, y, z, t) = 5x + 10y + 15z + 4t. Cc iu kin rng
buc (constraints) vit di dng biu thc ton hc l:
Bi ton NXB:
Bi ton
Mt anh nng dn c tng cng 10ha (10 hecta) t canh tc. Anh d tnh trng c ph v
h tiu trn s t ny vi tng chi ph cho vic trng ny l khng qu 16T (triu ng).
Chi ph trng c ph l 2T cho 1ha, trng h tiu l 1T/ha/. Thi gian trng c ph l
1 ngy/ha v h tiu l 4 ngy/ha; trong khi anh ch c thi gian tng cng l 32 ngy. Sau
khi tr tt c cc chi ph (bao gm chi ph trng cy), mi ha c ph mang li li nhun 5T,
mi ha h tiu mang li li nhun 3T. Hi anh phi trng nh th no ti a li nhun?
(Cc s liu c th v l v chng c chn bi ton ra nghim p)
Phn tch
Vy ta c bi ton ti u sau y:
5x
y=3 mu xm th hin tp hp cc
2
+
im tho mn cc rng buc. Cc
3y
ng nt t th hin cc ng
=
b(
ng mc ca hm mc tiu vi
ac
fe
as
mu cng tng ng vi gi tr
on
ib
st
le cng cao. Nghim tm c chnh
ant
se
x
tX
)
+
l im mu xanh, l giao im ca
y
=
hnh ng gic xm v ng ng
10
2x
x mc ng vi gi tr cao nht.
+
(0, 0)
y=
16
Bi ton
Mt cng ty phi chuyn 400 m3 ct ti a im xy dng bn kia sng bng cch thu
mt chic x lan. Ngoi chi ph vn chuyn mt lt i v l 100k ca chic x lan, cng ty
phi thit k mt thng hnh hp ch nht t trn x lan ng ct. Chic thng ny
khng cn np, chi ph cho cc mt xung quanh l 1T/m2 , cho mt y l 2T/m2 . Hi kch
thc ca chic thng nh th no tng chi ph vn chuyn l nh nht. cho n
gin, gi s ct ch c ngang hoc thp hn vi phn trn ca thnh thng, khng c
ngn. Gi s thm rng x lan rng v hn v cha c sc nng v hn, gi s ny khin
bi ton d gii hn.
Phn tch
400
Chi ph thu x lan: s chuyn x lan phi thu l xyz (ta hy tm gi s rng y l mt
s t nhin, vic lm trn ny s khng thay i kt qu ng k v chi ph vn chuyn
400 40
mt chuyn l nh so vi chi ph lm thng). S tin phi tr cho x lan s l 0.1 xyz = xyz .
Chi ph lm thng: Din tch xung quanh ca thng l 2(x + y)z. Din tch y l xy.
Vy tng chi ph lm thng l 2(x + y)z + 2xy = 2(xy + yz + zx).
Tng ton b chi ph l f (x, y, z) = 40x1 y 1 z 1 + 2(xy + yz + zx). iu kin rng buc
duy nht l kch thc thng phi l cc s dng. Vy ta c bi ton ti u sau:
Bi ton vn chuyn:
Nhn thy rng bi ny hon ton c th dng bt ng thc Cauchy gii c, nhng ti
vn mun mt li gii cho bi ton tng qut sao cho c th lp trnh c.
(Li gii:
20 20
5
f (x, y, z) = + + 2xy + 2yz + 2zx 5 3200
xyz xyz
du bng xy ra khi v ch khi x = y = z = 5 10. Bi ny c l hp vi cc k thi v d kin
qu p. C nhn ti thch cc bi ra kiu ny hn l yu cu i tm gi tr nh nht ca
mt biu thc nhm chn, nhiu hc sinh cho rng khng bit hc bt ng thc lm g!)
Trc ht, chng ta cn hiu cc khi nim v convex optimization problems v ti sao convex
li quan trng. (Bn c c th c ti phn 4 nu khng mun bit cc khi nim v nh
l ton trong phn 2 v 3.)
Ngoi ra:
Nu optimal set l mt tp khng rng, ta ni bi ton (17.13) l solvable (gii c). Ngc
li, nu optimal set l mt tp rng, ta ni optimal value l khng th t c (not attained/
not achieved ).
Vi hm mt bin, mt im l cc tiu ca mt hm s nu ti , hm s t gi tr nh
nht trong mt ln cn (v ln cn ny thuc tp xc nh ca hm s). Trong khng gian
1 chiu, ln cn c hiu l tr tuyt ti ca hiu 2 im nh hn mt gi tr no .
17.2.3 Mt vi lu
fi (x) 0 fi (x) 0.
17.3 Bi ton ti u li
17.3.1 nh ngha
trong f0 , f1 , . . . , fm l cc hm li.
Hm mc tiu l mt hm li.
Mt vi nhn xt:
Tnh cht quan trng nht ca bi ton ti u li chnh l bt k locally optimal point chnh
l mt im (globally) optimal point.
Tnh cht quan trng ny c th chng minh bng phn chng nh sau. Gi x0 l mt im
locally optimal, tc:
lane
erp
iu kin ti u cho hm mc tiu
ing hyp
kh vi. Cc ng nt t c mu
support
tng ng vi cc level sets (ng
x0 ng mc).
f0 (x0 )
x
feasible set X
f 0(x
0)
level sets
f0 (x0 )T (x x0 ) 0, x X (17.21)
CVXOPT l mt th vin min ph trn Python gip gii rt nhiu cc bi ton trong cun
sch Convex Optimization phn Ti liu tham kho. Tc gi th hai ca cun sch ny,
Lieven Vandenberghe, chnh l ng tc gi ca th vin ny. Hng dn ci t, ti liu
hng dn, v cc v d mu ca th vin ny cng c y trn trang web CVXOPT.
A general LP:
x = arg min cT x + d
x
subject to: Gx h (17.22)
Ax = b
Trong dng tiu chun (standard form) LP, cc bt ng thc rng buc ch l iu kin cc
nghim c thnh phn khng m:
x = arg min cT x
x
subject to: Ax = b (17.23)
x0
x = arg min cT x
x,s
subject to: Ax = b (17.24)
Gx + s = h
s0
Tip theo, nu ta biu din x di dng hiu ca hai vector m thnh phn ca n u
khng m, tc: x = x+ x , vi x+ , x 0. Ta c th tip tc vit li (17.24) di dng:
x = arg +min
cT x+ cT x
x ,x ,s
+
subject to: Ax Ax = b (17.25)
Gx+ Gx + s = h
x+ 0, x 0, s 0
x1 x0 c
feasible set X
V LP, cc bn c th tm thy rt nhiu ti liu c ting Vit (Quy hoch tuyn tnh) v
ting Anh. C rt nhiu cc bi ton trong thc t c th a v dng LP. Phng php
thng c dng gii bi ton ny c tn l simplex (n hnh). Ti s khng cp
n cc phng php ny, thay vo , ti s hng dn cc bn dng th vin CVXOPT
gii quyt cc bi ton thuc dng ny.
Ti s dng th vin CVPOPT gii Bi ton canh tc pha trn. Nhc li bi ton ny:
1 1 10
2 1 16
G=
1 4 h = 32
1 0 0
0 1 0
1 Solution:
2 [ 6.00e+00]
3 [ 4.00e+00]
Mt vi lu :
x0 f0 (x0 )
feasible set X
Quadratic Programming:
1
x = arg min xT Px + qT x + r
x 2
subject to: Gx h (17.27)
Ax = b
Din t bng li: trong QP, chng ta ti thiu mt hm quadratic li trn mt polyhedron
(Xem Hnh 17.4).
17.5.2 V d v QP
Feasible set trong bi ton ny ti ly trc tip t Bi ton canh tc v u = [10, 10]T . Bi
ton ny c th c gii bng CVXOPT nh sau:
1 from cvxopt import matrix, solvers
2 P = matrix([[1., 0.], [0., 1.]])
3 q = matrix([-10., -10.])
4 G = matrix([[1., 2., 1., -1., 0.], [1., 1., 4., 0., -1.]])
5 h = matrix([10., 16., 32., 0., 0])
6
7 solvers.options[show_progress] = False
8 sol = solvers.qp(P, q, G, h)
9
10 print(Solution:)
11 print(sol[x])
x
+
y
=
10
2x
x
+
(0, 0)
y=
16
1 Solution:
2 [ 5.00e+00]
3 [ 5.00e+00]
Tng ca cc monomials:
X
K
f (x) = ck xa11k xa22k . . . xannk
k=1
Mt bi ton ti u c dng:
V d:
X
K
f (x) = exp(aTk y + bk )
k=1
X
K0
y = arg min exp(aT0k y + b0k )
y
k=1
Ki
X
subject to: exp(aTik y + bik ) 1, i = 1, . . . , m (17.32)
k=1
exp(gjT y + hj ) = 1, j = 1, . . . , p
vi aik Rn , i = 1, . . . , p v gi Rn .
P
Vi ch rng hm s log m i=1 exp(gi (x)) l mt hm li nu gi l cc hm li (ti xin b
qua phn chng minh), ta c th vit li bi ton (17.32) di dng li bng cch ly log
ca cc hm nh sau:
GP in convex form:
!
X
K0
minimizey f0 (y) = log exp(aT0k y + bi0 )
k=1
Ki
!
X
subject to: fi (y) = log exp(aTik y + bik ) 0, i = 1, . . . , m (17.33)
k=1
hj (y) = gjT y + hj = 0, j = 1, . . . , p
1 Solution:
2 [[ 1.58489319]
3 [ 1.58489319]
4 [ 1.58489319]]
5
6 checking sol^5
7 [[ 9.9999998]
8 [ 9.9999998]
9 [ 9.9999998]]
Nghim thu c chnh l x = y = z = 5 10. Bn c c khuyn khch c thm ch dn
ca hm solvers.gp hiu cch thit lp bi ton.
17.7 Tm tt
[1] Convex Optimization Boyd and Vandenberghe, Cambridge University Press, 2004.
[2] CVXOPT.
Duality
Trc tin, chng ta li bt u bng nhng k thut n gin cho cc bi ton c bn.
K thut ny c l cc bn tng nghe n: Phng php nhn t Lagrange (method of
Lagrange multipliers). y l mt phng php gip tm cc im cc tr ca hm mc tiu
trn feasible set ca bi ton.
Bi ton ny l bi ton tng qut, khng nht thit phi li. Tc hm mc tiu v hm
rng buc khng nht thit phi li.
rng iu kin th hai chnh l L(x, ) = 0, v cng chnh l rng buc trong bi
ton (18.1).
Vic gii h phng trnh (18.2) - (18.3), trong nhiu trng hp, n gin hn vic trc
tip i tm optimal value ca bi ton (18.1).
Xt cc v d n gin sau y.
18.2.1 V d
Li gii:
1 + 2x = 0
x,y, L(x, y, ) = 0 1 + 2y = 0 (18.4)
2
x + y2 = 2
X
n Xn
L(q1 , q2 , . . . , qn , ) = pi log(qi ) + ( qi 1)
i=1 i=1
pi
qi + = 0, i = 1, . . . , n
q1 ,...,qn , L(q1 , . . . , qn , ) = 0
q 1 + q2 + + qn = 1
Pn Pn
T phng trnh th nht ta c pi = qi . Vy nn: 1 = i=1 pi = i=1 qi ==1
qi = pi , i.
Qua y, chng ta hiu rng v sao hm s cross entropy c dng p hai xc sut
gn nhau.
18.3.1 Lagrangian
p
vi min xc inh D = (m i=0 domfi ) (j=1 domhj ). Ch rng, chng ta ang khng gi
s v tnh cht li ca hm ti u hay cc hm rng buc y. Gi s duy nht y l
D=6 (tp rng).
18.3.3 Chn di ca gi tr ti u
18.3.4 V d
V d 1
Xt bi ton ti u sau:
40
10.0
30
7.5
20 5.0
2.5
10
0.0
0
2.5
10
f0 (x)
5.0
f1 (x) g()
20
7.5
f0 (x) + f1 (x) p
30
4 2 0 2 4 6 0 1 2 3 4 5 6 7 8
V d 2
Dual function l:
g(, ) = inf L(x, , ) (18.11)
x
= bT + inf (c + AT )T x (18.12)
x
bT if c + AT = 0
g(, ) = (18.13)
otherwise
Ch rng iu kin n g(, ) > , trong nhiu trng hp, cng c th c vit c
th. Quay li vi v d pha trn, iu kin n c th c vit thnh c + AT = 0. y
l mt hm affine. V vy, khi c thm rng buc ny, ta vn c mt bi ton li.
Tnh cht n gin ny c gi l weak duality. Tuy n gin nhng n cc k quan trng.
Nu ng thc p = d tho mn, the optimal duality gap bng khng, ta ni rng strong
duality xy ra. Lc ny, vic gii bi ton i ngu gip ta tm c chnh xc gi tr ti
u ca bi ton gc.
Tht khng may, strong duality khng thng xuyn xy ra trong cc bi ton ti u. Tuy
nhin, nu bi ton gc l li, tc c dng:
fi (x) < 0, i = 1, 2, . . . , m, Ax = b
Ch :
Strong duality khng thng xuyn xy ra. Vi cc bi ton li, vic ny xy ra thng
xuyn hn. Tn ti nhng bi ton li m strong duality khng xy ra.
f0 (x ) = g( , ) (18.16)
p
!
X
m X
= inf f0 (x) + i fi (x) + j hj (x) (18.17)
x
i=1 j=1
p
X
m X
f0 (x ) + i fi (x ) + j hj (x ) (18.18)
i=1 j=1
f0 (x ) (18.19)
x chnh l mt im optimal ca g( , ).
Th v hn:
X
m
i fi (x ) = 0
i=1
i fi (x ) = 0, i = 1, 2, . . . , m
i > 0 fi (x ) = 0 (18.20)
fi (x ) < 0 i = 0 (18.21)
fi (x ) 0, i = 1, 2, . . . , m (18.22)
hj (x ) = 0, j = 1, 2, . . . , p (18.23)
i 0, i = 1, 2, . . . , m (18.24)
i fi (x ) = 0, i = 1, 2, . . . , m (18.25)
p
X
m X
f0 (x ) + i fi (x ) + j hj (x ) = 0 (18.26)
i=1 j=1
Vi cc bi ton li v strong duality xy ra, cc iu kin KKT pha trn cng l iu kin
. Vy vi cc bi ton li vi hm mc tiu v hm rng buc l kh vi, bt k im no
tho mn cc iu kin KKT u l primal v dual optimal ca bi ton gc v bi ton i
ngu.
Lagrangian:
1
L(x, ) = xT Px + qT x + r + T (Ax b)
2
iu kin KKT cho bi ton ny l:
Ax = b (18.28)
Px + q + AT = 0 (18.29)
18.6 Tm tt
Gi s rng cc hm s u kh vi:
Vi mi (, ), g(, ) p .
18.7 Kt lun
Trong ba bi 16, 17, 18, ti gii thiu s lc v tp li, hm li, bi ton li, v cc iu
kin ti u c xy dng thng qua duality. nh ban u ca ti l trnh phn ny
v kh nhiu ton, tuy nhin trong qu trnh chun b cho bi Support Vector Machine, ti
nhn thy rng cn phi gii thch v Lagrangian - k thut c s dng rt nhiu trong
Ti u. Thm na, gii thch v Lagrangian, ti cn ni v cc bi ton li. Chnh v vy
ti thy c trch nhim phi vit v ba bi ny.
Trong lot bi tip theo, chng ta s li quay li vi cc thut ton Machine Learning vi
rt nhiu v d, hnh v v code mu. Nu bn no c cm thy hi ui sau ba bi ti u
ny th cng ng lo, mi chuyn ri s n c thi.
[1] Convex Optimization Boyd and Vandenberghe, Cambridge University Press, 2004.
n tp i s tuyn tnh
51.1 Lu v k hiu
Cho mt ma trn W, nu khng gii thch g thm, chng ta hiu rng wi l vector ct
th i ca ma trn . Ch s tng ng gia k t vit hoa v vit thng.
Trong khng gian mt chiu, vic o khong cch gia hai im rt quen thuc: ly tr
tuyt i ca hiu gia hai gi tr . Trong khng gian hai chiu, tc mt phng, chng ta
thng dng khong cch Euclid o khong cch gia hai im. Khong cch ny chnh
l ci chng ta thng ni bng ngn ng thng thng l ng chim bay. i khi, i
t mt im ny ti mt im kia, con ngi chng ta khng th i bng ng chim bay
c m cn ph thuc vo vic ng i ni gia hai im c dng nh th no na.
CHAPTER 51. N TP I S TUYN TNH 58
Vic o khong cch gia hai im d liu nhiu chiu, tc hai vector, l rt cn thit trong
Machine Learning. Chng ta cn nh gi xem im no l im gn nht ca mt im
khc; chng ta cng cn nh gi xem chnh xc ca vic c lng; v trong rt nhiu
v d khc na.
V chnh l l do m khi nim norm (chun) ra i. C nhiu loi norm khc nhau m
cc bn s thy di y:
51.2.1 nh ngha
1. f (x) 0. Du bng xy ra x = 0.
Nhn thy rng khong cch Euclid chnh l mt norm, norm ny thng c gi l norm
2: q
kxk2 = x21 + x22 + . . . x2n (51.1)
Vi p l mt s khng nh hn 1 bt k, hm s sau y:
Nhn thy rng khi p 0 th biu thc bn trn tr thnh s cc phn t khc 0 ca x.
Hm s (51.2) khi p = 0 c gi l gi chun (pseudo-norm) 0. N khng phi l norm v
n khng tha mn iu kin 2 v 3 ca norm. Gi-chun ny, thng c k hiu l kxk0 ,
kh quan trng trong Machine Learning v trong nhiu bi ton, chng ta cn c rng buc
sparse, tc s lng thnh phn active ca x l nh.
C mt vi gi tr ca p thng c dng:
2. Khi p = 1 chng ta c:
kxk1 = kx1 k + kx2 k + + kxn k (51.3)
l tng cc tr tuyt i ca tng phn t ca x. Norm 1 thng c dng nh xp
x ca norm 0 trong cc bi ton c rng buc "sparse". Di y l mt v d so snh
norm 1 v norm 2 trong khng gian hai chiu:
yk
2
|x1 y1 | y
y2
z
x1 y1
Norm 2 (mu xanh) chnh l ng thng "chim bay" ni gia hai vector x v y. Khong
cch norm 1 gia hai im ny (mu ) c th din gii nh l ng i t x ti y trong
mt thnh ph m ng ph to thnh hnh bn c. Chng ta ch c cch i dc theo
cnh ca bn c m khng c i thng.
f (x)
x1
f (x)
x f (x) ,
x2
.. R
n
(51.5)
.
f (x)
xn
trong fx(x)
i
l o hm ca hm s theo thnh phn th i ca vector x. o hm ny
c ly khi gi s tt c cc bin cn li l hng s.
o hm bc nht theo x ca hm s l:
" #
f (x)
x1 2x1 + 2x2 + cos(x1 )
f (x) = f (x) =
x
2x1
2
Gi s mt hm s vi u vo l mt s thc v(x) : R Rn :
v1 (x)
v2 (x)
v(x) = .. (51.8)
.
vn (x)
h i
v1 (x) v2 (x) vn (x)
v(x) , x x
... x
(51.9)
o hm bc hai ca hm s ny c dng:
h i
2 2 v1 (x) 2 v2 (x) 2 vn (x)
v(x) , x2 x2
... x2
(51.10)
Mt quy tc d nh y l nu mt hm s g : Rm Rn th o hm ca n l
mt ma trn thuc Rmn .
Trc khi n phn tnh o hm ca cc hm s thng gp, chng ta cn bit hai tnh
cht quan trng kh ging vi o hm ca hm mt bin c hc trong chng trnh cp
ba.
Product rules
f (X)T g(X) = (f (X)) g(X) + (g(X)) f (X) (51.14)
Ch rng vi vector v ma trn, chng ta khng c s dng tnh cht giao hon.
Chain rules
Khi c cc hm hp th:
X g(f (X)) = X f T f g (51.15)
51.3.4 o hm ca cc hm s thng gp
f (x) = aT x
Gi s a, x Rn , ta vit li:
f (x) = aT x = a1 x1 + a2 x2 + + an xn
f (x) = Ax
f (x) = xT Ax
Nu A l mt ma trn i xng, ta s c:
xT Ax = 2Ax, 2 xT Ax = 2A (51.19)
Cch 2: Dng Chain rule. S dng (Ax b) = AT v kxk22 = 2x v cng thc chain
rules (51.15), ta s thu c kt qu tng t.
f (x) = aT xxT b
Bng cch vit li f (x) = (aT x)(xT b), ta c th dng Product rules (14) v ra kt qu:
Cho vector
f (x) f (x)
aT x a
T
x Ax (A + AT )x
xT x = kxk22 2x
kAx bk22 T
2A (Ax b)
aT xT xb 2aT bx
aT xxT b (abT + baT )x
f (X) f (X)
kXk2F 2X
AX AT
kAX Bk2F 2AT (AX B)
kXA Bk2F 2(XA B)AT
aT XT Xb X(abT + baT )
aT XXT b (abT + baT )X,
aT YXT b baT Y
aT YT Xb YabT
aT XYT b abT Y
aT XT Yb YbaT
Cho ma trn