Beruflich Dokumente
Kultur Dokumente
CHNG 1:
2009
M U
Ngn iu chnh l ci mang li cho ting ni con ngi nhng m sc ring bit. Ngn
iu ca li ni lin kt cht ch vi ng iu.Ng iu l s nng cao h thp ca ging ni
trong cu. Ting Vit ta l mt ngn ng kh phc tp bao gm c ngn iu v ng iu. Do
vn nghin cu cc phng php nhn dng ting ni v ang thu ht rt nhiu s u t
v nghin cu ca nh khoa hc.Tuy nhin cho n nay kt qu mang li vn cha hon thin do
tnh cht qu phc tp v khng c nh ca i tng nhn dng l ting ni con ngi,c bit
l ting Vit.
Hin nay c rt nhiu phng php nhn dng ting ni. M hnh Fujisaki c ng dng
rng ri trong h thng ca ting Nht, m hnh MFGI (Mixdorff- Fujisaki model of German
Intonation) c ng dng trong ting c, m hnh HMM (hidden markov models)
Trong cc m hnh y li p dng nhiu phng php nhn dng khc nhau. Mi phng
php mang mt tnh c trng v u im ring.
Phng php LPC (linear predictive coding)-m ha d bo tuyn tnh: nhc im l c
mt s t pht m gn ging nhau th b nhm ln nhiu.
Phng php AMDF (average magnitude difference function)- hm hiu bin trung
bnh: u im l s ng vo t,kch thc mng hun luyn nh,t ph thuc vo cch
pht m nn t l c sai t hn phng php LPC, tuy nhin khuyt im l khng phn
bit v thanh iu,kh s ng trong trng hp t c lin tip.
AMDF & LPC :Do u v nhc im ca hai phng php LPC v AMDF nn cn s
kt hp gia hai phng php .
Phng php th t MFCC (mel-frequency ceptrums coefficients).
Nhn dng ting ni l mt qu trnh nhn dng mu,vi mc ch l phn lp thng tin
u vo l tn hiu ting ni thnh mt dy tun t cc mu c hc trc v lu tr
trong b nh. Cc mu l cc n v nhn dng, chng c th l cc t hay l cc m v. Nu
cc mu ny l bt bin v khng thay i th cng vic nhn dng ting ni tr nn n gin
bng cch so snh d liu ting ni cn nhn dng vi cc mu c hc v lu tr trong
b nh.
Chng 1: M u
2009
n mn hc 2
CHNG 2:
2009
n mn hc 2
2.3 Cc tn s ca m thanh:
F0 gi l tn s c bn ca m thanh. Nam gii f0 = 150 Hz. N gii : f0 = 250 Hz.
Ging nam trm 80 320 Hz
Ging nam trung 100 400 Hz
Ging nam cao 130 480 Hz
Ging n thp 160 600 Hz
Ging n cao 260 1200 Hz
Cng sut ca ting ni , khi ni to nh cng khc nhau.Khi ni thm cng sut 10 -3 mW ,
ni bnh thng 10 mW, ni to 103 mW.
2009
n mn hc 2
AV
B to xung
thanh m
Tham s b
lc
B to nhiu
ngu nhin
B lc b
phn pht m
Ting ni
AN
M hnh lc ngun cho qu trnh to ting ni kh n gin nhng khng th lc cm
xt bng cch nh cng hng ca b phn pht m nh m hu thanh hay m bt hi,v vy
m hnh lc ngun hon ton khng chnh xc cho m xt.
2009
n mn hc 2
SN XUT TING
NI
To thng ip
Hiu thng ip
M ngn ng
M ngn ng
Qu trnh thn
kinh
ng dn
m
Sng
m
thanh
Tai trong
Dy
thanh
NGI NI
NGI NGHE
Chng 2: L thuyt m thanh v ting ni
2009
n mn hc 2
2.8.2 Cc m v khc:
Nguyn m i th c s bin thin mt cch lin tc cc formant ca biu din ph theo
thi gian. i vi m v loi ny,cn phi c bit ch n vic phn on theo thi gian khi
nhn dng.
Cc bn nguyn m nh /l/, /r/ v /y/ l tng i kh trong vic biu din c trng. Cc
m thanh ny khng c coi l nguyn m nhng gi l bn nguyn m do bn cht ta nguyn
m ca chng. Cc c trng m hc ca cc m thanh ny chu nh hng rt mnh ca ng
cnh m trong chng xut hin.
i vi cc m mi th ming ng vai tr nh mt khong cng hng c tc dng by
nng lng m ti mt vi tn s t nhin. Cc tn s cng hng ny ca khoang ming xut
2009
n mn hc 2
- ht Chng 2 -
2009
n mn hc 2
CHNG 3:
D liu ting ni
M hnh
m thanh
Tn hiu
u vo
Trch chn
c trng
M hnh
t vng
M hnh
ha,
Phn lp
M hnh
ngn ng
Tm kim,
i snh
T c
nhn
2009
n mn hc 2
H thng nhn
dng
H thng vi
kch thc b
t in nh
H thng vi kch
thc t in
trung bnh v ln
2009
n mn hc 2
Dy cc
Tn hiu
c tnh
ting ni Phn tch
ph
c tnh
Phn lp
mu
Dy cc
t hoc
m v
X l
ngn ng
T, cu
c nhn
dng
Cc t,
m v
Cc t,
cu
M hnh m hc
M hnh ngn ng
10
2009
n mn hc 2
Hai phng php trch chn c trng ting ni ang c s dng rng ri hin
nay trong cc h thng nhn dng hin nay: MFCC ( melscale frequency cepstral
coefficients) v PLP ( Perceptual Linear Prediction).
Phn tch cepstral theo thang o mel MFCC
Phng php c xy dng da trn s cm nhn ca tai ngi i vi cc di
tn s khc nhau. Vi cc tn s thp (di 1000 Hz), cm nhn ca tai ngi l tuyn
tnh. i vi cc tn s cao, bin thin tun theo hm logarit. Cc bng lc tuyn tnh
tn s thp v bin thin theo hm logarit tn s cao c s dng trch chn cc
c trng m hc quan trng ca ting ni.
Ngi ta chn tn s 1kHz, 40 dB trn ngng nghe l 1000 Mel. Cng thc gn ng
biu din quan h tn s thang mel v thang tuyn tnh nh sau:
mel(f) = 2595*log 10(1+f/700)
Mt phng php chuyn i sang thang mel l s dng bng lc (Hnh 3.4.2),
trong mi b lc c p ng tn s dng tam gic. S bng lc s dng thng trn 20
bng. Thng thng, ngi ta chn tn s t 0 dn Fs/2 (Fs l tn s ly mu ting ni).
Nhng cng c th mt di tn gii hn t LOFREQ n HIFREQ s c dng lc i
cc tn s khng cn thit cho x l. Chng hn, trong x l ting ni qua ng in
thoi c th ly gii hn di tn t LOFREQ=300 n HIFREQ=3400.
Tn s
Tn s mel
Hnh 3.4.2: Cc bng lc tam gic theo thang tn s Mel
Phng php m d on tuyn tnh LPC
M hnh LPC c s dng trch lc cc tham s c trng ca tn hiu ting
ni. Kt qu ca qu trnh phn tch tn hiu thu c mt chui gm cc khung ting
ni. Cc khung ny c bin i nhm s dng cho vic phn tch m hc.
11
2009
n mn hc 2
cc tiu ha li cn tm tp gi tr { k } ph hp nht.
Phng php PLP
Phng php ny l s kt hp ca hai phng php trnh by trn
3.4.2 Phn lp mu:
bc ny, h thng s gn dy cc vector c tnh thnh dy cc ti u n v
ting ni c bn. C bn phng php hay c p dng l: i snh mu, rule-based,
m hnh Markov n, mng Neuron
Nguyn tc c bn ca i snh mu l ct gi mt s lng cc mu ting
ni, bao gm cc vector c tnh.Tn hiu ting ni cn nhn dng c phn tch v cc
vector c tnh ca chng s c so snh vi cc mu c ct gi trc . Do tc
pht m l rt khc nhau, k thut DWT (Dynamic Time W arping)c p dng
dn hoc co hp thi gian trn trc thi gian nhm gim s khc bit so vi cc mu.
H thng rule-based xy dng mt lot cc tiu chun trn mt cy quyt nh
xc nh xem n v no ca ngn ng nm trong tn hiu ting ni. i vi h thng
nhn dng ting ni ln, phng php ny gp kh khn trong tng qut ha s a dng
ca tn hiu ting ni. Mt vn na l vi cy quyt nh rt kh phc hi li nu nh
mt quyt nh sai c xc nh ngay t khi bt u phn tch.
M hnh Markvo n c nghin cu rng ri gn y nh l mt cng c mnh
c p dng thnh cng trong nhn dng ting ni. a s cc h thng nhn dng ting
ni u dng m hnh Markov n. Chi tit v m hnh Markov n s c trnh by trong
mc 3.6.2.
12
2009
n mn hc 2
13
2009
n mn hc 2
B tch
c trng
Tn hiu
ting
ni
B tch
c trng
Trch
chn
c trng
.
.
.
Ting ni
c nhn
dng
Phn
on
V
Chn
la
gn nhn
B tch
c trng
14
n mn hc 2
2009
15
2009
n mn hc 2
Tip cn nhn dng mu thng c la chn cho cc ng dng nhn dng ting
ni bi cc l do sau:
Tnh d s dng v d hiu trong thut ton.
Tnh bt bin v kh nng thch nghi i vi nhng t vng, ngi
s dng, cc tp hp c trng, cc thut ton so snh mu v cc quy tc quyt nh
khc nhau.
Khng nh tnh nng cao trong thc t.
Ting ni
Phn tch tn hiu
Trch chn c trng
m hu thanh/V
thanh/ Khong lng
Cc
Phn on
ngun
kin
Gn nhn
thc
Phn lp m thanh
Nguyn tc ng m
Xc nh t
Truy cp t in
Xc nh cu
M hnh ngn ng
16
2009
n mn hc 2
c im ca cc h thng nhn dng theo phng php ny l:
17
2009
n mn hc 2
I
i =1
j =1
2 t exp( t ), t 0
Gp ( t ) =
0, t < 0
(3.6.1.1)
(3.6.1.2)
min[ 1 (1 + t ) exp( t ), ]
Gp ( t ) =
0, t < 0
t 0
(3.6.1.3)
Cc tham s ca m hnh gm c:
Cc hng s: Fb l gi tr khi u ca ng tn s c bn. Fb l gi tr ph
thuc vo ngi ni ch khng ph thuc vo cc mu ting ni. Gi tr l tn s gc t
nhin ca lnh ng. Gi tr l tn s gc t nhin ca lnh trng m. Gi tr l mc gi
tr trn tng ng vi cc thnh phn trng m.
Cc i s: I l s lnh ng. J l s lnh trng m. Api l cng ca lnh ng
th i. Aaj l bin ca trng m th j. T0i l thi im bt u lnh ng th i. T1j v T2j
l thi im bt u v kt thc thanh iu lnh trng m th j.
Trong m hnh, ng F0 c xt min logF0, mc ch ca php bin i
ny l lm cho ging ni ca nam v n ging nhau. Theo (3.6.1.1) cc gi tr =2.0/s v
=20.0/s, trong mt s trng hp c bit =3.0/s. Tuy nhin theo quan st th nm
trong khong [1.0;3.0], cn thuc khang [19.5;20.5].
Cc tham s Ap,,,Aa,T1,T2,Fb c gi l cc tham s Fujisaki v phng php
phn tch bng tng hp bng ng nt F0 s dng m hnh Fujisaki c gi l phn
tch Fujisaki. Cc tham s ca m hnh c th c sinh ra t ng bi nhiu cch khc
nhau ty vo tng ngn ng c phn tch.
Phn tch thanh iu ting Vit bng m hnh Fujisaki:
18
2009
n mn hc 2
1)
2)
3)
4)
5)
Phn tch c s d liu cho thy, cc thanh ngang, sc, ng c biu din bng
mt lnh thanh iu dng, thanh huyn v hi c biu din bng mt lnh thanh iu
m, thanh nng khng cn lnh thanh iu.
Thanh iu
Ngang
Sc
Hi
Huyn
Ng
Nng
19
2009
n mn hc 2
P(s,i,t,j)= P(s+h,i,t+h,j)
th ta ni h l thun nht theo thi gian.
20
2009
n mn hc 2
21
2009
n mn hc 2
Ch rng
a
j =1
ij
1 i, j N
mt trng thi bt k, ngha l aij >0 vi mi i,j. Tuy nhin i vi ting ni c th aij = 0
cp i,j no .
22
2009
n mn hc 2
Ch rng
b (k ) = 1 vi mi j, k.
k =1
Ch rng
i =1
1 i N
= 1 vi mi j.
1 i N
Bc 2: Qui np
t +1 ( j ) = t (i)aij b j (ot +1 )
i =1
1 t T 1
1 j N
Bc 3: Kt thc
23
2009
n mn hc 2
N
P(O | ) = T (i )
i =1
1 i N
Bc 2: Qui np:
t = T 1, T 2,...1
1 i N
24
2009
n mn hc 2
1i N
1 (i) = 0
Bc 2: qui
t ( j ) = argmax t 1 (i )a ij
1i N
2 t T
1 j N
2t T
1 j N
Bc 3: Kt thc
P * = max [ T ( i ) ]
1 i N
q T* = arg max [ T ( i ) ]
1 i N
qt* =t +1(qt*+1)
t = t 1,T 2,....,1
e. c lng tham s:
25
P (O,qt = i| )
P (O| )
2009
n mn hc 2
t(i) =
t (i ) t (i )
N
(i) (i)
T
i =1
P ((qt = i, qt +1 = j | O , ))
P (O | )
(i )a ij b j (o t + 1 ) t + 1 ( j )
P (O | )
t (i )a ij b j (o t +1 ) t +1 ( j )
N
(i)a
t
i =1 j =1
ij
b j (o t +1 ) t +1 ( j)
t (i) = t (i, j ) .
j =1
Vy:
T 1
t =1
T 1
(i , j ) =
t =1
vi O.
Tp cc cng thc c lng A, B v nh sau:
= 1 (i )
26
P (O ,q1 = i| )
P (O | )
2009
n mn hc 2
=
1 (i ) 1 (i )
(3.6.2.1)
(i)
T
i =1
aij =
(i, j )
i =1
T 1
(i)
t
i =1
T 1
(i)a b (o )
t
t =1
ij
t +1
t +1
( j)
(3.6.2.2)
T 1
(i) (i)
t
t =1
b j (k ) =
( j) ( j)( o , v )
t =1
(3.6.2.3)
( j) ( j)
t =1
ta k hiu :
1
(o t , v k ) =
ot = v k
0 Ngc li
=1
1 i N
i =1
27
2009
n mn hc 2
a
j =1
ij
=1
1 j N
(k ) = 1
k =1
M hnh tri phi hay m hnh Bakis: m hnh nyc s dng thng
thng trong nhn dng ting ni. M hnh c tn gi l tri phi v cc trng thi lin
kt vi m hnh c tnh cht l khi thi gian tng, trng thi s tng ln tc l trng thi
tin dn t tri sang phi. iu ny ph hp vi cu trc t nhin ca ting ni l bin
thin theo thi gian t tri sang phi.Tnh cht c bn ca m hnh ny l cc h s ca
ma trn v tr c tnh cht a ij = 0 ( j<0) tc l khng cho php trng thi sau nh hn
trng thi hin ti. Ngoi ra xc sut trng thi ban u c tnh cht:
0
i =
1
i 1
i =1
bi v trng thi ban u bt buc l 1( v kt thc trng thi N).V m hnh cn rng
buc khng c chuyn t trng thi ny n trng thi khc qu xa,rng buc c dng:
aij= 0
j > i + i
Phn loi m hnh Markvo n theo tnh cht ca hm pht x quan st, th c 3
loi m hnh:
M hnh HMM ri rc : khng gian cc c tnh ph c chia thnh mt s
hu hn cc vng bng phng php lng t ha vector VQ.Trng tm ca mi vng
c biu din bng mt t m m thc cht l mt ch s ch ti mt sch m.Mt
khung tn hiu c bin i thnh mt t m bng cch tm mt vector gn vi n nht
trong sch m.Nhc im ca m hnh ny l c sai s trong qu trnh lng t ha nht
l nu kch thc ca sch m nh, ngc li nu kch thc sch m ln th s lng
tnh ton s tng ln.
M hnh HMM lin tc: khc phc nhc im ca m hnh trn.Trong
28
2009
n mn hc 2
Bc tnh xc sut thng dng thut ton Viterbi v cn V .N2.T php tnh.Vi
b t vng V=100 t, m hnh 5 trng thi v T=40 quan st cho mi t cha bit tng
cng c 103 php tnh. iu ny c th chp nhn c cho cc my tnh ngy nay.
29
2009
n mn hc 2
HMM
t 1
Tnh ton
xc sut
Tn
hiu
ting
ni S
Chui
quan
Lng st O
t ha
vector
P(O|1)
HMM
t 2
2
Tnh ton
xc sut
P(O| )
[(
v * = arg max P O | V
1 v V
Chn ln
nht
.
.
.
Ch s ca t
nhn dng
HMM
t V
Tnh ton
xc sut
P(O|V)
30
)]
2009
n mn hc 2
Ting Vit l ngn ng khng bin hnh t. m tit ting Vit n nh, c
cu trc r rng. c bit khng c 2 m tit no c ging nhau m vit
khc nhau. iu ny s d dng trong vic xy dng cc m hnh m tit
trong nhn dng
Ngoi nhng thun li trn, nhn dng ting ni ting Vit cng gp rt nhiu kh
khn nh sau:
Nhng kh khn c bn trong nhn dng ting ni l ting ni bin thin theo
thi gian v c s khc bit ln gia ting ni ca nhng ngi ni khc nhau, tc ni,
ng cnh v mi trng m hc khc nhau.
- ht Chng 3 -
31
2009
n mn hc 2
MNG NEURON
CHNG 4:
Cc
tn
hiu
u
vo
H s hiu
chnh bk
x1
Wk1
x2
Wk2
.
.
.
xn
(.)
u ra
yk
Hm kch
hot
B t hp
tuyn tnh
Wkn
Cc trng
s synpase
Hnh 4.1: M hnh ca mng neuron
Mt m hnh mng neuron c ba thnh phn c bn:
1.
Mt tp hp cc synpase hay cc kt ni, m mi mt trong chng c
c trng bi mt trng s ring ca n.Tc l mt tn hiu xj ti u vo ca synpase j
ni vi neuron k s c nhn vi trng s synpase wkj. k l ch s ca neuron ti
32
2009
n mn hc 2
( x) =
1
1 + exp( x)
(4.1.1)
(4.1.2)
y k = wki xi + b
i =1
Mng neuron nhn to ang c ng dng rng ri trong cc ngnh k thut nh:
trong k thut iu khin, mng neuron c ng dng nhn dng, d bo v iu
khin cc h thng ng; trong in t vin thng th ng dng x l nh, nhn dng
nh v truyn thng; trong h thng in th ng dng nhn dng, d bo v iu
khin cc trm bin p...
33
2009
n mn hc 2
Mng Perceptron tuyn tnh n SLP
(4.2.1.1)
Gi tp d liu mu dng hun luyn l (xk, yk ). Vi tp d liu mu, mng
ANN vi cc trng s, bi ton hun luyn mng c t ra nh l iu chnh cc trng
s sao cho vi mi vector gi tr vo xk , mng cho mt kt qu tng ng y^k , gn nht
vi kt qu mong mun theo mt tiu chun no . La chn thng dng cho mt hm
tiu chun l hm bnh phng ti thiu (least square criterion).
(4.2.1.2)
34
2009
n mn hc 2
(4.2.1.3)
35
2009
n mn hc 2
(4.2.1.4)
T phng trnh (4.2.1.1) ta c
= xi
(4.2.1.5)
(4.2.1.6)
Phng trnh (4.2.1.6) cho thy s bin thin ca trng s ca mng sau khi c
mt gi tr vo mng, gi tr ny t l vi hiu s gia gi tr ti cc nt output v gi tr
ra mong mun nhn c.
Ta nh ngha i lng
= yi y^i
Khi phng trnh (4.2.1.6) c vit li l
wij = xj
(4.2.1.7)
36
2009
n mn hc 2
Ta c th tnh
37
2009
n mn hc 2
(4.2.1.8)
Trong ta c th tnh:
(4.2.1.9)
T hai phng trnh (4.2.1.8) v (4.2.1.9) ta c
wij = ( yi y^i )i(yi) j
t:
(4.2.1.10)
Cui cng ta c cng thc iu chnh trng s tng t nh trng hp ca mng
Perceptron n lp nh sau:
(4.2.1.11)
By gi ta xem xt trng hp ca trng s vjk gia nt vo k v nt n j c
iu chnh theo hm li E:
38
n mn hc 2
2009
Ta c th tnh
(4.2.1.12)
(4.2.1.13)
T hai phng trnh (4. 2.1.8) v (4. 2.1.9) ta c
t:
Cui cng ta c cng thc iu chnh trng s tng t nh trng hp ca mng
Perceptron a lp nh sau:
(4.2.1.14)
By gi ta xem xt trng hp ca trng s vjk gia nt vo k v nt n j c
iu chnh theo hm li E:
39
n mn hc 2
2009
(4.2.1.15)
Ta c th tnh
(4.2.1.16)
Ta c th thy t phng trnh (4.12) l s thay i ca trng s vjk lin quan n
ton b cc nt ra output ca mng.
(4.2.1.17)
Trong :
(4.2.1.18)
Trong hj^ l gi tr u ra ca nt n th j. Ta tip tc tnh:
40
2009
n mn hc 2
(4.2.1.19)
T cc phng trnh (4. 2.1.15), (4. 2.1.16), (4. 2.1.17) v (4. 2.1.19) ta c
(4.2.1.20)
41
n mn hc 2
2009
42
2009
n mn hc 2
4.3.3 Tnh cht thch nghi:
- ht Chng 4 -
43
n mn hc 2
2009
Cc hm x l m thanh:
[y fs]=wavread(wavfile)
wavwrite(y,fs,wavfile)
sound(y)
y=wavrecord(n, fs)
44
Chng 5: Cc hm Matlab
2009
n mn hc 2
VoiceBox toolbox
V oiceBox l mt toolbox ca Matlab chuyn v x l ting ni do Mike Brookes pht
trin. V oiceBox yu cu Matlab phin bn 5 tr ln. V oicebox c th ti v t
http://webscripts.softpedia.com/script/Scientific-Engineering-Ruby/Signal-Processing/V oicebox34702.html.
V oiceBox gm cc hm c th chia thnh mt s nhm chc nng sau:
X l file m thanh (c, ghi file wav v mt s nh dng file m thanh khc)
Phn tch ph tn hiu
Phn tch LPC
Tnh ton MFCC, chuyn i spectral - cepstral
Chuyn i tn s (mel-scale, midi,...)
Bin i Fourier, Fourier ngc, Fourier thc...
Tnh khong cch (sai lch) gia cc vector v dy vector.
Loi tr nhiu trong tn hiu ting ni.
Chc nng quan trng nht l trch c trng tn hiu ting ni, m y l 2 loi ph
bin nht LPC v MFCC.
Hm tnh MFCC ca tn hiu trong VoiceBox l hm :
melcepst(s,fs,w,nc,p,n,inc,fl,fh)
Hm c nhiu tham s, mt s tham s quan trng l:
s l vector tn hiu ting ni (c c sau khi dng hm hoc ), fs l tn s ly mu
(mc nh l 11050).
p l s b lc mel-scale.
12).
w l mt xu m t cc la chn khc: nu c th tnh thm log nng lng, c th tnh
thm c trng delta.
Mc d vy hm c th gi mt cch n gin l:
c=melcepst(s,fs);
Li gi hm sinh ra ma trn c, mi dng ca ma trn l 12 h s MFCC ca mt frame.
km thm log nng lng v d liu delta nh trong cc h nhn dng khc, ta dng lnh:
45
Chng 5: Cc hm Matlab
2009
n mn hc 2
c=melcepst(s,fs,ed);
Khi mi dng ca c l vector 26 h s MFCC ca frame tng ng.
NetLab toolbox
NetLab do Ian T. Nabney pht trin. Chng ti s dng toolbox NetLab xy dng,
hun luyn v th nghim mng nron MLP cho h thng nhn dng trong n ny. Link ti
v:
http://webscripts.softpedia.com/script/Scientific-Engineering-Ruby/Controls-and-SystemsModeling/Netlab-32705.html
Lnh khi to MLP trong NetLab c c php nh sau:
net = mlp(inode, hnode, onode, func, anpha);
Trong :
inode, hnode, onode ln lt l s nron ca lp vo, lp n v lp ra.
func l kiu hm kch hot, func c th c cc gi tr logistic,
softmax
anpha l ngng ca gi tr trng s, thng ly bng 0.01.
net l mng MLP do hm to ra.
Mng MLP sau khi iu kin khi to c th hun luyn vi mt b d liu hun luyn
cho trc. Lnh hun luyn MLP trong NetLab c c php nh sau:
[net, error] = mlptrain(net, x, t, its)
Trong :
x, t l b d liu hun luyn. x l cc vector u vo, t l cc vector u ra cn t
n (target).
its l s vng hun luyn (s ln thc hin thut ton lan truyn ngc li).
net l mng nron.
error l tng sai s ca ln hun luyn cui cng.
Sau khi hun luyn ta c th dng mng MLP tnh u ra ng vi cc u vo bt k
Lnh tnh u ra y ca MLP ng vi u vo x nh sau:
y = mlpfwd(net, x)
46
Chng 5: Cc hm Matlab
2009
n mn hc 2
Trong :
x l mt hay nhiu vector u vo
y l cc vector u ra tng ng.
- ht Chng 5 -
47
Chng 5: Cc hm Matlab
2009
n mn hc 2
Nhn dng ting ni l mt lnh vc tuy khng mi nhng v cng phc tp. Nhn dng
ting ni c th gii bt u nghin cu cch y hn 50 nm, tuy nhin nhng kt qu thc t
t c v cng khim tn. Cn phi rt lu na con ngi mi t n vic xy dng mt h
thng hiu c ting ni nh con ngi. Trong phm vi ch l mt n mn hc, chng ti ch
xy dng mt chng trnh nh nhn dng mi ch s ting Vit bng nhng cng c c
sn ca Matlab. Chng ti cng rt mun xy dng mt h thng nhn dng ting Vit vi b t
in ln hn, c th ng dng c vo thc t. Tuy nhin do ch mi tip xc lnh vc ny
nn kh nng, kin thc ca chng ti cn rt hn ch, cng vo l nhng kh khn v thi
gian, phng tinnn chng ti ch c th xy dng mt h thng nhn dng nh. Trong tng
lai nu c iu kin tip xc v nghin cu su hn v lnh vc ny, chng ti mong mun pht
trin n ny ln c th ng dng trong thc t.
6.1 Cc bc xy dng
H thng nhn dng mi ch s ting Vit c xy dng vi cc c trng nh sau:
mi).
S khi h thng nhn dng ting ni cc ch s ting Vit bng mng nron MLP
trn mi trng Matlab c m t trong hnh 6.1. Chc nng ca tng khi c m t nh
sau:
Thu thp v tin x l: tn hiu ting ni giai on hun luyn c thc hin bng
phng php th cng: s dng phn mm ghi m, lc nhiu v ct thnh cc t ring r, mi t
ghi vo mt file (tn file ghi t tng ng).
B d liu do chng ti t xy dng gm:
48
2009
n mn hc 2
Hnh 6.1: S khi h thng nhn dng ting ni cc ch s ting Vit bng mng
nron MLP trn mi trng Matlab
Vic thu thp v tin x l (ct cc vng khng cha tn hiu ting ni) c thc hin
bi cc lnh sau:
x = wavrecord(10000,8000); %tn s ly mu 8kHz, ghi m chng hn 1s
x = x';
%chuyn x thnh ma trn dng
y = endcut(x, 64, 1.5E-3); %ct khon lng
Hm endcut dng ct cc khong lng khng cha tn hiu m, s gii thut miu t
trong hnh 6.2. Cc lnh miu t nh sau:
function y = endcut(x, n, es)
% cat khoang lang ra khoi x.
% n l di frame, es l ngng nng lng.
x = x - mean(x); %dk: x d c chun ha
if nargin < 3
es = 2E-3;
end;
49
%mc nh l 2e-3
2009
n mn hc 2
if nargin < 2
n = 128;
end;
%mc nh l 128 mu
y=[]; i=1;
while i<=length(x)-n
t=x(i:i+n-1);
e=mean(t.^2);
if (e>es)
y = [y t];
end;
i=i+n;
end;
Begin
nl frame>
nl ngng
Lu frame
end
50
2009
n mn hc 2
p = 8;
end;
if nargin < 2 % mc nh tn s ly mu = 8kHz
fs = 8000;
end;
if isstr(wav) % nu wav l tn file th c
[wav fs] = wavread(wav);
end;
% chun ho max(wav)=1.
mx = max(wav);
wav = wav ./ mx;
% tnh vector MFCC p phn t, gm c nng lng
mfcc = melcepst(wav,fs,'e',p-1);
51
2009
n mn hc 2
52
2009
n mn hc 2
53
2009
n mn hc 2
54
2009
n mn hc 2
- ht Chng 6 -
55
Ni dung ca n:
PHN I:
L THUYT
Chng 1: M u
Chng 2: L thuyt m thanh v ting ni
2.1 Ngun gc m thanh
2.2 Cc i lng c trng cho m thanh
2.3 Cc tn s c bn ca ting ni
2.4 C ch to lp ting ni con ngi.
2.5 M hnh lc ngun to ting ni.
2.6 H thng nghe ca ngi.
2.7 Qu trnh sn xut v thu nhn ting ni ca con ngi.
2.8 Cc m thanh ting ni v cc c trng
2.8.1 Nguyn m
2.8.2 Cc m v khc
Chng 3: L thuyt nhn dng ting ni
3.1 Tng quan v nhn dng ting ni
3.2 Cc nguyn tc c bn trong nhn dng ting ni.
3.3 Cc h thng nhn dng ting ni.
3.4 Cc qu trnh nhn dng ting ni.
3.4.1 Phn tch cc c tnh ting ni
3.4.2 Phn lp mu
3.4.3 X l ngn ng
3.5 Cc tip cn nhn dng ting ni
3.5.1 Tip cn m thanh-ng m
3.5.2 Tip cn nhn dng mu
3.5.3 Tip cn tr tu nhn to
3.6 Cc phng php nhn dng ting ni.
3.6.1 M hnh Fujisaki
3.6.2 M hnh Markvo n
3.6.3 M hnh mng neuron
3.7 Nhng thun li v kh khn trong nhn dng ting Vit
Chng 4: Mng neuron v ng dng trong nhn dng ting ni
4.1 nh ngha mng neuron
4.2 Kin trc mng neuron
4.3 c trng ca mng neuron
PHN II: