Sie sind auf Seite 1von 4

25/3/2015

ReproducingthefeatureoutputsofcommonprogramsinMatlabusingmelfcc.m

DanEllis:Resources:Matlab:PLP,Rasta,MFCC:

Reproducingthefeatureoutputsofcommonprograms
usingMatlabandmelfcc.m
WhenIdecidedtoimplementmyownversionofwarpedfrequencycepstralfeatures(suchasMFCC)inMatlab,Iwantedtobeableto
duplicatetheoutputofthecommonprogramsusedforthesefeatures,aswellastobeabletoinverttheoutputsofthoseprograms.This
pagegivessomeexamplesofhowcepstracanbecalculatedbythreecommonprograms(HTK'sHCopy,feacalcfromSPRACHcore,and
mfcc.mfromMalcolmSlaney'sAuditoryToolboxforMatlab),andhowtoduplicatetheresults(orverynearly)usingmymelfcc.m
routine.Thisalsoautomaticallyshowsyouhowtoinvertcepstracalculatedbyeitherpathintospectrogramsorwaveformsusing
invmelfcc.m,sinceitsargumentsarethesame.

HTKMFCC
20130226:ForanemulationofHTK'sMFCCcalculationaccuratetothe3rddecimalplace,seethemodifiedrastamatcodein
calc_mfcc.ThemaindifferenceswerethatHTKappliespreemphasisindependentlyoneachwindow,andalsoremovesthemeanoneach
window.
CalculatingfeaturesinHTKisdoneviaHCopy,whichcanconvertbetweenawiderangeofrepresentationsincludingwaveformto
cepstra.HCopytakesitsoptionsfromaconfigfile.Thus,toconvert16kHzsampledsoundfilestostandardMelfrequencycepstral
coefficients(MFCCs),youwouldhaveafileconfig.mfcccontaining:
SOURCEKIND=WAVEFORM
SOURCEFORMAT=WAVE
SOURCERATE=625
TARGETKIND=MFCC_0
TARGETRATE=100000.0
WINDOWSIZE=250000.0
USEHAMMING=T
PREEMCOEF=0.97
NUMCHANS=20
CEPLIFTER=22
NUMCEPS=12

(TheSOURCEFORMAToptionspecifiesthatthewavefilesareinMSWAVEformat.)Thentocalculatethefeatures,yousimplyrunHCopyfrom
theUnixcommandline:
$HCopyCconfig.mfccsa1.wavsa1mfcc.htk

WecanemulatethisprocessinginMatlab,andcomparetheresults,asbelow:(Notethatthe">>"atthestartofeachlineisanimage,so
youcancutandcopymultiplelinesoftextdirectlyintoMatlabwithouthavingtoworryabouttheprompts).
%Loadaspeechwaveform
[d,sr]=wavread('sa1.wav');
%CalculateHTKstyleMFCCs
mfc=melfcc(d,sr,'lifterexp',22,'nbands',20,...
'dcttype',3,'maxfreq',8000,'fbtype','htkmel','sumpower',0);
%LoadthefeaturesfromHCopyandcompare:
htkmfc=readhtk('sa1mfcc.htk');
%Reorderandscaletobelikemefccoutput
htkmfc=2*htkmfc(:,[13[1:12]])';
%(melfcc.mis2xHCopybecauseitdealsinpower,notmagnitude,spectra)
subplot(311)
imagesc(htkmfc);axisxy;colorbar
title('HTKMFCC');
subplot(312)
imagesc(mfc);axisxy;colorbar
title('melfccMFCC');
subplot(313)
imagesc(htkmfcmfc);axisxy;colorbar
title('differenceHTKmelfcc');
%Differenceoccasionallypeaksatasmuchasafewpercent(unexplained),
%butisbasicallynegligable

%InverttheHTKfeaturesbacktowaveform,auditoryspectrogram,
%regularspectrogram(sameargsasmelfcc())
[dr,aspec,spec]=invmelfcc(htkmfc,sr,'lifterexp',22,'nbands',20,...
'dcttype',3,'maxfreq',8000,'fbtype','htkmel','sumpower',0);
subplot(311)
imagesc(10*log10(spec));axisxy;colorbar
title('ShorttimepowerspectruminvertedfromHTKMFCCs')
subplot(312)
specgram(dr,512,sr);colorbar
title('Spectrogramofreconstructed(noiseexcited)waveform');

http://www.ee.columbia.edu/ln/rosa/matlab/rastamat/mfccs.html

1/4

25/3/2015

ReproducingthefeatureoutputsofcommonprogramsinMatlabusingmelfcc.m

subplot(313)
specgram(d,512,sr);colorbar
title('Originalsignalspectrogram');
%Spectrogramslookprettyclose,althoughnoiseexcitation
%ofreconstructiongivesitaweird'whisperingcrowd'sound

HTKPLP
HTKcanalsocalculatePLPfeatures.ItturnsoutthatthesearesomewhatdifferentfromtheMFCCfeaturesbecausethecepstraare
calculatedbyadifferentalgorithm.However,wecanstillemulateandinvertthemwithdifferentparameters.TocalculatePLPfeatures
withHCopy,weneedanewconfigfile,config.plp:
SOURCEKIND=WAVEFORM
SOURCEFORMAT=WAVE
SOURCERATE=625
TARGETKIND=PLP_0
TARGETRATE=100000.0
WINDOWSIZE=250000.0
USEHAMMING=T
PREEMCOEF=0.97
NUMCHANS=20
CEPLIFTER=22
NUMCEPS=12
USEPOWER=T
LPCORDER=12

(TARGETKINDischanged,andUSEPOWERandLPCORDERareadded).Thenwecalculatethefeatures:
$HCopyCconfig.plpsa1.wavsa1plp.htk

..andcomparetotheMatlabversion:
[d,sr]=wavread('sa1.wav');
%CalculateHTKstylePLPs
plp=melfcc(d,sr,'lifterexp',22,'nbands',20,...
'dcttype',1,'maxfreq',8000,'fbtype','htkmel',...
'modelorder',12,'usecmp',1);
%LoadtheHCopyfeatures
htkplp=readhtk('sa1plp.htk');
%Reorder(noscalinginthiscase)
htkplp=htkplp(:,[13[1:12]])';
subplot(311)
imagesc(htkplp);axisxy;colorbar
title('HTKPLP');
subplot(312)
imagesc(plp);axisxy;colorbar
title('melfccPLP');
subplot(313)
imagesc(htkplpplp);axisxy;colorbar
title('differenceHTKmelfcc');
%Unexplaineddifferencescanbeupto20%forhigherorder
%cepstra,butessentiallythesame

%InverttheHTKfeaturesbackagainbymirroringargstomelfcc
[dr,aspec,spec]=invmelfcc(htkplp,sr,'lifterexp',22,'nbands',20,...
'dcttype',1,'maxfreq',8000,'fbtype','htkmel',...
'modelorder',12,'usecmp',1);
subplot(311)
imagesc(10*log10(spec));axisxy;colorbar
title('ShorttimepowerspectruminvertedfromHTKPLPs')
subplot(312)
specgram(dr,512,sr);colorbar
title('Spectrogramofreconstructed(noiseexcited)waveform');
subplot(313)
specgram(d,512,sr);colorbar
title('Originalsignalspectrogram');
%Prettyclose

feacalcMFCC
http://www.ee.columbia.edu/ln/rosa/matlab/rastamat/mfccs.html

2/4

25/3/2015

ReproducingthefeatureoutputsofcommonprogramsinMatlabusingmelfcc.m

feacalcisthemainfeaturecalculationprogramfromICSI'sSPRACHcorepackage.It'sactuallyawrapperaroundtheolderrasta.which
wastheoriginalClanguageimplementationofRASTAandPLPfeaturecalculation.feacalchasbeenexpandedtobeabletocalculate
(itsownversionof)MFCCfeatures,sotoparalleltheHTKexamplesabove,we'llstartwithfeacalc'sMFCCfeature.Theycanbe

calculatedwiththefollowingcommandline:
$feacalcsr16000nyq8000delta0rasnoplpno\
domcepcomnofrqmelfilttricep13opfhtk\
sa1.wavosa1fcmfc.htk

andweduplicatethisinMatlabasfollows:
[d,sr]=wavread('sa1.wav');
%CalculateFeacalcstyleMFCCs
%(scaletomatchnormalizationofMelfilters)
mfc2=melfcc(d*5.5289,sr,'lifterexp',0.6,'nbands',19,...
'dcttype',4,'maxfreq',8000,'fbtype','fcmel','preemph',0);
%LoadtheHCopyfeatures
fcmfc=readhtk('sa1fcmfc.htk');
%Noneedtoreorderorscale,justtranspose
fcmfc=fcmfc';
subplot(311)
imagesc(fcmfc(2:13,:));axisxy;colorbar
title('feacalcMFCC');
subplot(312)
imagesc(mfc2(2:13,:));axisxy;colorbar
title('melfccMFCC(feacalcstyle)');
subplot(313)
imagesc(fcmfcmfc2);axisxy;colorbar
title('differencefeacalcmelfcc');
%Smalldifferencesinhighordercepstradueto
%cumulativeerrorsinMelfiltershapes

..andinvertingworksjustthesameasabove.

feacalcPLP
feacalcwasoriginallydesignedtocalculatePLP(andRasta)features,sothisisitsmore'native'invocation:
$feacalcsr16000nyq8000delta0rasnodomcepplp12\
opfhtksa1.wavosa1fcplp.htk

..whichweduplicatethisinMatlabasfollows:
[d,sr]=wavread('sa1.wav');
%CalculateFeacalcstylePLPs
plp2=melfcc(d,sr,'lifterexp',0.6,'nbands',21,...
'dcttype',1,'maxfreq',8000,'fbtype','bark','preemph',0,...
'numcep',13,'modelorder',12,'usecmp',1);
%LoadtheHCopyfeatures
fcplp=readhtk('sa1fcplp.htk');
%justtranspose
fcplp=fcplp';
subplot(311)
imagesc(fcplp(2:13,:));axisxy;colorbar
title('feacalcPLP');
subplot(312)
imagesc(plp2(2:13,:));axisxy;colorbar
title('melfccPLP(feacalcstyle)');
subplot(313)
imagesc(fcplpplp2);axisxy;colorbar
title('differencefeacalcmelfcc');
%Afewlocalizeddifferencesduewindowsetc.

..andonceagaininvertingworksjustthesameasabove.

AuditoryToolboxmfcc.m
ThemostpopulartoolforcalculatingMFCCsinMatlabismfcc.mfromMalcolmSlaney'sAuditoryToolbox.ThisiswhatIusedfora
longtime,untilIneededsomethingwithmoreflexibility.Thatflexibilityincludesbeingabletoduplicatemfcc.m.Here'showwecan
comparetheminMatlab.
[d,sr]=wavread('sa1.wav');
%CalculateMFCCsusingmfcc.mfromtheAuditoryToolbox
%(gainshouldbe2^15becausemelfccscalesbythatamount,
%butinthiscasemfccuses2xFFTlen)
ce=mfcc(d*(2^14),sr);
%Scalethemtomatch(log_10andpower)
ce=log(10)*2*ce;

http://www.ee.columbia.edu/ln/rosa/matlab/rastamat/mfccs.html

3/4

25/3/2015

ReproducingthefeatureoutputsofcommonprogramsinMatlabusingmelfcc.m

%Duplicatewithmelfcc.m
mfc3=melfcc(d,sr,'lifterexp',0,'minfreq',133.33,...
'maxfreq',6855.6,'wintime',0.016,'sumpower',0);
%..andcompare:
subplot(311)
imagesc(ce(2:13,:));axisxy;colorbar
title('AuditoryToolboxMFCC');
subplot(312)
imagesc(mfc3(2:13,:));axisxy;colorbar
title('melfccMFCC(AudToolboxstyle)');
subplot(313)
imagesc(cemfc3);axisxy;colorbar
title('differenceAudTBoxmelfcc');
%Smalldifferencesmainlyduetohanningvs.hamming

NotesonthedifferencesbetweendifferentMFCCs
Melmappingfunction
Melfilternormalization
DCTusedtocalculatecepstrum
NumberofMelbands(andhencetheirwidth)
FrequencyspanofMelbands
Lifteringrasta,htk,none
DetailsofinitialSTFT(odd/evenhann/hamm,fftlength,windowlength)
Melintegrationinlinearorpowerdomain
DitherandDCremoval
Preemphasis
Lastupdated:$Date:2013/02/2617:00:16$
DanEllis<dpwe@ee.columbia.edu>

http://www.ee.columbia.edu/ln/rosa/matlab/rastamat/mfccs.html

4/4

Das könnte Ihnen auch gefallen