Beruflich Dokumente
Kultur Dokumente
ByAvinashNehemiahandValerieLeung,MathWorks
Computervisionengineershaveusedmachinelearningtechniquesfordecadestodetectobjectsofinterestinimagesandtoclassifyoridentifycategoriesof
objects.Theyextractfeaturesrepresentingpoints,regions,orobjectsofinterestandthenusethosefeaturestotrainamodeltoclassifyorlearnpatternsinthe
imagedata.
Intraditionalmachinelearning,featureselectionisatimeconsumingmanualprocess.Featureextractionusuallyinvolvesprocessingeachimagewithoneor
moreimageprocessingoperations,suchascalculatinggradienttoextractthediscriminativeinformationfromeachimage.
Enterdeeplearning.Deeplearningalgorithmscanlearnfeatures,representations,andtasksdirectlyfromimages,text,andsound,eliminatingtheneedfor
manualfeatureselection.
Usingasimpleobjectdetectionandrecognitionexample,thisarticleillustrateshoweasyitistouseMATLABfordeeplearning,evenwithoutextensive
knowledgeofadvancedcomputervisionalgorithmsorneuralnetworks.
Thecodeusedinthisexampleisavailablefordownload.
Getting Started
Thegoalinthisexampleistotrainanalgorithmtodetectapetinavideoandcorrectlylabelthepetasacatoradog.Wellbeusingaconvolutionalneural
network(CNN),aspecifictypeofdeeplearningalgorithmthatcanbothperformclassificationandextractfeaturesfromrawimages.
TobuildtheobjectdetectionandrecognitionalgorithminMATLAB,allweneedisapretrainedCNNandsomedogandcatimages.WellusetheCNNtoextract
discriminativefeaturesfromtheimages,andthenuseaMATLABapptotrainamachinelearningalgorithmtodiscriminatebetweencatsanddogs.
websave('\networks\imagenetcaffealex.mat',...
'http://www.vlfeat.org/matconvnet/models/beta16/
imagenetcaffealex.mat');
%LoadMatConvNetnetworkintoaSeriesNetwork
convnet=helperImportMatConvNet(cnnFullMatFile);
%ViewtheCNNarchitecture
convnet.Layers
%%Setupimagedata
dataFolder='\data\PetImages';
categories={'Cat','Dog'};
imds=imageDatastore(fullfile(dataFolder,categories),...
'LabelSource','foldernames');
Wethenselectasubsetofthedatathatgivesusanequalnumberofdogandcatimages.
tbl=countEachLabel(imds)
%%Usethesmallestoverlapset
minSetCount=min(tbl{:,2});
%UsesplitEachLabelmethodtotrimtheset.
imds=splitEachLabel(imds,minSetCount,'randomize');
%Noticethateachsetnowhasexactlythesamenumberofimages.
countEachLabel(imds)
Sincethe AlexNet networkwastrainedon227x227pixelimages,wehavetoresizeallourtrainingimagestothesameresolution.Thefollowingcodeallowsus
Technical Articles and Newsletters
toreadandprocessimagesfromthe imageDatastore atthesametime.
%%PreprocessImagesForCNN
%SettheImageDatastoreReadFcn
imds.ReadFcn=@(filename)readAndPreprocessImage(filename);
%%Dividedataintotrainingandtestingsets
[trainingSet,testSet]=splitEachLabel(imds,0.3,'randomize');
functionIout=readAndPreprocessImage(filename)
I=imread(filename);
%Someimagesmaybegrayscale.Replicatetheimage3timesto
%createanRGBimage.
ifismatrix(I)
I=cat(3,I,I,I);
end
%ResizetheimageasrequiredfortheCNN.
Iout=imresize(I,[227227]);
end
Figure1.WorkflowforusingapretrainedCNNtoextractfeaturesforanewtask.
WhileeachlayerofaCNNproducesaresponsetoaninputimage,onlyafewlayersaresuitableforimagefeatureextraction.Thereisnoexactformulafor
identifyingtheselayers.Thebestapproachistosimplytryafewdifferentlayersandseehowwelltheywork.
Thelayersatthebeginningofthenetworkcapturebasicimagefeatures,suchasedgesandblobs.Toseethis,wevisualizethenetworkfilterweightsfromthe
firstconvolutionallayer(Figure2).
%Getthenetworkweightsforthesecondconvolutionallayer
w1=convnet.Layers(2).Weights;
%Scaleandresizetheweightsforvisualization
w1=mat2gray(w1);
w1=imresize(w1,5);
%Displayamontageofnetworkweights.Thereare96individual
%setsofweightsinthefirstlayer.
figure
montage(w1)
title('Firstconvolutionallayerweights')
Technical Articles and Newsletters
Figure2.Visualizationoffirstlayerfilterweights.
Noticethatthefirstlayerofthenetworkhaslearnedfiltersforcapturingblobandedgefeatures.These"primitive"featuresarethenprocessedbydeepernetwork
layers,whichcombinetheearlyfeaturestoformhigherlevelimagefeatures.Thesehigherlevelfeaturesarebettersuitedforrecognitiontasksbecausethey
combinealltheprimitivefeaturesintoaricherimagerepresentation.Youcaneasilyextractfeaturesfromoneofthedeeperlayersusingthe activations
method.
Thelayerrightbeforetheclassificationlayerfc7isagoodplacetostart.Weextracttrainingfeaturesusingthatlayer.
featureLayer='fc7';
trainingFeatures=activations(convnet,trainingSet,featureLayer,...
'MiniBatchSize',32,'OutputAs','columns');
TheClassificationLearnerappinStatisticsandMachineLearningToolboxletsustrainandcomparemultiplemodelsinteractively(Figure3).
Figure3.ClassificationLearnerapp.
Alternatively,wecouldtraintheclassifierinourMATLABscript.
Wesplitthedataintotwosets,onefortrainingandonefortesting.Next,wetrainasupportvectormachine(SVM)classifierusingtheextractedfeaturesby
Technical Articles and Newsletters
callingthe fitcsvm functionusing trainingFeatures astheinputorpredictorsand trainingLabels astheoutputorresponsevalues.Wewillcrossvalidate
theclassifieronthetestdatatodetermineitsvalidationaccuracy,anunbiasedestimateofhowtheclassifierwouldperformonnewdata.
%%Trainaclassifierusingextractedfeatures
trainingLabels=trainingSet.Labels;
%HereItrainalinearsupportvectormachine(SVM)classifier.
svmmdl=fitcsvm(trainingFeatures,trainingLabels);
%Performcrossvalidationandcheckaccuracy
cvmdl=crossval(svmmdl,'KFold',10);
fprintf('kFoldCVaccuracy:%2.2f\n',1cvmdl.kfoldLoss)
Figure4.Resultofusingthetrainedpetclassifieronanimageofacat.
Forobjectdetectionwewilluseatechniquecalledopticalflow,whichusesthemotionofpixelsinavideofromframetoframe.Figure5showsasingleframeof
videowiththemotionvectorsoverlaid.
Figure5.Asingleframeofvideoshowingthemotionvectorsoverlaid.
ThenextstepinthedetectionprocessistoseparateoutpixelsthataremovingandthenusetheImageRegionAnalyzerapptoanalyzetheconnected
componentsinthebinaryimagetofilteroutnoisecausedbythemotionofthecamera.TheoutputoftheappisaMATLABfunctionthatcanlocatethepetinthe
fieldofview(Figure6).
Technical Articles and Newsletters
Figure6.ImageRegionAnalyzerapp.
Wenowhaveallthepiecesweneedtobuildapetdetectionandrecognitionsystem(Figure7).Thesystemcan:
Detectthelocationofthepetinnewimagesusingopticalflow
CropthepetfromtheimageandextractfeaturesusingapretrainedCNN
ClassifythefeaturesusingtheSVMclassifierwetrainedtodetermineifthepetisacatoradog
Figure7.Accuratelyclassifieddogandcat.
Inthisarticleweusedanexistingdeeplearningnetworktosolveadifferenttask.Youcanusethesametechniquestosolveyourownimageclassification
problemforexample,classifyingtypesofcarsinvideosfortrafficflowanalysis,identifyingtumorsinmassspectrometrydataforcancerresearch,oridentifying
individualsbytheirfacialfeaturesforsecuritysystems.
ArticlefeaturedinMathWorksNews&Notes
Published201693019v00
Products Used
MATLAB
NeuralNetworkToolbox
StatisticsandMachineLearningToolbox
Learn More
ObjectRecognition:DeepLearningandMachineLearningforComputerVision(26:57)
ObjectDetectionExampleCode(download)
DeepLearningin11LinesofMATLABCode(2:38)