Sie sind auf Seite 1von 5

Technical Articles and Newsletters

Deep Learning for Computer Vision with MATLAB

ByAvinashNehemiahandValerieLeung,MathWorks

Computervisionengineershaveusedmachinelearningtechniquesfordecadestodetectobjectsofinterestinimagesandtoclassifyoridentifycategoriesof
objects.Theyextractfeaturesrepresentingpoints,regions,orobjectsofinterestandthenusethosefeaturestotrainamodeltoclassifyorlearnpatternsinthe
imagedata.

Intraditionalmachinelearning,featureselectionisatimeconsumingmanualprocess.Featureextractionusuallyinvolvesprocessingeachimagewithoneor
moreimageprocessingoperations,suchascalculatinggradienttoextractthediscriminativeinformationfromeachimage.

Enterdeeplearning.Deeplearningalgorithmscanlearnfeatures,representations,andtasksdirectlyfromimages,text,andsound,eliminatingtheneedfor
manualfeatureselection.

Usingasimpleobjectdetectionandrecognitionexample,thisarticleillustrateshoweasyitistouseMATLABfordeeplearning,evenwithoutextensive
knowledgeofadvancedcomputervisionalgorithmsorneuralnetworks.

Thecodeusedinthisexampleisavailablefordownload.

Getting Started
Thegoalinthisexampleistotrainanalgorithmtodetectapetinavideoandcorrectlylabelthepetasacatoradog.Wellbeusingaconvolutionalneural
network(CNN),aspecifictypeofdeeplearningalgorithmthatcanbothperformclassificationandextractfeaturesfromrawimages.

TobuildtheobjectdetectionandrecognitionalgorithminMATLAB,allweneedisapretrainedCNNandsomedogandcatimages.WellusetheCNNtoextract
discriminativefeaturesfromtheimages,andthenuseaMATLABapptotrainamachinelearningalgorithmtodiscriminatebetweencatsanddogs.

Importing a CNN Classifier


WebeginbydownloadingaCNNclassifierpretrainedonImageNet,adatabasecontainingover1.2millionlabeledhighresolutionimagesin1000categories.In
thisexamplewellbeusingtheAlexNetarchitecture.

websave('\networks\imagenetcaffealex.mat',...
'http://www.vlfeat.org/matconvnet/models/beta16/
imagenetcaffealex.mat');

WeimportthenetworkintoMATLABasaSeriesNetworkusingNeuralNetworkToolbox,anddisplaythearchitectureoftheCNN.The SeriesNetwork object


representstheCNN.

%LoadMatConvNetnetworkintoaSeriesNetwork
convnet=helperImportMatConvNet(cnnFullMatFile);


%ViewtheCNNarchitecture
convnet.Layers

Wevestoredtheimagesinseparate cat and dog foldersunderaparentcalled pet_images .TheadvantageofusingthisfolderstructureisthattheMATLAB


imageDatastore wecreatewillbeabletoautomaticallyreadandmanageimagelocationsandclasslabels.( imageDatastore isarepositoryforcollectionsof
datathataretoolargetofitinmemory.)

Weinitializean imageDatastore toaccesstheimagesinMATLAB.

%%Setupimagedata
dataFolder='\data\PetImages';
categories={'Cat','Dog'};
imds=imageDatastore(fullfile(dataFolder,categories),...
'LabelSource','foldernames');

Wethenselectasubsetofthedatathatgivesusanequalnumberofdogandcatimages.

tbl=countEachLabel(imds)

%%Usethesmallestoverlapset
minSetCount=min(tbl{:,2});

%UsesplitEachLabelmethodtotrimtheset.
imds=splitEachLabel(imds,minSetCount,'randomize');

%Noticethateachsetnowhasexactlythesamenumberofimages.
countEachLabel(imds)
Sincethe AlexNet networkwastrainedon227x227pixelimages,wehavetoresizeallourtrainingimagestothesameresolution.Thefollowingcodeallowsus
Technical Articles and Newsletters
toreadandprocessimagesfromthe imageDatastore atthesametime.

%%PreprocessImagesForCNN
%SettheImageDatastoreReadFcn
imds.ReadFcn=@(filename)readAndPreprocessImage(filename);

%%Dividedataintotrainingandtestingsets
[trainingSet,testSet]=splitEachLabel(imds,0.3,'randomize');

Weusethe readAndPreprocessImage functiontoresizetheimagesto227x227pixels.

functionIout=readAndPreprocessImage(filename)

I=imread(filename);

%Someimagesmaybegrayscale.Replicatetheimage3timesto
%createanRGBimage.
ifismatrix(I)
I=cat(3,I,I,I);
end

%ResizetheimageasrequiredfortheCNN.
Iout=imresize(I,[227227]);

end

Performing Feature Extraction


Wewanttousethisnewdatasetwiththepretrained AlexNet CNN.CNNscanlearntoextractgenericfeaturesthatcanbeusedtotrainanewclassifiertosolve
adifferentprobleminourcase,classifyingcatsanddogs(Figure1).

Figure1.WorkflowforusingapretrainedCNNtoextractfeaturesforanewtask.

WepassthetrainingdatathroughtheCNNandusethe activations methodtoextractfeaturesataparticularlayerinthenetwork.Likeotherneuralnetworks,


CNNsareformedusinginterconnectedlayersofnonlinearprocessingelements,orneurons.Inputandoutputlayersconnecttoinputandoutputsignals,and
hiddenlayersprovidenonlinearcomplexitythatgivesaneuralnetworkitscomputationalcapacity.

WhileeachlayerofaCNNproducesaresponsetoaninputimage,onlyafewlayersaresuitableforimagefeatureextraction.Thereisnoexactformulafor
identifyingtheselayers.Thebestapproachistosimplytryafewdifferentlayersandseehowwelltheywork.

Thelayersatthebeginningofthenetworkcapturebasicimagefeatures,suchasedgesandblobs.Toseethis,wevisualizethenetworkfilterweightsfromthe
firstconvolutionallayer(Figure2).

%Getthenetworkweightsforthesecondconvolutionallayer
w1=convnet.Layers(2).Weights;

%Scaleandresizetheweightsforvisualization
w1=mat2gray(w1);
w1=imresize(w1,5);

%Displayamontageofnetworkweights.Thereare96individual
%setsofweightsinthefirstlayer.
figure
montage(w1)
title('Firstconvolutionallayerweights')
Technical Articles and Newsletters

Figure2.Visualizationoffirstlayerfilterweights.

Noticethatthefirstlayerofthenetworkhaslearnedfiltersforcapturingblobandedgefeatures.These"primitive"featuresarethenprocessedbydeepernetwork
layers,whichcombinetheearlyfeaturestoformhigherlevelimagefeatures.Thesehigherlevelfeaturesarebettersuitedforrecognitiontasksbecausethey
combinealltheprimitivefeaturesintoaricherimagerepresentation.Youcaneasilyextractfeaturesfromoneofthedeeperlayersusingthe activations
method.

Thelayerrightbeforetheclassificationlayerfc7isagoodplacetostart.Weextracttrainingfeaturesusingthatlayer.

featureLayer='fc7';
trainingFeatures=activations(convnet,trainingSet,featureLayer,...
'MiniBatchSize',32,'OutputAs','columns');

Training an SVM Classifier Using the Extracted Features


Werenowreadytotraina"shallow"classifierwiththefeaturesextractedinthepreviousstep.Notethattheoriginalnetworkwastrainedtoclassify1000object
categories.Theshallowclassifierwillbetrainedtosolvethespecificdogsvs.catsproblem.

TheClassificationLearnerappinStatisticsandMachineLearningToolboxletsustrainandcomparemultiplemodelsinteractively(Figure3).

Figure3.ClassificationLearnerapp.

Alternatively,wecouldtraintheclassifierinourMATLABscript.
Wesplitthedataintotwosets,onefortrainingandonefortesting.Next,wetrainasupportvectormachine(SVM)classifierusingtheextractedfeaturesby
Technical Articles and Newsletters
callingthe fitcsvm functionusing trainingFeatures astheinputorpredictorsand trainingLabels astheoutputorresponsevalues.Wewillcrossvalidate
theclassifieronthetestdatatodetermineitsvalidationaccuracy,anunbiasedestimateofhowtheclassifierwouldperformonnewdata.

%%Trainaclassifierusingextractedfeatures
trainingLabels=trainingSet.Labels;

%HereItrainalinearsupportvectormachine(SVM)classifier.
svmmdl=fitcsvm(trainingFeatures,trainingLabels);

%Performcrossvalidationandcheckaccuracy
cvmdl=crossval(svmmdl,'KFold',10);
fprintf('kFoldCVaccuracy:%2.2f\n',1cvmdl.kfoldLoss)

Wecannowusethe svmmdl classifiertoclassifyanimageasacatoradog(Figure4).

Figure4.Resultofusingthetrainedpetclassifieronanimageofacat.

Performing Object Detection


Inmostimagesandvideoframes,thereisalotgoingon.Forexample,inadditiontoadog,therecouldbeatree,oraflockofpigeons,oraraccoonchasingthe
dog.Evenareliableimageclassifierwillonlyworkwellifwecanlocatetheobjectofinterest,croptheobject,andthenfeedittotheclassifierinotherwords,if
wecanperformobjectdetection.

Forobjectdetectionwewilluseatechniquecalledopticalflow,whichusesthemotionofpixelsinavideofromframetoframe.Figure5showsasingleframeof
videowiththemotionvectorsoverlaid.

Figure5.Asingleframeofvideoshowingthemotionvectorsoverlaid.

ThenextstepinthedetectionprocessistoseparateoutpixelsthataremovingandthenusetheImageRegionAnalyzerapptoanalyzetheconnected
componentsinthebinaryimagetofilteroutnoisecausedbythemotionofthecamera.TheoutputoftheappisaMATLABfunctionthatcanlocatethepetinthe
fieldofview(Figure6).
Technical Articles and Newsletters

Figure6.ImageRegionAnalyzerapp.

Wenowhaveallthepiecesweneedtobuildapetdetectionandrecognitionsystem(Figure7).Thesystemcan:

Detectthelocationofthepetinnewimagesusingopticalflow
CropthepetfromtheimageandextractfeaturesusingapretrainedCNN
ClassifythefeaturesusingtheSVMclassifierwetrainedtodetermineifthepetisacatoradog

Figure7.Accuratelyclassifieddogandcat.

Inthisarticleweusedanexistingdeeplearningnetworktosolveadifferenttask.Youcanusethesametechniquestosolveyourownimageclassification
problemforexample,classifyingtypesofcarsinvideosfortrafficflowanalysis,identifyingtumorsinmassspectrometrydataforcancerresearch,oridentifying
individualsbytheirfacialfeaturesforsecuritysystems.

ArticlefeaturedinMathWorksNews&Notes

Published201693019v00

Products Used
MATLAB
NeuralNetworkToolbox
StatisticsandMachineLearningToolbox

Learn More
ObjectRecognition:DeepLearningandMachineLearningforComputerVision(26:57)
ObjectDetectionExampleCode(download)
DeepLearningin11LinesofMATLABCode(2:38)

View Articles for Related Capabilities


AlgorithmDevelopment

Das könnte Ihnen auch gefallen