Sie sind auf Seite 1von 2

20.10.2016.

Ristretto:Layers,BenchmarkingandFinetuningLEPS

Home / Ristretto | CNN Approximation / Ristretto: Layers, Benchmarking and Finetuning

Ristretto: Layers, Benchmarking and Finetuning


This modied Cae version supports layers with limited numerical precision. The layers in question use reduced word width
forlayer parameters and layer activations (inputs and outputs). As Ristretto follows the principles of Cae, users already
acquainted with Cae will understand Ristretto quickly. The main additions ofRistretto are explained below.

Ristretto Layers
Ristretto introduces new layer types with limited numerical precision. These layers can be used through the traditional Cae net
description les (*.prototxt).
An example of a minioatconvolutional layer is given below:
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.

layer{
name:"conv1"
type:"ConvolutionRistretto"
bottom:"data"
top:"conv1"
convolution_param{
num_output:96
kernel_size:7
stride:2
weight_filler{
type:"xavier"
}
}
quantization_param{
precision:MINIFLOAT
mant_bits:10
exp_bits:5
}
}

This layer will use half precision (16-bit oating point) number representation. The convolution kernels, bias as well as layer
activationsare trimmed to this format.
Notice the three dierences to a traditional convolutional layer:
type changes to ConvolutionRistretto
An additional layer parameter is added: quantization_param
This layer parameter contains all the information used for quantization

Ristretto provides limited precision layers at src/cae/ristretto/layers/.

Blobs
Ristretto allows for accurate simulation of resource-limited hardware accelerators. In order to stay with the Cae principles,
Ristretto reuses oating point blobs for layer parameters and outputs. This means that all numbers with limited precision are
actually stored in oating point blobs.

Scoring
For scoring of quantized networks, Ristretto requires
The 32-bit FP parameters of the trained network
The network denition with reducedprecision layers
The rst item is the result of the traditional training in Cae. Ristretto can test networks using full-precision parameters. The
parameters are converted to limited precision on the y, using round-nearest scheme by default.
As for the second item the model description you will either have to manually change the network description of your
Caemodel, or use the Ristretto tool for automatic generation of a Google Protocol Buer le.
1.
2.
3.
4.

#scorethedynamicfixedpointSqueezeNetmodelonthevalidationset*
./build/tools/caffetestmodel=models/SqueezeNet/RistrettoDemo/quantized.prototxt\
weights=models/SqueezeNet/RistrettoDemo/squeezenet_finetuned.caffemodel\
gpu=0iterations=2000

*Before running this, please follow the SqueezeNet Example.

Fine-tuning
In order to improve a condensed networks accuracy, it should always be ne-tuned. In Ristretto, the Caecommand line tool
supports ne-tuning of condensed networks. The only dierence to traditional training is that the network description le should
contain Ristretto layers.

http://lepsucd.com/?page_id=637

1/2

20.10.2016.

Ristretto:Layers,BenchmarkingandFinetuningLEPS

The following items are required for ne-tuning:


The 32-bit FP network parameters
Solver with hyper parameters for training
The network parameters are the result of full-precision training in Cae.
The solver contains the path to the description le of the limited precision network. This network description is the same one that
we used for scoring.
1.

#finetunedynamicfixedpointSqueezeNet*

2.

./build/tools/caffetrain\

3.

solver=models/SqueezeNet/RistrettoDemo/solver_finetune.prototxt\
weights=models/SqueezeNet/squeezenet_v1.0.caffemodel

4.

*Before running this, please follow the SqueezeNet Example.

Implementation details
During this retraining procedure, the network learns how to classify images with limited word withparameters. Since the network
weights can only have discrete values, the main challenge consists in the weight update. We adopt the idea of previous work
(Courbariaux et al. [1]) which uses full precision shadow weights. Small weight updates are applied to the 32-bit FPweights w,
whereas the discrete weights w are sampled from the full precision weights. The sampling during ne-tuning is done with
stochastic rounding, which was successfully used by Gupta et al. [2] to train networks in 16-bit xed point.

[1] Matthieu Courbariaux, Yoshua Bengio, and Jean-Pierre David. Binaryconnect: Training deep neural networks with binary
weights during propagations. In Advances in Neural Information Processing Systems, 2015.
[2] Suyog Gupta, Ankur Agrawal, Kailash Gopalakrishnan, and Pritish Narayanan. Deep learning with limited numerical precision.
arXiv preprint, 2015.

http://lepsucd.com/?page_id=637

2/2

Das könnte Ihnen auch gefallen