Beruflich Dokumente
Kultur Dokumente
This notebook demonstrates how to use Caffe neural network framework to produce "dream"
visuals shown in the Google Research blog post.
It'll be interesting to see what imagery people are able to generate using the described technique. If
you post images to Google+, Facebook, or Twitter, be sure to tag them with #deepdream so other
researchers can check them out too.
Dependencies
This notebook is designed to have as few dependencies as possible:
Standard Python scientific stack: NumPy, SciPy, PIL, IPython. Those libraries can also be
installed as a part of one of scientific packages for Python, such as Anaconda or Canopy.
Caffe deep learning framework (installation instructions).
Google protobuf library that is used for Caffe model manipulation.
In [33]:
# imports and basic notebook setup
import sys
sys.path.append("/usr/local/lib/python2.7/dist-packages")
from cStringIO import StringIO
import numpy as np
import scipy.ndimage as nd
#import PIL.Image
import PIL
from IPython.display import clear_output, Image, display
from google.protobuf import text_format
import caffe
def showarray(a, fmt='jpeg'):
a = np.uint8(np.clip(a, 0, 255))
f = StringIO()
PIL.Image.fromarray(a).save(f, fmt)
display(Image(data=f.getvalue()))
model = caffe.io.caffe_pb2.NetParameter()
text_format.Merge(open(net_fn).read(), model)
model.force_backward = True
open('tmp.prototxt', 'w').write(str(model))
net = caffe.Classifier('tmp.prototxt', param_fn,
mean = np.float32([104.0, 116.0, 122.0]), # ImageNet
mean, training set dependent
channel_swap = (2,1,0)) # the reference model has
channels in BGR order instead of RGB
# a couple of utility functions for converting to and from Caffe's input image
layout
def preprocess(net, img):
return np.float32(np.rollaxis(img, 2)[::-1]) - net.transformer.mean['data']
def deprocess(net, img):
return np.dstack((img + net.transformer.mean['data'])[::-1])
Producing dreams
Making the "dream" images is very simple. Essentially it is just a gradient ascent process that tries
to maximize the L2 norm of activations of a particular DNN layer. Here are a few simple tricks that
we found useful for getting good images:
offset image by a random jitter
normalize the magnitude of gradient ascent steps
apply ascent across multiple scales (octaves)
First we implement a basic gradient ascent step function, applying the first two tricks:
In [35]:
def make_step(net, step_size=1.5, end='inception_4c/output', jitter=32,
clip=True):
'''Basic gradient ascent step.'''
src = net.blobs['data'] # input image is stored in Net's 'data' blob
dst = net.blobs[end]
ox, oy = np.random.randint(-jitter, jitter+1, 2)
src.data[0] = np.roll(np.roll(src.data[0], ox, -1), oy, -2) # apply jitter
shift
net.forward(end=end)
dst.diff[:] = dst.data # specify the optimization objective
net.backward(start=end)
g = src.diff[0]
# apply normalized ascent step to the input image
src.data[:] += step_size/np.abs(g).mean() * g
src.data[0] = np.roll(np.roll(src.data[0], -ox, -1), -oy, -2) # unshift
image
if clip:
bias = net.transformer.mean['data']
src.data[:] = np.clip(src.data, -bias, 255-bias)
Next we implement an ascent through different scales. We call these scales "octaves".
In [36]:
Now we are ready to let the neural network to reveal its dreams! Let's take a cloud image as a
starting point:
In [37]:
from PIL import Image
image = Image.open('sky1024px.jpg')
img = np.float32(image)
showarray(img)
--------------------------------------------------------------------------IOError
Traceback (most recent call last)
<ipython-input-37-82972af6f902> in <module>()
1 from PIL import Image
----> 2 image = Image.open('sky1024px.jpg')
3 img = np.float32(image)
4 showarray(img)
/usr/lib/python2.7/dist-packages/PIL/Image.pyc in open(fp, mode)
IOError: cannot identify image file
Running the next code cell starts the detail generation process. You may see how new patterns start
to form, iteration by iteration, octave by octave.
In [26]:
_=deepdream(net, img)
--------------------------------------------------------------------------NameError
Traceback (most recent call last)
<ipython-input-26-d4150d0aed19> in <module>()
----> 1 _=deepdream(net, img)
NameError: name 'img' is not defined
The complexity of the details generated depends on which layer's activations we try to maximize.
Higher layers produce complex features, while lower ones enhance edges and textures, giving the
image an impressionist feeling:
In [ ]:
_=deepdream(net, img, end='inception_3b/5x5_reduce')
We encourage readers to experiment with layer selection to see how it affects the results. Execute
the next code cell to see the list of different layers. You can modify the make_step function to
make it follow some different objective, say to select a subset of activations to maximize, or to
maximize multiple layers at once. There is a huge design space to explore!
In [ ]:
net.blobs.keys()
What if we feed the deepdream function its own output, after applying a little zoom to it? It turns
out that this leads to an endless stream of impressions of the things that the network saw during
training. Some patterns fire more often than others, suggestive of basins of attraction.
We will start the process from the same sky image as above, but after some iteration the original
image becomes irrelevant; even random noise can be used as the starting point.
In [ ]:
!mkdir frames
frame = img
frame_i = 0
In [ ]:
h, w = frame.shape[:2]
s = 0.05 # scale coefficient
for i in xrange(100):
frame = deepdream(net, frame)
PIL.Image.fromarray(np.uint8(frame)).save("frames/%04d.jpg"%frame_i)
frame = nd.affine_transform(frame, [1-s,1-s,1], [h*s/2,w*s/2,0], order=1)
frame_i += 1
Be careful running the code above, it can bring you into very strange realms!
In [ ]:
Image(filename='frames/0029.jpg')
Let's make it brain-dead simple to launch your very own deepdreaming server (in the cloud, on an
Ubuntu machine, Mac via Docker, and maybe even Windows if you try out Kitematic by Docker)!
Motivation
I decided to create a self-contained Caffe+GoogLeNet+Deepdream Docker image which has
everything you need to generate your own deepdream art. In order to make the Docker image very
portable, it uses the CPU version of Caffe and comes bundled with the GoogLeNet model.
The compilation procedure was done on Docker Hub and for advanced users, the final image can be
pulled down via:
docker pull visionai/clouddream
The docker image is 2.5GB, but it contains a precompiled version of Caffe, all of the python
dependencies, as well as the pretrained GoogLeNet model.
For those of you who are new to Docker, I hope you will pick up some valuable engineering skills
and tips along the way. Docker makes it very easy to bundle complex software. If you're somebody
like me who likes a clean Mac OS X on a personal laptop, and do the heavy-lifting in the cloud,
then read on.
Instructions
We will be monitoring the inputs directory for source images and dumping results into the
outputsdirectory. Nginx (also inside a Docker container) will be used to serve the resulting files
and a simple AngularJS GUI to render the images in a webpage.
Prerequisite:
You've launched a Cloud instance using a VPS provider like DigitalOcean and this instance has
Docker running. If you don't know about DigitalOcean, then you should give them a try. You can
lauch a Docker-ready cloud instance in a few minutes. If you're going to set up a new DigitalOcean
account, consider using my referral link: https://www.digitalocean.com/?refcode=64f90f652091.
Will need an instance with at least 1GB of RAM for processing small output images.
Let's say our cloud instance is at the address 1.2.3.4 and we set it up so that it contains our SSH key
for passwordless log-in.
ssh root@1.2.3.4
git clone https://github.com/VISIONAI/clouddream.git
cd clouddream
./start.sh
You should see three running containers: deepdream-json, deepdream-compute, and deepdreamfiles
root@deepdream:~/clouddream# docker ps
CONTAINER ID
IMAGE
COMMAND
CREATED
STATUS
PORTS
NAMES
21d495211abf
ubuntu:14.04
"/bin/bash -c 'cd /o
7 minutes ago
Up 7 minutes
deepdream-json
7dda17dafa5a
visionai/clouddream
"/bin/bash -c 'cd /o
7 minutes ago
Up 7 minutes
deepdream-compute
010427d8c7c2
nginx
"nginx -g 'daemon of
7 minutes ago
Up 7 minutes
0.0.0.0:80->80/tcp, 443/tcp
deepdream-files
If you want to jump inside the container to debug something, just run:
./enter.sh
cd /opt/deepdream
python deepdream.py
#This will take input.jpg, run deepdream, and write output.jpg
So I simply paste the last three lines (the ones starting with export) right into the terminal.
export DOCKER_TLS_VERIFY=1
export DOCKER_HOST=tcp://192.168.59.103:2376
export DOCKER_CERT_PATH=/Users/tomasz/.boot2docker/certs/boot2docker-vm
And then visit the http://1.2.3.4:8000 URL to see the frames show up as they are being
And instead of showing random N images, you can view the latest images:
http://1.2.3.4/#/?latest
Here is a screenshot of what things should look like when using the 'conv2/3x3' setting:
Additionally, you can browse some more cool images on the deepdream.vision.ai server, which I've
currently configured to run deepdream through some Dali art. When you go to the page, just hit
refresh to see more goodies.
"maxwidth" : 400,
"layer" : "inception_4c/output"
You can change maxwidth to something larger like 1000 if you want big output images for big
input images, remeber that will you need more RAM memory for processing lager images. For
testingmaxwidth of 200 will give you results much faster. If you change the settings and want to
regenerate outputs for your input images, simply remove the contents of the outputs directory:
rm deepdream/outputs/*
Possible values for layer are as follows. They come from the tmp.prototxt file which lists the
layers of the GoogLeNet network used in this demo. Note that the ReLU and Dropout layers are not
valid for deepdreaming.
Just a few days ago, the Google Research blog published a post demonstrating a unique, interesting,
and perhaps even disturbing method to visualize whats going inside the layers of a Convolutional
Neural Network (CNN).
Note: Before you go, I suggest taking a look at the images generated using bat-country most of
them came out fantastic, especially the Jurassic Park images.
Their approach works by turning the CNN upside down, inputting an image, and gradually
tweaking the image to what the network thinks a particular object or class looks like.
The results are breathtaking to say the least. Lower levels reveal edge-like regions in the images.
Intermediate layers are able to represent basic shapes and components of objects (doorknob, eye,
nose, etc.). And lastly, the final layers are able to form the complete interpretation (dog, cat, tree,
etc.) and often in a psychedelic, spaced out manner.
Along with their results, Google also published an excellent IPython Notebook allowing you to play
around and create some trippy images of your own.
The IPython Notebook is indeed fantastic. Its fun to play around with. And since its an IPython
Notebook, its fairly easy to get started with. But I wanted to take it a step further. Make it
modular. More customizable. More like a Python modules that acts and behaves like one. And of
course, it has to be pip-installable (youll need to bring your own Caffe installation).
Thats why I put together bat-country, an easy to use, highly extendible, lightweight Python
module for inceptionism and deep dreaming with Convolutional Neural Networks and Caffe.
Comparatively, my contributions here are honestly pretty minimal. All the real research has been
done by Google Im simply taking the IPython Notebook, turning it into a Python module, while
keeping in mind the importance of extensibility, such as custom step functions.
Before we dive into the rest of this post, I would like to take a second and call attention to Justin
Johnsons cnn-vis, a command line tool for generating inceptionism images. His tool is quite
powerful and more like what Google is (probably) using for their own research publications. If
youre looking for a more advanced, complete package, definitely go take a look at cnn-vis. You
also might be interested in Vision.ais co-founder Tomasz Malisiewiczsclouddream docker image
to quickly get Caffe up and running.
But in the meantime, if youre interested in playing around with a simple, easy to use Python
package, go grab the source from GitHub or install it via pip install bat-country
The rest of this blog post is organized as follows:
A simple example. 3 lines of code to generate your own deep dream/inceptionism images.
Requirements. Libraries and packages required to run bat-country (mostly just Caffe and its
associated dependencies).
Whats going on under the hood? The anatomy of bat-country and how to extend it.
Show and tell. If there is any section of this post that you dont want to miss, its this one. I have
put together a gallery of some really awesome images generated I generated over the weekend using
bat-country . The results are quite surreal, to say the least.
A simple example.
As I mentioned, one of the goals of bat-country is simplicity. Provided you have already installed
Caffe and bat-country on your system, it only takes 3 lines of Python code to generate a deep
dream/inceptionism image:
bat-country: an extendible, lightweight Python package for deep dreaming with Caffe and
Convolutional Neural Networks
Python
# we can't stop here...
bc =
1 BatCountry("caffe/models/bv
2 lc_googlenet")
3 image =
4 bc.dream(np.float32(Image.o
pen("/path/to/image.jpg")))
bc.cleanup()
After executing this code, you can then take the image returned by the dream method and write it
to file:
bat-country: an extendible, lightweight Python package for deep dreaming with Caffe and
Convolutional Neural Networks
Python
1 result = Image.fromarray(np.uint8(image))
2 result.save("/path/to/output.jpg")
And thats it! You can see the view source code of demo.py here on GitHub.
Requirements.
The bat-country packages requires Caffe, an open-source CNN implementation from Berkeley, to
be already installed on your system. This section will detail the basic steps to get Caffe setup on
your system. However, an excellent alternative is to use the Docker image provided by Tomasz of
Vision.ai. Using the Docker image will get you up and running quite painlessly. But for those who
would like their own install, keep reading.
bat-country: an extendible, lightweight Python package for deep dreaming with Caffe and
Convolutional Neural Networks
Shell
$ cd $CAFFE_ROOT
$
1
./scripts/download_mode
2
l_binary.py
models/bvlc_googlenet/
bat-country: an extendible, lightweight Python package for deep dreaming with Caffe and
Convolutional Neural Networks
Python
1 def custom_step(net, step_size=1.25, end="inception_4c/output",
2 jitter=48, clip=True):
3 src = net.blobs["data"]
4 dst = net.blobs[end]
5
6 ox, oy = np.random.randint(-jitter, jitter + 1, 2)
7 src.data[0] = np.roll(np.roll(src.data[0], ox, -1), oy, -2)
8
9 net.forward(end=end)
10 dst.diff[:] = dst.data
11 net.backward(start=end)
12 g = src.diff[0]
13
14 src.data[:] += step_size / np.abs(g).mean() * g
15 src.data[0] = np.roll(np.roll(src.data[0], -ox, -1), -oy, -2)
16
17 if clip:
18 bias = net.transformer.mean["data"]
19 src.data[:] = np.clip(src.data, -bias, 255 - bias)
20
21 image = bc.dream(np.float32(Image.open("image.jpg")),
22 step_fn=custom_step)
Again, this is just a demonstration of implementing a custom step function and not meant to be
anything too exciting.
You can also override the default preprocess and deprocess functions by passing in a custom
preprocess_fn and deprocess_fn to dream :
bat-country: an extendible, lightweight Python package for deep dreaming with Caffe and
Convolutional Neural Networks
Shell
def custom_preocess(net, img):
# do something interesting here...
1
pass
2
3
def custom_deprocess(net, img):
4
# do something interesting here...
5
pass
6
7
image =
8
bc.dream(np.float32(Image.open("image
9
.jpg")),
10
preprocess_fn=custom_preocess,
deprocess_fn=custom_deprocess)
Finally, bat-country also supports visualizing each octave, iteration, and layer of the network:
bat-country: an extendible, lightweight Python package for deep dreaming with Caffe and
1
2
3
4
5
6
7 for (k, vis) in visualizations:
8 outputPath = "{}/{}.jpg".format(args.vis, k)
9 result = Image.fromarray(np.uint8(vis))
result.save(outputPath)
To see the full demo_vis.py script on GitHub, just click here.