Beruflich Dokumente
Kultur Dokumente
Published
March 14, 2018
Skip to content
Tutorials
Unity
Shader
Python
Arduino
Machine Learning
Maths for Gamedev
After a theoretical introduction, this course will focus on how to make the most
out of the popular applications FakeApp and faceswap; most of the deepfakes you
will find online (such as the ones featuring Nicolas Cage) have been created using
them.
In his article from 2016, Face Swap using OpenCV, author Satya Mallick showed how
to swap faces programmatically, warping and colour correcting Ted Cruz�s face to
fit Donald Trump (below).
When applied correctly, this technique is uncannily good at swapping faces. But it
has a major disadvantage: it only works on pre-existing pictures. It cannot, for
instance, morph Donald Trump�s face to match the expression of Ted Cruz.
That has changed in late 2017, when a new approach to face-swap has appeared on
Reddit. Such a breakthrough relies on neural networks, computational models that
are loosely inspired by the way real brains process information. This novel
technique allows generating so-called deepfakes, which actually morph a person�s
face to mimic someone else�s features, although preserving the original facial
expression.
When used properly, this technique allows the creation of photorealistic videos at
an incredibly low cost. The finale of Rogue One, for instance, featured a digital
version of Princess Leia; a very expensive scene which required the expertise of
many people. Below, you can see a comparison between the original scene and another
one recreated using Deep Learning.
Creating Deepfakes
At the moment there are two main applications used to create deepfakes: FakeApp and
faceswap. Regardless of which one you will use, the process is mostly the same, and
requires three steps: extraction, training and creation.
Extraction
The deep- in deepfakes comes from the fact that this face-swap technology uses Deep
Learning. If you are familiar with the concept, you should know that deep learning
often requires large amounts of data. Without hundreds (if not thousands!) of face
pictures, you will not be able to create a deepfake video.
A way to get around this is to collect a number of video clips which feature the
people you want to face-swap. The extraction process refers to the process of
extracting all frames from these video clips, identifying the faces and aligning
them.
The alignment is critical, since the neural network that performs the face swap
requires all faces to have the same size (usually 256�256 pixels) and features
aligned. Detecting and aligning faces is a problem that is considered mostly
solved, and is done by most applications very efficiently.
Training
Training is a technical term borrowed from Machine Learning. In this case, it
refers to the process which allows a neural network to convert a face into another.
Although it takes several hours, the training phase needs to be done only once.
Once completed, it can convert a face from person A into person B.
This is the most obscure part of the entire process, and I have dedicated two posts
to explain how it works from a technical point of view: An Introduction to Neural
Networks and Autoencoders and Understanding the Technology Behind DeepFakes). If
you really want to create photorealistic deepfakes, a basic understanding of the
process that generates them is necessary.
Creation
Once the training is complete, it is finally time to create a deepfake. Starting
from a video, all frames are extracted and all faces are aligned. Then, each one is
converted using the trained neural network. The final step is to merge the
converted face back into the original frame. While this sounds like an easy task,
it is actually where most face-swap applications go wrong.
The creation process is the only one which does not use any Machine Learning. The
algorithm to stitch a face back onto an image is hard-coded, and lacks the
flexibility to detect mistakes.
Also, each frame is processed independently; there is no temporal correlation
between them, meaning that the final video might have some flickering. This is the
part where more research is needed. If you are using faceswap instead of FakeApp,
have a look at df which tries to improve the creation process.
Conclusion
Deep Learning has made photorealistic face-swap not just possible, but also
accessible. This technique is still in its infancy and many more improvements are
expected to happen in the next few years.
In the meantime, places like the FakeApp forum or the fakeapp GitHub page are where
most of the technical discussion around deepfakes is currently taking place. The
community around deepfakes is constantly exploring new approaches, and developers
are often very willing to share their creations. This is the case of user
ZeroCool22 which created a deepfake video of Jimmy Fallon interviewing himself.
It cannot be denied that deepfakes have finally shown the world a practical
application of Deep Learning. However, this very technique has often been used
without the explicit consent of the people involved. While this is unlikely to be
an issue with videos such as the ones shown in this article, the same cannot be
said when it is used to create pornographic content. This is why, before showing
how to create deepfakes, the next lecture in this online course will focus entirely
on the legal and ethical issues of deepfakes.
?? This course was made possible thanks to the support of patrons on Patreon
?? Stay updated
A new tutorial is released every week.
Patreon
Paypal
PayPal � The safer, easier way to pay online.
Ko-fi Buy Me a Coffee
Write a Comment
RELATED CONTENT BY TAGAUTOENCODERDEEPFAKESFACE-SWAPFAKEAPPMACHINE LEARNINGNEURAL
NETWORKSPYTHON
Independent Publisher empowered by WordPress
in machine learning, science, tutorial
An Introduction to Neural Networks and Autoencoders
You can read all the posts in this series here:
Neural networks are computational system loosely inspired by the way in which the
brain processes information. Special cells called neurons are connected to each
other in a dense network (below), allowing information to be processed and
transmitted.
In Computer Science, artificial neural networks are made out of thousands of nodes,
connected in a specific fashion. Nodes are typically arranged in layers; the way in
which they are connected determines the type of the network and, ultimately, its
ability to perform a certain computational task over another one. A traditional
neural network might look like this:
Each node (or artificial neuron) from the input layer contains a numerical value
that encodes the input we want to feed to the network. If we are trying to predict
the weather for tomorrow, the input nodes might contain the pressure, temperature,
humidity and wind speed encoded as numbers in the range \left[-1,+1\right]. These
values are broadcasted to the next layer; the interesting part is that each edge
dampens or amplifies the values it transmits. Each node sums all the values it
receives, and outputs a new one based on its own function. The result of the
computation can be retrieved from the output layer; in this case, only one value is
produced (for instance, the probability of rain).
When images are the input (or output) of a neural network, we typically have three
input nodes for each pixel, initialised with the amount of red, green and blue it
contains. The most effective architecture for image-based applications so far is
convolutional neural network (CNN), and this is exactly what Deep Fakes is using.
Training a neural network means finding a set of weights for all edges, so that the
output layer produces the desired result. One of the most used technique to achieve
this is called backpropagation, and it works by re-adjusting the weights every time
the network makes a mistake.
The basic idea behind face detection and image generation is that each layer will
represent progressively core complex features. In the case of a face, for instance,
the first layer might detect edges, the second face features, which the third layer
is able to use to detect images (below):
In reality, what each layer responds to is far from being that simple. This is why
Deep Dreams have been originally used as a mean to investigate how and what
convolutional neural networks learn.
Autoencoders
Neural networks come in all shapes and sizes. And is exactly the shape and size
that determine the performance of the network at solving a certain problem. An
autoencoder is a special type of neural network which objective is to match the
input that was provided with. At a first glance, autoencoders might seem like
nothing more than a toy example, as they do not appear to solve any real problem.
Let�s have a look at the network below, which features two fully connected hidden
layers, with four neurons each.
However, something interesting happens if one of the layers features fewer nodes
(diagram below). In this case, the input values cannot be simply connected to their
respective output nodes. In order to succeed at this task, the autoencoder has to
somehow compress the information provided and to reconstruct it before presenting
it as its final output.
If the training is successful, the autoencoder has learned how to represents the
input values in a different, yet more compact form. The autoencoder can be
decoupled into two separate networks: an encoder and a decoder, both sharing the
layer in the middle. The values \left[Y_0, Y_1\right] are often referred to as base
vector, and they represent the input image in the so-called latent space.
Autoencoders are naturally lossy, meaning that they will not be able to reconstruct
the input image perfectly. This can be seen in the comparison below, taken from
Building Autoencoders in Keras. The first row shows random images that have been
fed, one by one, to a trained autoencoder. The row just below shows how they have
been reconstructed by the network.
However, because the autoencoder is forced to reconstruct the input image as best
as it can, it has to learn how to identify and to represents its most meaningful
features. Because the smaller details are often ignored or lost, an autoencoder can
be used to denoise images (as seen below). This works very well because the noise
does not add any real information, hence the autoencoder is likely to ignore it
over more important features.
Conclusion
The next post in this series will explain how autoencoders can be used to
reconstruct faces.
You can read all the posts in this series here:
Patreon
Paypal
PayPal � The safer, easier way to pay online.
Ko-fi Buy Me a Coffee
Write a Comment
Jon
April 12, 2018
Thanks for the post. It�s my first glimpse of what is �under the hood� of neural
networks. Forgive my simplistic interpretation, but to me it looks like a set of
variables (call it an array) are tested against a set of conditions (call it
another array) with the number of possible permutations being of a factorial
enormity. The network then compares? each output test and if its a good one, stores
it somehow. What I don�t understand is how that stored �good result� is used to
better inform or direct the continuing testing.
Reply to Jon
Alan Zucconi
April 17, 2018
Hi Jon!
The �numbers� that the neural network stores are the �weights�, which are
represented by the arrows. Each neuron sums the value of the neurons connects to
its left, multiplied by the values that are stored in the arrows.
So yes, neural networks are, in their most simple variant, just sums and
multiplications. The trick is to find the best set of weights so that the neural
network produces the result we want. While executing a neural network is very easy
and straightforward, finding the right balance for the weight is a very challenging
task. The �standard� algorithm used is called �back propagation�. You start with
random weights, and check how poorly the network performs. Then, you use this error
to �fix� the weights so that the overall networks performs slightly better. If you
repeat this millions of times, chances are you�ll converge to a good result.
I would advice having a look at this video, which probably does a better job at
visualising neural networks and showing how back propagation works.
https://www.youtube.com/watch?v=aircAruvnKk
Reply to Alan
Jon Fink
April 29, 2018
Thanks for the stripped down summary and the follow up references. Your brief
response gave me more insight Than the subsequent four hour of videos I trawled
through, learning about the significance of the cosine function and calculus in
improving the weight of each neuron. its a vary apt analogy. Each level of
calculations improves the relative worth of each branch of nodes towards the goal
of a more successful outcome, I use branch in place of the term nodes as you can
clearly see the pathways that lead through each level. The AI approach seems more
efficient than brute force random permutations. I feel that the science would
benefit from a closer look a cognitive studies. It sort of does, but give the AI is
given more guidance at the earlier stages it may produce even better results.I
don�t how that could be achieved mathematically, its just a thought.
Reply to Jon
RELATED CONTENT BY TAGAUTOENCODERDEEPFAKESFACE-SWAPFAKEAPPMACHINE LEARNINGNEURAL
NETWORKSPYTHON
Independent Publisher empowered by WordPress
Despite its success, one of the most recurring criticisms the game has faced is
related to the apparent simplicity of its execution. With Return of the Obra Dinn,
Lucas Pope clears any doubt with a game that, by itself, is nothing less than an
achievement in technical excellence.
Most of the content from this week�s Shader Showcase Saturday comes directly from
the Return of the Obra Dinn [DevLog], which Lucas Pope himself started back in
2014. The aesthetics of the game heavily relies on an effect called dithering,
which allows representing greyscale images with only two colours: black and white.
If you are only interested in a quick way to add dithering to your game, I would
suggest trying Stylizer � Basic. The asset comes with several other effects
available, all designed to add a retro vibe to your scenes.
A significant amount of work was put into making sure that the dithering effect had
temporal consistency between frames. This was achieved by remapping the dithering
effect on a sphere.
Lastly, Return of the Obra Dinn features very crisp lines. That is done thanks to
an edge detection post-processing effect, which will not be discussed today. This
blog will soon dedicate an entire tutorial on the topic, inspired by the
exceptional aesthetics of Sable, which is based on a similar technique.
Dithering
Dithering has been very pioneered by the printing industry, which traditionally
used black ink on white paper. The most simple technique to render images in black
and white is to use a threshold, so that all pixels darker than a certain value are
coloured black. As seen in the image below, thresholding yields very poor results:
Since dithering was first used, there have been countless implementations, each one
with its specific advantages and limitations. The most used technique nowadays is
the Floyd-Steinberg dithering, firstly introduced by Robert W. Floyd and Louis
Steinberg in 1976. You can see how well it performs below.
Dithering, however, has two main problems. The first one is that is a non-local
effect. This means that, conversely to thresholding, it cannot be executed on a
per-pixel basis. The Floyd-Steinberg dithering requires to propagate the
computation to nearby pixels; a process that is notoriously tricky to do via
shaders.
If you need more control over your effects, you can try the extended version of
Stylizer � Basic, unsurprisingly called Stylizer � Extended.
Lookup Texture
A cheaper solution is to use a lookup texture, which can be sampled in world-space
based on an object reflectance (often indicated as NdotL). The result can be pretty
effective, although is prone to inaccuracies. Ciro Continisio has implemented this
technique in ShaderGraph (below) to simulate a hatching effect.
Another shader experiment in ShaderGraph. I like this one, it's taking shape! Still
loads to do, but I like the direction. pic.twitter.com/ZtmnkTjFtB
It is also possible to sample the texture in UV space, which the hatching follows
the curvature of the 3D models. The technique is discussed in Real-Time Hatching,
where Emil Praun and his colleagues relied on mip-mapped lookup textures which they
referred to as tonal art map (below).
Technical Artist Kyle Halladay wrote an interesting tutorial on that very paper,
titled A Pencil Sketch Effect, later improved by Ben Golus.
Ordered Dithering
A more advanced technique to perform real dithering requires a post-processing
effect. Ordered dithering decides if a pixel should be black or white by analysing
its local neighbourhood. Such an operation requires to sample the texture several
times, but is highly parallelisable and works relatively well in a shader. A
specific implementation of ordered dithering uses a matrix to governs how nearby
pixels should be taken into account, often referred to as Bayer matrix.
A variant of this technique has been used in Return of the Obra Dinn. As explained
in his devlog, the game initially adopted Bayer dithering. Lucas Pope later
developed its own variant (referred to as blue noise dithering) because it survived
much better to both screen scaling and video compression.
Temporal Dithering
One massive problem with ordered dithering is that each frame is calculated
independently from the previous one. This lack of temporal correlation causes the
dither pattern to move erratically. This has a negative impact not only on the
video compression, but also on the overall playability.
Lucas Pope attempted overcoming these challenges by mapping the dithering effect on
a sphere.
Lucas Pope wrote an exceptionally detailed devlog entry on how such a technique
works, and on all of his earlier attempts. If you are interested, I would strongly
advise to read it.
If you are struggling to replicate those effects, you can try using one of the many
professional assets that are available on the Asset Store. Another great package
for dithering is Graphics Adapter Pro, which offers a variety of post-processing
effects to emulates retro videocards.
?? Stay updated
A new tutorial is released every week.
Patreon
Paypal
PayPal � The safer, easier way to pay online.
Ko-fi Buy Me a Coffee
Write a Comment
RELATED CONTENT BY TAGLUCAS POPEOBRA DINNPAPERS PLEASESHADERSHADER SHOWCASE
SATURDAYUNITY
Independent Publisher empowered by WordPress
Continue reading ?
This post shows how it is possible to find the position of an object in space,
using a technique called trilateration. The traditional approach to this problem
relies on three measurements only. This tutorial addresses how to it is possible to
take into account more measurements to improve the precision of the final result.
This algorithm is robust and can work even with inaccurate measurements.
Introduction
Part 1. Geometrical Interpretation
Part 2. Mathematical Interpretation
Part 3. Optimisation Algorithm
Part 4. Additional Considerations
Conclusion
If you are unfamiliar with the concepts of latitude and longitude, I suggest you
read the first post in this series: Understanding Geographical Coordinates.
Continue reading ?
This series introduces the concept of trilateration. This technique can be applied
to a wide range of problems, from indoor localisation to earthquake detection. This
first post provides a general introduction to the concept of geographical
coordinates, and how they can be effectively manipulated. The second post in the
series, Positioning and Trilateration, will cover the actual techniques used to
identify the position of an object given independent distance readings. Most
trilateration tutorials require the measures from the sensors to be precise and
consistent. The approach here presented, instead, is highly robust and can tolerate
inaccurate readings.
Introduction
Part 1. Understanding Geographical Coordinates
Part 2. Local Geographical Distance
Part 3. Great-Circle Distance
Conclusion
Continue reading ?
The previous post in this series, Understanding Deep Dreams, explained what deep
dreams are, and what they can be used for. In this second post you�ll learn how to
create them, with a step by step guide.
Introduction
The Instructions
The Code
Conclusion
Continue reading ?
Despite being a very serious language, Python is full of Easter eggs and hidden
references. This post shows the top 5:
Hello World�
The Zen of Python
Antigravity
C-Style braces instead of indentation
Monthy Python references
I have covered the 5 most interesting features of Python in this post. Continue
reading ?
Introduction
Part 1. Implementation
Part 2. The yield Statement
Part 3. Limitations
Conclusion
Introduction
If you are familiar with C#, chances are you might have used the List class. Like
most of the modern data structures available in .NET, the elements within a List
can be iterated in many way. The most common uses a for loop and and index i to
access the elements sequentially.
Implementation
Classes that can be iterated using foreach make such a behaviour possible by
implementing the IEnumerable interface (MSDN). Inside, it contains a method called
GetEnumerator that must be used to create and return an iterator. Like the name
suggests, iterators are data structures that can be iterated upon. They implement
the interface IEnumerator (MSDN), which provides an API to iterate over a sequence
of element. IEnumerator contains:
MoveNext: A method that forces the iterator to fetch the next element in the list.
It returns true if there is a next element; false if the sequence has terminated.
Current: This getter is used to return the current element of the iterator.
After having understood how an iterator class is implemented, it�s easy to see how
those two code snippets are equivalent:
// Foreach
foreach (int i in list)
Debug.Log(i);
// Iterator
IEnumerator<int> iterator = list.GetEnumerator();
while (iterator.MoveNext())
{
int n = iterator.Current;
Debug.Log(n);
}
1
2
3
4
5
6
7
8
9
10
11
// Foreach
foreach (int i in list)
Debug.Log(i);
// Iterator
IEnumerator<int> iterator = list.GetEnumerator();
while (iterator.MoveNext())
{
int n = iterator.Current;
Debug.Log(n);
}
The yield Statement
Iterators are awesome. However, they are a pain to code. Recording the position of
the current object and moving it at each subsequent call of MoveNext is not the
most natural way to iterate over a sequence. This is why C# allows a more compact
way to write classes that are compatible with foreach. This is done thanks to a new
keyword: yield.
Let�s imagine that we want to create an iterator that produces the elements , 1, 2,
and 3. We can either create a class that implements the IEnumerable interface,
instancing an IEnumerator that uses MoveNext and Current to produce the desired
list.
IEnumerator<int> OurNewEnumerator ()
{
yield return 0;
yield return 1;
yield return 2;
yield return 3;
}
1
2
3
4
5
6
7
IEnumerator<int> OurNewEnumerator ()
{
yield return 0;
yield return 1;
yield return 2;
yield return 3;
}
The compiler will take this piece of code and convert it in a proper IEnumerator.
With this new syntax, is incredibly easy to loop over the object the sequence:
Conclusion
This post introduced the concept of foreach loop, as a safer and more elegant
approach to traditional index-based for loops. To sum up:
IEnumerator it = �
try {
}
finally {
if (it is IDisposable)
((IDisposable)it).Dispose();
}
Reply to 0xmarcin
Alan Zucconi
January 25, 2017
Oh thank you so much for the clarification!
I tried to make this as simple as possible. There are few aspects I have
intentionally ignore such as Reset and stuff. :p I didn�t want to get too many
concepts in the same tutorial! :p
Reply to Alan
Tom
January 25, 2017
Where did �bat� come from for �bat.Current�?
Reply to Tom
Alan Zucconi
January 25, 2017
Ooops!
Thank you for spotting that!
That was form a previous version of that snippet code!
Reply to Alan
Sylvain
January 26, 2017
You website feeds don�t work anymore. Well at least on Feedly they don�t
Reply to Sylvain
Alan Zucconi
January 26, 2017
Oh not sure why, sorry! ??
Reply to Alan
rich
June 5, 2017
can you talk a little about the memory implications of use foreach? thanks
Reply to rich
Alan Zucconi
June 5, 2017
Hey!
I didn�t want to go too much into those details for a very simple reasons.
Sometimes developers get very worried about micro-optimisations (such as foreach vs
for), and ending up writing much more code and losing the big picture. Since this
was a fairly basic tutorial, I didn�t want to scare developers too much!
I will talk about it though, don�t worry! ??
Reply to Alan
Bartosz Olszewski
August 3, 2017
�Indices like i and j can be easily swapped by mistake, and edge conditions are
sometimes hard to get right on the first try.�
That�s why I prefer using names like �enemyIndex� or �buttonIndex�. They are longer
than single-letter ones, but it is harder to make a mistake, and the code is easier
to read.
Reply to Bartosz
Write a Comment
WEBMENTIONS
Tutorial Series - Alan Zucconi August 3, 2017
�Indices like i and j can be easily swapped by mistake, and edge conditions are
sometimes hard to get right on the first try.�
That�s why I prefer using names like �enemyIndex� or �buttonIndex�. They are longer
than single-letter ones, but it is harder to make a mistake, and the code is easier
to read.