Beruflich Dokumente
Kultur Dokumente
Karel Ha
article by Google DeepMind
spam filters
1
Applications of AI
spam filters
recommender systems (Netflix, YouTube)
1
Applications of AI
spam filters
recommender systems (Netflix, YouTube)
predictive text (Swiftkey)
1
Applications of AI
spam filters
recommender systems (Netflix, YouTube)
predictive text (Swiftkey)
audio recognition (Shazam, SoundHound)
1
Applications of AI
spam filters
recommender systems (Netflix, YouTube)
predictive text (Swiftkey)
audio recognition (Shazam, SoundHound)
self-driving cars
1
Artistic-Style Painting (1/2)
[1] Gatys, Ecker, and Bethge 2015 [2] Li and Wand 2016 2
Artistic-Style Painting (1/2)
[1] Gatys, Ecker, and Bethge 2015 [2] Li and Wand 2016 2
Artistic-Style Painting (2/2)
Champandard 2016 3
C Code Generated Character by Character
Karpathy 2015 4
Algebraic Geometry Generated Character by Character
Karpathy 2015 5
Game of Thrones Generated Character by Character
http://pjreddie.com/darknet/rnns-in-darknet/ 5
Game of Thrones Generated Character by Character
JON
http://pjreddie.com/darknet/rnns-in-darknet/ 5
Game of Thrones Generated Character by Character
JON
Darknet (on OS X)
http://pjreddie.com/darknet/rnns-in-darknet/ 5
DeepDrumpf: a Twitter bot / neural network which learned
the language of Donald Trump from his speeches
Hayes 2016 6
DeepDrumpf: a Twitter bot / neural network which learned
the language of Donald Trump from his speeches
We’ve got nuclear weapons that are obsolete. I’m going to create jobs just by making the worst thing ever.
Hayes 2016 6
DeepDrumpf: a Twitter bot / neural network which learned
the language of Donald Trump from his speeches
We’ve got nuclear weapons that are obsolete. I’m going to create jobs just by making the worst thing ever.
The biggest risk to the world, is me, believe it or not.
Hayes 2016 6
DeepDrumpf: a Twitter bot / neural network which learned
the language of Donald Trump from his speeches
We’ve got nuclear weapons that are obsolete. I’m going to create jobs just by making the worst thing ever.
The biggest risk to the world, is me, believe it or not.
I am what ISIS doesn’t need.
Hayes 2016 6
DeepDrumpf: a Twitter bot / neural network which learned
the language of Donald Trump from his speeches
We’ve got nuclear weapons that are obsolete. I’m going to create jobs just by making the worst thing ever.
The biggest risk to the world, is me, believe it or not.
I am what ISIS doesn’t need.
I’d like to beat that @HillaryClinton. She is a horror. I told my supporter Putin to say that all the time. He
has been amazing.
Hayes 2016 6
DeepDrumpf: a Twitter bot / neural network which learned
the language of Donald Trump from his speeches
We’ve got nuclear weapons that are obsolete. I’m going to create jobs just by making the worst thing ever.
The biggest risk to the world, is me, believe it or not.
I am what ISIS doesn’t need.
I’d like to beat that @HillaryClinton. She is a horror. I told my supporter Putin to say that all the time. He
has been amazing.
I buy Hillary, it’s beautiful and I’m happy about it.
Hayes 2016 6
Atari Player by Google DeepMind
https://youtu.be/0X-NdPtFKq0?t=21m13s
Mnih et al. 2015 7
https://xkcd.com/1002/ 7
Heads-up Limit Holdem Poker Is Solved!
Cepheus http://poker.srv.ualberta.ca/
0.000986 big blinds per game on expectation
Bowling et al. 2015 8
Basics of Machine Learning
https://dataaspirant.com/2014/09/19/supervised-and-unsupervised-learning/ 8
Supervised Learning (SL)
http://www.nickgillian.com/ 9
Supervised Learning (SL)
1. data collection: Google Search, Facebook “Likes”, Siri, Netflix, YouTube views, LHC collisions, KGS Go
Server...
http://www.nickgillian.com/ 9
Supervised Learning (SL)
1. data collection: Google Search, Facebook “Likes”, Siri, Netflix, YouTube views, LHC collisions, KGS Go
Server...
2. training on training set
http://www.nickgillian.com/ 9
Supervised Learning (SL)
1. data collection: Google Search, Facebook “Likes”, Siri, Netflix, YouTube views, LHC collisions, KGS Go
Server...
2. training on training set
3. testing on testing set
http://www.nickgillian.com/ 9
Supervised Learning (SL)
1. data collection: Google Search, Facebook “Likes”, Siri, Netflix, YouTube views, LHC collisions, KGS Go
Server...
2. training on training set
3. testing on testing set
4. deployment
http://www.nickgillian.com/ 9
Supervised Learning (SL)
1. data collection: Google Search, Facebook “Likes”, Siri, Netflix, YouTube views, LHC collisions, KGS Go
Server...
2. training on training set
3. testing on testing set
4. deployment
http://www.nickgillian.com/ 9
Supervised Learning (SL)
1. data collection: Google Search, Facebook “Likes”, Siri, Netflix, YouTube views, LHC collisions, KGS Go
Server...
2. training on training set
3. testing on testing set
4. deployment
http://www.nickgillian.com/ 9
Regression
9
Regression
9
Mathematical Regression
https://thermanuals.wordpress.com/descriptive-analysis/sampling-and-regression/
10
Classification
https://kevinbinz.files.wordpress.com/2014/08/ml-svm-after-comparison.png 11
Underfitting and Overfitting
https://www.researchgate.net/post/How_to_Avoid_Overfitting 12
Underfitting and Overfitting
Beware of overfitting!
https://www.researchgate.net/post/How_to_Avoid_Overfitting 12
Underfitting and Overfitting
Beware of overfitting!
It is like learning for a mathematical exam by memorizing proofs.
https://www.researchgate.net/post/How_to_Avoid_Overfitting 12
Reinforcement Learning (RL)
https://youtu.be/0X-NdPtFKq0?t=16m57s 13
Reinforcement Learning (RL)
https://youtu.be/0X-NdPtFKq0?t=16m57s 13
Monte Carlo Tree Search
Tree Search
16
Neural networks
Neural Networks (NN): Inspiration
http://pages.cs.wisc.edu/~bolo/shipyard/neural/local.html 17
Neural Networks (NN): Inspiration
http://pages.cs.wisc.edu/~bolo/shipyard/neural/local.html 17
Neural Networks (NN): Inspiration
http://pages.cs.wisc.edu/~bolo/shipyard/neural/local.html 17
Neural Networks (NN): Inspiration
http://pages.cs.wisc.edu/~bolo/shipyard/neural/local.html 17
Neural Networks (NN): Inspiration
Dieterle 2003 18
Neural Networks: Modes
Two modes
Dieterle 2003 18
Neural Networks: Modes
Two modes
feedforward for making predictions
Dieterle 2003 18
Neural Networks: Modes
Two modes
feedforward for making predictions
backpropagation for learning
Dieterle 2003 18
Neural Networks: an Example of Feedforward
http://stevenmiller888.github.io/mind-how-to-build-a-neural-network/ 19
Gradient Descent in Neural Networks
http://pages.cs.wisc.edu/~bolo/shipyard/neural/local.html 20
Gradient Descent in Neural Networks
http://pages.cs.wisc.edu/~bolo/shipyard/neural/local.html 20
Gradient Descent in Neural Networks
http://code.flickr.net/2014/10/20/introducing-flickr-park-or-bird/ 21
(Deep) Convolutional Neural Networks
The hierarchy of concepts is captured in the number of layers: the deep in “Deep Learning”.
http://pages.cs.wisc.edu/~bolo/shipyard/neural/local.html 22
(Deep) Convolutional Neural Networks
The hierarchy of concepts is captured in the number of layers: the deep in “Deep Learning”.
http://pages.cs.wisc.edu/~bolo/shipyard/neural/local.html 22
Rules of Go
Backgammon: Man vs. Fate
22
Backgammon: Man vs. Fate
22
Go: Man vs. Self
Robert Šámal (White) versus Karel Král (Black), Spring School of Combinatorics 2016 22
Rules of Go
23
Rules of Go
23
Rules of Go
23
Rules of Go
23
Rules of Go
23
Rules of Go
https://en.wikipedia.org/wiki/Go_(game) 24
Scoring Rules: Area Scoring
https://en.wikipedia.org/wiki/Go_(game) 24
Scoring Rules: Area Scoring
https://en.wikipedia.org/wiki/Go_(game) 24
Scoring Rules: Area Scoring
https://en.wikipedia.org/wiki/Go_(game) 24
Ranks of Players
https://en.wikipedia.org/wiki/Go_(game) 25
Ranks of Players
https://en.wikipedia.org/wiki/Go_(game) 25
Chocolate micro-break
25
AlphaGo: Inside Out
Policy and Value Networks
Results:
Results:
Results:
Results:
Beware of overfitting!
Beware of overfitting!
Beware of overfitting!
Beware of overfitting!
Beware of overfitting!
Move number is the number of moves that had been played in the given position.
Move number is the number of moves that had been played in the given position.
Move number is the number of moves that had been played in the given position.
1. selection phase
1. selection phase
2. expansion phase
1. selection phase
2. expansion phase
3. evaluation phase
1. selection phase
2. expansion phase
3. evaluation phase
4. backup phase (at end of all simulations)
1. selection phase
2. expansion phase
3. evaluation phase
4. backup phase (at end of all simulations)
1. selection phase
2. expansion phase
3. evaluation phase
4. backup phase (at end of all simulations)
1. selection phase
2. expansion phase
3. evaluation phase
4. backup phase (at end of all simulations)
1. selection phase
2. expansion phase
3. evaluation phase
4. backup phase (at end of all simulations)
1. selection phase
2. expansion phase
3. evaluation phase
4. backup phase (at end of all simulations)
1. selection phase
2. expansion phase
3. evaluation phase
4. backup phase (at end of all simulations)
where bonus
P(s, a)
u(st , a) ∝
1 + N(s, a)
Silver et al. 2016 41
MCTS Algorithm: Expansion
percentage frequency with which actions were selected from the root during simulations
AlphaGo:
40 search threads
AlphaGo:
40 search threads
40 CPUs
AlphaGo:
40 search threads
40 CPUs
8 GPUs
AlphaGo:
40 search threads
40 CPUs
8 GPUs
AlphaGo:
40 search threads
40 CPUs
8 GPUs
AlphaGo:
40 search threads
40 CPUs
8 GPUs
40 search threads
AlphaGo:
40 search threads
40 CPUs
8 GPUs
40 search threads
1202 CPUs
AlphaGo:
40 search threads
40 CPUs
8 GPUs
40 search threads
1202 CPUs
176 GPUs
Silver et al. 2016 47
Elo Ratings for Various Combinations of Threads
https://en.wikipedia.org/wiki/Fan_Hui 50
Fan Hui
professional 2 dan
https://en.wikipedia.org/wiki/Fan_Hui 50
Fan Hui
professional 2 dan
European Go Champion in 2013, 2014 and 2015
https://en.wikipedia.org/wiki/Fan_Hui 50
Fan Hui
professional 2 dan
European Go Champion in 2013, 2014 and 2015
European Professional Go Champion in 2016
https://en.wikipedia.org/wiki/Fan_Hui 50
Fan Hui
professional 2 dan
European Go Champion in 2013, 2014 and 2015
European Professional Go Champion in 2016
biological neural network:
https://en.wikipedia.org/wiki/Fan_Hui 50
Fan Hui
professional 2 dan
European Go Champion in 2013, 2014 and 2015
European Professional Go Champion in 2016
biological neural network:
100 billion neurons
https://en.wikipedia.org/wiki/Fan_Hui 50
Fan Hui
professional 2 dan
European Go Champion in 2013, 2014 and 2015
European Professional Go Champion in 2016
biological neural network:
100 billion neurons
100 up to 1,000 trillion neuronal connections
https://en.wikipedia.org/wiki/Fan_Hui 50
AlphaGo versus Fan Hui
51
AlphaGo versus Fan Hui
51
AlphaGo versus Fan Hui
Fan Hui 51
Lee Sedol “The Strong Stone”
https://en.wikipedia.org/wiki/Lee_Sedol 52
Lee Sedol “The Strong Stone”
professional 9 dan
https://en.wikipedia.org/wiki/Lee_Sedol 52
Lee Sedol “The Strong Stone”
professional 9 dan
the 2nd in international titles
https://en.wikipedia.org/wiki/Lee_Sedol 52
Lee Sedol “The Strong Stone”
professional 9 dan
the 2nd in international titles
the 5th youngest (12 years 4 months) to become
a professional Go player in South Korean history
https://en.wikipedia.org/wiki/Lee_Sedol 52
Lee Sedol “The Strong Stone”
professional 9 dan
the 2nd in international titles
the 5th youngest (12 years 4 months) to become
a professional Go player in South Korean history
Lee Sedol would win 97 out of 100 games against Fan Hui.
https://en.wikipedia.org/wiki/Lee_Sedol 52
Lee Sedol “The Strong Stone”
professional 9 dan
the 2nd in international titles
the 5th youngest (12 years 4 months) to become
a professional Go player in South Korean history
Lee Sedol would win 97 out of 100 games against Fan Hui.
biological neural network comparable to Fan Hui’s (in number
of neurons and connections)
https://en.wikipedia.org/wiki/Lee_Sedol 52
I heard Google DeepMind’s AI is surprisingly
strong and getting stronger, but I am
confident that I can win, at least this time.
Lee Sedol
52
I heard Google DeepMind’s AI is surprisingly
strong and getting stronger, but I am
confident that I can win, at least this time.
Lee Sedol
interview in JTBC
Newsroom
52
I heard Google DeepMind’s AI is surprisingly
strong and getting stronger, but I am
confident that I can win, at least this time.
Lee Sedol
interview in JTBC
Newsroom
52
AlphaGo versus Lee Sedol
https://en.wikipedia.org/wiki/AlphaGo_versus_Lee_Sedol 53
AlphaGo versus Lee Sedol
In March 2016 AlphaGo won 4:1 against the legendary Lee Sedol.
https://en.wikipedia.org/wiki/AlphaGo_versus_Lee_Sedol 53
AlphaGo versus Lee Sedol
In March 2016 AlphaGo won 4:1 against the legendary Lee Sedol.
AlphaGo won all but the 4th game; all games were won
by resignation.
https://en.wikipedia.org/wiki/AlphaGo_versus_Lee_Sedol 53
AlphaGo versus Lee Sedol
In March 2016 AlphaGo won 4:1 against the legendary Lee Sedol.
AlphaGo won all but the 4th game; all games were won
by resignation.
The winner of the match was slated to win $1 million.
https://en.wikipedia.org/wiki/AlphaGo_versus_Lee_Sedol 53
AlphaGo versus Lee Sedol
In March 2016 AlphaGo won 4:1 against the legendary Lee Sedol.
AlphaGo won all but the 4th game; all games were won
by resignation.
The winner of the match was slated to win $1 million.
Since AlphaGo won, Google DeepMind stated that the prize will be
donated to charities, including UNICEF, and Go organisations.
https://en.wikipedia.org/wiki/AlphaGo_versus_Lee_Sedol 53
AlphaGo versus Lee Sedol
In March 2016 AlphaGo won 4:1 against the legendary Lee Sedol.
AlphaGo won all but the 4th game; all games were won
by resignation.
The winner of the match was slated to win $1 million.
Since AlphaGo won, Google DeepMind stated that the prize will be
donated to charities, including UNICEF, and Go organisations.
Lee received $170,000 ($150,000 for participating in all the five
games, and an additional $20,000 for each game won).
https://en.wikipedia.org/wiki/AlphaGo_versus_Lee_Sedol 53
Who’s next?
53
http://www.goratings.org/ (18th April 2016) 53
AlphaGo versus Ke Jie?
https://en.wikipedia.org/wiki/Ke_Jie 54
AlphaGo versus Ke Jie?
professional 9 dan
https://en.wikipedia.org/wiki/Ke_Jie 54
AlphaGo versus Ke Jie?
professional 9 dan
the 1st in (unofficial) world ranking list
https://en.wikipedia.org/wiki/Ke_Jie 54
AlphaGo versus Ke Jie?
professional 9 dan
the 1st in (unofficial) world ranking list
the youngest player to win 3 major international tournaments
https://en.wikipedia.org/wiki/Ke_Jie 54
AlphaGo versus Ke Jie?
professional 9 dan
the 1st in (unofficial) world ranking list
the youngest player to win 3 major international tournaments
head-to-head record against Lee Sedol 8:2
https://en.wikipedia.org/wiki/Ke_Jie 54
AlphaGo versus Ke Jie?
professional 9 dan
the 1st in (unofficial) world ranking list
the youngest player to win 3 major international tournaments
head-to-head record against Lee Sedol 8:2
biological neural network comparable to Fan Hui’s, and thus
by transitivity, also comparable to Lee Sedol’s
https://en.wikipedia.org/wiki/Ke_Jie 54
I believe I can beat it. Machines can be very
strong in many aspects but still have
loopholes in certain calculations.
Ke Jie
54
I believe I can beat it. Machines can be very
strong in many aspects but still have
loopholes in certain calculations.
Ke Jie
Ke Jie
54
I believe I can beat it. Machines can be very
strong in many aspects but still have
loopholes in certain calculations.
Ke Jie
Ke Jie
Ke Jie
54
Conclusion
Difficulties of Go
challenging decision-making
challenging decision-making
intractable search space
challenging decision-making
intractable search space
complex optimal solution
It appears infeasible to directly approximate using a policy or value function!
scalable implementation
scalable implementation
multi-threaded simulations on CPUs
scalable implementation
multi-threaded simulations on CPUs
parallel GPU computations
scalable implementation
multi-threaded simulations on CPUs
parallel GPU computations
distributed version over multiple machines
57
Backup Slides
Input features for rollout and tree policy
move probabilities taken directly from the SL policy network pσ (reported as a percentage if above 0.1%).
action values Q(s, a) for each tree-edge (s, a) from root position s (averaged over value network evaluations only)
Silver et al. 2016
Tree Evaluation from Rollouts
https://youtu.be/vFr3K2DORc8
https://en.wikipedia.org/wiki/AlphaGo_versus_Lee_Sedol
AlphaGo versus Lee Sedol: Game 2 (1/2)
https://youtu.be/l-GsfyVCBu0
https://en.wikipedia.org/wiki/AlphaGo_versus_Lee_Sedol
AlphaGo versus Lee Sedol: Game 2 (2/2)
https://en.wikipedia.org/wiki/AlphaGo_versus_Lee_Sedol
AlphaGo versus Lee Sedol: Game 3
https://youtu.be/qUAmTYHEyM8
https://en.wikipedia.org/wiki/AlphaGo_versus_Lee_Sedol
AlphaGo versus Lee Sedol: Game 4
https://youtu.be/yCALyQRN3hw
https://en.wikipedia.org/wiki/AlphaGo_versus_Lee_Sedol
AlphaGo versus Lee Sedol: Game 5 (1/2)
https://youtu.be/mzpW10DPHeQ
https://en.wikipedia.org/wiki/AlphaGo_versus_Lee_Sedol
AlphaGo versus Lee Sedol: Game 5 (2/2)
https://en.wikipedia.org/wiki/AlphaGo_versus_Lee_Sedol
Further Reading I
AlphaGo:
Atari player: a DeepRL system which combines Deep Neural Networks with Reinforcement Learning (Mnih
et al. 2015)
Neural Turing Machines (Graves, Wayne, and Danihelka 2014)
Artificial Intelligence:
Chess:
Machine Learning:
Neuroscience:
http://www.brainfacts.org/
References I
Allis, Louis Victor et al. (1994). Searching for solutions in games and artificial intelligence. Ponsen & Looijen.
Baudiš, Petr and Jean-loup Gailly (2011). “Pachi: State of the art open source Go program”. In: Advances in
Computer Games. Springer, pp. 24–38.
Bowling, Michael et al. (2015). “Heads-up limit holdem poker is solved”. In: Science 347.6218, pp. 145–149. url:
http://poker.cs.ualberta.ca/15science.html.
Champandard, Alex J (2016). “Semantic Style Transfer and Turning Two-Bit Doodles into Fine Artworks”. In:
arXiv preprint arXiv:1603.01768.
Conway, John Horton (1976). “On Numbers and Games”. In: London Mathematical Society Monographs 6.
Dieterle, Frank Jochen (2003). “Multianalyte quantifications by means of integration of artificial neural networks,
genetic algorithms and chemometrics for time-resolved analytical data”. PhD thesis. Universität Tübingen.
Gatys, Leon A., Alexander S. Ecker, and Matthias Bethge (2015). “A Neural Algorithm of Artistic Style”. In:
CoRR abs/1508.06576. url: http://arxiv.org/abs/1508.06576.
Graves, Alex, Greg Wayne, and Ivo Danihelka (2014). “Neural turing machines”. In: arXiv preprint
arXiv:1410.5401.
Hayes, Bradley (2016). url: https://twitter.com/deepdrumpf.
Karpathy, Andrej (2015). The Unreasonable Effectiveness of Recurrent Neural Networks. url:
http://karpathy.github.io/2015/05/21/rnn-effectiveness/ (visited on 04/01/2016).
References II
Kurzweil, Ray (2005). The singularity is near: When humans transcend biology. Penguin.
LeCun, Yann, Yoshua Bengio, and Geoffrey Hinton (2015). “Deep learning”. In: Nature 521.7553, pp. 436–444.
Li, Chuan and Michael Wand (2016). “Combining Markov Random Fields and Convolutional Neural Networks for
Image Synthesis”. In: CoRR abs/1601.04589. url: http://arxiv.org/abs/1601.04589.
Mnih, Volodymyr et al. (2015). “Human-level control through deep reinforcement learning”. In: Nature 518.7540,
pp. 529–533. url:
https://storage.googleapis.com/deepmind-data/assets/papers/DeepMindNature14236Paper.pdf.
Müller, Martin (1995). “Computer Go as a sum of local games: an application of combinatorial game theory”.
PhD thesis. TU Graz.
Silver, David et al. (2016). “Mastering the game of Go with deep neural networks and tree search”. In: Nature
529.7587, pp. 484–489.