Sie sind auf Seite 1von 68

2017. 11.

12 / GDG DevFest Nanjing 2017


2017. 11. 19 / GDG DevFest Seoul 2017

The Flow of TensorFlow


Jeongkyu Shin
Lablup Inc.
Descript.ion
Jeongkyu Shin / @inureyes

§ CEO / Co-founder, Lablup Inc.


§ Develops Backend.AI

§ Open-source devotee
§ Google Developer Experts (Machine Learning)
§ Principal Researcher, KOSSLab., Korea
§ Textcube open-source project maintainer (10th
anniversary!)

§ Physicist / Neuroscientist
§ Adj. professor (Dept. of CSE, Hanyang Univ.)
§ Ph.D in Statistical Physics (complex systems /
computational neuroscience)
Machine Learning Era: All came from dust
§ Machine learning
§ ”Field of study that gives computers the ability to learn without being explicitly programmed”
Arthur Samuel (1959)

§ "A computer program is said to learn from experience E with respect to some class of tasks T
and performance measure P, if its performance at tasks in T, as measured by P, improves with
experience E.” Tom Michel (1999)

§ Type of Machine Learning


§ Supervised learning
§ Unsupervised learning
§ Reinforcement learning
§ Recommender system
Artificial Intelligence
§ Definition
§ Allan Turing, ‘The Imitation Game” (1950) => Turing test
§ John McCarthy, Dartmouth Artificial Intelligence Conference (1956)

§ Information Processing Language (1955)


§ From axiom to theory
§ Heuristics to reduce probing space
§ Born of LISP programming language

§ First approach : IF-THEN rule


§ Probe every possible cases and choose the pathway with highest fitness
Artificial Neural Network: Basics
§ Effect of layers

A. K. Jain, J. Mao, K. M. Mohiuddin (1996) Artificial Neural Networks: A Tutorial IEEE Computer 29
Winter was coming
§ First winter (1970s)
§ Complex problems: too difficult to construct logic models (by hand)

§ Second winter (1990s)


§ Overfitting problem → pre-training, supervised backpropagation → dropout (2013)
§ Convergence → vanishing gradient problem (1991)
§ Divergence problem → weight decay / sparsity regularization
§ Tedious training speed → IT evolution, mini-batch

§ And the spring: Environmental changes open the gate


§ Rise of big-data
§ Phenomenal computation cost reduction
Deep Learning: flower of the golden era
§ What if you have enough money to do (formally) crazy experiments? Like
§ Increase the number of hidden layers
§ Pour unlimited number of data

§ Breakthrough of deep learning


§ Geoffrey Hinton (2005)
§ Andrew Ng (2012)

§ Convolution Neural Network


§ Pooling layer + weight
§ Recurrent Neural Network
§ Feedforward routine with (long/short) term memory
§ Deep disbelief Network
§ Multipartite neural network with generative model
§ Deep Q-Network
§ Using deep learning for reinforcement learning
AlphaGo as a mixture of Machine Learning techniques
§ Reducing search space
§ Breadth reduction
§ And depth reduction
§ Prediction
§ 13 layer convolutional NN

§ Value network
§ Policy network
§ Principal variation
Flow of TensorFlow
Still less than two years passed.
TensorFlow
§ Open-source software library for machine learning across a range of tasks
§ Developed by Google (Dec. 2015~)

§ Characteristics
§ Python API (like Theano)
§ From 1.0, TensorFlow expands native API binding with Java C, etc.
§ Supports
§ Linux, macOS
§ NVidia GPUs (pascal and above)
Before TensorFlow
§ User-friendly Deep-learning toolkits
§ Caffe (2012)
§ Generalized programming method to researchers
§ Provides common NN blocks
§ Configuration file + training kernel program
§ Theano (2013~2017)
§ User code / configuration part is written in Python
§ Keras (2015~)
§ Meta-framework for Deep Learning programming
§ Supports various backends:
§ Theano (default) / TensorFlow (2016~) / MXNet (2017~) / CNTK (WIP)
§ ETC
§ Paddle, Chainer, DL4J…
TensorFlow: Summary
§ Statistics § TensorFlow Serving
§ More than 24000 commits since Dec. 2015 § Enables easier inference / model serving
§ More than 1140 committers § XLA compiler (1.0~)
§ More than 24000 forks for last 12 months § Support various environments / speedups
§ Dominates Bootstrap! (15000)
§ More than 6400 TensorFlow-related § Keras API Support (1.2~)
repository created on GitHub § High-level programming API
§ Keras-compatible API
§ Current § Eager Execution (1.4~)
§ Complete ML model prototyping § Interactive mode of TensorFlow
§ Distributed training § Treat TensorFlow python code as real
§ CPU / GPU / TPU / Mobile support python code

https://www.infoworld.com/article/3233283/javascript/at-github-javascript-rules-in-usage-tensorflow-leads-in-forks.html
TensorFlow: Summary
2016 § TensorFlow Serving
⏤ TFLearn (contrib)
⏤ SyntaxNet § Enables easier inference / model serving
⏤ Multi GPU support § XLA compiler (1.0~)
⏤ SKLearn (contrib)
⏤ Distributed TensorFlow
§ Support various environments / speedups
⏤ OpenAL w/ OpenCompute
§ Keras API Support (1.2~)
⏤ TensorFlow Serving
⏤ TensorFlow Slim § High-level programming API
⏤ Mobile TensorFlow § Keras-compatible API
2017
⏤ XLA
§ Eager Execution (1.4~)
⏤ Keras API § Interactive mode of TensorFlow
⏤ DRAGNN § Treat TensorFlow python code as real
⏤ TensorFlow TimeSeries python code
⏤ TensorFlow Datasets
⏤ Eager Execution

⏤ TensorFlow Lite
How TensorFlow works
§ CPU
§ Multiprocessor
§ AVX-based acceleration
§ GPU part in chip
§ OpenMP
§ GPU
§ CUDA (NVidia) ➜ cuDNN
§ OpenCL (AMD) ➜ ComputeCPP /
ROCm
§ TPU (1st, 2nd gen.)
§ ASIC for accelerating matrix
calculation
§ In-house development by Google

https://www.tensorflow.org/get_started/graph_viz
How TensorFlow works
§ Python but not Python
§ Python API is default API for
TensorFlow
§ However, TF core is written in C++,
with cuDNN library (for GPU
acceleration)

§ Computation Graph
§ User TF code is not a code
§ it is a configuration to generate
computation graph
§ Session
§ Creates a computation graph and
run the training using C++ core
§ Tedious debug process
How TensorFlow works

Google I/O 2017 / TensorFlow Frontiers


TensorFlow Features
§ Recent TensorFlow core features
§ TensorFlow Estimators
§ Included in 1.4 (Oct. 2017) / high-level API for using, modeling well-known estimators
§ TensorFlow Serving (independent project)
§ TensorFlow Keras-compatible API (Sep. 2017)
§ Included in 1.3 (Sep. 2017)
§ TensorFlow Datasets
§ Included in 1.4 (Oct. 2017)

§ Upcoming/testing TensorFlow core features


§ TensorFlow eager execution
§ Introduced in 1.4 (Oct. 2017)
§ TensorFlow Lite
§ (Work-in-progress)
XLA: linear algebra compiler for TensorFlow

Google I/O 2017 / TensorFlow Frontiers


TensorFlow Serving
§ Serving system for inference service
§ Components
§ Servables
§ Loaders
§ Managers

§ Features
§ Model building
§ Model versioning
§ Model saving / loading
§ Online inference support with RPC
Keras-compatible API for TensorFlow
§ Keras ( https://keras.io )
§ High-level API
§ Focus on user experience
§ “Deep learning accessible to everyone”
§ History
§ Announced at Feb. 2017
§ Bundled as an contribution package from TF 1.2
§ Official core package since 1.4
§ Characteristics
§ “Simplified workflow for TensorFlow users, more powerful features to Keras users”
§ Most Keras code can be used on TensorFlow (with keras. to tf.keras.)
§ Can mix Keras code with TensorFlow codes
TensorFlow Datasets
§ New way to generate data pipeline
§ Dataset classes
§ TextLineDataset
§ TFRecordDataset
§ FixedLengthRecordDataset
§ Iterator
Example: Decoding and resizing image data

# Reads an image from a file, decodes it into a dense tensor, and resizes it
# to a fixed shape.
def _parse_function(filename, label):
image_string = tf.read_file(filename)
image_decoded = tf.image.decode_image(image_string)
image_resized = tf.image.resize_images(image_decoded, [28, 28])
return image_resized, label

# A vector of filenames.
filenames = tf.constant(["/var/data/image1.jpg", "/var/data/image2.jpg", ...])

# `labels[i]` is the label for the image in `filenames[i].


labels = tf.constant([0, 37, ...])

dataset = tf.data.Dataset.from_tensor_slices((filenames, labels))


dataset = dataset.map(_parse_function)
Eager execution
§ Announced at Oct. 30, 2017
§ Makes TensorFlow execute operations immediately
§ Returns concrete values
§ Provides
§ A NumPy-like library for numerical computation
§ Support for GPU acceleration and automatic differentiation
§ A flexible platform for machine learning research and experiments
§ Advantages
§ Python debugger tools
§ Immediate error reporting
§ Easy control flow
§ Python data structures
Example: Session

x = tf.placeholder(tf.float32, shape=[1, 1]) x = [[2.]]


m = tf.matmul(x, x) m = tf.matmul(x, x)

print(m) print(m)
# Tensor("MatMul:0", shape=(1, 1), # tf.Tensor([[4.]], dtype=float32,
dtype=float32) shape=(1,1))

with tf.Session() as sess:


m_out = sess.run(m, feed_dict={x: [[2.]]})
print(m_out)
# [[4.]]
Example: Instant error

x = tf.gather([0, 1, 2], 7)

InvalidArgumentError: indices = 7 is not in [0, 3) [Op:Gather]


Example: removing metaprogramming

x = tf.random_uniform([2, 2]) x = tf.random_uniform([2, 2])

with tf.Session() as sess: for i in range(x.shape[0]):


for i in range(x.shape[0]): for j in range(x.shape[1]):
for j in range(x.shape[1]): print(x[i, j])
print(sess.run(x[i, j]))
Eager execution: Python Control Flow

# Outputs
tf.Tensor(3, dtype=int32)
a = tf.constant(6) tf.Tensor(10, dtype=int32)
while not tf.equal(a, 1): tf.Tensor(5, dtype=int32)
if tf.equal(a % 2, 0): tf.Tensor(16, dtype=int32)
a = a / 2 tf.Tensor(8, dtype=int32)
else: tf.Tensor(4, dtype=int32)
a = 3 * a + 1 tf.Tensor(2, dtype=int32)
print(a) tf.Tensor(1, dtype=int32)
Eager execution: Gradients

def square(x):
return tf.multiply(x, x) # Or x * x

grad = tfe.gradients_function(square)

print(square(3.)) # tf.Tensor(9., dtype=tf.float32


print(grad(3.)) # [tf.Tensor(6., dtype=tf.float32))]
Eager execution: Gradients

def square(x):
return tf.multiply(x, x) # Or x * x

grad = tfe.gradients_function(square)
gradgrad = tfe.gradients_function(lambda x: grad(x)[0])

print(square(3.)) # tf.Tensor(9., dtype=tf.float32)


print(grad(3.)) # [tf.Tensor(6., dtype=tf.float32)]
print(gradgrad(3.)) # [tf.Tensor(2., dtype=tf.float32))]
Eager execution: Custom Gradients

def log1pexp(x):
return tf.log(1 + tf.exp(x))
grad_log1pexp = tfe.gradients_function(log1pexp)

print(grad_log1pexp(0.))

Works fine, prints [0.5]


Eager execution: Custom Gradients

def log1pexp(x):
return tf.log(1 + tf.exp(x))
grad_log1pexp = tfe.gradients_function(log1pexp)

print(grad_log1pexp(100.))

[nan] due to numeric


instability
Eager execution: Custom Gradients
@tfe.custom_gradient
def log1pexp(x):
e = tf.exp(x)
def grad(dy):
return dy * (1 - 1 / (1 + e))
return tf.log(1 + e), grad
grad_log1pexp = tfe.gradients_function(log1pexp)

# Gradient at x = 0 works as before.


print(grad_log1pexp(0.)) # [0.5]
# And now gradient computation at x=100 works as well.
print(grad_log1pexp(100.)) # [1.0]
Eager execution: Using GPUs

tf.device() for manual placement

with tf.device(“/gpu:0”):
x = tf.random_uniform([10, 10])
y = tf.matmul(x, x)
# x and y reside in GPU memory
Eager execution: Building Models

The same APIs as graph building


(tf.layers, tf.train.Optimizer, tf.data etc.)

model = tf.layers.Dense(units=1, use_bias=True)


optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.1)
Eager execution: Building Models

model = tf.layers.Dense(units=1, use_bias=True)


optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.1)

# Define a loss function


def loss(x, y):
return tf.reduce_mean(tf.square(y - model(x)))
Eager execution: Training Models

Compute and apply gradients

for (x, y) in get_next_batch():


optimizer.apply_gradients(grad_fn(x, y))
Eager execution: Training Models

Compute and apply gradients

grad_fn = tfe.implicit_gradients(loss)

for (x, y) in get_next_batch():


optimizer.apply_gradients(grad_fn(x, y))
Comparison
TF Eager Keras Keras
TensorFlow TFlearn TF Slim (with TF (with MXNet PyTorch CNTK MXNet
Execution backend) backend)

Difficulty ■■■■ ■■■ ■■ ■■ ■■ ■■■ ■ ■■■■ ■■■■

Extensibility ■■■■ ■■■■ ■■■■ ■■ ■■ ■■ ■ ■■■■ ■■■■

Interactive
mode
X X X O X X O X X

Multi-CPU
(NUMA)
O O X X O O O O O
Multi-CPU
(Cluster)
O O O X O O X O O

?
Multi-GPU (manual
(single node)
O O O X O O O O
multi-
batch)
Multi-GPU
(Cluster)
O O O X O O X O O
TensorFlow Lite
§ TensorFlow Lite: Embedded
TensorFlow
§ No additional environment installation
required
§ OS level hardware acceleration
§ Leverages Android NN

§ XLA-based optimization support


§ Enables binding to various programming
languages

§ Developer Preview (4 days ago)


§ Part of Android O-MR1

Google I/O 2017 / Android meets TensorFlow


TensorFlow Lite
§ Format
§ FlatBuffers instead of ProtocolBuffers
§ Provides converter
§ Models
§ InceptionV3
§ MobileNets: vision-specific model family
§ API
§ Java
§ C++
TensorFlow Lite: Why and How
§ Why? Less traffic / faster response
§ Image / OCR, Speech <-> Text, Translation, NLP
§ Motion, GPS and more

§ ML can extract the meaning from raw data


§ Image recognition: Send raw image vs. send detected label
§ Motion detection: Send raw motion vs. send feature vector

§ How? Model compression


§ Graph freezing
§ Graph conversion tools
§ Quantization
§ Weight
§ Calculation
§ Memory mapping

Google I/O 2017 / Android meets TensorFlow


Android Neural Network API
§ New APIs for NeuralNet
§ Part of Android Framework
§ Since next Android release
§ Reduce the library duplication through apps.
§ Supports Hardware acceleration
§ GPU, DSP, ISP, NeuralNet chips, etc.

Google I/O 2017 / Android meets TensorFlow


Flow goes to: market
What is flowing through the stream?
Market: API-based (personalized) deep learning service
§ Service with pre-baked models via API
§ Focuses on the fields that does not require real-time
§ e.g. Microsoft Azure Cognitive service
§ Pre-trained ANN + personalized data = personalized NN

+ =
§ Easy personalization : server-side training
Market: User-side deep learning services
§ Inference with trained models
§ Does not require heavy calculation
§ e.g. ARMv7 with ~512MB / 1GB RAM

§ Toys / light products


§ Smart toys for kidult (adult + kids) : Self-driving R/C car / drone
§ Home appliance and controllers

§ IoT + ML
§ Locality : Home (per room), Car, Office, etc.
§ E.g. Smart home resource management systems
Market: Deep Learning service for everyone
§ Digital assistants War
§ Digital assistant (with sprakers): Gateway of deep learning based services
§ Context extraction + inference + features

§ Echo (Amazon) / Google Home (Google)


§ Microsoft (Cortana in every MS products) / Apple (HomePod)

§ Korea? Also entering the war field


§ Naver: Wave / Friends
§ Kakao: Kakao mini
§ SK: Nugu
Flow goes to: tech.
What is flowing through the stream?
Portability and extensibility
§ Training on
§ Mac / windows
§ GPU server
§ GPU / TPU on Cloud

§ Prediction / Inference using


§ Android / iOS
§ Raspberry Pi and TPU
§ Android Things

Google I/O 2017 / Android meets TensorFlow


Open-source Machine Learning Framework
§ Machine Learning Framework: (almost) open-source
§ Google: TensorFlow (2015~)
§ Microsoft: CNTK (2016~)
§ Amazon: MxNet (2015~)
§ Facebook: Caffe 2 (2017~) / PyTorch (2016~)
§ Baidu: PaddlePaddle (2016~)

§ Why?

§ 2017
§ General goal of new versions: user-friendly syntax
§ Rise of Keras, PyTorch leads TensorFlow Eager execution
Server-side machine learning
§ Machine learning workload characteristics
§ Training
§ Requires ultra-heavy computation resources
§ Need to feed big, indexed data
§ OR, (reinforcement learning) need pair model / training
environment to give feedbacks
§ Serving
§ Requires (relatively) light resources:
§ Low CPU cost
§ Middle memory capacity (to load NeuralNet)
TensorFlow: Multiverse
§ TensorFlow AMD GPU acceleration
§ OpenCL with ComputeCPP (Feb. 2017)
§ Accelerates c++ codes (codeplay)
§ Khronos support / SYCL standard
§ Still in early stage
§ Only supports Linux

§ ROCm (AMD) based TensorFlow (Sep. 2017)


§ First open-source HPC/Hyperscale-class
platform for GPU computing
§ LLVM based / HCC C++ / GCN compiler
§ https://github.com/ROCmSoftwarePlatform/
hiptensorflow
Hand-held machine learning: Why?
§ Issues from real-time models / apps
§ Autopilot
§ Real-time effect on photos / videos
§ Voice recognition
§ Automators

§ Privacy issues
§ Increasing privacy information

§ ETC
§ Lead the network cost reduction
Hand-held machine learning: How?
§ Apple’s approach
§ Keeping user privacy with Differential Privacy
§ Gather Anonymized user data
§ User-specific machine learning models: keep them in the phone
§ e.g. Photo face detection / voice recognition / smart keyboard
§ Core ML (iOS 11)
§ Support Machine Learning model as function (.mlmodel format)

§ Google’s approach
§ Ultra-large scale server side training using TPU (2nd gen.)
§ Mobile: Handles data compression and feature extraction (to reduce traffic)
§ On the mobile:
§ Android NeuralNet API (Android O)
§ TensorFlow Lite on Android (Android O)

https://backchannel.com/an-exclusive-look-at-how-ai-and-machine-learning-work-at-apple-8dbfb131932b
Hand-held machine learning: How?
§ Train on server, Serve on smartphone
§ Enough to serve pre-trained models on smartphones

§ Both train and serve on smartphone


§ Keeping privacy / reduce traffic / personalization
§ Uses GPUs on recent smartphones

§ Working together
§ Feature extraction / compression / preprocessing ‒ Mobile side
§ Machine Learning model training / updating / streaming advanced models ‒ Server side
Hand-held machine learning: How?
§ TensorFlow
§ Supports both Android and iOS
§ XCode and Android Studio
§ XLA compiler framework since TensorFlow 1.0:
§ Will support diverse languages / environments
§ Also, optimizing for smartphones and tablets
§ MobileNet (Apr. 2017)
§ Efficient Convolutional Neural Networks for Mobile Vision Applications
§ TensorFlow Lite (Nov. 2017): development focus
§ Built-in operators for both quantized models (int (8bit) / fixed point) and floating point models
(FP10, FP16)
§ Support for embedded GPUs / ASICs
Browser-side machine learning
§ Machine Learning without hassle
§ Ingredients for machine learning: Computation, Data, Algorithm
§ XLA: provides binary-code level optimization for various environment
§ Do we have cross-platform computation environment?
§ Java?
§ Browser!

§ Recent improvements of web browser


§ WebGL
§ Unified programming environment for many GPU-enabled machines
§ WebAssembly
§ Binary-level optimization
§ Shipped to every mainstream browser! (just in this week)
Convertible NeuralNet format
§ ONNX (Open Neural Network Exchange)
§ Microsoft / Facebook (Sep. 2017)
§ Caffe 2, PyTorch (by Facebook), CNTK (Microsoft)

§ MLMODEL (Code ML model, Machine Learning Model)


§ Apple (Aug. 2017)
§ Caffe, Keras, scikit-learn, LIBSVM (Open Source)
§ Provides Core ML converter / specification
Recap
§ Machine Learning / Artificial Intelligence
§ Flow of TensorFlow
§ TensorFlow Serving Project
§ Keras-compatible API
§ Datasets

§ Eager execution
§ TensorFlow Lite
§ Flow goes to
§ More user-friendly toolkits / frameworks
§ API-based / personalized
§ User-side inference / Hand-held ML
§ Convertible Machine Learning Model formats
End!
Thank you for listening
Lablup Inc. https://www.lablup.ai
Backend.AI https://backend.ai
Backend.AI Cloud https://cloud.backend.ai
CodeOnWeb Service https://www.codeonweb.com
Github repository https://github.com/lablup

Das könnte Ihnen auch gefallen