Sie sind auf Seite 1von 24

Deep Learning – a cookbook view

or “Comparative Analysis of Different Deep Learning Solutions “


The evolution of artificial intelligence Massive unstructured big data

Deep Learning
– Unsupervised training
– Generic code
– Pattern recognition

Systems can
– Observe
– Test
– Refine

Massive structured data sets Successes


– AlphaGO First Computer GO program to
Small data sets Machine learning beat a human
– Deep Blue Beating World Chess – Deep Face Facial verification
Early artificial intelligence Champion Kasparov
– Libratus AI Poker App
– ENIAC Heralded the Giant Brain“;
– DARPA Challenge Autonomous – Digital virtual assistants Siri
used for WW II ballistics
vehicle drove 132 miles – Google Self-driving cars
– Industrial robots
Statistical and mathematical models Predictive models defined by
Advanced Analytics and Heuristic
applied to solve problems machines based neural networks

1940 – 1980 1990 – 2000s Today


2
Traditional machine learning
Requires feature engineering Artificial
Intelligence

Machine
Learning

Machine learning Deep


Training Data Feature engineering
algorithm Learning

Training
Prediction
HPC

Learned model
Data Feature extraction Prediction
(prediction function)

3
Deep learning
Efficient data representations, no more feature engineering

Deep learning
Training Data
algorithm

Training
Prediction (inference)

Learned model
Data (transformation and Prediction
prediction function)

4
Types of artificial neural networks
Topology to fit data characteristics
Convolutional: Fully connected: Recurrent:
Images Speech, text, sensor Speech, text, sensor

Hidden Hidden Hidden Hidden Hidden


Input Output Input
Layer 1 Layer 2 Layer 1 Layer 2 Output Input
Layer 1
Output

5
Terminology

Flower
Epoch Batch Predictions

Model House
Errors
Iteration True labels
Training data

Worker 1 Worker 2 Worker 1 Worker 2


Worker 1 Strong scaling Weak scaling

6
Why deep learning?
Applications

Vision Speech Text Other


‒ Search & information ‒ Interactive voice ‒ Search and ranking ‒ Recommendation
extraction response (IVR) systems ‒ Sentiment analysis engines
‒ Security/Video ‒ Voice interfaces (Mobile, ‒ Machine translation ‒ Advertising
surveillance Cars, Gaming, Home) ‒ Question answering ‒ Fraud detection
‒ Self-driving cars ‒ Security (speaker ‒ AI challenges
‒ Medical imaging identification) ‒ Drug discovery
‒ Robotics ‒ Health care ‒ Sensor data analysis
‒ People with disabilities ‒ Diagnostic support

7
Applications break down

Images Image analysis

Video Video surveillance Detection


Look for a known object/pattern

Speech Speech recognition Generation


Generate content

Classification
Text Sentiment analysis Assign a label from a predefined set of
labels

Sensor Predictive maintenance Anomaly detection


Look for abnormal, unknown patterns

Other Fraud detection

8
How an individual customer’s AI evolves

Explore Experiment Scale up and Optimize


How can AI help me? How can I get started? How can I scale and optimize?

Do things better Boundary constraints Provisioning for inference


– Product development (regulations, etc.)
– Customer experience Infrastructure scale up
– Productivity Data – Training
– Employee experience Data model? Location? – Inference
How to create a model? – On-prem / cloud / hybrid
Do new things – Homegrown solution or open source?
– New disruptions – Simple ML or scalable DL? Data management
– Between edge and core
Design – Security
How to design and deploy the PoC? – Updates
– On-prem, cloud? – Regulations
– How to think about inference – Tracing

Performance
What is the best config to run?
How to tune the model to improve
accuracy?
9
Key IT challenges are constraining deep learning adoption
Limited knowledge, resources and capabilities
How to get started? How to go to production? How to optimize?

“I need simple, infrastructure and “I could use more expert advice and “I need help integrating the latest
software capabilities to rapidly and tailored solutions for migrating and technologies into my deep learning
efficiently support deep learning integrating apps in a production environment to accelerate
app development.” environment.” actionable insights.”

Immature, sub-optimal Inability to scale Lack of technology


foundation and integrate integration capabilities

Content under embargo until Oct 10, 2017 10


What about AI consumers ?

Do it yourself How do I do it ? I know better

Current wave of AI / Could benefit from better data Super-Experts – current


Machine Learning is core to science, machine learning, but wave is woefully inadequate
their business. All in-house it is not historically their core-
competency

Google, Baidu, Facebook, Banks, advertisers, Government – DoD, DoE,


Microsoft, Apple, etc. healthcare, manufacturing, NSA, NASA, etc.
food, automotive, etc.

Not ready for an ASIC. Don’t know what Begging for higher performance ASICs.
they need exactly. Many still developing Know exactly what they want to do.
on CPUs. Can’t use solutions that can’t Strong technology pull.
be verified or understood
Where to start ?
Recommend DL stack by vertical application

Verticals Voice interfaces Social media Manufacturing Oil & gas Connected cars

Data type Speech Images Video Sensor data

Data Small Moderate Large

Typical layers Convolutional Fully-connected Recurrent … Neural Network sits here

Frameworks TensorFlow Caffe 2 CNTK Torch …


Infrastructure x86 GPUs FPGAs TPU ? …

12
Neural Network : Popular Networks

Model size Model size GFLOPs


Network
(# params) (MB) (forward pass)
AlexNet 60,965,224 233 MB 0.7
GoogleNet 6,998,552 27 MB 1.6
VGG-16 138,357,544 528 MB 15.5
VGG-19 143,667,240 548 MB 19.6
ResNet50 25,610,269 98 MB 3.9
ResNet101 44,654,608 170 MB 7.6
ResNet152 60,344,387 230 MB 11.3

13
Today’s scale
Model size, data size, compute requirements

Application Model Training data FLOPs per epoch


Vision 1.7 * 109 14*106 images 6*1.7*109*14*106
~6.8 GB ~2.5 TB (256x256) ~1.4*1017
~10 TB (512x512)
Speech 60 * 106 100K hours of audio 6*60*106*34*109
~240 MB ~34*109 frames ~1.2*1019
~50 TB
Text 6.5 * 106 856*106 words 6*6.5*106*856*106
~260 MB ~3.3*1016

Signals 1.2 * 106 3*106 frames 6*1.2*3*106*3*106


~4.8 MB 6.5*1013
Today’s hardware
Model size, data size, compute requirements

Application Model Training data FLOPs per epoch


Vision 1.7 * 109 14*106 images 6*1.7*109*14*106
~6.8 GB ~2.5 TB (256x256) ~1.4*1017
~10 TB (512x512)

1 epoch per hour:


~39 TFLOPS
Today’s hardware:
Google TPU2: 180 TFLOPS Tensor ops (FP16 ??)
NVIDIA Tesla V100: 15 TFLOPS SP (30 TFLOPS FP16 , 120 TFLOPS Tensor ops), 12 GB memory
NVIDIA Tesla P100: 10.6 TFLOPS SP, 16 GB memory
NVIDIA Tesla K40: 4.29 TFLOPS SP, 12 GB memory
NVIDIA Tesla K80: 5.6 TFLOPS SP (8.74 TFLOPS SP with GPU boost), 24 GB memory
INTEL Xeon Phi: 2.4 TFLOPS SP

Superdome X: ~21 TFLOPS SP, 24 TB memory


So what to recommend?

Software

Hardware

16
Building performance models

Alex Net
TensorFlow Hardware

GoogleNet Scalable, automated real-time


Worker 1 Worker 2 intelligence
Caffe 2 Strong scaling
VGG-16, VGG -19
Tensor RT
ResNet 50, 101,152 Worker 1 Worker 2
Populated with 8 GPUs
BVLC Caffe Weak scaling

Eng Acoustic Model

17
TensorFlow – Weak Scaling – Training – Different models perfromance
in Tensor Flow . Scaling up to 8 GPUs
Speedup for up to 8 GPUs
8

0
1 2 4 8
DeepMNIST EngAcousticModel GoogleNet ResNet101 ResNet152
ResNet50 SensorNet VGG16 VGG19
18
TensorFlow - Inference ( Inferences per Second) - Different Models
witth different Batch
DeepMNIST
numbers GoogleNet
4500
350000 4000
300000 3500
250000 3000
2500
200000
2000
150000 1500
100000 1000
500
50000
HOW TO ANALYZE ALL THE DIFFERENT NUMBERS .
0
0
1 2 4 8 16 32 64 128 256 512 1024 2048 4096 8192
1 2 4 8 16 32 64 128 256 512 1024 2048 4096 8192
1 2 4 8
1 2 4 8

AS WE ADD MORE
ResNet50 OPTIONS and MORE TECHNOLOGIES
VGG19 IT
WOULD BE IMPOSSIBLE TO USE
1800
1000
1600
900
1400
800
1200
700
1000 600
800 500
600 400
400 300
200 200
0 100
1 2 4 8 16 32 64 128 256 512 1024 2048 4096 8192 0
1 2 4 8 16 32 64 128 256 512 1024 2048 4096 8192
1 2 4 8
1 2 4 8

19
HPE demystifies deep learning for faster intelligence across all organizations
New IT expertise, blueprints and technologies to get started, scale, integrate and optimize

Get started rapidly: Scale and Integrate: Deliver Optimize Environment:


Develop deep learning models attractive returns Enhance competitive advantage

IT expertise and solutions Proven blueprints and services Technology integration capabilities
to “get started” with deep learning models for “scalable” production deployments to maximize performance

Expertise Proven Blueprints Integration capabilities


− Rapid technology selection guides − Reference Architectures − Enhanced global Centers of Excellence
− State of the art training − Innovation labs for best practices − Next gen technology integration
Solutions Services
− Integrated purpose-built solutions − Deploy, integrate and support
− Out of the box solutions − Flexible, on-demand capacity

20
Get
Started

Select ideal technology configurations


with HPE Deep Learning Cookbook

“Book of recipes” for deep Expert advice to Availability of


learning workloads get you started complete toolset

− Comprehensive tool set based on extensive − Informed decision making - optimal − Deep Learning Benchmarking Suite:
benchmarking hardware and software configurations available on GitHub Dec 2018
− Includes 11 workloads with 8 DL − Eliminates the “guesswork” - validated − Deep Learning Performance
frameworks and 8 HPE hardware systems methodology and data Analysis Tool: planned to be released in
− Estimates workload performance and − Improves efficiency - detects bottlenecks the beginning of 2018.
recommends an optimal HW/SW stack for in deep learning workloads − Reference configurations: available
that workload soon on HPE.com website

21
Deep Learning Cookbook helps to pick the right HW/SW stack

Knowledgebase
Benchmarking Suite Reporting tool
Performance results • Performance results
• Benchmarking scripts
• 11 reference models • Performance prediction for arbitrary
• Reference models
• 8 frameworks ANNs
• Performance metrics • 8 hardware systems
• Scalability prediction
• Optimal HW/SW configuration
for a given workload
Performance and
scalability models
Reference configurations
• Machine learning (SVR) • Image classification
to predict performance
• Others to come
of core operations
will be available externally • Analytical communication
models
internal assets
• Analytical models for overall
performance
22
23
Thank you
Natalia Vassilieva Sorin Cheran
nvassilieva@hpe.com sorin.cheran@hpe.com

Sergey Serebryakov Bruno Monnet


sergey.serebryakov@hpe.com bruno.monnet@hpe.com

24

Das könnte Ihnen auch gefallen