Beruflich Dokumente
Kultur Dokumente
Inescapable trends
Data volumes increase drastically
Applications demand even more complex algorithms
COMP
Cost
DATA
Volume
O(N3)
TB
MB
Knowledge Graph
Creation
Dim. Reduction
O(N2)
HADOOP
GB
Graph Analytics
Clustering
HPC
PB
Uncertainty
Quantification
Information Retrieval
Simple DB
queries
O(N)
Informa(on
G
A
T
H
E
R
Data
NoSQL
SQL
C
O
N
N
E
C
T
Intelligence
Knowledge
Context
R Decisions
E
A & Ac(ons
S
O
N
A
D
A
P
T
ESB
Gather
Connect
Reason
Analyze data in context to
uncover hidden information
and find new relationships.
Analytics both add to context
via metadata extraction,
and use context to broader
information exploited
Adapt
Compose recommended
interactions, use context to
deliver to point of action.
suggest material
properties, suggest
simulations
standard infrastructure
to handle ingestion,
table narration &
annotation
Watson
Core
ingestion
pipeline
Start:
Unstructured
documents
KNOWLEDGE
DISCOVERY
Hand
off
facts &
values
extracted by
annotators
patents,
scientific
papers,
technical
reports
dictionary
, domainspecific
annotator
s
Domain
definition
Knowledge
graph
Discovery
Advisor
&
Graph
Analytics
End:
Domain expert with
Watson user
experience
repository
for
extracted
information
UI, query,
inference
&
discovery
Demo data:
Curated ground truth
8
Mentions (highlighted): Words annotated by entity type: sample, form, process, property,
unit, value, subvalue
Relations (arrows): Mentions mapped to one another
9
10
11
SimGet: Architecture
Scaleout Service to exploit word embeddings
12
Research
Watson
Research
13
Watson
Solutions
1965
1975
1985
1995
2005
2015
Queries
Sample queries of interest in materials science
l
l
l
15
Chemical composition
similarity estimator
ALLOY_1:
Chem. Composition
extracted from text
VALUE
ALLOY_2:
Chem. Composition
Extracted from text
YES
Threshold
NO
CONT.
VALUE
We conduct text extraction from a set of documents and get the document type
nodes. The document nodes link process Proc_1 to each alloy separately:
PROC_1
ALLOY_1
ALLOY_2
Query: Find all alloys for which Proc_1 is used and certain properties need to hold
for the alloys
Action on graph:
Start from the node Proc_1 and visit its neighbors
Those nodes you find that are of the alloy type that fulfill the user defined
criteria are your answers
2015 IBM Corporation
1
7
System Architecture
GRAPH DB
18
SUBGRAPH SELECTION
19
20
21
DEMO
INVERSE MATERIALS DESIGN
22
Inescapable trends
Data volumes increase drastically
Applications demand even more complex algorithms
COMP
Cost
DATA
Volume
O(N3)
TB
MB
Knowledge Graph
Creation
Dim. Reduction
O(N2)
HADOOP
GB
Graph Analytics
Clustering
HPC
PB
Uncertainty
Quantification
Information Retrieval
Simple DB
queries
O(N)
24
Roadmap
NEW TECH.
100x
POWER8 + K80
TRANSPRECISION COMPUTING
SINGLE-DOUBLE PRECISION
~5 secs
LIBRARIES
POWER8 + ASIC
10x
POWER7 + FPGA
POWER7
POWER7
CUSTOM
STOCHASTIC ARITH. &
DATA MOVEMENT &
STORAGE
Projection: 0.1 sec /
0.001 KJ
50x / 500x
FIXED POINT/8b/16b/32b PREC.
REDUCED DATA ACCURACY
STOCHASTIC DATA TRANSFER
Projection: 0.5-1 sec / 0.05 KJ
TRANSPRECISION COMPUTING
SENSOR POWER MEASUR.
54 secs / 9.6 KJ
170 W peak 140 W idle
BASELINE BLAS3
SINGLE/DOUBLE PREC.
SENSOR POWER MEASUR.
546 secs / 104 KJ
210 W peak / 140 W idle
2011
25
1000x / 2000x
2012
2015
2016/2018
2020+
2015 IBM Corporation
IBM Research
CG
POWER
MAX 179
Watts
IBM Research
IN GENERAL: CONSIDER
LOW PRECISION, LOW COST, LOW POWER: LP
HIGH PRECISION, HIGH POWER: HP
Let SLV(A,y,) be a LP procedure approximating Ax=b
SLV: Analog? Neuromorphic (spikes?), Neural Nets?, Machine Learning?
1.
Key properties:
Overall cost: O(n2), instead of O(n3)
Most of arithmetic is performed on Low Power platform
28
IBM Research
Learning approaches
Machine Learning / Statistical approach
Neural Networks
Neuromorphic approaches
Spike computing to simulate numerics
Hardware approaches
29
Accelerators (GPUs)
FPGAs
SPDs
Low reliability hardware (low voltage)
IBM Research
Examples...
Learning / stochastic approach: Reduce dimension by random sampling
XDATA DARPA PROJECT (2012-2016)
Simulations
Theory
Output
(i.e. new material/
device
process conditions)
Input /
Constraints
Big Data
31
31