Beruflich Dokumente
Kultur Dokumente
Hands-On Labs
Microsoft Azure Subscription
Free trial available in some regions
Client computer
Windows
Linux
Mac OS X
nd
Machine
Learning
Single Threaded
Ingest
Manipulate
Predict
Tasks
Executor
Worker Node
Driver Program
Tasks
Executor
Worker Node
Umbrellas sold
Rainfall / inches
Temperature / F
10
0.2
65.1
56
12
2.1
64.6
70
3.0
67.3
21
1.5
65.3
0.1
65.1
Label
1.0
0.0
Feature
A
B
Feature
We
Are
1.0
1.0
C
A
No
Yes
DATA VOLUME
Validation,
25%
Test, 25%
Training 50%
Common for
DSP
MPEG/JPEG
What is it
Replaces discrete
values with
binned values
Uses a coefficient
matrix to
determine best fit
binned values
import org.apache.spark.ml.feature.QuantileDiscretizer
val metrics = Array((1, 10.2), (2, 17.1), (3, 9.6), (4, 5.0), (5, 3.4))
val df = metrics.toDF(day", rainfall")
val discretizer = new QuantileDiscretizer()
.setInputCol(rainfall")
.setOutputCol(discreterainfall")
.setNumBuckets(3)
val result = discretizer.fit(df).transform(df)
result.show()
Common for
Used for
dimensionality
reduction
Uses Eigenvectors
and eigenvalues
to determine most
relevant features
and rescale
Allows plotting in
2D
Speeds up
calculation
Lose some
information
What is normalization?
Normalization
Clip Values
12
10
0
Day 1
Day 2
Rainfall
Day 3
Temperature
Wind Speed
3.5
2.5
1.5
0.5
0
Day 1
Day 2
Rainfall
Day 3
Temperature
Wind Speed
2014 Microsoft Corporation. All rights reserved. Microsoft, Windows, Office, Azure, System Center, Dynamics and other product names are or may be registered trademarks and/or trademarks in the
U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft
must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after
the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.