Sie sind auf Seite 1von 39

Advanced/Adaptive Signal

Processing
By
Dr. Supratim Gupta
Department of Electrical Engineering
National Institute of Technology Rourkela
Odisha-769008
The 2 Philosophies

• Information = Data: Big Data Processing

• Information < Data: Compressed Sensing


Chapter-5: Big Data Processing
Processing Tools

• Smaller Neural Network


 Logistic Regression
 Shallow Network

• Large Neural Network


 Deep Neural-Network
 Convolution Neural Network
Logistic Regression: Face Vs. Non-Face

=1
⋮ +

G

Face Image
B
Logistic Regression: Face Vs. Non-Face

=0
⋮ +

G

Non-Face Image
B
Logistic Regression: Face Vs. Non-Face
Given an image represented by a feature vector , the algorithm will
evaluate the probability of a face being in that image.

i.e. given , = =1 ) where 0 ≤ ≤1

The parameters used in Logistic regression are:


• The input features vector: ∈ ℝ , where is the number of features
• The training label: ∈ 0,1
• The weights: ∈ ℝ , where is the number of features
• The threshold: ∈ ℝ
• The output: = ( + )
• Sigmoid function: ( )=
Logistic Regression: Face Vs. Non-Face
Sigmoid function: ( )=

• Large positive ,
( ) = 1
• Small/Large negative ,
( ) = 0
• = 0,
( ) = 0.5
Logistic Regression: Training
Given a set of face & non-face images represented by feature
vectors and labels , , , ,… , , determine
∈ ℝ and ∈ ℝ so that a suitable cost function is minimum

()

R
()
⋮ ()
+ () ,
G ⋮

B
() 1
min ,
,
Logistic Regression: Cost Function
, =− log + 1− log 1 −
()

R
()

1 ⋮
() + () ,
, = , G

()
1
min ,
,
Logistic Regression: Geometric Interpretation

0.01 1
Logistic Regression with Gradient Descent
Computational Steps:

1.Forward Propagation:
 Compute the cost function , = ∑ ,

1.Backward Propagation:
 Compute derivative/gradients (or ), (or )
 Update and
= −
= −
Logistic Regression: Computation Graph
, , = 6( + 0.5 )

=6

= + =6
= 0.5
= 0.5
=6

= 0.5
Logistic Regression Derivatives

= + + = = ( ) = ℒ( , )

= − 1−
=− +
1−
= = −
= = −
= = −
Logistic Regression Derivatives for examples
()

() () ()
()
=
()
+
()
+ = ()
= ℒ( ( ), ( ))
= ( ( ))

() () () () ()
= − 1−
() =− ()+
1− ()

1
, = ℒ ,

() () () () ()
= ∑ , = ∑ , = ∑
Logistic Regression: Vectorization

() = () +

For all M examples


1. Forward Propagation
⋮ ⋮ ⋮ ⋮
( ) ( )
⋯ ( ) = ⋯ ( ) ( )
⋮ ( ) + ⋯
⋮ ⋮ ⋮ ⋮

= ( ) ( ) ( ) = ( ) ( ) ( )
⋯ ⋯
Logistic Regression: Vectorization
() () ()
= −

For all M examples

2. Backward Propagation
= ( ) ( ) ( ) = −

= ( ) ( ) ( ) , = ( ) ( ) ( )
⋯ ⋯

() ()
= ∑ =

∑ ()
=
Logistic Regression to Neural Network

= + = ( ) ℒ( , )

() 1− ()
() = () − () ()
=− ()+ ()
1−
Logistic Regression to Neural Network

= + = ( )

[1] [2] Layers

[ ]
=

[ ] [ ] [ ]
Parameters: L1: , , L2: , ,
Neural Network Representation
[ ] [ ]
= +
[ ] [ ]
=

+ ( ) =

[ ]
=
[ ] [ ]
= +
= + [ ] [ ]
=
= ( )
Neural Network Representation
[ ] [ ]
= +
[ ] [ ]
=

+ ( ) =

[ ] [ ] [ ]
= = [ ]
+
[ ] [ ]
= + =
= ( )
Neural Network: Forward Propagation
[ ] [ ]
= + , = ( )
[ ] [ ]
= + , = ( )
[ ] [ ]
= + , = ( )
[ ] [ ]
= + , = ( )
[ ] [ ] [ ] [ ] [ ]
= + , = ( )
So, for layer forward propagation
[] [ ] [] [] []
= + , = ( )
NN: Forward Propagation with Multiple Examples

for i = 1 to m:
[ ]
= ()
for l = 1 to L:
() [ ]
= +
()
= ( )
end for
end for
NN: Forward Propagation with Multiple Examples
[ ]
=
for l = 1 to L:
= +
= ( )
end for

[ ]
= = ( ) ( …
) ( ) A[ ] = [ ]( ) [ ]( ) … [ ]( )
Neural Network: Activation Function Across Layers

ℎ [ ]
=
for l = 1 to L-1:

= +
ℎ = ℎ( )
end for

= +
= ( )
Neural Network: Activation Functions

ℎ [ ]
=
for l = 1 to L-1:

= +
ℎ = ℎ( )
end for

= +
= ( )
Neural Network: Activation Functions
a a

x
z
1
sigmoid: = −
1+ tanh: =
+
a a

z z
ReLU: = max
(0, ) = max
(0.01 , )
Leaky ReLU:
Derivative of Activation Functions
Sigmoid Function
a
1
= ( )=
1+
z

′( ) = = (1 − )
1+
Derivative of Activation Functions
Tanh Function
a
= ( ) = tanh( )

= 1 − tanh =1−
Derivative of Activation Functions
ReLU and Leaky ReLU
a a

z z
ReLU Leaky ReLU
= ( ) = max
(0, ) = ( ) = max
(0.01 , )

0.01 <0
0 <0 =
= 1 ≥0
1 ≥0
Logistic Regression Derivatives

= + + = = ( ) = ℒ( , )

= ∗ 1−
=− +
= − 1−
Neural Network Derivatives [ ]

[ ]
[ ]
[ ] [ ] [ ] = ( [ ]) a[ ] = ( [ ]) ℒ( [ ] , y)
[ ] [ ]
= + [ ] = +
[ ]

[ ] [ ] [ ]
[ ] [ ] = ∗ ′(z )
= − [ ]
= ∗ [ ] ′(z )
[ ] [ ] [ ] [ ]
= =
[ ] [ ] [ ] [ ]
= =
Neural Network Derivatives for M Examples
Vectorized Implementation
[ ] [ ]
= − [ ]
= [ ]

[ ] [ ] 1
= [ ] = [ ]

[ ] [ ] [ ]
1 [ ]( )
= =
Element wise product
[ ] [ ] [ ]
= ∗ ′(z ) [ ]
= [ ]
∗ [ ]
′(Z )

[ ] [ ] 1
= [ ] = [ ]

[ ]
1 [ ]( )
[ ] [ ] =
=
Neural Network: Weight Initialization
Initializing the parameters to zero [ ] [ ]
= 0 0 and =0
[ ] [ ] [ ] [ ]
[ ] = + = 0.5
[ ]
[ ] = 0.5 −
[ ] [ ]
[ ] = = 0.5 0.5 − 1 1
[ ] [ ] 0
= ∗ (1 − a ) =
0
[ ] 0 0 [ ] 0 0 0
= and = [ ]
= [ ]
=
0 0 0 0 0
[ ] [ ] [ ]
= + = 0.5 [ ] [ ] [ ]
= +
[ ] [ ] [ ] [ ] [ ] [ ]
= + = 0.5 = +
Neural Network: Weight Initialization
Initializing the parameters to random values
a
[ ]

[ ]

[ ]
z

[ ] [ ] 0
= 0.01 × and =
0
Large initial values of weights
[ ] [ ]
= 0.01 × and =0 push the output of the node to
saturation leading to slower
convergence
Neural Network: Matrix Dimension for L layers

Determine the
parameter dimensions
Neural Network: Matrix Dimension for L layers
Forward propagation with example:
() [ ]( ) []
= + ,
[ ]( ) = [ ] ( [ ]( ) )

Vectorized Forward propagation with


examples :

=1
= [ ] + [ ],
[] = [ ]( [ ])
L Layer Deep Neural Network

[]
[ ]
[] [] []
,

Cache
[ ]
,
[] [] []
,
[]

[] []
[]
[ ]
L Layer Deep Neural Network
Forward Propagation

[]
[ ]
[] [] []
,

Cache
[ ]
,
[] [] []
,
[]

[] []
[]
[ ]
Backward Propagation