Ann & Dif Equation

OPTIMIZATION
SOFTWARE
as a Tool for Solving
Differential Equations Using
NEURAL NETWORKS
Fotiadis, D. I.
Karras, D. A.
Lagaris, I. E.
Likas, A.
Papageorgiou,
D. G.
DIFFERENTIAL
EQUATIONS HANDLED
• ODE’s
• Systems of ODE’s
• PDE’s ( Boundary and Initial
Value Problems )
• Eigen - Value PDE Problems
• IDE’s
ARTIFICIAL
NEURAL NETWORKS
• Closed Analytic Form

• Universal Approximators
• Linear and Non-Linear Parameters
• Highly Parallel Systems
• Specialized Hardware for ANN
OPTIMIZATION
ENVIRONMENT
MERLIN / MCL 3.0

SOFTWARE
Features Include:
• A Host of Optimization Algorithms
• Special Merit for Sums of Squares
• Variable Bounds and Variable Fixing
• Command Driven User Interface
• Numerical Estimation of Derivatives
• Dynamic Programming of Strategies
ARTIFICIAL
NEURAL NETWORKS
Input Hidden Output

Layer Layers Layer
• Inspired from
biological NN
(1)
w (2)
x1 1
w
1
Input - Output
x2 2
2 mapping via the
Bias 3 v weights u,w,v and the
+1 u activation functions σ
Analytically this is given by the formula:
 n ( 2 )  n (1) 
N ( x , u , w , w , v ) = ∑ viσ ∑ wij σ ∑ w jk xk + u j  
n3 2 1
(1) ( 2)
i =1  j =1  k =1  
Activation Functions
Many different functions can be used.
Our current choice: The Sigmoidal

A smooth function, 1
σ ( x) =
infinitely differentiable, −x
bounded in (0,1) 1+e
σ (x )
0. 8
0. 6
0. 4
0. 2
0
-10 -5 0 5 10
The Sigmoidal
properties
dσ ( x )
= σ ( x )[1 − σ ( x )]
dx
2
d σ ( x)
= σ ( x )[1 − σ ( x )][1 − 2σ ( x )]
2
dx
FACTS
Kolmogorov
and Cybenko and Hornik
proved theorems concerning the
approximation capabilities of ANNs
In fact it is shown that ANNs are

UNIVERSAL APPROXIMATORS
DESCRIPTION
OF THE METHOD
SOLVE THE EQUATION
LΨ ( x ) = f ( x )
SUBJECT TO
DIRICHLET B.C.
Where
L is an Integrodifferential Operator
Linear or Non-Linear
Ψ M
( x ) = B ( x ) + Z ( x ) N ( x )
Where:
•B(x) satisfies the BC
•Z(x) vanishes on the boundary
•N(x) is an Artificial Neural Net
MODEL
PROPERTIES
The Model Ψ M
( x) = B( x) + Z ( x) N ( x)
satisfies by construction the B.C.
The Model thanks to the Network is

“trainable”
The Network parameters
can be adjusted so that:
L Ψ ( x) − f ( x) = 0
M
∀x ∈[ 0,1]
N
Pick a set of representative
points x1 , x2 ,..., xn
in the unit Hypercube
The residual “Error”
∑ [L Ψ (x
M i
) − f(x i
)] 2
i = 1, n
ILLUSTRATION
Simple 1-d example

d 2
dΨ
Ψ ( x ) = f ( x, Ψ , )
dx 2
dx
Ψ (0) = Ψ , Ψ (1) = Ψ
0 1
Model
Ψ ( x) = Ψ (1 − x) + Ψ x + x(1 − x) N ( x)
M 0 1
ILLUSTRATION
For a second order, two-dimensional PDE:
Ψ( x, y ) =B( x, y ) +x(1 −x) y (1 −y ) N ( x, y )

M
where
B( x, y ) = (1 − x)Ψ (0, y ) + xΨ (1, y )
+ (1 − y ){Ψ ( x,0) − [(1 − x)Ψ ( x,0) + xΨ (1,0)]}
+ y{Ψ ( x,1) − [(1 − x)Ψ (0,1) + xΨ (1,1)]}
EXAMPLES
Problem: Solve the 2-d PDE:

∇ 2Ψ ( x, y) = e− x ( x − 2 + y3 + 6 y)
In the domain: x, y ∈[0,1]
Subject to the BC :
(1 + y 3 )
Ψ(0, y ) = y 3 Ψ(1, y ) =
e
Ψ( x,0) =xe −x Ψ( x,1) =(1 +x)e −x
A single hidden layer Perceptron was used:

ΨM ( x, y ) = B ( x, y ) + x(1 − x) y (1 − y ) N ( x, y )
B ( x, y ) = (1 − x) y 3 + x(1 + y 3 ) / e + (1 − y ) x(e − x − e −1 )
+ y[(1 + x)e − x − (1 − x − 2 xe −1 )]
GRAPHICAL
REPRESENTATION
The analytic solution is:
Exac
t
Ψ ( x, y) = e− x ( x + y3 )
GRAPHS &
COMPARISON
Neural Solution accuracy
Plot Points: Training Points
Ψ ( x, y ) − ΨM ( x, y )
GRAPHS &
COMPARISON
Neural Solution accuracy
Plot Points: Test Points
Ψ ( x, y ) − ΨM ( x, y )
GRAPHS &
COMPARISON
Finite Element Solution accuracy

Plot Points: Training Points
Ψ ( x, y ) − ΨFE ( x, y )
GRAPHS &
COMPARISON
Finite Element Solution accuracy

Plot Points: Test Points
Ψ ( x, y ) − ΨFE ( x, y )
PERFORMANCE
• Highly Accurate Solution

(even with few training points)
• Uniform “Error” Distribution
• Superior Interpolation
Properties
The model solution is very flexible.
Can be easily enhanced to offer even
higher accuracy.
EIGEN VALUE
PROBLEMS
Problem: L Ψ ( x) = λ Ψ ( x )
With appropriate Dirichlet BC
The model is the same as before.

However the “Error” is defined as:
n
∑ [ LΨ
i =1
M ( xi ) − λΨM ( xi )]
2
∑ [Ψ
i =1
M
2
( xi )]
EIGEN VALUE
PROBLEMS
∑ Ψ ( x ) LΨ ( x )
i i
Where: λ= i =1
n
∑ i
[ Ψ
i =1
( x )]2
i.e. the value for which

the “Error” is minimum.
Problems of that kind are often

encountered in Quantum Mechanics.
(Schrödinger’s equation)
EXAMPLES
The non-local Schrödinger equation
∞
h/ 2 d 2ψ (r )
− + V (r )ψ (r ) + ∫ K 0 (r , r ' )ψ (r ' )dr ' = εψ (r )
2 µ dr 2
0
ψ (0) = 0, ψ (r ) ~ e , k > 0 − kr
Describes the bound “n+α” system in the

framework of the Resonating Group Method.
Model:ψ M ( r ) = re
−br
N (r ), b > 0
nodes
Where: N (r ) = ∑ v σ (w r + u )
j =1
j j j
is a single hidden layer, sigmoidal Perceptron

OBTAINING
EIGENVALUES
Example:The Henon-Heiles potential
1  ∂ 2Ψ ∂ 2Ψ  1 2 1  2 1 3 
−  2 + 2  +  ( x + y2 ) +  xy − x  Ψ = εΨ
2  ∂x ∂y   2 4 5 3 
−k ( x2 + y 2 )
Asymptotic behavior: Ψ ( x, y ) ~ e
−b ( x 2 + y 2 )
Model used: ΨM ( x, y ) = e N ( x, y )
Use the above model to obtain an eigen solution Φ.
Obtain a different eigen solution by deflation, i.e. :
~
ΨM ( x, y ) = ΨM ( x, y ) − Φ ( x, y ) ∫∫ Φ ( x' , y ' )ΨM ( x' , y ' )dx' dy '
This model is orthogonal to Φ(x,y) by construction.

The procedure can be applied repeatedly.
ARBITRARILY
SHAPED DOMAINS
For domains other than Hypercubes the

BC cannot be embedded in the model.
Let Ri , ∀i = 1,2,..., m be the set of points

defining the arbitrarily shaped boundary.
The BC are then: Ψ ( Ri ) = bi ∀i = 1,2,..., m
Let ri , ∀i = 1,2,..., n be the set of the

training points inside the domain.
We describe two ways to proceed

solving the LΨ ( x ) = f ( x ) problem
OPTIMIZATION
WITH CONSTRAINTS
Model: ΨM ( x) = N ( x)
“Error” to be minimized:
Domain terms + Boundary terms
n m
∑ [ LΨ
i =1
M (ri ) − f (ri )] + β ∑ [Ψ M ( Ri ) − bi ]
2
i =1
2
With β a penalty parameter, to control

the degree of satisfaction of the BC.
PERCEPTRON-RBF
SYNERGY
Model:
m
ΨM ( x ) = N ( x ) + ∑ai e −λ x −Ri 2
i =1
Where the α’s are determined in a way so

that the model satisfies the BC exactly, i.e.:
m
∑a e
k =1
k
−λ Ri −Rk 2
= bi − N ( Ri )
The free parameter λ is chosen once

initially so as the system above is easily
solved. n
“Error”: ∑ Ψ − 2
[ L M ( ri ) f ( ri )]
i =1
Pros & Cons .
..
The RBF - Synergy is:
• Computationally costly. A linear
system is solved each time the model is
evaluated.
• Exact in satisfying the BC.
The Penalty method is:

• Approximate in satisfying the BC.
• Computationally efficient
IN PRACTICE . . .
• Initially proceed via the penalty method,

till an approximate solution is found.
• Refine the solution, using the RBF-
Synergy method, to satisfy the BC exactly.
Conclusions:
Experiments on several model
problems shows performance
similar to the one reported earlier.
GENERAL
OBSERVATIONS
Enhanced generalization performance is

achieved, when the exponential weights of the
Neural Networks are kept small.
Hence box-constrained optimization
methods should be applied.
Bigger Networks (greater number of nodes)
can achieve higher accuracy.
This favors the use of:
• Existing Specialized Hardware
• Sophisticated Optimization Software
MERLIN 3.0
A software package offering
What is it ? many optimization algorithms
and a friendly user interface.
What problems does it solve ?

Find a local minimum of the function:
f ( x ) , x ∈ R N , x = ( x1 , x2 ,..., x N )
Under the conditions:
xi ∈[li , ui ], ∀ i = 1,2,..., N
ALGORITHMS
Direct Methods
• SIMPLEX
• ROLL
Gradient Methods
Conjugate Gradient Quasi Newton
• Polak-Ribiere • BFGS (3 versions)

• Fletcher-Reeves • DFP
• Generalized P&R
Levenberg-Marquardt
• For Sum-Of-Squares
THE USER’S PART
What the user has to do ?

• Program the objective function
• Use Merlin to find an optimum
What the user may want to do ?

• Program the gradient
• Program the Hessian
• Program the Jacobian
MERLIN
FEATURES & TOOLS
• Intuitive free-format I/O
• Menu assisted Input
• On-line HELP
• Several gradient modes
• Confidence parameter intervals
• Box constraints
• Postscript graphs
• Programmability
• “Open” to user enhancements
Merlin
MCL: Control
Language
What is it ?
High-Level Programming Language,
that Drives Merlin Intelligently.
What are the benefits ?

• Abolishes User Intervention.
• Optimization Strategies.
• Handy Utilities.
• Global Optimum Seeking Methods.
MCL
REPERTOIRE
MCL command types:
• Merlin Commands
• Conditionals (IF-THEN-ELSE-ENDIF)
• Loops (DO type of loops)
• Branching (GO TO type)
• I/O (READ/WRITE)
MCL intrinsic variables:

All Merlin important variables, e.g.:
Parameters, Value, Gradient, Bounds ...
SAMPLE
MCL PROGRAM
program
var i; sml; bfgs_calls; nfix; max_calls
sml = 1.e-4 % Gradient threshlod.
bfgs_calls = 1000 % Number of BFGS calls.
max_calls = 10000 % Max. calls to spend.
again:
loosall
nfix = 0
loop i from 1 to dim
if abs[grad[i]] <= sml then
fix (x.i)
nfix = nfix+1
end if
end loop
if nfix == dim then
display 'Gradient below threshold...'
loosall
finish
end if
bfgs (noc=bfgs_calls)
when pcount < max_calls just move to again
display 'We probably failed...'
end
MERLIN-MCL
Availability
The Merlin - MCL package is
written in ANSI Fortran 77 and
can be downloaded from the
following URL:
http://nrt.cs.uoi.gr/merlin/
It is maintained, supported
and is FREELY available to
the scientific community.
FUTURE
DEVELOPMENTS
• Optimal Training Point Sets

• Optimal Network Architecture
• Expansion & Pruning Techniques
Hardware Implementation on
NEUROPROCESSORS

Ann &amp; Dif Equation

Hochgeladen von

Dokumentinformationen

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Ann &amp; Dif Equation

Hochgeladen von

Copyright:

Verfügbare Formate

OPTIMIZATION

• Closed Analytic Form

MERLIN / MCL 3.0

Input Hidden Output

Analytically this is given by the formula:

Many different functions can be used.

Our current choice: The Sigmoidal

In fact it is shown that ANNs are

The Model thanks to the Network is

The residual “Error”

Simple 1-d example

For a second order, two-dimensional PDE:

Ψ( x, y ) =B( x, y ) +x(1 −x) y (1 −y ) N ( x, y )

Problem: Solve the 2-d PDE:

A single hidden layer Perceptron was used:

The analytic solution is:

Finite Element Solution accuracy

Finite Element Solution accuracy

• Highly Accurate Solution

The model is the same as before.

i.e. the value for which

Problems of that kind are often

Describes the bound “n+α” system in the

is a single hidden layer, sigmoidal Perceptron

This model is orthogonal to Φ(x,y) by construction.

For domains other than Hypercubes the

Let Ri , ∀i = 1,2,..., m be the set of points

Let ri , ∀i = 1,2,..., n be the set of the

We describe two ways to proceed

With β a penalty parameter, to control

Where the α’s are determined in a way so

The free parameter λ is chosen once

The Penalty method is:

• Initially proceed via the penalty method,

Enhanced generalization performance is

What problems does it solve ?

Conjugate Gradient Quasi Newton

• Polak-Ribiere • BFGS (3 versions)

What the user has to do ?

What the user may want to do ?

What are the benefits ?

MCL intrinsic variables:

• Optimal Training Point Sets

Das könnte Ihnen auch gefallen

Ann & Dif Equation

Ann & Dif Equation