Sie sind auf Seite 1von 40

OPTIMIZATION

SOFTWARE
as a Tool for Solving
Differential Equations Using

NEURAL NETWORKS

Fotiadis, D. I.
Karras, D. A.
Lagaris, I. E.
Likas, A.
Papageorgiou,
D. G.
DIFFERENTIAL
EQUATIONS HANDLED

• ODE’s
• Systems of ODE’s
• PDE’s ( Boundary and Initial
Value Problems )
• Eigen - Value PDE Problems
• IDE’s
ARTIFICIAL
NEURAL NETWORKS

• Closed Analytic Form


• Universal Approximators
• Linear and Non-Linear Parameters
• Highly Parallel Systems
• Specialized Hardware for ANN
OPTIMIZATION
ENVIRONMENT

MERLIN / MCL 3.0


SOFTWARE

Features Include:
• A Host of Optimization Algorithms
• Special Merit for Sums of Squares
• Variable Bounds and Variable Fixing
• Command Driven User Interface
• Numerical Estimation of Derivatives
• Dynamic Programming of Strategies
ARTIFICIAL
NEURAL NETWORKS

Input Hidden Output


Layer Layers Layer
• Inspired from
biological NN
(1)
w (2)
x1 1
w
1
Input - Output
x2 2
2 mapping via the
Bias 3 v weights u,w,v and the
+1 u activation functions σ

Analytically this is given by the formula:

 n ( 2 )  n (1) 
N ( x , u , w , w , v ) = ∑ viσ ∑ wij σ ∑ w jk xk + u j  
n3 2 1
(1) ( 2)

i =1  j =1  k =1  
Activation Functions

Many different functions can be used.

Our current choice: The Sigmoidal


A smooth function, 1
σ ( x) =
infinitely differentiable, −x
bounded in (0,1) 1+e
σ (x )

0. 8

0. 6

0. 4

0. 2

0
-10 -5 0 5 10
The Sigmoidal
properties

dσ ( x )
= σ ( x )[1 − σ ( x )]
dx
2
d σ ( x)
= σ ( x )[1 − σ ( x )][1 − 2σ ( x )]
2
dx
FACTS

Kolmogorov
and Cybenko and Hornik
proved theorems concerning the
approximation capabilities of ANNs

In fact it is shown that ANNs are


UNIVERSAL APPROXIMATORS
DESCRIPTION
OF THE METHOD
SOLVE THE EQUATION
LΨ ( x ) = f ( x )
SUBJECT TO
DIRICHLET B.C.
Where
L is an Integrodifferential Operator
Linear or Non-Linear
Ψ M
( x ) = B ( x ) + Z ( x ) N ( x )
Where:
•B(x) satisfies the BC
•Z(x) vanishes on the boundary
•N(x) is an Artificial Neural Net
MODEL
PROPERTIES

The Model Ψ M
( x) = B( x) + Z ( x) N ( x)
satisfies by construction the B.C.

The Model thanks to the Network is


“trainable”
The Network parameters
can be adjusted so that:

L Ψ ( x) − f ( x) = 0
M

∀x ∈[ 0,1]
N
Pick a set of representative
points x1 , x2 ,..., xn
in the unit Hypercube

The residual “Error”

∑ [L Ψ (x
M i
) − f(x i
)] 2

i = 1, n
ILLUSTRATION

Simple 1-d example


d 2

Ψ ( x ) = f ( x, Ψ , )
dx 2
dx
Ψ (0) = Ψ , Ψ (1) = Ψ
0 1

Model
Ψ ( x) = Ψ (1 − x) + Ψ x + x(1 − x) N ( x)
M 0 1
ILLUSTRATION

For a second order, two-dimensional PDE:

Ψ( x, y ) =B( x, y ) +x(1 −x) y (1 −y ) N ( x, y )


M

where
B( x, y ) = (1 − x)Ψ (0, y ) + xΨ (1, y )
+ (1 − y ){Ψ ( x,0) − [(1 − x)Ψ ( x,0) + xΨ (1,0)]}
+ y{Ψ ( x,1) − [(1 − x)Ψ (0,1) + xΨ (1,1)]}
EXAMPLES

Problem: Solve the 2-d PDE:


∇ 2Ψ ( x, y) = e− x ( x − 2 + y3 + 6 y)
In the domain: x, y ∈[0,1]
Subject to the BC :
(1 + y 3 )
Ψ(0, y ) = y 3 Ψ(1, y ) =
e
Ψ( x,0) =xe −x Ψ( x,1) =(1 +x)e −x

A single hidden layer Perceptron was used:


ΨM ( x, y ) = B ( x, y ) + x(1 − x) y (1 − y ) N ( x, y )
B ( x, y ) = (1 − x) y 3 + x(1 + y 3 ) / e + (1 − y ) x(e − x − e −1 )
+ y[(1 + x)e − x − (1 − x − 2 xe −1 )]
GRAPHICAL
REPRESENTATION

The analytic solution is:

Exac
t

Ψ ( x, y) = e− x ( x + y3 )
GRAPHS &
COMPARISON
Neural Solution accuracy
Plot Points: Training Points

Ψ ( x, y ) − ΨM ( x, y )
GRAPHS &
COMPARISON
Neural Solution accuracy
Plot Points: Test Points

Ψ ( x, y ) − ΨM ( x, y )
GRAPHS &
COMPARISON

Finite Element Solution accuracy


Plot Points: Training Points

Ψ ( x, y ) − ΨFE ( x, y )
GRAPHS &
COMPARISON

Finite Element Solution accuracy


Plot Points: Test Points

Ψ ( x, y ) − ΨFE ( x, y )
PERFORMANCE

• Highly Accurate Solution


(even with few training points)
• Uniform “Error” Distribution
• Superior Interpolation
Properties
The model solution is very flexible.
Can be easily enhanced to offer even
higher accuracy.
EIGEN VALUE
PROBLEMS

Problem: L Ψ ( x) = λ Ψ ( x )
With appropriate Dirichlet BC

The model is the same as before.


However the “Error” is defined as:
n

∑ [ LΨ
i =1
M ( xi ) − λΨM ( xi )]
2

∑ [Ψ
i =1
M
2
( xi )]
EIGEN VALUE
PROBLEMS

∑ Ψ ( x ) LΨ ( x )
i i
Where: λ= i =1
n

∑ i
[ Ψ
i =1
( x )]2

i.e. the value for which


the “Error” is minimum.

Problems of that kind are often


encountered in Quantum Mechanics.
(Schrödinger’s equation)
EXAMPLES
The non-local Schrödinger equation

h/ 2 d 2ψ (r )
− + V (r )ψ (r ) + ∫ K 0 (r , r ' )ψ (r ' )dr ' = εψ (r )
2 µ dr 2
0

ψ (0) = 0, ψ (r ) ~ e , k > 0 − kr

Describes the bound “n+α” system in the


framework of the Resonating Group Method.
Model:ψ M ( r ) = re
−br
N (r ), b > 0
nodes
Where: N (r ) = ∑ v σ (w r + u )
j =1
j j j

is a single hidden layer, sigmoidal Perceptron


OBTAINING
EIGENVALUES
Example:The Henon-Heiles potential
1  ∂ 2Ψ ∂ 2Ψ  1 2 1  2 1 3 
−  2 + 2  +  ( x + y2 ) +  xy − x  Ψ = εΨ
2  ∂x ∂y   2 4 5 3 
−k ( x2 + y 2 )
Asymptotic behavior: Ψ ( x, y ) ~ e
−b ( x 2 + y 2 )
Model used: ΨM ( x, y ) = e N ( x, y )
Use the above model to obtain an eigen solution Φ.
Obtain a different eigen solution by deflation, i.e. :
~
ΨM ( x, y ) = ΨM ( x, y ) − Φ ( x, y ) ∫∫ Φ ( x' , y ' )ΨM ( x' , y ' )dx' dy '

This model is orthogonal to Φ(x,y) by construction.


The procedure can be applied repeatedly.
ARBITRARILY
SHAPED DOMAINS

For domains other than Hypercubes the


BC cannot be embedded in the model.

Let Ri , ∀i = 1,2,..., m be the set of points


defining the arbitrarily shaped boundary.
The BC are then: Ψ ( Ri ) = bi ∀i = 1,2,..., m

Let ri , ∀i = 1,2,..., n be the set of the


training points inside the domain.

We describe two ways to proceed


solving the LΨ ( x ) = f ( x ) problem
OPTIMIZATION
WITH CONSTRAINTS

Model: ΨM ( x) = N ( x)
“Error” to be minimized:
Domain terms + Boundary terms
n m

∑ [ LΨ
i =1
M (ri ) − f (ri )] + β ∑ [Ψ M ( Ri ) − bi ]
2

i =1
2

With β a penalty parameter, to control


the degree of satisfaction of the BC.
PERCEPTRON-RBF
SYNERGY
Model:
m
ΨM ( x ) = N ( x ) + ∑ai e −λ x −Ri 2

i =1

Where the α’s are determined in a way so


that the model satisfies the BC exactly, i.e.:
m

∑a e
k =1
k
−λ Ri −Rk 2
= bi − N ( Ri )

The free parameter λ is chosen once


initially so as the system above is easily
solved. n

“Error”: ∑ Ψ − 2
[ L M ( ri ) f ( ri )]
i =1
Pros & Cons .
..
The RBF - Synergy is:
• Computationally costly. A linear
system is solved each time the model is
evaluated.
• Exact in satisfying the BC.

The Penalty method is:


• Approximate in satisfying the BC.

• Computationally efficient
IN PRACTICE . . .

• Initially proceed via the penalty method,


till an approximate solution is found.
• Refine the solution, using the RBF-
Synergy method, to satisfy the BC exactly.

Conclusions:
Experiments on several model
problems shows performance
similar to the one reported earlier.
GENERAL
OBSERVATIONS

Enhanced generalization performance is


achieved, when the exponential weights of the
Neural Networks are kept small.
Hence box-constrained optimization
methods should be applied.
Bigger Networks (greater number of nodes)
can achieve higher accuracy.
This favors the use of:
• Existing Specialized Hardware
• Sophisticated Optimization Software
MERLIN 3.0
A software package offering
What is it ? many optimization algorithms
and a friendly user interface.

What problems does it solve ?


Find a local minimum of the function:
f ( x ) , x ∈ R N , x = ( x1 , x2 ,..., x N )
Under the conditions:
xi ∈[li , ui ], ∀ i = 1,2,..., N
ALGORITHMS
Direct Methods
• SIMPLEX
• ROLL
Gradient Methods

Conjugate Gradient Quasi Newton

• Polak-Ribiere • BFGS (3 versions)


• Fletcher-Reeves • DFP
• Generalized P&R

Levenberg-Marquardt
• For Sum-Of-Squares
THE USER’S PART

What the user has to do ?


• Program the objective function
• Use Merlin to find an optimum

What the user may want to do ?


• Program the gradient
• Program the Hessian
• Program the Jacobian
MERLIN
FEATURES & TOOLS
• Intuitive free-format I/O
• Menu assisted Input
• On-line HELP
• Several gradient modes
• Confidence parameter intervals
• Box constraints
• Postscript graphs
• Programmability
• “Open” to user enhancements
Merlin
MCL: Control
Language
What is it ?
High-Level Programming Language,
that Drives Merlin Intelligently.

What are the benefits ?


• Abolishes User Intervention.
• Optimization Strategies.
• Handy Utilities.
• Global Optimum Seeking Methods.
MCL
REPERTOIRE
MCL command types:
• Merlin Commands
• Conditionals (IF-THEN-ELSE-ENDIF)
• Loops (DO type of loops)
• Branching (GO TO type)
• I/O (READ/WRITE)

MCL intrinsic variables:


All Merlin important variables, e.g.:
Parameters, Value, Gradient, Bounds ...
SAMPLE
MCL PROGRAM
program
var i; sml; bfgs_calls; nfix; max_calls
sml = 1.e-4 % Gradient threshlod.
bfgs_calls = 1000 % Number of BFGS calls.
max_calls = 10000 % Max. calls to spend.
again:
loosall
nfix = 0
loop i from 1 to dim
if abs[grad[i]] <= sml then
fix (x.i)
nfix = nfix+1
end if
end loop
if nfix == dim then
display 'Gradient below threshold...'
loosall
finish
end if
bfgs (noc=bfgs_calls)
when pcount < max_calls just move to again
display 'We probably failed...'
end
MERLIN-MCL
Availability
The Merlin - MCL package is
written in ANSI Fortran 77 and
can be downloaded from the
following URL:

http://nrt.cs.uoi.gr/merlin/
It is maintained, supported
and is FREELY available to
the scientific community.
FUTURE
DEVELOPMENTS

• Optimal Training Point Sets


• Optimal Network Architecture
• Expansion & Pruning Techniques

Hardware Implementation on
NEUROPROCESSORS

Das könnte Ihnen auch gefallen