Sie sind auf Seite 1von 354

Editors

Bri
Snel
Khanh
Dao

Applied Mathematics in Engineering and Reliability contains papers presented at the International
Conference on Applied Mathematics in Engineering and Reliability (ICAMER 2016, Ho Chi Minh City,
Viet Nam, 4-6 May 2016). The book covers a wide range of topics within mathematics applied in
reliability, risk and engineering, including:

in Engineering and Reliability


Applied Mathematics
Risk and Reliability Analysis Methods
Maintenance Optimization Editors
Bayesian Methods
Monte Carlo Methods for Parallel Computing of Reliability and Risk
Advanced Mathematical Methods in Engineering
Radim Bri
Methods for Solutions of Nonlinear Partial Differential Equations Vclav Snel
Statistics and Applied Statistics, etc.
Chu Duc Khanh
The application areas range from Nuclear, Mechanical and Electrical Engineering to Information
Technology and Communication, Safety Engineering, Environmental Engineering, Finance to Health Phan Dao
and Medicine. The papers cover both theory and applications, and are focused on a wide range of
sectors and problem areas. Integral demonstrations of the use of reliability and engineering
mathematics are provided in many practical applications concerning major technological systems
and structures.

Applied Mathematics in Engineering and Reliability will be of interest to academics and professionals
working in a wide range of industrial, governmental and academic sectors, including Electrical and
Electronic Engineering, Safety Engineering, Information Technology and Telecommunications, Civil
Engineering, Energy Production, Infrastructures, Insurance and Finance, Manufacturing, Mechanical
Engineering, Natural Hazards, Nuclear Engineering, Transportation, and Policy Making.

Applied Mathematics
in Engineering and Reliability
an informa business
APPLIED MATHEMATICS IN ENGINEERING AND RELIABILITY

AMER16-FM.indd i 3/15/2016 2:12:24 PM


This page intentionally left blank
PROCEEDINGS OF THE 1ST INTERNATIONAL CONFERENCE ON APPLIED MATHEMATICS IN
ENGINEERING AND RELIABILITY (ICAMER 2016), HO CHI MINH CITY, VIETNAM, 46 MAY
2016

Applied Mathematics in Engineering


and Reliability

Editors
Radim Bri & Vclav Snel
Faculty of Electrical Engineering and Computer Science
VBTechnical University of Ostrava, Czech Republic

Chu Duc Khanh


Faculty of MathematicsStatistics, Ton Duc Thang University, Vietnam

Phan Dao
European Cooperation Center, Ton Duc Thang University, Vietnam

AMER16-FM.indd iii 3/15/2016 2:12:24 PM


CRC Press/Balkema is an imprint of the Taylor & Francis Group, an informa business

2016 Taylor & Francis Group, London, UK

Typeset by V Publishing Solutions Pvt Ltd., Chennai, India


Printed and bound in Great Britain by CPI Group (UK) Ltd, Croydon, CR0 4YY

All rights reserved. No part of this publication or the information contained herein may be reproduced,
stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical,
by photocopying, recording or otherwise, without written prior permission from the publisher.

Although all care is taken to ensure integrity and the quality of this publication and the information
herein, no responsibility is assumed by the publishers nor the author for any damage to the property or
persons as a result of operation or use of this publication and/or the information contained herein.

Published by: CRC Press/Balkema


P.O. Box 11320, 2301 EH Leiden, The Netherlands
e-mail: Pub.NL@taylorandfrancis.com
www.crcpress.com www.taylorandfrancis.com

ISBN: 978-1-138-02928-6 (Hbk)


ISBN: 978-1-315-64165-2 (eBook PDF)

AMER16-FM.indd iv 3/17/2016 7:59:05 PM


Applied Mathematics in Engineering and Reliability Bri, Snel, Khanh & Dao (Eds)
2016 Taylor & Francis Group, London, ISBN 978-1-138-02928-6

Table of contents

Preface ix
Organization xi
Message from Professor Vinh Danh Le xiii
Introduction xv

Applied mathematics in reliability engineering


Bayesian methods, Bayesian reliability
A conjugate prior distribution for Bayesian analysis of the Power-Law Process 3
V.C. Do & E. Gouno
Bayesian approach to estimate the mixture of failure rate model 9
R. Bri & T.T. Thach
Cost oriented statistical decision problem in acceptance sampling and quality control 19
R. Bri
High-dimensional simulation experiments with particle filter
and ensemble Kalman filter 27
P. Bui Quang & V.-D. Tran
The Prior Probability in classifying two populations by Bayesian method 35
V.V. Tai, C.N. Ha & N.T. Thao

Efficient methods to solve optimization problems


Estimation of parameters of Rikitake systems by SOMA 43
T.D. Nguyen & T.T.D. Phan
Clustering for probability density functions based on Genetic Algorithm 51
V.V. Tai, N.T. Thao & C.N. Ha
Optimization of truss structures with reliability-based frequency constraints under
uncertainties of loadings and material properties 59
V. Ho-Huu, T. Vo-Duy, T. Nguyen-Thoi & L. Ho-Nhat
Optimum revenue calculation method to generate competitive hydroelectric power
on Hua Na hydropower 67
P.T.H. Long, L.Q. Hung & P. Dao

Maintenance modelling and optimization


A dynamic grouping model for the maintenance planning of complex structure systems
with consideration of maintenance durations 73
H.C. Vu, A. Barros, M.A. Lundteigen & P. Do
A parametric predictive maintenance decision framework considering the system
health prognosis accuracy 81
K.T. Huynh, A. Grall & C. Brenguer

AMER16-FM.indd v 3/15/2016 2:12:24 PM


Condition-based maintenance by minimax criteria 91
O. Abramov & D. Nazarov
Impact of cost uncertainty on periodic replacement policy with budget constraint:
Application to water pipe renovation 95
K.T.P. Nguyen, D.T. Tran & B.M. Nguyen

Monte Carlo methods for parallel computing of reliability and risk


Acceleration of multi-factor Merton model Monte Carlo simulation via Importance
Sampling and GPU parallelization 107
M. Bre & R. Bri
Highly reliable systems simulation accelerated using CPU and GPU parallel computing 119
S. Domesov & R. Bri

Network and wireless network reliability


Advanced protocol for wireless information and power transfer in full duplex DF
relaying networks 133
X.-X. Nguyen, D.-T. Pham, T.-Q. Nguyen & D.-T. Do
A performance analysis in energy harvesting full-duplex relay 139
T.N. Kieu, T.N. Hoang, T.-Q.T. Nguyen, H.H. Duy, D.-T. Do & M. Vozk
A stochastic model for performance analysis of powered wireless networks 145
N.D. Ut & D.-T. Do
Energy harvesting in amplify-and-forward relaying systems with
interference at the relay 153
T.-L. Nguyen & D.-T. Do
On a tandem queueing network with breakdowns 159
A. Aissani

Risk and hazard analysis


Risk assessment of biogas plants 169
K. Derychova & A. Bernatik
Verification of the design for forced smoke and heat removal from a sports hall 177
P. Kuera & H. Dvorsk

Stochastic reliability modelling, applications of stochastic processes


The methods of parametric synthesis on the basis of acceptability region
discrete approximation 187
Y. Katueva & D. Nazarov
Inference for a one-memory Self-Exciting Point Process 193
E. Gouno & R. Damaj

System reliability analysis


House events matrix for application in shutdown probabilistic safety assessment 201
M. epin
Imprecise system reliability using the survival signature 207
F.P.A. Coolen, T. Coolen-Maturi, L.J.M. Aslett & G. Walter
Parallel algorithms of system availability evaluation 215
M. Kvassay, V. Levashenko & E. Zaitseva

vi

AMER16-FM.indd vi 3/15/2016 2:51:57 PM


Advanced mathematical methods for engineering
Advanced methods to solve partial differential equations
A note on the Cauchy-Riemann equation on a class of convex domains
of finite and infinite type in 2 225
L.K. Ha
A review on global and non-global existence of solutions of source types of degenerate
parabolic equations with a singular absorption: Complete quenching phenomenon 231
D.N. Anh, K.H. Van & J.I. Daz
A spectral decomposition in vector spherical harmonics for Stokes equations 237
M.-P. Tran & T.-N. Nguyen
On convergence result for a finite element scheme of Landau-Lifschitz-Gilbert equation 243
M.-P. Tran
N
Some results on the viscous Cahn-Hilliard equation in 249
L.T.T. Bui & N.A. Dao

Inverse problems
On a multi-dimensional initial inverse heat problem with a time-dependent coefficient 255
C.D. Khanh & N.H. Tuan

Advanced numerical methods


A study of Boundary Element Method for 3D homogeneous Helmholtz equation
with Dirichlet boundary condition 271
M.-P. Tran & V.-K. Huynh
A study of stochastic FEM method for porous media flow problem 281
R. Blaheta, M. Bre & S. Domesov
Parallel resolution of the Schur interface problem using the Conjugate gradient method 291
M.-P. Tran

Statistics and applied statistics


Determining results of the Phadiatop test using logistic regression and contingency
tables based on a patients anamnesis 303
P. Kurov
On the distortion risk measure using copulas 309
S. Ly, U.H. Pham & R. Bri
On the performance of sequential procedures for detecting a change,
and Information Quality (InfoQ) 317
R.S. Kenett
Two-factor hypothesis testing using the Gamma model 327
N. Pal, N.T.T. Tran & M.-P. Tran

Author index 335

vii

AMER16-FM.indd vii 3/15/2016 2:12:24 PM


This page intentionally left blank
Applied Mathematics in Engineering and Reliability Bri, Snel, Khanh & Dao (Eds)
2016 Taylor & Francis Group, London, ISBN 978-1-138-02928-6

Preface

ICAMER (International Conference on Applied Mathematics in Engineering and Reliability) is first


conference on this topic in Vietnam promoted by the following institutions: Ton Duc Thang University
from Ho Chi Minh City, VBTechnical University of Ostrava, and the European Safety and Reliability
Association (ESRA). The Conference attracts broad international community, which is a good mix of
academics and industry participants that present and discuss subjects of interest and application across
various industries.
Main theme of the Conference is Applied Mathematics in Reliability and Engineering. The Confer-
ence covers a number of topics within mathematics applied in reliability, risk and engineering, including
risk and reliability analysis methods, maintenance optimization, Bayesian methods, methods to solve
nonlinear differential equations, etc. The application areas range from nuclear engineering, mechanical
engineering, electrical engineering to information technology and communication, safety engineering,
finance or health. The Conference provides a forum for presentation and discussion of scientific papers
covering theory, methods and applications to a wide range of sectors and problem areas. Integral demon-
strations of the use of reliability and engineering mathematics were provided in many practical applica-
tions concerning major technological systems and structures.
The ICAMER Conference is organized for the first time in Vietnam. Ho Chi Minh City has been
selected as the venue for the Conference. Ho Chi Minh City, one of biggest cities in Vietnam, as well as in
the world, lies in the southern part of Vietnam and ranks amongst the most impressive and modern cities
in the world. The city has always played an important part in the history of the country.
The host of the conference is the Ton Duc Thang University in close cooperation with VBTechnical
University of Ostrava and ESRA. The Ton Duc Thang University, as well as VBTechnical University
of Ostrava rank among top technical universities in both countries. They develop traditional branches of
industry as metallurgy, material engineering, mechanical, electrical, civil and safety engineering, econom-
ics, computer science, automation, environmental engineering and transportation. Research and devel-
opment activities of those Universities are crucial for the restructuring process in both countries and
corresponding regions.
The program of the Conference includes around 50 papers from prestigious authors coming from
all over the world. Originally, about 72 abstracts were submitted. After the review by the Technical Pro-
gramme Committee of full papers, 40 have been selected to be included in these Proceedings. The work
and effort of the peers involved in the Technical Program Committee in helping the authors to improve
their papers are greatly appreciated.
Thanks to authors as well as reviewers for their contributions in this process. The review process has
been conducted electronically through the Conference webpage.
Finally we would like to acknowledge the local organizing committee for their careful planning of the
practical arrangements.

Radim Bri, Vclav Snel,


Chu Duc Khanh and Phan Dao
Editors

ix

AMER16-FM.indd ix 3/15/2016 2:12:24 PM


This page intentionally left blank
Applied Mathematics in Engineering and Reliability Bri, Snel, Khanh & Dao (Eds)
2016 Taylor & Francis Group, London, ISBN 978-1-138-02928-6

Organization

HONORARY CHAIRS

Prof. Le Vinh Danh


President of Ton Duc Thang University, Vietnam
Prof. Ivo Vondrk
President of VBTechnical University of Ostrava, Czech Republic
Prof. Terje Aven
President of ESRA, University of Stavanger, Norway

CONFERENCE CHAIRMAN

Radim Bri, VBTechnical University of Ostrava, Czech Republic

CONFERENCE CO-CHAIRMEN

Vclav Snel, VBTechnical University of Ostrava, Czech Republic


Chu Duc Khanh, Ton Duc Thang University, Vietnam
Phan Dao, Ton Duc Thang University, Vietnam

ORGANIZING INSTITUTIONS

Faculty MathematicsStatistics, Ton Duc Thang University, Vietnam


Faculty of Electrical Engineering and Computer Science, VBTechnical University of Ostrava,
Czech Republic
European Safety and Reliability Association
European Cooperation Center, Ton Duc Thang University, Vietnam

INTERNATIONAL CONFERENCE COMMITTEE

John Andrews, ESRA TC Chair, The University of Nottingham, UK


Christophe Berenguer, ESRA TC Chair, Grenoble Institute of Technology, France
Radim Bri, Vice-Chairman of ESRA, VBTechnical University of Ostrava, Czech Republic
Marko epin, ESRA TC Chair, University of Ljubljana, Slovenia
Eric Chtelet, Troyes University of Technology, France
Frank Coolen, Durham University, UK
Phan Dao, Ton Duc Thang University, Vietnam
Tran Trong Dao, Ton Duc Thang University, Vietnam
Jesus Ildefonso Diaz, Complutense University of Madrid, Spain
Vo Hoang Duy, Ton Duc Thang University, Vietnam
Antoine Grall, Chairman of ESRA Committee for Conferences, Troyes University of Technology, France
Dang Dinh Hai, University of Mississippi, USA
Nguyen Thanh Hien, Ton Duc Thang University, Vietnam
Chu Duc Khanh, Ton Duc Thang University, Vietnam

xi

AMER16-FM.indd xi 3/15/2016 2:12:24 PM


Krzysztof Koowrocki, Past Chairman of ESRA Committee for Conferences, Gdynia Maritime University,
Poland
Jan Krack, VBTechnical University of Ostrava, Czech Republic
Miroslav Vozk, VBTechnical University of Ostrava, Czech Republic
Vitaly Levashenko, University of Zilina, Slovakia
Gregory Levitin, ESRA TC Chair, The Israel Electric Corporation, Israel
Phan Tran Hong Long, Water Resources University, Vietnam
Le Van Nghi, Key Laboratory of River and Coastal Engineering, Vietnam
Nabendu Pal, Ton Duc Thang University, Vietnam and University Louisiana, Lafayette, USA
Do Phuc, Lorraine University, France
Pavel Praks, VBTechnical University of Ostrava, Czech Republic
Joanna Soszyska-Budny, Gdynia Maritime University, Poland
Do Dinh Thuan, HCMC University of Technology and Education, Vietnam
Nguyen Thoi Trung, Ton Duc Thang University, Vietnam
David Vali, University of Defence, Brno, Czech Republic
Elena Zaitseva, ESRA TC Chair, University of Zilina, Slovakia

LOCAL ORGANIZING COMMITTEE

Tran Trong Dao, Ton Duc Thang University, Vietnam


Trinh Minh Huyen, Ton Duc Thang University, Vietnam
Phan Dao, Ton Duc Thang University, Vietnam
Chu Duc Khanh, Ton Duc Thang University, Vietnam
Vo Hoang Duy, Ton Duc Thang University, Vietnam

SECRETARY OF THE CONFERENCE

Dao Nguyen Anh, Ton Duc Thang University, Vietnam


E-mail: icamer@tdt.edu.vn

SPONSORED BY

Ton Duc Thang University, Vietnam


VBTechnical University of Ostrava, Czech Republic
European Safety and Reliability Association

xii

AMER16-FM.indd xii 3/15/2016 2:12:24 PM


Applied Mathematics in Engineering and Reliability Bri, Snel, Khanh & Dao (Eds)
2016 Taylor & Francis Group, London, ISBN 978-1-138-02928-6

Message from Professor Vinh Danh Le

Welcome to the 1st International Conference on Applied Mathematics in Engineering and Reliability
(ICAMER 2016), held at Ton Duc Thang University, Vietnam. This Conference aims to offer a forum
for scientists, researchers, and managers from universities and companies to share their research findings
and experiences in the field. In recognition of its special meaning and broad influence, we consider the
organization of this Conference as one of our strategic activities in the development of three decades
applied research university.
Ton Duc Thang University (TDTU) has always described itself as a young, inspiring and dynamically
growing higher education institution in vibrant Ho Chi Minh City.
TDTU is steadily growing to meet the expanding demand for higher education as well as high-quality
human resources in Vietnam. With fifteen faculties and around 25,000 students, the University is now
ranked among the largest and fastest growing universities in Vietnam in all aspects.
On behalf of TDTU, the host institution of ICAMER 2016, I would like to express my sincere appre-
ciation to our great partnersEuropean Safety and Reliability Association (ESRA) and VB-Technical
University of Ostrava (Czech Republic)for their great efforts in organizing this Conference. I would
also like to send my special thanks to conference committees, track chairs, reviewers, speakers and authors
around the world for their contributions to and interest in our event.
I believe that you will have an interesting and fruitful conference in Vietnam. I really look forward to
welcoming all of you at our campus and hope that this Conference will start a long-term partnership
between you and our university.

February 2016

Prof. Vinh Danh Le, Ph.D.


President
Ton Duc Thang University, Vietnam

xiii

AMER16-FM.indd xiii 3/15/2016 2:12:24 PM


This page intentionally left blank
Applied Mathematics in Engineering and Reliability Bri, Snel, Khanh & Dao (Eds)
2016 Taylor & Francis Group, London, ISBN 978-1-138-02928-6

Introduction

The Conference covers a number of topics within engineering and mathematics. The Conference is
especially focused on advanced engineering mathematics which is frequently used in reliability, risk and
safety technologies.

I APPLIED MATHEMATICS IN RELIABILITY ENGINEERING

Bayesian Methods, Bayesian Reliability


Efficient Methods to Solve Optimization Problems
Maintenance Modelling and Optimization
Monte Carlo Methods for Parallel Computing of Reliability and Risk
Network and Wireless Network Reliability
Risk and Hazard Analysis
Stochastic Reliability Modelling, Applications of Stochastic Processes
System Reliability Analysis

II ADVANCED MATHEMATICAL METHODS FOR ENGINEERING

Advanced Methods to Solve Partial Differential Equations


Inverse Problems
Advanced Numerical Methods
Statistics and Applied Statistics

xv

AMER16-FM.indd xv 3/15/2016 2:12:25 PM


This page intentionally left blank
Applied mathematics in reliability engineering

Bayesian methods, Bayesian reliability

AMER16_Book.indb 1 3/15/2016 11:22:22 AM


This page intentionally left blank
Applied Mathematics in Engineering and Reliability Bri, Snel, Khanh & Dao (Eds)
2016 Taylor & Francis Group, London, ISBN 978-1-138-02928-6

A conjugate prior distribution for Bayesian analysis


of the Power-Law Process

V.C. Do & E. Gouno


Laboratoire de Mathmatiques de Bretagne Atlantique Universit de Bretagne Sud, France

ABSTRACT: This paper focuses on the Bayesian analysis of the Power-Law Process. We investigate the
possibility of a natural conjugate prior. Relying on the work of Huang and Bier (1998), we introduce and
study the H-B distribution. This distribution is a natural conjugate prior since the posterior distribution
is a HB-distribution. We describe a strategy to draw out the prior distribution parameters. Results on
simulated and real data are displayed.

1 INTRODUCTION concluding, we apply the method on simulated


data and on data from aircraft generator.
The Power-Law Process (PLP) is a non homoge-
neous Poisson process { (t ) } with a power
law intensity m(t ) = t / , > , > 0 . The 2 THE HUANG-BIER DISTRIBUTION
literature on this process is abundant. It has been
widely used to describe reparaible systems, in soft- In this section, we introduce a new bi-variate distri-
ware reliability, in reliability growth, etc. Inference bution. This distribution has four parameters. One
was carried out by many authors from a frequen- of its component has a gamma distribution while
tist and a Bayesian perspective. Choosing the the marginal distribution of the other one do not
prior distribution is an important matter. Guida have an explicit expression. However, the expecta-
et al. (1989) propose different choice: a joint non tion and the variance of each component can be
informative prior of the form ( )1, a uniform obtained. First of all, we give the definition of the
distribution for and 1 / for . Then con- H-B distribution.
sidering a gamma prior distribution on m(t ) , the
number of expected failures, they express a distri- Definition 1 A bivariate r.v. ( ) R+ R+
bution for given . Bar-Lev et al. (1992) con- has a Huang-Bier distribution with parameters
sider a joint prior for ( ) of the form ( )1. ( ) where a, b, c, d > 0 and such that c d a ,
They obtain a chi-square distribution for pos- if it has a p.d.f. of the form:
terior distribution but a cumbersome expression
for posterior distribution. Sen and Khattree fX ,Y x, y ) y )a 1c y e p{ bd y x}
K ( xy (1)
(1998) study specifically the Bayesian estimator
of m(t ) considering different lost functions. Our
where K = [ b l g(d a / c )]a / (a )2 .
purpose here is to investigate conjugate prior for
We denote: ( ) HB ( ) . Figure 1
the Bayesian analysis of PLP. This problem has
displays a H-B distribution with parameters
already been addressed by Huang and Bier (1998)
( ) . As mentioned before, the marginal
and (Oliveira & Gilardoni 2012). In section 2, we
distribution of X cannot be obtained explicitly.
define a 4-parameter distribution that we name
The following theorem provides the conditionnal
the H-B distribution (for Huang-Bier distribu-
distribution of X .
tion). Properties of this distribution are given
and it is shown that this distribution is a natural
Theorem 1 Let ( ) HB ( ).
conjugate prior for Bayesian analysis of the PLP.
Then
The Bayes estimates are then obtained and we
suggest a technique to elicit the parameters of the i. X given Y y has a gamma distribution with
y
prior distribution. This technique is very attrac- parameters ( ),
tive and simple since the practionner has only to ii. Y has a gamma distribution with parameters
give a prior guess on and a standard deviation ( (d a / c )) .
(d
associated with his guess. To end with and before

AMER16_Book.indb 3 3/15/2016 11:22:22 AM


Figure 1. Probability density function for the HB distribution with a = 1.5, b = 5, c = 0.5 and d = 1.

Proof: To compute E(X ) we consider the conditional


expectation and compute E [ E (X | Y )] to obtain:
+
fY y ) = xy )a 1c y exp{ bd y x}ddx
K ((xy
xy
0 a
+ a k
= K ya 1c y x a 1 exp{ bd y x}dxd E (X ) = (4)
0 b k og d
( a )
= K ya 1c y
(bd y )a A similar reasonning provides:
y
[log(d / c )]a c a 1
a
= a y a
(a ) d a k
Var(X ) = k . (5)
=
[log(dd a / c )]a a 1
y ogg (d a / c ) y
llog b2 og d
(a )
It is interesting to remark that when d = 1 , X
Therefore Y has a gamma distribution with and Y are independent and have gamma distribu-
g(( a )) .
parameters ( log(
log( tions. We have the following theorem:
To prove ( ) we write:
Theorem 2 Let ( ) H B( ) . Then X
fX |Y x ) = fX ,Y x y fY ( y ) and Y are independent, X has a gamma distribu-
y
tion with parameters ( ) and Y has a gamma dis-
xy )a 1c y exp{ bd y x}
K ( xy tribution with parameters ( log(
log(
g( )) .
= (a )
K ya 1c y Proof: When d = 1 ,
b y )a
( bd
(bd y )a a 1
= x e p{ bd y x} fX ,Y x, y ) K ( xyy )a 1c y e p{ bx}
(a )

Thus X | Y = y has a gamma distribution with where K = [ b l g(d a / c )]a / (a )2 .


y
parameters ( ). Clearly,
The previous theorem allows us to compute the
expectation and the variance of X and Y . ba a 1
Let k g(d a / c ) . fX ,Y x, y ) = x e p{ bx}
(a )
We have
[log(1 / c )]a a 1
y exp{ log(1 / c )yy
E (Y ) = a / k (2) (a )
= fX x ) fY y ).
and
Note that in this case, the expectations and the
Var(Y ) = a / k 2 (3) variances are easily obtained. We have:

AMER16_Book.indb 4 3/15/2016 11:22:25 AM


E (X ) = a / b, Var(X ) a b2 , E (Y ) = a / log(1 / c ) n
d MLE = n / tn MLE .

MLE = and
i =1 log( n
n
and Var(Y ) = a / [log(1 / c )]2 . i)

One can see that Bayes can be expressed as a


3 CONJUGATE PRIOR convex combination of the MLE and the expecta-
tion of the prior distribution:
We now consider the Bayesian inference for the PLP.
We reparametrize the intensity setting 1 / = . a
Then m(t ) = t 1 . Then a prior distribution for Bayes qn ( ) MLE [1
[ n( )] ,
k
( ) is needed. The HB-distribution is a natural
conjugate prior.
where
Theorem 3 Let t (t1, ,tn ) be the jump p dates of
a Power Law Process with intensity t 1 Sn
Then a Huang-Bier distribution with parameters qn (a c ) = . (8)
( k Sn
n ) is a natural conjugate prior and the pos-
terior distribution is a Huang-Bier
g distribution with
i =1 i n ) .
n
parameters ( This remark will be usefull to choose the param-
eters ( ) in the sequel.
Proof: We have A relationship between Bayes and MLE can
be proposed.
1
n Sn = n+a
f t | ) n n ti
i =1
e { tn } From (6), k
Bayes
.
Substituting in 7,
Let us consider a H-B ( n) as the joint
n+a
n + a
prior distribution:
1
Bayes =
( a 1 a 1 1 + b 1 log Bayes
)
n }.
B y
l n (

The posterior distribution is: which can be approximated by:

( ( ) n+a n+a 1
Bayes p{{ log }= .
B
Bayes
1 exp{ l g (9)
n 1 b 1 b t Bayes
n+ 1 n+ a 1 c ti n
i =1
exp{( + )tn } Therefore Bayes can be expressed as a convex
combination of the MLE and the prior expecta-
That is to say a HB distribution with parameters tion of given :
n
( i n) . a
i =1
Bayes (1 ) MLE ,
Assuming a quadratic loss, the Bayes estimators
btn Bayes
are the expectation of the posterior distributions.
Therefore by (4) and (2), we have:
where = 1+bb . This approximation will be used in
n+a the next section to elicit prior parameters.
Bayes = (6)
k Sn

and 4 PRIOR ELICITATION

n+a We suggest a first strategy (elicitation strategy 1)


n+a k Sn to choose the values for the prior parameters. Sup-
Bayes = (7)
1 n lloggttn pose that the practioner has a guess g ,1 at the
value of and a guess g ,2 at the standard devia-
i =1 log(tn / ti )
n
where Sn and k = l g ( na / c ) . tion associated with g ,1 . Then a value for a can
Let us recall the expression of the MLE: be obtained by solving the system:

AMER16_Book.indb 5 3/15/2016 11:22:29 AM


Table 1. Mean of the Bayes estimates with elicitation
a / k g ,1 (10 ) strategy 1 for simulated data from a PLP with input

a / k = g ,2 ( ) parameter values = 1.38 and = 0.0008 .

Sample-size Prior guess Bayes estimates


We have:
n g ,1 g ,2 Bayes Bayes
2
a g , / g ,2 k = a / g ,1.
10 0,90 0,27 1,1338 0,006949
0,54 1,4009 0,012574
Then (6) can be computed. 0,81 1,5412 0,014838
According to (9), n(n + ab ) can be interpreted as a 1,40 0,42 1,4973 0,002126
0,84 1,6157 0,007978
confidence or corrective factor associated with 1,26 1,6750 0,011449
the MLE. A value for b can be obtained solving 2,10 0,63 1,8461 0,000892
the equation: 1,26 1,7543 0,006538
1,89 1,7318 0,010739
n+a n+a MLE 1.4343 0.001604
= to obtain b = 1, 150 0,90 0,27 1,3545 0,001994
n( b ) n
0,54 1,3936 0,001714
0,81 1,4016 0,001662
with = qn (a, c ) for example. 1,40 0,42 1,4114 0,001363
A second strategy (elicitation strategy 2) con- 0,84 1,4124 0,001506
sists in considering a guess at and a guess at 1,26 1,4127 0,001535
, g . From the guess at , a value for b can 2,10 0,63 1,4464 0,001085
be deduced. Setting n = a / b , a value for a is 1,26 1,4226 0,001411
obtained. The guess at provides a value for k 1,89 1,4180 0,001484
since k = a / g . The results using this strategy are MLE 1.3995 0.001082
displayed in Table 2. 2000 0,90 0,27 1,3848 0,000904
0,54 1,3879 0,000881
0,81 1,3885 0,000877
5 APPLICATIONS 1,40 0,42 1,3855 0,000899
0,84 1,3855 0,000904
5.1 Simulated data 1,26 1,3854 0,000905
2,10 0,63 1,3906 0,000854
In order to investigate the behaviour of the HB 1,26 1,3887 0,000875
natural conjugate prior, we make a comparison 1,89 1,3883 0,000879
between Bayesian estimation and maximum like- MLE 1.3803 0.000834
lihood estimation relying on simulated data from
PLP. For the elicitation strategy 1, three different
values of priormean for are investigated: case
[1] prior mean underestimates the true value of
parameters, case [2] prior mean overestimates the overestimated prior guess, we choose respectively
true value of parameters, and case [3] prior mean g 0 9 g ,1 = 1.4, g ,1 = 2.1.
is relatively close to the true value of parameter In case of large sample size, it is not surprising
used in generating the data sets. For a each given that Bayesian estimates are relatively close to MLEs
prior guess g ,1 , computations are carried out and tend to the true values of the parameters what
using three incertitude values of variability g ,2 ever the parameter is underestimated or overes-
according to the scheme: g g ,1 , where timated. For small and medium size, one can see
= 0.3, 0.6, 0.9 are the coefficient of variation. that the underestimating scenario is more accurate
The sample sizes vary from small size n = 20 to than the two other scenarios. In more detail, the
medium size n = 150 and then to very large size Bayesian estimators seems to increase when we
n = 2000 . The small size case is in favour of show- let the incertitude values become larger. In case
ing the advantage of Bayesian approach. of medium sample size, the underestimated prior
Table 1 and Table 2 describe the results of guess with moderate variability g 0 6 g ,1
estimation based on the data sets generated by a gives results on Bayesian estimators which are
PLP with true parameters = 0.008. In more accurate than MLEs but in small sample size
table 1, we use the elicitation strategy 1 for choos- case, the MLEs seems to perform better.
ing the values of the prior parameters. For under- On the other hand, results in table 2 illustrate
estimated prior guess, accurate prior guess and the elicitation strategy 2. We choose different val-

AMER16_Book.indb 6 3/15/2016 11:22:33 AM


Table 2. Mean of the Bayes estimates with elicitation ues for depending on the confidence we might
strategy 2 for simulated data from a PLP with input
have in the data. We set = 0.3, 0.6, 0.8, 0.95 . We
parameter values = 1.38 and = 0.0008.
consider as in Tables 1, 3 values for g . In small
Sample-size Prior guess Bayes estimates and medium sample size, it turns out that if we
choose the accurate prior for then the Bayesian
n g Bayes Bayes estimator of performs better but the Bayesian
estimator of performs worse compare to strat-
10 0.90 0.30 1.3084 0.0172384 egy 1. We remark that globally the results with
0.60 1.0856 0.0187112 strategy 2 are worse than with strategy 1. But for
0.80 0.9821 0.0205621 some schemes of prior, the estimation of for
0.95 0.9189 0.0224425 e.g. is closer to the input value. With strategy 2, we
1.40 0.30 1.5894 0.0080820 observe more dispersion on the estimates.
0.60 1.4833 0.0033231
0.80 1.4350 0.0016563
0.95 1.4076 0.0009654 5.2 Real data
2.10 0.30 1.7735 0.0057058 The Table 5.2 gives data that have been discussed
0.60 1.8691 0.0008951 many times in the literature (Bar-Lev, Lavi, &
0.80 1.9654 0.0001251
0.95 2.0617 0.0000164
MLE 1.4343 0.001604
150 0.90 0.30 1.1988 0.0060447 Table 4. Bayes estimates with strategy 1 for aircraft gen-
0.60 1.0488 0.0177631 erator data.
0.80 0.9686 0.0329844
Prior guess Bayes estimates
0.95 0.9162 0.0500529
1.40 0.30 1.4018 0.0012170 g ,1 g ,2 Bayes Bayes
0.60 1.4000 0.0009408
0.80 1.3996 0.0008128 0.075 0.3583 0.2561
0.95 1.3998 0.0007406 0.25 0.15 0.4646 0.2642
2.10 0.30 1.5625 0.0003637 0.225 0.5123 0.2457
0.60 1.7534 0.0000530 0.15 0.5350 0.1054
0.80 1.9103 0.0000106 0.5 0.30 0.5555 0.1730
0.95 2.0489 0.0000026 0.45 0.5623 0.1959
MLE 1.3995 0.001082 0.225 0.6402 0.0604
2000 0.90 0.30 1.1944 0.0064560 0.75 0.45 0.5943 0.1441
0.60 1.0475 0.0298960 0.675 0.5812 0.1797
0.80 0.9681 0.0687878 MLE 0.5690 0.1076
0.95 0.9161 0.1189907
1.40 0.30 1.3912 0.0008157
0.60 1.3949 0.0007593
0.80 1.3974 0.0007268 Table 5. Bayes estimate with strategy 2 for aircraft gen-
0.95 1.3993 0.0007047 erator data.
2.10 0.30 1.5447 0.0001643
0.60 1.7420 0.0000196 Prior guess Bayes estimates
0.80 1.9042 0.0000034
g ,1 Bayes Bayes
0.95 2.0474 0.0000007
MLE 1.3803 0.000834
0.25 0.30 0.4115 0.5399
0.60 0.3223 0.9559
Table 3. Failure times in hours for aircraft generator. 0.80 0.2816 1.2621
0.95 0.2572 1.4992
Failure Time Failure Time 0.5 0.30 0.5464 0.2120
0.60 0.5255 0.2041
1 55 8 1308 0.80 0.5124 0.1981
2 166 9 2050 0.95 0.5031 0.1934
3 205 10 2453 0.75 0.30 0.6134 0.1355
4 341 11 3115 0.60 0.6653 0.0735
5 488 12 4017 0.80 0.7051 0.0439
6 567 13 4596 0.95 0.7383 0.0277
7 731 MLE 0.5690 0.1076

AMER16_Book.indb 7 3/15/2016 11:22:37 AM


Reiser 1992). Those are failure times in hours for a the choice of the elicitation strategy is very sensi-
complex type of aircraft generator. tive. More need to be done in order to improve the
The MLE for and are easily obtained: accuracy of the estimates. Other strategies should
MLE = 0.5690 and MLE = 0.10756 . We compare be investigated. We are working in this direction in
the MLE with the Bayes estimates in Table 3, for the present time.
strategy 1 and in Table 4 for strategy 2. Strategy
1 leads to an estimate close to the MLE when the
guess on is 0.5, with a small standard deviation.
Strategy is unable what ever be the guess to provide REFERENCES
estimate close to the MLE. Again the only case
where is close to the MLE is when g = 0.5. Bar-Lev, S., I. Lavi, & B. Reiser (1992). Bayesian infernce
The comments are very similar to those for simu- for the power law process. Ann; Inst. Statist. Math.
44(4), 623639.
lated data. Guida, M., R. Calabria, & G. Pulcini (1989). Bayes infer-
ence for non-homogenuous Poisson process with
power intensity law. IEEE Transactions on Reliability
6 CONCLUDING REMARKS 38, 603609.
Huang, Y.S. & V.B. Bier (1998). A natural conjugate prior
We introduce in this work a new distribution: the for non-homogeneous Poisson process with power law
H-B distribution. This distribution is a natural intensity function. Communications in Statistics-Sim-
conjugate prior to make Bayesian inference on the ulation and Computation 27, 525551.
PLP. More investigations concerning the proper- Oliveira, M.D. and Colosimo, E.A. & G.L. Gilardoni
(2012). Bayesian inference for power law process with
ties of this distribution need to be carried out. In applications in repairable systems. Journal of Statisti-
particular a better understanding of the proper- cal Planning and Inference 142, 11511160w.
ties will be helpful to elicit prior parameters. We Sen, A. & R. Khattree (1998). On estimating the current
suggest two strategies that are easy to implement, intensity of failure for the power law process. Journal
relying on expert guessing. The results show that of Statistical Planning and Inference 74, 252272.

AMER16_Book.indb 8 3/15/2016 11:22:39 AM


Applied Mathematics in Engineering and Reliability Bri, Snel, Khanh & Dao (Eds)
2016 Taylor & Francis Group, London, ISBN 978-1-138-02928-6

Bayesian approach to estimate the mixture of failure rate model

R. Bri
Department of Applied Mathematics, Faculty of Electrical Engineering and Computer Science,
VSBTechnical University Ostrava, Ostrava, The Czech Republic

T.T. Thach
Ton Duc Thang University, Faculty of Mathematics and Statistics, Ho Chi Minh City, Vietnam

ABSTRACT: Engineering systems are subject to continuous stresses and shocks which may (or may
not) cause a change in the failure pattern of the system with unknown probability. A mixture of failure
rate models can be used as representation of frequent realistic situations, the failure time distribution
is given in the corresponding case. Classical and Bayesian estimation of the parameters and reliability
characteristics of this failure time distribution is the subject matter of the paper, where particular emphasis
is put on Weibull wear-out failure model.

1 INTRODUCTION cept which is useful for differentiating between


different life-time distributions. Failure rate is
Engineering systems, while in operation, are always one such concept in the literature on reliability.
subject to environmental stresses and shocks which After analyzing such physical considerations of
may or may not alter the failure rate function of the system, we can formulate a mixture of failure
the system. Suppose p is the unknown probability rate functions which, in turn, provide the failure
that the system is able to bear these stresses and time distributions. In view of the above, and due
its failure model remains unaffected, and q is the to continuous stresses and shocks on the system,
probability of the complementary event. In such let us suppose that the failure rate function of a
situations, a failure distribution is generally used system remain unaltered with probability p, and
to describe mathematically the failure rate on the it undergoes a change with probability q. Let the
system. To some extent, the solution to the pro- failure rate function of the system in these two
posed problem is attempted through the mixture situations be in either of the following two states
of distributions (Mann et al. 1974, Sinha 1986, (Sharma et al. 1997):
Lawless 1982). However, in this regard we are faced
with two problems. Firstly, there are many physi-
1.1 State 1
cal causes that individually or collectively cause the
failure of the system or device. Initially it experiences a constant failure rate model
At present, it is not possible to differentiate and this model may (or may not) change with
between these physical causes and mathemati- probability q(p = 1 q).
cally account for all of them, and, therefore, the
choice of a failure distribution becomes difficult.
1.2 State 2
Secondly, even if a goodness of fit technique is
applied to actual observations of time to fail- If the stresses and shocks alter the failure rate model
ure, we face a problem arising due to the non- of the system with probability q, then it experiences
symmetric nature of the life-time distributions a wear-out failure model. In comparison with
whose behaviour is quite different at the tails Sharma et al. (1997), this study brings distinctive
where actual observations are sparse in view of generalization of the state by implementation of a
the limited sample size (Mann et al. 1974). Obvi- new parameter, which enables to take into account
ously, the best one can do is to look out for a con- also more general Weibull model.

AMER16_Book.indb 9 3/15/2016 11:22:40 AM


In probability theory and statistics, the Weibull for
distribution is a continuous probability distribu-
1. p = 1; represents the failure rate of an exponen-
tion, which is named after the Waloddi Weibull.
tial distribution.
As a result of flexibility in time-to-failure of a
2. k = 1 and p = 0; represents the failure rate of the
very widespread diversity to versatile mechanisms,
Rayleigh distribution or Weibull distribution
the two-parameter Weibull distribution has been
with shape parameter 2.
recently used quite extensively in reliability and
3. k = 1; represents the linear ageing process.
survival analysis particularly when the data are not
4. 0 < k < 1; represents the concave ageing
censored. Much of the attractiveness of the Weibull
process.
distribution is due to the wide variety of shapes
5. k > 1; represents the convex ageing process.
which can assume by altering its parameters.
Using such a failure rate pattern, the characteriza- At the beginning of the research our study
tion of life-time distribution in the corresponding distinguishes between 3 different practical
situation is given. Various inferential properties of situations: Case 1 when p is known, Case
this life-time distribution along with the estima- 2 when is known, and Case 3 when both
tion of parameters and reliability characteristics and p are unknown.
is the subject matter of the present study. Since In Weibull reliability analysis it is frequently
the estimates based on the operational data can the case that the value of the shape parameter is
be updated by incorporating past environmental known (Martz & Waller 1982). For example, the
experiences on the random variations in the life- Raleigh distribution is obtained when k = 1. The
time parameters (Martz & Waller 1982), therefore, earliest references to Bayesian estimation of the
the Bayesian analysis of the parameters and other unknown scale parameter are in Harris & Sing-
reliability characteristics is also given. purwalla (1968). Since that time this case has been
considered by numerous authors, see Sharma et al.
(1997), Canavos (1974), Moore & Bilikam (1978),
2 BACKGROUND Tummala & Sathe (1978), Alexander Aron et al.
(2009) & Aslam et al. (2014). This study is free con-
Let tinuation and generalization of the research from
Sharma et al. (1997).
T: the random variable denoting
life-time of the system.
h(t): the failure rate function.
f(t): the probability density function 4 CHARACTERISTICS OF THE
(p.d.f.) of T. LIFE-TIME DISTRIBUTION
F(t): the cumulative distribution func-
tion of T. Using the well-known relationship
R(t) = (T > t): the reliability/survival function.
E( )

0 R( )dt: Mean Time To Failure (MTTF). f t ) = h(t ) exp { t
h( x )dx } (2)

and in view of equation (1), the p.d.f. of the life-


3 ASSUMPTIONS time T is
Let
( p ) k +1
( p ( p )t k ) pt +
t ,
p: the probability of the event A, that the f t) = k +1
system is able to bear the stresses and 0, otherw
r ise.

shocks and its failure pattern remains t>0
unaltered.
q = 1 p: the probability of the complementary (3)
event Ac. The reliability function is
Further, let, the mixture of the failure rate func-
tion be ( p ) k +1
R(t ) = exp pt + t , t > 0. (4)
k +1

h(t ) = p + ( p )t k , , t > 0, 0 < p < 1 (1) The MTTF is given by

10

AMER16_Book.indb 10 3/15/2016 11:22:40 AM



MTTF = E( ) = R( )dt 2. The MLE for h(t), say h1( ), will be
0
( ) k +1 ( p (1 p )t k ).
= exp pt + t dt (5) h1(t ) ( (11)
0
k +1
3. The MLE for MTTF will be
This integral can be obtained by using some
suitable numerical methods.
MTTF1 = MTTF( , ), (12)

which can be obtained by installing into for-


5 ESTIMATION OF PARAMETERS AND mula (5) and integrating.
RELIABILITY CHARACTERISTICS

Let t1, t2,..., tn be the random failure times of n 5.1.2 Case 2: When is known
items under test whose failure time distribution is as To find the MLE of p, say p, we consider
given in equation (3). Then the likelihood function is
log (t1, t2 , , | , p
n
= 0. (13)
n p
L(t1, t2 , , tn | , p )( p + (
n
p )tik )
i =1 or
n
1 p k +1
exp pti + ti . (6)
i =1 k +1

ti (1 k
n n
1
tik k + 1 i 1
k
i ) 0. (14)
i =11
5.1 MLEs p+
1 tik
5.1.1 Case 1: When p is known
To find the MLE of , say , we consider
An estimate of p, i.e. p, can be obtained from
equation (14), by using some suitable numerical
log (t1, t2 , , n | , ) iteration method. By using the invariance property
n
log ( p + ( ) of MLEs,
l g
= n log p )tik (7)
i =1 1. The MLE for R(t), say R 2 ( ), will be
n
1 p k +1
pti + ti .
i =1
k +1 1 p k +1
+1
R 2 (t ) exp . (15)
k 1
Now,
2. The MLE for h(t), say h2 ( ), will be
log (t1, t2 , , n | , )
=0 (8)
h2 (t ) (( p (1 p )t k ). (16)

gives 3. The MLE for MTTF will be

n
= . (9)
MTTF2 = MTTF( , ), (17)
1 p k +1
i =1 pti
n
t
k 1i which can be obtained by installing into formula
(5) and integrating.
By using the invariance property of MLEs,
1. The MLE for R(t), say R ( ) , will be 1 5.1.3 Case 3: When both and p are unknown
To find the MLE of and p, we consider
1 p k +1
+1 log (t1, t2 , , | , )
R1(t ) exp . (10)
=0
n
(18)
k 1

11

AMER16_Book.indb 11 3/15/2016 11:22:41 AM


and
L(t1, t2 , , tn | , p ) nT1 e T2 (26)
log (t1, t2 , , | , )
n
= 0. (19) n
p T1 ( p + (1 p)tik ) (27)
i =1
We get
and
n
= (20)
1 p k +1
i =1 pti
n
t
n
1 p
k 1i T2 pti + k + 1 tik +1 . (28)
i =1

and
In view of the prior in equation (25), the poste-
n rior distribution of given tl, t2, ..., tn is given by
n n ti (1 + k tik )
1 , tn | , p ) ( )
tik
i =1
n
1 p k +1
= 0. ( | t1, t2 , , tn , p ) =
L(( 1, t2 ,
i =1


p +
1 tik
( k 1) i k +1 i
pt + t 0 L(( 1, t2 , , tn | , p ) ( )
i =1
T2n T
(21) = n 1
, > 0.
( )
Equation (21) may be solve for p by Newton- (29)
Raphson or other suitable iterative methods and
this value is substituted in equation (20) to obtain Therefore, the Bayes estimate of , say *, under
. By using the invariance property of MLEs, the square-error loss function, becomes
1. The MLE for R(t), say R ( ), will be 3 n
* ( | 1, 2 , , n , ) ( ) (30)
1 p T2
k +1
+1
R3 (t ) exp . (22)
k 1 i.e. it reduces to the usual ML estimator, what is in
agreement with Martz & Waller (1982).
2. The MLE for h(t), say h3 ( ), will be Also, the Bayes estimate of R(t), say R1* (t ), is

h3 (t ) ( p (1 p )t k ).
( (23) R1* (t ) = E(R(t ) | t1, t2 , , tn p )

= e T3 ( | t1 t2 , , tn , p d (31)
3. The MLE for MTTF will be 0
1
= n
T3
MTTF2 = MTTF( , ), (24) 1 +
T
2
which can be obtained by installing into for-
mula (5) and integrating. where

1 p k +1
6 BAYESIAN ESTIMATION T3 pt + t .
k +1
6.1 Case 1: when p is known Similarly, the Bayes estimation of h(t), say
6.1.1 Non-informative prior h1* (t ), is
We are going to use the non-informative prior
h1* t ) = E(hh t ) | t1, t2 , , tn p )
1
( ) = . (25) = ( p (1 p )t k ) ( | t1 t2 , , tn , p d
0
(n )( p (1 p )t k )
)(
= .
The likelihood function in equation (6) may be T2
rewritten as
(32)

12

AMER16_Book.indb 12 3/15/2016 11:22:43 AM


6.1.2 Informative prior 6.2 Case 2: when is known
Let the conjugate prior of be gamma with p.d.f. 6.2.1 Non-informative prior
We are going to use the non-informative prior

( ) = > 0, > 0. (33) ( p) p. (38)
( )

In view of the prior in equation (33), the poste- The likelihood function in equation (6) may be
rior distribution of A given tl, t2, ..., tn is given by rewritten as

L(t1, t2 , , tn | , p ) =
L(( 1, t2 , , tn | , p ) ( )
( | t1, t2 , , tn , p ) =
n

( pT
T4 T5 ) (39)
0 L(( 1, t2 , , tn | , p ) ( ) n
p n j
( p) j k j e k +1

n+ j =0
( 2)
= n + 1e ( 2)
,
( ) where
, > 0.
(34) k0 = 1 (40)

Therefore, the Bayes estimate of , say *, under


kj tik1 tik2 tikj , j 1n (41)
1ii1 i2 << i j n
the square-error loss function, becomes
n n

n+ T4 ( k + 1) ti tik +1, (42)


( | 1, t2 , , tn , p ) = . (35) i =1 i =1
+ T2
and
Also, the Bayes estimate of R(t), say R1* (t ), is
n

R1* (t ) = E(R(t ) | t1, t2 , , tn p ) T5 tik +1 (43)


T i =1
= e 3 ( | t1 t2 , , tn , p d (36)
0
1 or
= n+
,
T3 L(t1, t2 , , tn | , p ) =
1 + + T T5
2 n + r ( T4 )r n + r j
n
(44)
e k +1
( k ) r
r !
p ( p) j k j .
r=0 j =0
where

1 p k +1 In view of the prior in equation (38), the poste-


T3 pt + t . rior distribution of p given t1, t2, ..., tn is given by
k +1

Similarly, the Bayes estimation of h(t), say L(( 1, t2 , , tn | , p ) ( p )


( p | 1, 2 , , n, ) =
h1* (t ), is 1
0 L(( 1, t2 , , tn | , p ) ( p dp
r
T4 1 n + r j +1
n
h1* t ) = E(hh t ) | t1, t2 , , tn p )

k r!
p ( p) j k j
= ( p (1 p )t ) ( | t1 t2 , , tn , p d
k
=
r=0 j =0
0 r
T 1 nn++ r j +1
n
1
=
(n )( p (1 p )t k )
)(
. 0 k + 14 r!
p (1 p ) j k j dp
+ T2 r=0 j =0
r
T 1 n + r j +1
n
(37) k + 14 r!
p (1 p ) j k j
r=0 j =0
= r
T4 1
n
6.1.3 Super parameter estimation
Prior parameters and can be obtained by k r!
B(n + r j 2, j + 1)k j
r=0 j =0
Empirical Bayes approach, what is now under
study of authors of the paper. (45)

13

AMER16_Book.indb 13 3/15/2016 11:22:45 AM


Therefore, the Bayes estimate of p, say p*, under In view of the prior in equation (49), the poste-
the square-error loss function, becomes rior distribution of p given t1, t2,..., tn is given by

1 ( p | 1, 2 , , n , )
p* E ( p|tt1, t2 , ,tn , ) = p ( p | t t2 ,, tn , )ddp
0 L(( , t , , tn | , p ) ( p )
r = 1 1 2
T 1
n
k + 14 B(n + r j + 3, j kj L(( 1, t2 , , tn | , p) ( p dp
0
r=0 j =0 r! r
T4 1 n + r + a
n
=
T 1
n r
.
k r !
p j 1
( p )b j 1
kj
k 4 r ! B( n + r j + 2, j + 1)k j =
1
r=0 j =0
r
T 1 n++
n
r=0 j =0
0 k + 4
+ a j 1
p (1 p )b + j 1 k j ddp
r=0 j =0 r!
(46)
r
T4 1 n + r + a
n
The Bayes estimate of R(t), say R2* (t ), is k
r!
p j 1
(1 p )b j 1
kj
r=0 j =0
= r
T 1
n
k + 14 r!
B(n + r + a j b + j k j
r=0 j =0

(50)
R2* (t ) = E(R(t ) | t1, t2 , , tn , )
1 1 p k +1
= exp pt + t ( p | 1, 2 , , n , )dp
0
k +1

( )
m
T4 1 t
t k +1
e 1
n

r +1 1 nn++ r + m j +1
k +1

0 k

r! k +1 m !
p (1 p ) j k j dp
d
r=0 m=0 j =0
= r
T4 1
n
(47)
k r!
B(n + r j + 2, j kj
r=0 j =0

( )
m
t k +1 r t k + 1 t k
T 1
e n
1
k +1
k 4 r ! k +1 m !
B(n r m j , j + 1)k j
r=0 m=0 j =0
= r
T4 1
n
k r!
B(n r j ,j kj
r=0 j =0

Similarly, the Bayes estimate of h(t), say h2* (t ),


becomes

h2* t ) = E(hh t ) | t1, t2 , , tn , )

0
1
(
= p + (1 p ) t k ) ( p | t1, t2 , , tn , ) dp
d
= (1 t ) p + t
k k Therefore, the Bayes estimate of p, say p*, under
the square-error loss function, becomes
= ( p * + (1 p*)t k ),
(48) p* E ( p | t1, t2 , , tn , )
1
where p* is given in equation (46). = p ( p | t1, t2 , , tn , )dp
d
0
r
T 1
n
6.2.2 Informative prior k + 14 r!
B (n + r + a j b + j kj
Let the prior distribution of p be a Beta distribu- r=0 j =0
= r
T4 1
tion with p.d.f. n
k r!
B(n + r + a j b + j k j
1 r=0 j =0
( ) = pa ( p )b 1, a, b > 0, 0 < p < 1.
B(( , b ) (51)
(49) The Bayes estimate of R(t), say R2* (t ) , is

14

AMER16_Book.indb 14 3/15/2016 11:22:47 AM


R2* (t ) = E(R(t ) | t1,t2 ,,tn , )
1 1 p k +1
= exp pt + t ( p | 1, 2 , , n , )dp
0
k +1

( )
m
T4 1 t ( k + 1) t
t k +1 n r k
e 1 1 nn++ r + aa++ m j 1
k +1

0 k

r! k +1 m !
p (1 p )b + j 1 k j dp
r =0 m=0 j =0
= r
T 1
n
(52)
k + 4 r!
B(n + r + a j b + j k j
r =0 j =0

( )
m
t k +1 r t ( k + 1) t k
T 1
e n
1
k +1
k + 14 r ! k +1 m !
B ( n + r + a + m j , b + j )k j
r =0 m=0 j =0
= r
T 1
n
k + 4 r!
B ( n + r + a j b + j )k j
r =0 j =0

Similarly, the Bayes estimate of h(t), say h2* (t ),


becomes

h2* t ) = E(hh t ) | t1, t2 , , tn , )


1
(
= p + (1 p ) t k
0
) ( p | t1, t2 , , tn , ) dp
d
= (1 t ) p + t
k k

= ( p * + (1 p*)t k ),
(53)
where p* is given in equation (51).

6.2.3 Super parameter estimation


Prior parameters a and b can be obtained by
Figure 1. Density curve of T when k = 1, p = 1/3,
Empirical Bayes approach, what is now under = 1/5.
study of authors of the paper.

Table 1. Generated data values.


7 AN EXAMPLE
3.6145404 1.2608646 1.9644814 4.5339338
2.1762887 1.7987543 6.7039167 1.1689263
In case k = 1, p = 1/3, = 1/5, the failure rate func-
4.5634580 1.3711159 2.7837231 4.7787410
tion is
2.3456293 2.1053338 5.0590295 3.6569535
1.8818134 5.2699757 5.9550330 2.8941610
1 2
h(t ) = + t, t > 0, (54) 2.4522999 0.8210216 0.8628159 0.4675159
15 15 3.7480263 4.4549035 3.3256436 0.8107626
2.4162468 2.4680479
and the p.d.f. is

1 1 size n = 30 and the corresponding histogram brings


( t) e p (t + t 2 ) , > 0 Fig. 2.
f t ) = 15 15 (55)
0, otherwise Table 2 shows the estimated values for p; and
MTTF using data in Table 1. Figs. 35 demon-
strate estimated reliability functions for different
Figure 1 shows the curve of p.d.f. in (55). Table 1 cases in comparison with true reliability function
shows data generated from this p.d.f. with sample computed by formula (4). We can see that Bayes

15

AMER16_Book.indb 15 3/15/2016 11:22:49 AM


Figure 4. Plot of reliability functions in case 2.

Figure 2. Histogram for generated data.

Table 2. Estimated values for parameters using data in


Table 1.

Figure 5. Plot of reliability functions in case 3.

Figure 6. Plot of failure rate functions in case 1.

Figure 3. Plot of reliability functions in case 1.

approximation of reliability function is most closely


to the true function. Figs. 68 show the estimated
failure rate functions in comparison with the true
function computed by formula (1). Bayes approxi-
mation looks promisingly as well, in comparison
with MLE method. Figure 7. Plot of failure rate functions in case 2.

16

AMER16_Book.indb 16 3/15/2016 11:22:50 AM


Programme of Sustainability (NPU II) project
IT4 Innovations excellence in scienceLQ1602.

REFERENCES

Alexander Aron, A., H. Guo, A. Mettas, & D. Ogden


(2009). Improving the 1-parameter weibull: A baye-
sian approach. IEEE.
Aslam, M., S.M.A. Kazmi, I. Ahmad, & S.H. Shah (2014).
Bayesian estimation for parameters of the weibull
distribution. Sci.Int.(Lahore) 26(5), 19151920.
Canavos, G. (1974). On the robustness of a bayes esti-
Figure 8. Plot of failure rate functions in case 3. mate. Annual Reliability and Maintainability Sympo-
sium, 432435.
Harris, C. & N. Singpurwalla (1968). Life distributions
derived from stochastic hazard functions. IEEE Trans,
Reliab. 17, 7079.
8 CONCLUSION Lawless, J.F. (1982). Statistical Models and Methods for
Lifetime Data. New York: John Wiley and Sons.
Our study shows that Bayesian approach can show Mann, N., E. Schaffer, & N. Singpurwalla (1974). Meth-
better results than MLEs method to estimate the ods for Statistical Analysis ofReliability and Life Data.
NY: Wiley.
mixture of failure rate model. The study reveals an
Martz, H.F. & R.A. Waller (1982). Bayesian Reliability
interesting fact that non-informative prior seems Analysis. New York: John Wiley and Sons.
to be working well, especially in case 2. This result Moore, A.H. & J.E. Bilikam (1978). Bayesian estimation of
would be good platform for selection of non-in- parameters of life distributions and reliability from type
formative prior, in case of no information about ii censored samples. IEEE Trans, Reliab. 27, 6467.
prior. This work gives good evidence that Bayes Pandey, A., A. Singh, & W.J. Zimmer (1993). Bayes esti-
approach is viable method to model real situations mation of the linear hazard-rate model. IEEE Trans,
which can be approximated by mixture of failure Reliab. 42.
rate functions. Sharma, K.K., H. Krishna, & B. Singh (1997). Bayes
estimation of the mixture of hazard rate model. Reli-
ability Engineering and System Safety 55, 913.
Sinha, S.K. (1986). Reliability and Life Testing. USA:
ACKNOWLEDGEMENT Wiley Eastern Ltd.
Tummala, V.M.R. & P.T. Sathe (1978). Minimum expected
This work was supported by The Ministry of loss estimators of reliability and parameters of certain
Education, Youth and Sports from the National lifetime distributions. IEEE Trans, Reliab. 27, 283285.

17

AMER16_Book.indb 17 3/15/2016 11:22:54 AM


This page intentionally left blank
Applied Mathematics in Engineering and Reliability Bri, Snel, Khanh & Dao (Eds)
2016 Taylor & Francis Group, London, ISBN 978-1-138-02928-6

Cost oriented statistical decision problem in acceptance


sampling and quality control

R. Bri
Department of Applied Mathematics, Faculty of Electrical Engineering and Computer Science,
VBTechnical University Ostrava, Ostrava, The Czech Republicc

ABSTRACT: Statistical decision problem on the basis of Bayes approach is presented in the paper.
Bayes inference for hypothesis testing is introduced in general and its extension in acceptance sampling
and quality control as well. Common decision problem for a quality characteristic is described taking
into account the cost dependent on the decision. Asymmetric loss function is proposed to quantify cost
oriented decision and mainly, to set the optimal Bayes decision rule for acceptance sampling plan. Appli-
cation of the method on real example from industry is clearly demonstrated.

1 INTRODUCTION the acceptance sampling plan in order to be sure


that the quality control of the production proc-
1.1 Acceptance sampling in modern industry ess provides lots of a quality sufficient to meet his
objective. This is the basic principle of acceptance
In the mass production environments of the twen-
samplingto assure the production of lots of
tieth century, quality control rested on the triple
acceptable quality. The important role of SPI is:
sorting principle of inspecting product, detecting
inappropriate product, and sorting out inappro-
it serves as an supplementary source of informa-
priate product. Accordingly, Statistical Product
tion on the product quality level
Inspection (SPI) based on sampling was the first
it is used to learn about the product quality level
strong branch of modern statistical quality control.
if the supplier-consumer interface is new and
The sorting principle is clearly expressed by the
sufficient quality records do not exist
definition of the purpose of sampling inspection
it serves as a precaution for the consumer, if
given in Dodge & Roming (1959): The purpose of
the supplier-consumer interface is new or if the
the inspection is to determine the acceptability of
quality efforts of the supplier are not completely
individual lots submitted for inspection, i.e., sort-
reliable.
ing good lots from bad. SPI is a tool of product
quality control, where the term product is used in SPI is implemented according to a sampling
a broad sense to include any completed identifi- plan. A single sample plan is defined by a pair (n, r)
able entity like materials, parts, products, services, consisting of the sample size n and the acceptance
or data. Product control is located at the interface number r and a sample statistic t. The decision rule
of two parties: a supplier or vendor or producer of a single sampling plan prescribes the following
as the party providing product and a customer or steps: a) n items are sampled at random; b) the
consumer as the party receiving product. Tradi- value of the sample statistic t is calculated from the
tional product control relied on the sorting prin- elements of the sample; and c) acceptance is war-
ciple at the interface of supplier and consumer. ranted if t r, otherwise the decision is rejection.
Modern product control primarily relies on quality In acceptance sampling schemes, it is the usual
provision and prevention at the interface. The par- practice for acceptance criteria to be specified in
ticular task of the interface is to integrate supplier terms of some parameter or parameters of the
and consumer into joint and concerted efforts to distribution of the quality characteristic or char-
create and maintain favorable conditions for qual- acteristic which are the basis for acceptance. For
ity in design, manufacture, handling, transport, example, a minimum value for the percentage of
and service. In this scheme, inspection may be use- non-functioning items or a minimum value for
ful, however not as the primary tool of product the mean operating life of some component may
control, but in an adjunct role, subordinate to the be specified. In a classical acceptance sampling
guiding principles of provision and prevention. scheme, sample test data is used to construct a
The adjunct role of SPI is expressed in (Godfrey & hypothesis test for the specified parameter. If
Mundel 1984): The shipper must understand the hypothesis is not rejected and the sample is

19

AMER16_Book.indb 19 3/15/2016 11:22:54 AM


accepted, the distribution of the quality character- In the Bayesian approach the best sampling
istic can be calculated from the null distribution. plan is defined as the one which minimizes the
Therefore the null distribution can be used to cal- average cost. This is essentially the paradigm of
culate the number of individual items which will Bayesian decision theory, see (Berger 1993). How-
not function or will fail to meet some minimum ever, it is not restricted to costs in a monetary sense.
standard. In practice, the acceptable null standard It only requires a numerical goal function that
for the test parameter will have been determined by measures the consequences of possible decision.
the performance requirements of individual items. Bayesian sampling plans in context with optimal
For example, to insure that only a small number Bayes decision rule will be introduced in the paper,
of individual items have an operating life less than applied on real example from industry.
some minimum standard, the minimum acceptable
mean failure time must be considerably higher than
1.2 Statistical decision problem
this minimum standard for individual items. It is
interesting to note that if the failure time has an Consider a problem in which a Decision Maker
exponential distribution then fully 63.2% or almost (DM) must choose a decision from some class of
two thirds of individual items will fail before the available decisions, and suppose that the conse-
mean failure time. However, in the classical accept- quences of this decision depend on the unknown
ance sampling scheme, the minimum acceptable value of some parameter . We use the term
parameter values will have been chosen to insure parameter here in a very general sense, to rep-
that if the sample is accepted, only a small number resent any variable or quantity whose value is
of individual items will fail to meet minimum per- unknown to DM but is relevant to his or her deci-
formance standards. sion: some authors refer to as the unknown
However when prior information is available state of nature . The set of all possible values
for the relevant parameter values and a Bayesian of is called the parameter space. The set D of
procedure is employed, acceptance criteria are all possible decisions d that DM might make in the
not so easy to establish. It is possible to compute given problem is called the decision space.
a posterior parameter distribution which gives the Conceptually, every combination of state of
complete current probability information about nature () and decision (d) will be associated
the parameter. However, the relation between a with some loss. In practical problems of infer-
posterior distribution for the parameter and the ence it is usually not possible to specify the loss
likely distribution of the quality characteristic precisely. However, in general we can say that the
of individual items is much more complex than loss increases with the distance between decision
the relation of the individual quality characteris- and state of nature. This conceptual loss function
tic to a single null parameter value. For example, is often sufficient to assess known inference pro-
merely requiring the mean of the posterior param- cedures and eliminate those which are obviously
eter distribution to meet some specified standard flawed or unacceptable according to decision theo-
is not sufficient to determine the proportion of retic criteria. Thus when applying decision theory
individual items which will perform satisfactorily. to problems of estimation, it is common to assume
Other characteristics of the posterior distribution, a quadratic loss function.
notably the variance, are also of importance. Even
specifying the posterior probability of the param- (, d) = k( d)2 (1)
eter lying in some critical range is not sufficient to
guarantee the proportion of individual items which and for hypothesis testing problems to assign rela-
will perform satisfactorily. Such naive criteria can tive losses to Type I and Type II errors and no loss
produce inconsistent acceptance decisions. to the action of choosing the correct hypothesis.
In most cases as well as in this paper, decision in
product control is directed toward a single assem- State of Nature
bled lot of product. A common decision situation
is determining whether a product meets a pre-de- H0 H1
termined quality specification. The decision is to Decision H0 0 l2
either accept or reject the assembled lot of prod- H1 l1 0
uct, so superficially it would appear to be a simple
hypothesis testing problem. However, both a null
and alternate hypotheses are composite, embody- For the purposes of applying decision theory to
ing many parameter values each. The true loss is the problem of hypothesis testing, it is often suf-
likely to depend on the actual parameter value ficient to know only whether
rather than whether parameter value exceeds some
precisely defined specification. l1 < l2 or l1 > l2

20

AMER16_Book.indb 20 3/15/2016 11:22:55 AM


Suppose that has a specified probability dis- information, the decision problem becomes choos-
tribution g(). An optimal or Bayes decision with ing a decision function rather than choosing a sin-
respect to the distribution g() will be a decision gle decision.
d* for which the expected loss E( | g(), d) is a The risk function R(,d) is the equivalent of the
minimum. For any given distribution g() of the loss function in the presence of information. The
expected loss E( | g(), d) is called the risk R(g, d), risk function for any decision function d, is the
of the decision d. The risk of the Bayes decision, expected loss of using that decision function when
i.e. the minimum R0(g) of R(g,d) over all decisions: the state of nature is .
d D, is called the Bayes risk.
In many decisions problems, the DM has the
opportunity, before choosing a decision in D, of
R( , d ) [ , d x ] f ( x|| )dx
d (3)

observing the value of a random variable or ran-
dom vector X that is related to the parameter . The decision theoretic problem is to find a deci-
The observation of X provides information that sion function that is optimal.
may be helpful to the DM in choosing a good deci- The risk R(g,d) of any decision function d(x)
sion. We shall assume that the conditional distri- with respect to the prior distribution g() is the
bution of X given = can be specified for every expected loss (or expected value of risk function
value of , for which . A problem of this type R(, d) with respect of g():
is called a statistical decision problem.
Thus the components of a statistical decision R(g, ) E{ [ ,d X ]} R(( )g ( )d =
, )g
problem are a parameter space , a decision space (4)
D, a loss function , and a family of conditional
densities f(x|) of an observation X whose value  [ , d x ] f x| )g ( )d d

will be available to the DM when he or she chooses
a decision. In a statistical decision problem, the We recall that decision function d*(x) for which
probability density distribution of is called its the risk R(g, d) is minimum, is called a Bayes deci-
prior distribution because it is the distribution of sion function with respect to g and its risk R(g, d*)
before X has been observed. The conditional dis- is called the Bayes risk. We see that the Bayes deci-
tribution of given the observed value X = x is sion function is equivalent to the Bayes decision.
then called the posterior distribution of . After the value x has been observed, the DM
simply chooses a Bayes decision d*(x) with respect
1.3 The posterior distribution h( | x) to posterior distribution h ( | x) rather than the
prior distribution g().
By applying Bayes theorem to the conditional and When observed value x is known, the DM
prior distributions, a posterior distribution for the can choose a Bayes decision d*(x) with respect
state of nature can be derived. to the posterior distribution. At this stage of the
decisionmaking process, the DM is not inter-
f x )g( ) ested in the risk R(g,d), which is an average over all
h( x ) = ,
p( x ) possible observations, but in the posterior risk:
(2)
p( x ) = f x )g )g ( )dd
 [ , ] h( |x )d , (5)

Note: If g() is actually a discrete probability func-
tion, the integral should be replaced by a sum. Which is the risk from the decision he or she is
Since the posterior distribution of the state of actually making.
nature depends on the information, X, it is rea-
sonable to expect that the information or data will
1.4 Application to point estimation
affect the decision which is chosen as optimal. The
relation between information and decision is called For problems of estimation, it is common to
the decision function d(x). The decision function is assume a quadratic loss function. In this case,
really a statistic. A decision function is a rule that the estimator becomes the mean of the posterior
specifies a decision d(x), d(x)D, that will be cho- distribution of . It can be easy shown (OHagan
sen for each possible observed value x of X. 1994, Weerahandi 1995) that under quadratic loss
In inferential problems of estimation, it is the the optimal Bayes estimate is the posterior mean.
estimator. In problems of hypothesis testing, it When a uniform prior is used, the posterior dis-
is the rule for rejecting the null hypothesis, or tribution is proportional to the likelihood function
equivalently, the critical region. In the presence of of .

21

AMER16_Book.indb 21 3/15/2016 11:22:55 AM


f x ) loss function in very stark terms. If we choose d0
h( x ) = f x| ) l (x
( | ) (6) and it turns out that 1, then the inference is
p( x )
wrong and incurs a loss of a0 regardless of where
lies in 1. In a practical decision problem, as men-
Thus, the optimal Bayes estimator is the first tioned below (cost oriented acceptance sampling),
moment of the likelihood function. If the likeli- it may be more appropriate to make a0 and a1 func-
hood is symmetric about its maximum as the case tions of . Then E((, di)) will not have a simple
for a normal likelihood, the optimal Bayes estima- formula, but may be computed for i = 0,1 and an
tor is then the Maximum Likelihood Estimator optimal inference thereby selected.
(MLE). Important results concerning an alter-
native asymmetric precautionary loss functions,
presenting a general class of precautionary loss
functions with the quadratic loss functions as a 3 BAYES DECISION MAKING
special case, introduces Norstrom (1996). IN ACCEPTANCE SAMPLING
AND QUALITY CONTROL

3.1 Acceptance sampling


2 BAYES DECISION MAKING
IN HYPOTHESIS TESTING Acceptance sampling relates to the acceptance or
rejection of a product or process on the basis of
2.1 Application to hypothesis testing SPI. SPI is the process of evaluating the quality of
a product by inspecting some, but not all of them.
Now consider a hypothesis H, that 0 . Its methods constitute decision rules for the dis-
Inference consists of accepting or rejecting H. A position or sentencing of the product sampled. In
correct inference will result in zero loss. The loss this sense it may be contrasted with survey sam-
associated with an incorrect inference may depend pling, the purpose of which is largely estimation.
on the kind of error. Let d0 be the inference to Sampling plans, which specify sample size and
accept H and d1 be the inference to reject H. acceptance criteria, are fundamental to acceptance
D = {d0, d1}. Let 1 = (0)C. Then we let sampling.
In a classical acceptance sampling scheme, sam-
( , di ) 0 if i ple test data is used to construct a hypothesis test
(7)
= ai if i for the specified parameter. If the hypothesis is not
rejected and the sample is accepted, the distribu-
for i = 0,1 tion of the quality characteristic (as is for exam-
Therefore ple time to failure) can be calculated from the null
distribution.
E( ( , i )) a i P( i ) (8) However when prior information is available for
the relevant parameter values and a Bayesian pro-
and the optimal inference (OHagan 1994) is to cedure is employed, acceptance criteria are not so
reject H if easy to establish. The relation between a posterior
distribution for the parameter and the likely distri-
bution of the quality characteristic of individual
E( ( , 1 ))
) E( ( , d0 )), (9) items is much more complex than the relation of
the individual quality characteristic to a single
i.e. if a1P( 0) < a0 {1 P( 0)}, or (10) null parameter value (Bris 2002). However, using
a0 Bayes approach we can take into account the cost
P( 0 ) (11)
a 0 + a1 oriented statistical decision problem demonstrated
below, i.e. the loss associated with over-estima-
Therefore, we reject H if its probability is suf- tion (below specified quality) and the loss associ-
ficiently low. The critical value on right side of the ated with under-estimation (higher than specified
equation (11) depends on the relative seriousness quality).
of the kinds of error, as measured by a0/a1.
We also in following adopt a conservative strat-
3.2 Description of a common decision problem for
egy of rejecting H when its probability is small.
quality characteristic
In terms of Bayesian decision theory, this strat-
egy implies that the loss for Type I error is much A common decision theory situation is determining
greater than for Type II error. whether a product meets a pre-determined qual-
This formulation of the hypothesis testing prob- ity specification. The decision is to either accept
lem is very simple because we have expressed the or reject the product, so superficially it would

22

AMER16_Book.indb 22 3/15/2016 11:22:56 AM


appear to be a simple hypothesis testing problem.
However, both a null and alternate hypotheses are
composite, embodying many parameter values
each. The true loss is likely to depend on the actual
parameter value rather than whether parameter
value exceeds some precisely defined specification.
(Such problem is of particular interest whenever a
reliability demonstration testing of highly reliable
products is required).
Suppose the parameter cannot exceed some
specification limit 0. Values less than 0 mean
better that specified quality; values greater than
0 mean increasingly worse quality. The smaller Figure 1. Asymmetric loss function for cost dependent
the value of bellow the specification limit 0, acceptance sampling.
the greater the loss to the manufacturer who has
exceeded the quality requirement most likely at
some additional cost. The greater the value of
above the specification limit 0, the greater the
loss to either the manufacturer or the consumer in
E h( |x) [(( 0 )] = ( 0 )h(( | ) =

terms of low product quality. 0
In this example, a quadratic loss function = m1 ( 0 )2 ( |x )d (14)
would seem to be a reasonable model. The greater
the distance from the specification value 0, the +
greater the loss. However the loss associated with + (m
( m3 k ) ( 0 )2 h( |x
| )d
over-estimation (below specified quality) is unlikely 0
to be the loss associated with under-estimation
(higher than specified quality). In most circum- E h( |x) [(( 1 )] = ( 1 )h(( | ) =
stances, the loss associated with higher than speci-
0 (15)
fied quality should be lower than the cost of lower
than specified quality. However, there may be excep- = m1 ( 0 )2 ( |x )d + m2
tions depending on how the specification limit was

set and the type of product being tested. Therefore,


an asymmetric quadratic loss function would be a The difference between the expected posterior
reasonable choice for this decision problem. loss of accepting and rejecting the sample is:
Let m1 is the manufacturers cost of exceeding
quality specification, m2 is manufacturers cost of | ) [(
E h( |x) ] | ) [ (
E h(( |x) 1 )]
+
a rejected sample, m3 is manufacturers cost of an (16)
accepted sample which fails to meet quality speci- =( 3 ) ( 0 )2 h( |x )d m2
fication and k is consumers cost of an accepted 0
sample which fails to meet quality specification.
Therefore, the optimal Bayes Decision Rule
(BDR) is to accept the sample whenever:
( , 0) ( 0 ) ;
1
2
0
(12
=( 3 )( 0 ) ; 0
2 +
)
0 ( | )d
2
3 m2 (17)
( , 1 ) 1 ( 0 )
2
2 ; 0 (13) 0
= m2 ; 0
That is, the sample is accepted whenever the
Schematic demonstration of the idea brings expected total cost to both manufacturer and con-
Figure 1. sumer of accepting a sample which fails to meet
Suppose that on the basis of some test observa- specifications is less than the manufacturers cost
tion, X, the posterior distribution of is known. of discarding any sample, whether or not it meets
For each observation X the decision function d specifications.
must take one of two values: d0, d1, which means
Accept or Reject. The Bayes risk minimizing deci-
3.3 Example in case of uniform prior
sion function is the one which minimizes posterior
expected loss at each value of X. The expected pos- If a uniform prior is assumed, the decision rule
terior loss for each of the two possible actions is: becomes

23

AMER16_Book.indb 23 3/15/2016 11:22:57 AM


+ but 2 = A 1 and 1 0, so that the condition
( 3 p( x ) (
) p( 0 )2 f ( x )d
) m2 (18) for r2, t2 is given by
0

r2
where A 0 (22)
t2
1
Using the Bayes approach we consider failure
f | )d (19)
rates and acceleration factor to vary randomly
according to some prior distributions. To meet the
condition (22), possible acceptance sample plans
Of course the relative magnitudes of the losses, for RDT are presented in the Table 1, given real
m1, m3, and k as well as the scale factor p(x), are data: 0 = 8.76e-4 /year, A = 13.09.
important in determining the optimal decision, Conditioned (by t2, r2) posterior distribution of
but in general the sample will be accepted when the 2, derived in (Bris 2000), is given as follows:
likelihood of x for those parameter values > 0,
associated with an unacceptable sample, is small.
b
u
h( 2 2 ,,r
r2 ) exp ( t 2 )
2r exp p ( a2 u ) du
exp d ,
0 (c + u )2 b
4 EXAMPLE FROM INDUSTRY where
a = 7.247e6
4.1 Reliability demonstration testing the failure b = 2.1063
rate of an exponential distribution c = 0.001435
We give good motivation example from industry (23)
(Bris 2002) for the use of the BDR.
Notation Having the knowledge of posterior, we can
MLE Maximum Likelihood Estimator optimize not only the variance Var{2|t2, r2} (dem-
RDT Reliability Demonstration Testing onstrated in Figure 2, in units [hour-2]) and mean
0 requested (by consumer) limit of failure E{2|t2, r2}, moreover we are able to quantify the
rate at normal condition consumers posterior assurance 1 *, given by
2 failure rate in accelerated condition the following relationship:
i index for condition given by temperature;
i = 1, 2 1 * = Pr {1 0 | passing RDT} (24)
1 failure rate tested at given (normal) tem-
perature condition as demonstrated in Figure 3.
ti total test time in condition i during which ri Using the criterion of small variance of 2 at
failures have occurred acceptable test time for RDT, sample plan P2
(ti,ri) parameters of acceptance sample plan in seems to be optimal. The process of optimization
condition i is demonstrated in the Figure 2 which corresponds
* posterior risk, Pr {1 > 0 | passing RDT} with the results in Table 1. Having the variance of
1 *consumer's posterior assurance 7.6 e 13/hour2 (i.e. 0.5832 e-4/year2) requires
A acceleration factor minimal total test time t2 = 1.53 e6 test hours at
Consumer asks for following limitation in reli- maximally 2 allowed failures (r2 = 2). Calculation
ability of delivered Electronic Components (EC): of posterior assurance that 1 0, 1 *, in a test
1 0. MLE of 1 for the time censored data is that has passed is 77% (Figure 3).
given by Above derived optimal Bayes decision rule is to
accept the sample whenever:
r1 +
1 = (20)
2 0 2 | 2 , r2 )d 2 m2
2
t1 3
0
+ (25)
However, total time t1 is usually very large even m2
( 2 0 ) h(( 2 | 2 , r2 )d 2
2
when r1 = 1 is admitted. Let us perform RDT in ( m3 + k )
accelerated condition: 0

r2 where h{2 | t2, r2} is given by (23). Computing the


2 = (21) value on the left side of the last equation we can
t2 obtain clear limitation for cost in given example of

24

AMER16_Book.indb 24 3/15/2016 11:22:59 AM


Table 1. Sample plans for RDT with corresponding characteristics.

t2 r2 E{2 | t2, r2} Var{2/t2,r2} Var {2 / t2 , r2 }

106 test
Sample plan hours e-2/year e-4/year2 e-2/year

P1 2.3 3 1.253 0.4036 0.635


P2 1.53 2 1.297 0.5832 0.764
P3 0.76 1 1.402 1.036 1.018
P4 5.0 6 1.113 0.077 0.277

Of course, the later a newly developed prod-


uct (with a covert fail) is taken off the market,
the more expensive it is (both from producer and
also consumer point of view). We can remem-
ber many examples from previous time from
industry concerning the problem (car industry,
producers of electronic devices, etc.). The find-
ings of the paper just answer the question, which
relation referring the cost should be observed in
acceptance sampling plans to make an optimal
decision. We usually know that the fraction in
the relationship (26) is far less than 1, but we do
Figure 2. The optimization process for t2. not know in practice how close to zero it should
be. In our example, the optimal decision is made,
whenever:

m2
2e-4 1
( m3 k )
(m
( 3 k )2e-4 m2

5 CONCLUSIONS

Bayes framework for hypothesis testing was devel-


oped and proposed to use in acceptance sampling
plans. Asymmetric cost oriented loss functions
are proposed to allow optimal decision making in
acceptance sampling plans.
Optimal Bayes decision rule considering the loss
associated with both higher and lower than speci-
Figure 3. Posterior assurance that 1 0 (0 = 8.76e-4 / fied quality was derived.
year) for passing tests in dependence on total test time t2. Possibilities of use the rule in practice are also
demonstrated on given example from electronic
industry.
RDT. For example, taking into account the sample
plan P2, the value is about 2e-4.
In all practical production situations usually is
ACKNOWLEDGEMENT
valid the following relation:
This work was supported by The Ministry of Edu-
( 3 )  m2
cation, Youth and Sports from the National Pro-
m2 (26)
0 1 gramme of Sustainability (NPU II) project IT4
( 3 ) Innovations excellence in science LQ1602.

25

AMER16_Book.indb 25 3/15/2016 11:23:00 AM


REFERENCES Dodge, H.F. & Roming, H.G. 1959. Sampling Inspection
Tables, 2nd Edition, John Wiley & Sons, New York.
Berger, J.O. 1993. Statistical Decision Theory and Baye- Godfrey, A.B. & Mundel, A.B. (1984). Guide for selec-
sian Analysis, 3rd Edition, Springer, New York. tion of an acceptance sampling plan, Journal of Qual-
Bri, R. 2000. Bayes approach in RDT using accelerated ity Technology 16, 5055.
and long-term life data; Reliability Engineering and Norstrom J.G. 1996. The Use of Precautionary Loss
System Safety 2000 67: 916, ELSEVIER. Functions in Risk Analysis; IEEE Transactions on
Bri, R. 2002. Using Posterior Information About the Reliability, 1996, Vol. 45, No. 3, 1996 September.
Distribution of Failure Time Parameter in Accept- OHagan, Anthony. 1994. Kendalls Advanced Theory
ance Sampling Schemes; 13 ESREL 2002 Euro- of Statistics, Volume 2BBayesian Inference; ISBN
pean Conference on System Dependability and Safety, 0340529229, 1994 University Press Cambridge.
Lyon, France, March 1821, 2002, Proceedings of the Weerahandi, S. 1995. Exact Statistical Methods for Data
Conferences II.: 627630. Analysis; Springer Verlag 1995, ISBN 0387943609.

26

AMER16_Book.indb 26 3/15/2016 11:23:01 AM


Applied Mathematics in Engineering and Reliability Bri, Snel, Khanh & Dao (Eds)
2016 Taylor & Francis Group, London, ISBN 978-1-138-02928-6

High-dimensional simulation experiments with particle


filter and ensemble Kalman filter

P. Bui Quang & V.-D. Tran


Hoa Sen University, Ho Chi Minh City, Vietnam

ABSTRACT: Particle filter and ensemble Kalman filter are two Bayesian filtering algorithms adapted
to nonlinear statespace models. The problem of nonlinear Bayesian filtering is challenging when the
state dimension is high, since approximation methods tend to degrade as dimension increases. We experi-
mentally investigate the performance of particle filter and ensemble Kalman filter as the state dimen-
sion increases. We run simulations with two different state dynamics: a simple linear dynamics, and the
Lorenz96 nonlinear dynamics. In our results, the approximation error of both algorithms grows at a
linear rate when the dimension increases. This linear degradation appears to be much slower for ensemble
Kalman filter than for particle filter.

1 INTRODUCTION particle approximation of the posterior, and there-


fore the state estimation, is very poor. A strategy to
Bayesian filtering consists in computing the con- avoid this problem has recently been proposed in
ditional distribution of the unobserved state of a (Beskos, Crisan, & Jasra 2014). In EnKF however,
dynamical system w.r.t. available observations (the the sample points are not weighted, so that this
posterior distribution), in order to estimate the state. weight degeneracy phenomenon cannot occur (Le
When the state dynamics or the observation model Gland, Monbet, & Tran 2011). EnKF is a popular
are linear with additive Gaussian noise, the Kalman algorithm for models in which the state dimension
filter yields optimal state estimation. Otherwise, is very large, such as models describing geophysical
approximation methods must be used. The Particle systems (Evensen 2009, van Leeuwen 2009).
Filter (PF) and the Ensemble Kalman Filter (EnKF) In this paper, we lead two simulation experi-
are two algorithms that compute an approximation ments where we compare the performance of PF
of the posterior in the form of a random sample of and EnKF when the state dimension increases.
points. These points are weighted in PF, whereas in Algorithm performance is assessed in terms of the
EnKF they are not attached to a weight. mean squared error between the approximated pos-
Approximation problems are known to be more terior mean and the true posterior mean. Numerical
difficult when the state dimension is large. This results clearly indicate that the error increases lin-
phenomenon, referred to as curse of dimensional- early with the state dimension for both algorithms.
ity (Bengtsson, Bickel, & Li 2008, Daum & Huang This increase is slower for EnKF than for PF.
2003), has been observed in particle filtering. In The outline of the paper is as follows. In Sec-
(Bengtsson, Bickel, & Li 2008, Bickel, Li, & Bengts- tion 2, we present the problem of Bayesian fil-
son 2008, Bui Quang, Musso, & Le Gland 2010, tering in statespace models. The PF and EnKF
Snyder, Bengtsson, Bickel, & Anderson 2008), the algorithms, and the type of models to which they
authors argue that the particle sample size must apply, are then described in Section 3. In Section 4,
increase exponentially with the dimension to avoid we describe how we assess the performance of
the curse of dimensionality. A simple linear tar- algorithms. We define the algorithm error, which
get tracking problem is analyzed in (Bui Quang, is related to the algorithm approximation, and the
Musso, & Le Gland 2010), and a theoretical study model error, which is related to the intrinsic model
is made in (Bengtsson, Bickel, & Li 2008, Bickel, uncertainty and is independent of the algorithm.
Li, & Bengtsson 2008, Snyder, Bengtsson, Bickel, & Simulation experiments are lead in Section 5 and
Anderson 2008) under particular model assump- conclusion is drawn in Section 6.
tions. The problem occurring in PF when the Throughout the paper, we use the following
dimension is large is that the particle approxima- notations: a1:n (a1 , , an ) , In is the n n identity
tion tends to collapse to a single Dirac measure, i.e. matrix, 0n,p is the n p zero matrix.
all the points but one have a zero weight and only The (MATLAB) computer code used for simula-
one point has a nonzero weight. In this case, the tion experiments is available from the first author.

27

AMER16_Book.indb 27 3/15/2016 11:23:01 AM


2 STATESPACE MODELS AND Clapp 2002, Capp, Moulines, & Ryden 2005) for
BAYESIAN FILTERING particle filter and in (Evensen 2003, Evensen 2009)
for ensemble Kalman filter.
A statespace model is a time series model describ-
ing the observation of a dynamical system. The
3.1 Particle Filter
system is characterized by a hidden (unobserved)
state variable taking values in m. Statespace Particle Filters (PF), also known as sequential Monte
models are hidden Markov models where the state Carlo methods, are Bayesian filtering algorithms for
variable is continuous (Capp, Moulines, & Ryden general statespace models, as described in Equa-
2005). tions (1)(2). They are based on the principle of
The state dynamics is a Markov process with importance sampling. They recursively approximate
transition kernel the predictor and the posterior by a weighted sam-
ple of points, called particles, in the state space, i.e.
a weighted sum of Dirac measures i =1wk , where
N i
Xk | Xk = xk qk ( xk 11,), (1) i

ki denote the particle position and wki denote the


k

with the initial distribution X 0 q0 (in terms of particle weight.


densities). The observation model is A common problem with particle filtering is
weight degeneracy, which occurs when most of
Yk | X k = xk gk ( xk ), (2) the particles have a weight that is numerically zero,
whereas a few particles have a nonzero weight.
where gk is the likelihood, i.e. gk ( xk ) is the density of Weight degeneracy yields poor particle approxi-
the conditional probability P[ k dyk | k mation, and it is particularly severe when the
k ].
The observation sequence { k }k 0 verifies state dimension is high (Bengtsson, Bickel, & Li
2008, Bickel, Li, & Bengtsson 2008, Bui Quang,
P[ dyk | P[
P[ dyk | Musso, & Le Gland 2010, Daum & Huang 2003,
k 0:n ] k k ].
Snyder, Bengtsson, Bickel, & Anderson 2008, van
The problem of Bayesian filtering consists in Leeuwen 2009). A common strategy to (partially)
computing sequentially the conditional distribution avoid it is to resample the particles according to
of the hidden state Xk w.r.t. to past observations their weights, i.e. the probability that a parti-
Y0:kk 1 , called the predictor, and the conditional cle is drawn equals its weight. This multinomial
distribution of Xk w.r.t. to all available observa- resampling tends to discard particles with a low
tions Y0:k , called the posterior, at each time step weight and to duplicate particles with a large
k 0 . The predictor and the posterior, respec- weight. Weight degeneracy can be quantified by
the effective sample size N efff = 1 / iN=1 (wki ). Neff veri-
tively denoted pk |k 1 and pk, obey to the recursive
relation fies 0 Nefff N , and it is small is case of weight
degeneracy.
There exists many versions of particle filters.
pk|k 1( x ) = pk ( z )qk ( z , x )ddz , (3) The first practical implementation of PF has been
Rm
proposed in (Gordon, Salmond, & Smith 1993).
gk ( x ) pk|k 1( x ) We give in Algorithm 1 below the implementation
pk ( x ) = , (4)
R m
gk ( x ) pk|k 1( x )dx of the most classical particle filter, called SIR (for
sequential importance resampling), where particles
are propagated according to the Markov dynamics
with the convention p0| 1 q0 (in terms of den- explicitly given by Equation (1), and resampled if
sities). Equation (3) is the prediction step and the effective sample size is too small.
Equation (4) is the update step. Note that Equa-
tion (4) is the Bayes formula, where the prior is the
predictor. 3.2 Ensemble Kalman Filter
The Ensemble Kalman Filter (EnKF) is applicable
to statespace models where the observation model
3 PARTICLE FILTER AND ENSEMBLE is linear with an additive noise, i.e.
KALMAN FILTER
Yk H k X k Vk , (5)
We present now the two nonlinear Bayesian filter-
ing techniques we consider in the paper: particle where Hk is a d m matrix and Vk pVk .
filter and ensemble Kalman filter. Comprehensive In EnKF, the predictor and the posterior are
presentations of these algorithms can be found, for approximated by a sample of points, like in PF.
example, in (Arulampalam, Maskell, Gordon, & This sample is referred to as ensemble here.

28

AMER16_Book.indb 28 3/15/2016 11:23:01 AM


Ensemble members are not weighted as in PF.
Instead, each point is moved according to an aff-
ine transformation that mimics the update step of
the Kalman filter. This transformation involves a
gain matrix which depends on the predictor covari-
ance matrix. The predictor covariance matrix can puted nor stored. Indeed, the gain formula in
be approximated using ensemble members, as Equation
q (7) shows that only the matrix product
PkN|k 1H kT is required. Firstly, PkN|k 1H kT has size m
d, whereas PkN|k 1 has size m m. Secondly,
PkN|k 1 =
1 N i (6)
(k|kk 1 mkN|kk 1 )(ki|kk 1 mkN|kk 1 )T
N i =1
PkN|k 1H kT
1 N
1 N i = (ki |kk 1 mkN|kk 1 )(ki|kk 1 mkN|kk 1 )T H kT
with m N
k |k 1 = k|k 1 , where {ki|k 1}i is a
N i =1
N i =1
1 N T
sample approximation of the predictor. The gain = (ki|kk 1 mkN|kk 1 ) H k (ki||kk 1 mkN|kk 1 )
matrix is then N i =1
1 N
K kN PkN|k 1H kT ( H k PkN||kk 1H kT + R N )1, (7) = (ki |k 1 mkN|kk 1 )hki ,
N i =1
where R N is the sample covariance matrix of the T
where hki H k (ki ||kk 1 mkN|kk 1 ) . To get PkN|k 1H T ,
i.i.d. sample V 1, ,V N from the noise distribution
the following operations (scalar addition or mul-
pVk .
tiplication) are performed for each ensemble
EnKF is displayed in Algorithm 2 below. This
member:
algorithm has been developed for data assimila-
tion problems arising in geophysics (especially d scalar products of mdimensional vectors to
meteorology and oceanography). Seminal papers compute hki , requiring O(dm) operations,
on EnKF are (Burgers, van Leeuwen, & Evensen one (matrix) multiplication of a mdimensional
1998, Evensen 1994). columnvector with a ddimensional
In practice, the sample covariance matrix PkN|k 1 rowvector, to compute (ki|k 1 N i
k 1 )hk ,
k |k
defined in Equation (6) does not need to be com- requiring O(dm) operations, yielding O(Ndm)

29

AMER16_Book.indb 29 3/15/2016 11:23:04 AM


operations in total. On the other hand, to get of Xk w.r.t. Y0:k (see Section 2), is obtained by
N
Pk |k 1, a mdimensional columnvector is multi- marginalizing p k as
plied with a mdimensional rowvector to com-
pute (ki |k 1 pk ( x )
k 1 )( k |k 1
N i N T
k |k k 1 ) , requiring
k |k
O( m 2 ) operations for each ensemble member, = pk ( x xm )
yielding O( Nm N 2 ) operations in total. Thus, = d p k ( x1 xm xm
m+1 xm d )
R
when d m , as it is the case in many practi- dx
d m d m+d .
dx
cal geophysical models (van Leeuwen 2009), it
is less computationally
p demanding to store and In terms of sample approximation, this mar-
compute PkN|k 1H kT than PkN|k 1 . ginalization consists in removing the d last vec-
tor components of the ensemble members, i.e.
( )
T
ki = ik ik m , where 1k Nk is the sam-
( )
T
3.3 Linear formulation of observation model ple approximating p k and k = ik
i i
k m d .
The predictor approximation (conditional density
Statespace models with linear observation are
of Xk w.r.t. Y0:kk 1 ) is obtained similarly by margin-
a rather general family of models. In particular, alizing the conditional density of X k w.r.t. Y0:kk 1.
statespace models of the form PF cannot be applied when the observation
noise is zero. Indeed, the likelihood associated with
Xk (X k ) Wk ,
Fk (X (8) the observation model (11) takes positive values
m
Yk H k (X k ) Vk , (9) over the linear subspace { k } only.
In this case, weighting yields (almost surely) a zero
weight for each particle and the algorithm degen-
where Fk and Hk are nonlinear mappings, and Wk erates at the first time iteration. PF however can
and Vk are noise, can be casted in this family, and readily be used when the statespace model is in
therefore can be handled by EnKF. the form of Equations (8)(9), there is no need to
Consider the state (column) vector augmented
meT put the observation model in a linear form.
by the observation vector X k = X kT YkT , and
the dynamics noise (column) vector augmented
T 4 PERFORMANCE ASSESSMENT
by the observation noise vector U k WkT VkT , OF BAYESIAN FILTERS
takingg values in m+d. Let Em I m 0m ,d and
( )
Ed 0d ,m Id be matrices with respective size The problem of Bayesian filtering consists in
m ( m + d ) and d ( m + d ). Then, the state computing the posterior (and the predictor) at
dynamics each time iteration. From the posterior, one can
compute Bayesian estimators of the hidden state.
X k = Fk (X k ,U k ), (10) Two classical Bayesian estimators are: the Maxi-
mum A Posteriori (MAP) x R { k ( x )} , and the
m

where posterior mean E[ k | :k ]. The MAP is not read-


ily available in PF or EnKF, because the posterior
approximation is in the form of a sample, not a
Fk ( Em x ) Em u smooth density. The posterior mean however can
F k ( x,uu ) = ,
H k ( Fk ( Em x ) + Em u ) + Ed u straightforwardly be approximated by averaging
the particles, i.e. E[ k | k ] i =1 wkk in PF and
N i i

E[ k | :k ] N i =1 k in EnKF. Besides, the poste-


1 N i
is a Markov process. Besides, the observation model
rior mean has the minimum mean squared error
among unbiased estimators, i.e. for all estimator of
Yk H X k, (11)
the hidden state X k such that E[[ k ] [ k ] we
have that
where H = Ed , is linear.
We can readily use EnKF to perform Bayesian E X k X k 2


Xk [ k 0k]
2

filtering in the statespace model defined by Equa-
tions (10) and (11), as the observation model is lin- (where | | denotes the Euclidean norm in R m ).
ear. Note that there is no observation noise here, Hence, in this paper we use the posterior mean as
i.e. the observation noise variance is zero. EnKF state estimator in PF and EnKF.
can handle such a model, since rank( H ) = d (so A natural way to assess the performance of filter-
that the matrix H PkN|k 1 H T + 0d ,d is invertible ing algorithms is to compute the difference between
and the gain matrix defined in Equation (7) can be the estimated state X k and the true state X k .
computed). Let p k be the conditional density of The true state is defined as the state value from
X k w.r.t. Y0:k . Then pk, the conditional density which the observations are generated, i.e.it verifies

30

CH04_14.indd 30 3/15/2016 12:32:07 PM


Yk gk (X k ) . In a simulation framework, the true The true state sequence is simulated thanks
state is known since the observations are generated to the state dynamics, i.e. X 0r N (0 Q0 )
in the experiment. The difference between X k and and X kr | X kr = xk
r
k 1 N ( xkr1,Q ) for all
X k can be decomposed in two terms, as r { , , R} and k { , , n} . Consequently, the
model MSE E | E[ E Xn | |2 is equal to

X k X k = ( k [ k | 0:k ])) ( k [ k | 0:k ])),

E | E[
E Xn | 2
| , which is the trace of the
posterior covariance matrix.
which are related to different sources of error. The In a linear Gaussian model, the exact posterior
first term X k [ k | 0:k ] is related to the algo- mean and posterior covariance matrix are given by the
rithm approximation of the posterior mean, hence Kalman filter at each time step, so that the model MSE
we refer to it as algorithm error. The second term can easily be computed, without approximation.
X k X k |Y0:k ] is related to the model and to the The model parameters are set as follows: Q0 =
choice of the posterior mean as state estimator, Im , 10 6 I m , = 1. The number of time itera-
hence we refer to it as model error. tions is set to n = 100.
In the simulations presented in this paper, we
analyze separately these two error terms by com- 5.2 Lorenz96 nonlinear dynamics
puting their squared mean at final time step k n .
We secondly consider a statespace model
To do so, we generate R i.i.d. true state sequences
where the state dynamics is the highly nonlinear
X 0:1n X 0:Rn , from which we generate R observa-
Lorenz96 dynamics and the observation model
tion trajectories Y01:n Y0R:n , where Ykr gk (X kr )
is linear. The Lorenz96 dynamics is a standard
for all r { , , R} and k { , , n}, . The (total)
dynamical model in geophysical data assimila-
Mean Squared Error (MSE), the algorithm MSE,
tion (Hoteit, Pham, Triantafyllou, & Korres 2008,
and the model MSE, are then respectively approxi-
Nakano, Ueno, & Higuchi 2007).
mated as
The Lorenz96 dynamics is a deterministic time
continuous multidimensional dynamics. It is defined
2
MSE = X n X n by the nonlinear ordinary differential equation
R
1
|X r X nr |2 ,
R r =1 n
x t j = ( xt +1
jj+ xt j 2 )xt j 1 xt ,i + f (12)

( )
T
for j { m} , where xt xt , xt ,m R m
2
algorithm MSE n [ n 0 n] and f is a scalar constant parameter. By conven-
1 R tion, xt , xt ,m
m 1 xt ,0 xt ,m xt ,m +1 xt ,1. Equation
|X rn [ n | 0:n ]| ,
2
(12) is discretized with the (fourth order) Runge
R r =1 Kutta method to get the discretetime dynamics
xt h F ( xt ), where h is the discretization step. The
model MSE =
| n |Y0:n ] |2 state dynamics in the statespace model is then
R
1
|X n
R r =1
n |Y :n ]| ,
r 2
Xk F (X k 1 ) Wk ,

where Wk N ( Q ) and X 0 N ( m0 Q0 ) .
where X rn is the algorithm approximation of the The observation is the first component of vector
posterior mean using observations Y0r:n . Xk disrupted by an additive Gaussian noise with
variance 2, i.e. the observation model is the same
as in Section 5.1 above.
5 HIGH-DIMENSIONAL SIMULATIONS The true state sequence is generated thanks to the
discretized Lorenz96 dynamics, without dynamics
5.1 Linear dynamics noise, with a fixed initial condition x0, i.e. X 0r x0
We firstly consider a very simple statespace model and X kr F (X kr1 ) for all r { , , R} and
where the state dynamics and the observation model k { ,, n . Thus, the true state sequence is deter-
are linear with additive Gaussian white noise, ministic here and needs to be generated only once.
Note that, unlike in the linear Gaussian model in
Section 5.1, the exact posterior mean cannot be com-
Xk X k 1 + Wk , puted here. We must therefore approximate it accu-
Yk HX
H X k Vk , rately to get a reference value to compute the MSEs.
The model parameters are set as follows:
where Wk N ( Q ) and Vk N ( 2 ) , and m0 = (0 0 )T , Q0 = 64Im, Q 0 2 I m , = 1, f =
where H = ( ) is a 1 m matrix. The 8, h = 0.005. The number of time iterations is set
initial state X 0 follows the distribution N ( 0) . to n = 2000.

31

AMER16_Book.indb 31 3/15/2016 11:23:12 AM


5.3 Numerical results displays the time evolution of the total MSE for PF,
EnKF, and (optimal) Kalman filter, when the state
We run PF (Algorithm 1) and EnKF (Algorithm
dimension is m = 1 , 4, and 16. The performance of
2) to perform Bayesian filtering in the two mod-
PF and EnKF reaches optimality when m = 1, but
els described above. In both algorithms, we set the
it diverges from optimality when m increases. This
sample size to N = 100 . In PF, the threshold for
divergence is more severe for PF than for EnKF.
multinomial resampling is set to Nth = 2 /3N . The
Figure 2 displays the evolution of the model
number of simulation runs is R = 100 .
MSE and the algorithm MSE, for PF and EnKF,
Results for the linear dynamics model from Sec-
at final time step ( ), when the state dimen-
tion 5.1 are presented in Figures 1 and 2. Figure 1
sion increases. The model MSE and the algorithm
MSE increase at a linear rate with dimension. The
increase of algorithm MSE is faster for PF than
for EnKF. Figure 2 illustrates that PF is less robust
to high dimensionality than EnKF for this linear
model.

Figure 1. Total MSE vs. time for PF (solid line), EnKF Figure 2. Model MSE and algorithm MSE for PF and
(dashed line) and Kalman filter (dotted line), for different EnKF when the state dimension increases (linear dynam-
state dimensions (linear dynamics model). ics model).

32

AMER16_Book.indb 32 3/15/2016 11:23:18 AM


Figure 4. Model MSE and algorithm MSE for PF and
EnKF when the state dimension increases (Lorenz96
nonlinear dynamics model).

Figure 3. Total MSE vs. time for PF (solid line) and PF is less robust to high dimension than EnKF,
EnKF (dashed line), for different state dimensions although the algorithm MSE of both algorithms
(Lorenz96 nonlinear dynamics model). increases at a linear rate.
To get the results presented in Figures 3 and
4, the reference approximation of the posterior
Results for the nonlinear Lorenz96 dynamics mean (required to compute the model and algo-
model from Section 5.2 are presented in Figures 3 rithm MSEs) is computed thanks to a particle
5
and 4. Figure 3 displays the time evolution of filter with a large number of particles ( ).
the total MSE for PF and EnKF, when the state We use the optimal particle filter (for this type of
dimension is m = 4 , 20, and 40. Figure 4 displays model), described in (Le Gland, Monbet, & Tran
the evolution of the model MSE and the algorithm 2011, Section 6), that differs from the SIR imple-
MSE, for the two filters, at final time step, when mentation presented in Algorithm 1 in Section 3.
the state dimension increases. The same observa- When the sample size N is large, we preferably use
tions than for the linear model can be made here: PF because EnKF has been proven asymptotically

33

AMER16_Book.indb 33 3/15/2016 11:23:19 AM


biased, i.e. the ensemble members distribution does ear/non-gaussian bayesian tracking. IEEE Transac-
not converge to the true posterior distribution as tions on Signal Processing 50.
N (Le Gland, Monbet, & Tran 2011). Bengtsson, T., P. Bickel, & B. Li (2008). Curse-
of-dimensionality revisited: Collapse of the particle
filter in very large scale systems. In Probability and
statistics: Essays in honor of David A. Freedman.
6 CONCLUSION Beskos, A., D. Crisan, & A. Jasra (2014). On the stability
of sequential monte carlo methods in high dimen-
High dimensional nonlinear Bayesian filtering is a sions. The Annals of Applied Probability 24.
difficult approximation problem. In this paper, we Bickel, P., B. Li, & T. Bengtsson (2008). Sharp fail-
study how the performance of two popular non- ure rates for the bootstrap particle filter in high
linear Bayesian filters, PF and EnKF, is degraded dimensions. In Pushing the limits of contemporary
when the state dimension increases. statistics: Contributions in honor of Jayanta K.
Regarding PF, several authors argue that, when Ghosh.
Bui Quang, P., C.Musso, & F. Le Gland (2010). An
the dimension increases, the particle sample size insight into the issue of dimensionality in particle fil-
must grow at an exponential rate to maintain tering. In Proceedings of 13th International Conference
the approximation quality constant (Bengtsson, on Information Fusion, Edinburgh.
Bickel, & Li 2008, Bickel, Li, & Bengtsson 2008, Burgers, G., P. van Leeuwen, & G. Evensen (1998). Anal-
Bui Quang, Musso, & Le Gland 2010, Snyder, ysis scheme in the ensemble kalman filter. Monthly
Bengtsson, Bickel, & Anderson 2008). EnKF, on Weather Review 126.
the other hand, is widely applied to data assimila- Capp, O., E. Moulines, & T. Ryden (2005). Inference in
tion problems in geophysics (Evensen 2009). The hidden Markov models. New York: Springer.
models involved in such problems have often a very Daum, F. & J. Huang (2003). Curse of dimensionality
and particle filters. In Proceedings of IEEE Aerospace
large dimension (van Leeuwen 2009). Conference, Big Sky, MT.
In this paper, we lead simulation experiments to Evensen, G. (1994). Sequential data assimilation with a
quantify the degradation of PF and EnKF as the nonlinear quasigeostrophic model using monte carlo
state dimension increases. We consider two models methods to forecast error statistics. Journal of Geo-
with two different state dynamics: a simple linear physical Research 99.
dynamics, and the nonlinear Lorenz96 dynamics. Evensen, G. (2003). Ensemble kalman filter: theoreti-
The observation model is linear in the two mod- cal formulation and pratical implementations. Ocean
els. We assess the performance of PF and EnKF in Dynamics 53.
terms of the algorithm MSE (the MSE between the Evensen, G. (2009). Data assimilation, the ensemble
Kalman filter. Second edition. Berlin: Springer.
approximated posterior mean and the true poste- Gordon, N., D. Salmond, & A. Smith (1993). Novel
rior mean) and the model MSE (the MSE between approach to nonlinear/non-gaussian bayesian state
the true posterior mean and the true state value). estimation. IEE ProceedingsF 140.
In our simulations, it appears that the algorithm Hoteit, I., D.T. Pham, G. Triantafyllou, & G. Korres
MSE of both algorithms increases linearly with (2008). Particle kalman filtering for data assimilation
the state dimension. In PF, the algorithm MSE is in meteorology and oceanography. In Proceedings of
proportional to 1/N (Capp, Moulines, & Ryden 3rd WCRP International Conference on Reanalysis,
2005), so that it can be maintained constant if the Tokyo.
number of particles grows linearly with the dimen- Le Gland, F., V. Monbet, & V.D. Tran (2011). Large sam-
ple asymptotics for the ensemble kalman filter. In D.
sion. This empirical result differs from previous Crisan and B. Rozovskii (Eds.), Handbook on Nonlin-
results in the literature, showing the need for fur- ear Filtering. Oxford University Press.
ther analysis to describe the phenomenon. Besides, Nakano, S., G. Ueno, & T. Higuchi (2007). Merging par-
in our simulations, the algorithm MSE of EnKF ticle filter for sequential data assimilation. Nonlinear
increases at a (linear) rate much slower than that Processes in Geophysics 14.
of PF. This justifies that EnKF is preferable to PF Snyder, C., T. Bengtsson, P. Bickel, & J. Anderson (2008).
in high dimensional models. Obstacles to high-dimensional particle filtering.
Monthly Weather Review 136.
van Leeuwen, P. (2009). Particle filtering in geophysical
systems. Monthly Weather Review 137.
REFERENCES

Arulampalam, M., S. Maskell, N. Gordon, & T. Clapp


(2002). A tutorial on particle filters for online nonlin-

34

AMER16_Book.indb 34 3/15/2016 11:23:20 AM


Applied Mathematics in Engineering and Reliability Bri, Snel, Khanh & Dao (Eds)
2016 Taylor & Francis Group, London, ISBN 978-1-138-02928-6

The Prior Probability in classifying two populations by Bayesian method

V.V. Tai
Department of Mathematics, Can Tho University, Can Tho, Vietnam

C.N. Ha
Faculty of Mathematics and Statistics, Ton Duc Thang University, Ho Chi Minh City, Vietnam

N.T. Thao
Division of Computational Mathematics and Engineering, Institute for Computational Science,
Ton Duc Thang University, Ho Chi Minh City, Vietnam
Faculty of Mathematics and Statistics, Ton Duc Thang University, Ho Chi Minh City, Vietnam

ABSTRACT: This paper considers the Prior Probability (PP) in classifying two populations by Bayesian
approach. Specially, we establish the probability density functions for the ratio and the distance between
two PPs that be supposed to have Beta distributions. Also, we build the posterior distribution for PPs when
knowing the prior Beta distributions of them. From established probability density functions, we can
calculate some typical parameters such as mean, variance and mode. According to these parameters, we
can survey and also estimate the prior probabilities of two populations to apply for practical problems.
The numerical example in one synthetic data set, four bench mark data sets and one real data set not
only illustrate for the proposed theories but also present the applicability and feasibility of the researched
problem.

1 INTRODUCTION
Pe = min {q f x q f x } dx
Rn
Classification is an important problem in multivar- =1 gmax ( x ) dx,
Rn
iate statistical analysis and applied in many fields,
such as economics, physics, sociology, etc. In litera-
ture, there were many different methods proposed in which gmax ( x ) = max {q f ( x ) q f x )}. There-
to perform the classification problem like logistic fore, in Bayesian approach, classifying a new
regression method, Fisher method, Support Vec- observation and computing its error depend on
tor Machine method, Bayesian method, etc., in two factors: pdfs and PPs. From the given data,
which Bayesian approach was especially interested we have many methods to determine the pdfs.
Mardia, Kent, & Bibby 1979, Webb 2003, Ghosh, This problem was studied excitedly in theoreti-
Chaudhuri, & Sengupta 2006). In classifying by cal aspect and had many good applications with
Bayesian approach, we often study the case of two real data (Pham-Gia, Turkkan, & Vovan 2008,
populations because it can be applied in many prac- Vo Van & Pham-Gia 2010). In fact, when know-
tical problems and it is also the theoretical founda- ing the exact pdfs, determining suitable PPs is a
tion for the case of more than two populations. We significant factor to improve the performance in
suppose to have two populations Wi , i = 1, 2 with Bayesian classification. Normally, depending on
qi is the prior probability and fi x ) is the Prob- known information about the researched problem
ability Density Function (pdf) of the variable X or the training data, we can determine the prior
for ith population, respectively. Acording to Pham- probabilities. If there is none of information, we
Gia, (Pham-Gia, Turkkan, & Vovan 2008), classify- usually choose prior probabilities by uniform dis-
ing a new observation x0 by Bayesian method was tribution. When basing on training data, the prior
performed by the rule: if max {qi fi ( x0 )} q1 f1( x0 ) probabilities are often estimated by two main
n +1
then x0 is assigned to W1 , in contrast, we assign it methods: Laplace method: qi = Ni n and ratio of
ni
to W2 . Pham-Gia (Pham-Gia, Turkkan, & Vovan samples method: qi = N , where ni is the number
2008) also identified the misclassification in this of elements in Wi , n is the number of dimen-
approach that be called as the Bayes error and be sions and N is the number of all objects in training
calculated by formula data (James 1978, Everitt 1985). There were many

35

AMER16_Book.indb 35 3/15/2016 11:23:20 AM


authors who studied and applied these results, such a. The prior pdf of y is determied as follows:
as (McLachlan & Basford 1988, Inman & Bradley
Jr 1989, Miller, Inkret, Little, Martz, & Schillaci 1 y 1
f pri y ) = (1)
2001). Besides, determining specified distributions B ( ) ( y ) +
for PPs is also interested in case of two popula-
tions. We can list some researches about this prob-
where
lem such as (McLachlan & Basford 1988, Jasra,
Holmes, & Stephens 2005, Pham-Gia, Turkkan, & ( ) ( )
Bekker 2007). To inheritance their ideal, this arti- B ( ) , ( ) 0 x 1e x dx.
cle studies the PPs of two populations by build- ( )
ing pdfs for the ratio and distance between PPs.
According to prior information and sampling b. If Beta ( ) is the prior distribution of q and
data, we can establish posterior pdfs for ratio and m is the number of observations belonging to
distance between two PPs. Then, we can estimate W1 when collecting n elements, the posterior
and test the differences between two prior prob- pdf of y is determined as follows:
abilities. Because the sum of PPs is equal to 1, we
y 1

can survey and determine the PPs for two popula- 1
f pos ( y ) =
tions when knowing their ratio or distance.
The remainder of this article is organized as
( )
B , ( y 1) ,
(2)

follows. Section 2 presents the theories about the


prior pdfs and posterior pdfs for ratio and distance where
between two PPs. In this section q is assumed to
have Beta prior distribution and the posterior dis- m + n m.
tribution of q is updated from the sample infor-
mation. Section 3 discusses some relations of Proof
established pdfs in Section 2 and the computa- a. Because q has distribution Beta ( ) , pdf of
tional problems in practices. Section 4 examines q is:
three numerical examples to illustrate proposed
theories and compare the obtained results with 1 1
those of existing methods. The final section is the f pri q ) = q ( qq)) 1 , 0 q 1.
B ( )
conclusion.
y q = y / ( y + ) and the derivative of q is
Clearly,
1 / ( y 1)2 . Thus,
2 THE RATIO AND THE DISTANCE
BETWEEN TWO PRIOR PROBABILITIES
f pri y ) =
(y
1
( )
)2
f pri y
y

( )
Give the variable X and two populations W1 W2 , y 1
with pdfs are f1 x ) and f2 x ) , respectively. q = 1
B ( , ) y
is the PP for W1 and 1 q is the PP for W2 . Let
( ) y 1
q 1
y 1 q z =| 1 q | . If q is a random variable, then y ( y +1)2
1
y and z are also random variables. In this sec- y
= 1
B ( , ) ( y ) +
.
tion, we build the prior pdfs for y and z when q
has Beta distribution. The posterior distributions
of y and z are also established when we con- b. We call A to be the event for obtaining m obser-
sider the data samples. When posterior pdfs are vations of W1 when collecting n observations:
computed, we will have a general look about the
difference between two PPs and also find them via m
some representing parameters of y and z (e.g., P ( A) = q m ( q )n m .
n
mean or mode). This ideal can be also performed
in a similar way when q has other distributions
Then we have
in [0,1].
M (q ) = 1
B ( )
q 1( q ) 1.
2.1 Distribution of the ratio between
m m
two prior probabilities n q ( q )n m
Theorem 1 Assuming that q have the prior distri- m
bution Beta ( ) , we have following results for n
pdf of variable y. = B ( )
q m ( q ) 1+ n m
.

36

AMER16_Book.indb 36 3/15/2016 11:23:22 AM


The posterior pdf of q is 1 1
1 1 z 1 z
1 .
B ( ) 2 2
q 1(1 q ) 1

M (q )
f pos (q ) = 1 = . (3)
M (q )dq
0
B ( ) So we have

Doing like cases a) with f pos q ) in (3), we g1( z ) = C1 (1 z ) 1(1 z ) 1


. (6)
obtain (2).
For q > 1 / 2 , z 2q 1 or q ( z + ) / 2 , using
2.2 Distribution of the distance between the similar way to establish in (6), we have
two prior probabilities
Theorem 2 Let z be distance between q and g2 ( z ) = C1 (1 z ) 1(1 z ) 1

. (7)
( ) , z =| 1 q |. We have following results for
pdf of z Clearly, from (6) and (7), we have (4).
a. If q has distribution Beta ( ) , the pdf of z b. We have
( ) is
m
n q ( q ) 1 q m ( q )n m
g pri ( z ) = C1 ( 11
z) ( z) 1 M (q ) = B ( )
(4) m + m 1
+( z ) 11( z ) 1 n q q )n m + 1
(1
= B ( , )
.
where
So posterior pdf of q is
1
C1 = + 1 . M (q )
2 B ( , ) f pos (q ) = 1
0 M ( q )dq
1
q 1 (1 q )

b. If q has prior distribution Beta( ) and m = .
is the number of observations belonging to W1 B ( )
when choosing n elements, the posterior pdf of
z is determined as follows: It is easy to proof (5) by using the same way
to (a).
1 Moreover, if we set w 1 2q , 1 w 1, and
) 1(1 )

g pos ( ) 2 (1 the conditions in part (b) (theorem 2) are unchanged,

1
(5)
+ (1 ) 1(1

) we can proof the following result:


1
) 1(1 ) 1 .

g pos ( ) (1 (8)
where
2 1
( )
B
1
C2 = .
2
1
( )
B
3 SOME RELATIONS
Proof 3.1 Some surveys
a. We have prior pdf of q is
i. Because q is in the interval [0,1], y is in the
1 interval [ ). If q = 1 / 2 then y = 1. When
f pri q) = q 1( q ) 1. q 0, we have y 0. Similarly, when q 1,
B ( )
we have y +. When y 0 or y + we
receive the best classification. We also have z
When q 1 / 2 , z 1 2q or q ( z ) / 2 . In is a random variable whose value change from
this range, pdf of z is determined by 0 to 1. When q = 1 / 2, we have z = 0 and when
q = 0 or q = 1 we receive z = 1. In the classifi-
1 1 z cation problem, if there is none of prior infor-
g1( z ) = f pri
2 2 mation, we often choose y = 1 or z = 0. When
y 0, y + or z = 1, we obtain the best
where f pri (12z ) is given by classification.

37

AMER16_Book.indb 37 3/15/2016 11:23:27 AM


ii. The k th moment for posterior of y and z is iv. The building pdfs determined by (1), (2), (4) and
determined by (5) depends on the prior pdf of q. In practice, this
distribution is not easy to survey. It really depends
1 1 y k + 1
on the known information about the research.
E pos y k 0 ( y dy. Although there have been a lot of authors dis-
( )
B , 1)
(9)
cussing about this problem such as (McLachlan
& Basford 1988), (Inman & Bradley Jr 1989),
1 1 (Miller, Inkret, Little, Martz, & Schillaci 2001)
E pos z k C2 z k (1 ) 1(1 )

0
(10) none of optimal solution for all cases. According
1 1 to (Pham-Gia, Turkkan, & Eng 1993) there are at
+ (1 ) (1 ) d .
dz
least two of prior information that be often used
for ratio between two Beta distributions. These
According to above equations, we can compute are uniform distribution (or Beta(1,1)) and Jef-
easily the means and the variances of y and z freys prior (or Beta(1/2, 1/2)) see Figures 12.
(Berg 1985) by the help of some mathematical
software packages in Matlab, Maple, etc.
iii. When having the posterior pdfs of y and z , we 3.2 The computational problem
can compute the highest posterior density (hpd) Because the features of populations are often dis-
regions for them. The hpd credible interval I1 crete, we must estimate their pdfs before running
is often numerically computed although tables Bayesian method. There are some proposed meth-
exist for some distributions (Isaacs 1974). Berger ods to solve this problem; however, kernel function
(Berger 1985) had proposed the algorithm to method is the most popular one in practice. In this
determine hpd and Turkan and Pham-Gia (Pham- method,the choices of smoothing parameter and
Gia, Turkkan, & Eng 1993) written a program to kernel function has effects on the result. Although
determine hpd in different cases of distributions. (Scott 1992), (Martinez & Martinez 2007), (Vo Van
& Pham-Gia 2010), etc. have had many discus-
sions about this problem, the optimal choice is not
found. Here, the smoothing parameter is chosen by
Scott and the kernel function is the Gaussian.
When the prior probabilities and the pdfs have
been identified, we have to find the maximum
function gmax ( x ) to compute the Bayes error. In
the unidimensional case, we can use the specified
expression to compute the maximum function of
pdfs and the Bayes error (Pham-Gia, Turkkan, &
Vovan 2008). In the multidimensional case, this
calculation is really complicated. Vovan and Pham-
Gia (Vo Van & Pham-Gia 2010) and some other
researchers have mentioned about this problem. In
this case, the Bayes error is estimated by the Monte-
Figure 1. The prior pdf of y when q has distribution Carlo simulation with the help of some mathemati-
Beta(1/2,1/2), Beta(1/2,1), Beta(1,1), Beta(1,1/2). cal software packages in Maple, Matlab, etc.
In this article, the programs used for estimating
pdf, computing the Bayes error, are coded by Mat-
lab software.

4 THE NUMERICAL EXAMPLE

This section examines three examples to illustrate the


proposed theories in Section 2 and 3. Example 1 con-
siders a synthetic data set containing 20 observations
in two populations. Population I includes 9 observa-
tions and 11 ones for population II. We survey this
simple example to test the theoretical results in Sec-
tion 2 and compare the performance of the proposed
method with those of other choices, which compute
Figure 2. The prior pdf of z when q has distribution prior probabilities according to Uniform distribu-
Beta(1/2,1/2), Beta(1/2,1), Beta(1,1), Beta(1,1/2). tion, ratio of sample method and Laplace method.

38

AMER16_Book.indb 38 3/15/2016 11:23:34 AM


Example 2 compares the results of surveying meth- the ninth object. These methods also have the smaller
ods throughout four bench mark data sets including Bayes error than those of others. This result presents
Seed, Thyroid, User and Breast Tissue. For each data that if we choose the suitable prior probability for
set, we choose randomly two populations for experi- , using proposed method, we can be received a better
ment. These popular data sets are often studied in classification than other traditional methods.
recognized statistics. When there is a new method
that relates to classification problem, these data are Example 2. In this example, BayesU, BayesP,
also used to compare the result of the new method BayesL, BayesR and BayesD will be used to classify
with traditional ones. In the third example, we resolve some bench mark data sets that include Thyroid,
a practical issue in Vietnam: appraising the ability to Seeds, User, and Breast Tissue. In each data set, we
repay loans of the bank costumers in BinhPhuoc choose randomly two populations. The summary of
province. In this section, the prior probability chosen data features is presented by Table 3. The detailed
by Uniform distribution, ratio of sample method, data sets are given by [http://archive.ics.uci.edu/ml].
Laplace method, and z are respectively denoted The survey of bench mark data whose sizes,
by BayesU, BayesP, BayesL, BayesR and BayesD. In dimensions are various will show the effectiveness
cases of BayesR and BayesD, from the posterior pdf and the stability of new method. Assuming that
of y or z , we calculate the mean value and use it as the prior probability q of W1 is a random vari-
the prior probability of population. able having distribution Beta ([N/2],[N/2]) (N is the
number of elements in data set). Applying (2) and
Example 1. Give the studied marks (scale 10 grad- (5) with n and m got from training data, we com-
ing system) of 20 students, in which 9 students have pute the posterior pdfs of y and z. Also, using
marks being smaller than 5 (W W1 : fail the exam) and the mean value for each case, we calculate the prior
11students have marks being larger or equal to 5 probabilities of populations. In this example, the
(W
W2 : pass the exam). The data set is given by the ratio between training and test set is 1:1. The results
Table 1. Assuming that we need to classify the that we received when running 10 times randomly
ninth object and the prior probability q of W1 is are summarized in Table 4. Table 4 shows that
a random variable having distribution Beta(10,5). BayesR, BayesD are more stable than the existing
The training set presents that the total observa- ones. In almost of data sets, BayesR and BayesD
tions N = 19 and the number of observations in have the smaller errors than other methods.
W1 is n = 8 . Then, we have:
The mean, variance and mode of posterior dis- Example 3. In bank credit operation, determining
tribution of y are 1.2, 0.1886, 1.0, respectively. The the repay ability of customers is really important.
95% hpd credible interval of y : (1.2845, 1.6739). If the lending is too easy, the bank may have bad
The mean, variance and mode of posterior distri- debt problems. In contrast, the bank will miss good
bution of z are 0.1438, 0.0112, 0, respectively. The opportunities to lend. Therefore, in recent years,
95% hpd credible interval of z : (0.0962, 0.1914). the classification of credit applicants has been
Using the mean value of y and z , we have the especially studied in Vietnam. In this example, the
prior probabilities of two populations respectively are data including 27 cases of bad debt and 33 cases of
(0.5455;0.4545) and (0.5719;0.428). According to the good debt of a bank in BinhPhuoc province, Viet-
prior probabilities and those of the existing methods,
we classify the ninth element. The results presented Table 2. The result when classifying the ninth object.
by the following Table 2. It can be seen that only
BayeR and BayesD give the right classification for Method Prior gmax(x0) Population Bayes error

Table 1. The studied marks of 20 students and the BayesU (0.5;0.5) 0.0353 2 0.0538
actual result. BayesB (0.421;0.579) 0.0409 2 0.0558
BayesL (0.429;0.571) 0.0403 2 0.0557
Objects Marks Group Objects Marks Group BayesR (0.5545;0.454) 0.0365 1 0.0517
BayesD (0.572;0.428) 0.0383 1 0.0503
1 0.6 W1 11 5.6 W2
2 1.0 W1 12 6.1 W2
3 1.2 W1 13 6.4 W2 Table 3. Summary of four bench mark data sets.
4 1.6 W1 14 6.4 W2
5 2.2 W1 15 7.3 W2 Data No of objects No of dimensions
6 2.4 W1 16 8.4 W2
7 2.4 W1 17 9.2 W2 Thyroid 185 5
8 3.9 W1 18 9.4 W2 Seed 140 7
9 4.3 W1 19 9.6 W2 Breast 70 9
10 5.5 W2 20 9.8 W2 Users 107 5

39

AMER16_Book.indb 39 3/15/2016 11:23:36 AM


Table 4. Summary five Bayesian methods of bench parameters for prior probabilities of populations
mark data. themselves. The numerical examples proved that if
we have good prior information, choose the rea-
Data Method Empirical error (%) sonable prior distributions of prior probabilities,
Thyroid BayesU 1.304 we can determine the prior probabilities which give
BayesP 1.196 the better results in the comparison to traditional
BayesL 1.196 methods. In the coming, we will use these results to
BayesR 0.979 apply in different real data.
BayesD 1.195
Breast BayesU 8.284
BayesP 7.427 REFERENCES
BayesL 7.427
BayesR 7.713 Berg, A.C. (1985). SVM-KNN: Discriminative Near-
BayesD 7.427 est Neighbor Classification for Visual Category
Recognition.
Seeds BayesU 3.715 Berger, J.O. (1985). Statistical Decision Theory and Baye-
BayesP 4.001 sian Analysis. Springer Science & Business Media.
BayesL 4.001 Everitt, B.S. (1985). Mixture Distributions. Encyclopedia
BayesR 3.857 of statistical sciences (5), 559569.
BayesD 3.715 Ghosh, A.K., P. Chaudhuri, & D. Sengupta (2006). Clas-
Users BayesU 12.643 sification Using Kernel Density Estimates. Techno-
BayesP 15.661 metrics 48(1).
BayesL 15.661 Inman, H.F. & E.L. Bradley Jr (1989). The overlap-
ping coefficient as a measure of agreement between
BayesR 12.264
probability distributions and point estimation of the
BayesD 12.264
overlap of two normal densities. Communications in
Statistics-Theory and Methods 18(10), 38513874.
Table 5. The results of five Bayesian methods for Isaacs, G.L. (1974). Tables for Bayesian statisticians.
Example 3. Number 31. University of Iowa.
James, I.R. (1978). Estimation of the mixing proportion
Empirical error (%) Bayes error in a mixture of two normal distributions from simple,
rapid measurements. Biometrics, 265275.
Method (X, Y) (X, Y) Jasra, A., C.C. Holmes, & D.A. Stephens (2005). Markov
chain Monte Carlo methods and the label switching
BayesU 20 0.1168 problem in Bayesian mixture modeling. Statistical
BayesP 20.33 0.1170 Science, 5067.
BayesL 20.33 0.1170 Mardia, K.V., J.T. Kent, & J.M. Bibby (1979). Multivari-
ate analysis. Academic press.
BayesR 21 0.1170
Martinez, W.L. & A.R. Martinez (2007). Computational
BayesD 20 0.1170 statistics handbook with MATLAB. CRC press.
McLachlan, G.J. & K.E. Basford (1988). Mixture mod-
els: Inference and applications to clustering. Applied
nam will be considered. The objects in data who are Statistics.
bank borrowers are immigrants. Two independent Miller, G., W.C. Inkret, T.T. Little, H.F. Martz, & M.E.
features are X (years of schooling) and Y (years Schillaci (2001). Bayesian prior probability distribu-
of immigration). Because of the sensitive problem, tions for internal dosimetry. Radiation protection
authors have to conceal the detailed data. In this dosimetry 94(4), 347352.
example, the choice prior for Beta distribution, Pham-Gia, T., N. Turkkan, & A. Bekker (2007). Bounds
the ratio between training and test set, the way to for the Bayes error in classification: a Bayesian
approach using discriminant analysis. Statistical
perform experiment are similar to Example 2. The Methods and Applications 16(1), 726.
results are presented in Table 5. It can be seen that Pham-Gia, T., N. Turkkan, & P. Eng (1993). Bayesian analy-
the result in this example is quite similar to two sis of the difference of two proportions. Communications
previous ones. Especially, BayesD ensure the rea- in Statistics-Theory and Methods 22(6), 17551771.
sonable for all cases and give the best result. Pham-Gia, T., N. Turkkan, & T. Vovan (2008). Statistical
discrimination analysis using the maximum function.
Communications in StatisticsSimulation and Computa-
5 CONCLUSION tion 37(2), 320336.
Scott, D.W. (1992). Multivariate Density Estimation:
Theory practice and visualization.
This article establishes the prior and posterior dis- Vo Van, T. & T. Pham-Gia (2010). Clustering probabil-
tributions for the ratio and the distance between ity distributions. Journal of Applied Statistics 37(11),
two prior probabilities having Beta distribution 18911910.
in Bayesian classification. From related pdfs that Webb, A.R. (2003). Statistical pattern recognition. John
have been built, we can survey, compute typical Wiley & Sons.
40

AMER16_Book.indb 40 3/15/2016 11:23:38 AM


Efficient methods to solve optimization problems

AMER16_Book.indb 41 3/15/2016 11:23:39 AM


This page intentionally left blank
Applied Mathematics in Engineering and Reliability Bri, Snel, Khanh & Dao (Eds)
2016 Taylor & Francis Group, London, ISBN 978-1-138-02928-6

Estimation of parameters of Rikitake systems by SOMA

T.D. Nguyen
Vietnam Aviation Academy, Ho Chi Minh, Vietnam

T.T.D. Phan
HCMC University of Food Industry, Ho Chi Minh, Vietnam

ABSTRACT: This paper aims to present the combination of chaotic signal and Self-Organizing Migrat-
ing Algorithm (SOMA) to estimate the unknown parameters in chaos synchronization system via the active
passive decomposition method. The unknown parameters were estimated by self-organizing migrating
algorithm. Based on the results from SOMA, two Rikitake chaotic systems were synchronized.

1 INTRODUCTION on SOMA for estimation of the parameter of chaos


synchronization via ADP method.
Chaos theory is one of the most important achieve- Motivated by the aforementioned studies, this
ments in nonlinear system research. Chaos dynam- paper aims to present the combination of cha-
ics are deterministic but extremely sensitive to initial otic signal and the unknown parameters in chaos
conditions. Chaotic systems and their applications synchronization system were estimated via ADP
to secure communications have received a great deal method. Based on the results from SOMA algo-
of attention since Pecora and Carroll proposed a rithm, the estimated parameters were used to syn-
method to synchronize two identical chaotic systems chronize two chaotic systems.
(Pecora & Carroll 1990). The high unpredictability
of chaotic signal is the most attractive feature of
chaos based secure communication. Several types of 2 PROBLEM FORMULATION
synchronization have been considered in communi-
cation systems. The Active Passive Decomposition 2.1 The active-passive decomposition method
(APD) method was proposed by Kocarev and Par-
Kocarev & Parlitz (1995) proposed a general
litz (1995), it was known as one of the most versa-
drive response scheme named as Active Passive
tile schemes, where the original autonomous system
Decomposition (APD). The basic idea of the
is rewritten as controlled system with the desired
active passive synchronization approach consists
synchronization properties. Many of the proposed
in a decomposition of a given chaotic system into
solutions focused on synchronization-based meth-
an active and a passive part where different cop-
ods for parameter estimation (Shen & Wang 2008,
ies of the passive part synchronize when driven by
Ge & Chen 2005), among others. In (Parltiz & Junge
the same active component. In the following, we
1996), the parameters of a given dynamic model
explain the basic concept and terminology of the
were estimated by minimizing the average synchro-
active passive decomposition.
nization error using a scalar time series.
Consider an autonomous n-dimensional dynam-
Recently, a new class of stochastic optimization
ical system, which is chaotic as
algorithm called Self-Organizing Migrating Algo-
rithm (SOMA) was proposed in literature (Zelinka u g( u ) (1)
2004 & Zelinka 2008). SOMA wors on a popula-
tion of potential solutions called specimen and it is The system is rewritten as a non-autonomous
based on the self-organizing behavior of groups of system:
individuals in a social environment. It was proven
that SOMA has ability to escape the traps in local x f ( x, s ) (2)
optimal and it is easy to achieve the global optimal.
Therefore, SOMA has attracted much attention and where x is a new state vector corresponding to u and
wide applications in different fields mainly for vari- s is some vector valued function of time given by
ous continuous optimization problems. However,
to the best of our knowledge, there is no research s = h(x) (3)

43

AMER16_Book.indb 43 3/15/2016 11:23:39 AM


The pair of functions f and h constitutes a of individuals, rather than the development of suc-
decomposition of the original vector field g, and cessive generations. It can be applied to any cost-
are chosen such that any system minimization problem with a bounded parameter
space, and is robust to local minima. SOMA works
y f ( y, s ) (4) on a population of candidate solutions in loops
called migration loops. The population is initial-
given by the same vector field f, the same driving ized randomly distributed over the search space
signal s, but different variables y, synchronizes with at the beginning of the search. In each loop, the
the original system. Here, x constitutes the active population is evaluated and the solution with the
system while y is the passive one. highest fitness becomes the leader L. Apart from
The synchronization of the pair of identical sys- the leader, in one migration loop, all individuals
tems. (2) and (4) occurs if the dynamical system will traverse the in put space in the direction of the
describing the evolution of the difference yk xk leader. Mutation, the random perturbation of indi-
0 for k . viduals, is an important operation for Evolutionary
Strategies (ES). It ensures the diversity amongst
the individuals and it also provides the means to
2.2 The parameter estimation
restore lost information in a population. Muta-
When estimating the parameters, suppose the tion is different in SOMA compared with other ES
structure of the system is known in advance, the strategies. SOMA uses a parameter called PRT to
transmitter (driver) system is set with original achieve perturbation. This parameter has the same
parameters and the parameter in receiver (response) effect for SOMA as mutation has for GA.
system is unknown. Therefore, the problem of The novelty of this approach is that the PRT
parameter estimation can be formulated as the fol- vector is created before an individual starts its
lowing optimization problem: journey over the search space. The PRT vector
defines the final movement of an active individual
1 M in search space.
Cost function= yk
M k =1
xk 2
(5) The randomly generated binary perturbation
vector controls allowed dimensions for an individ-
ual. If an element of the perturbation vector is set
where M denotes length of data used for param- to zero, then the individual is not allowed to change
eter estimation, the parameter can be estimated by its position in the corresponding dimension.
minimum the Cost function (5). An individual will travel a certain distance (called
Because of the unstable dynamic behavior of the Path Length) towards the leader in n steps of
chaotic systems, the parameter estimation for cha- defined length. If the Part Length is chosen to be
otic systems is a multidimensional continuous opti- greater than one, then the individual will overshot
mization problem, the parameters are not easy to the leader. This path is perturbed randomly.
obtain. In addition, there are often multiple varia- There are specified following parameters of
bles in the problem and multiple local optimums in SOMA algorithm:
the landscape of Cost function, so traditional opti- Cost function: determines how to evaluate
mization methods are easy to trap in local optima individuals.
and it is difficult to achieve the global optimal Specimen: describes a form of individuals.
parameters. Therefore, SOMA was chosen because Population size: The number of individuals in the
it has been proven that the algorithm has the abil- population which is contained in one migration.
ity to converge toward the global optimum. Migrations: The maximum number of migra-
tions to complete.
Step: The step size of individual during
3 SELF ORGANIZING MIGRATING migration.
ALGORITHM Part Length: duration of path which use indi-
viduals for migration.
SOMA is the evolutionary algorithms which imi- PRT: perturbation of migration.
tates nature process of wildlife migration. The Minimal diversity: diversity of evolutionary
method was established in 1999, developed by process.
Prof. Ivan Zelinka at the University of Tomas More detailed description of SOMA can be
Bata, Zln. SOMA is a stochastic optimization found in e.g. (Zelinka 2004).
algorithm that is modeled on the social behavior of There are many of SOMA variations which are
cooperating individuals. The approach is similar to differentiated by way of migration. In our case,
that of genetic algorithms, although it is based on SOMA-All-To-One variation has been chosen, in
the idea of a series of migrations by a fixed set which individuals migrate past the best one.

44

AMER16_Book.indb 44 3/15/2016 11:23:39 AM


4 RIKITAKE SYSTEMS Table 1. SOMA parameter setting.

In this section, we apply the ADP technique to Parameter Value


achieve the synchronization between two identical
Population size 20
Rikitake systems. The mathematical description of Migrations 50
Rikitake system is as follows 0: Step 0.11
Path length 3
x x + zy Perturbation 0.1
Minimal diversity 1
u = y  y + ( z a )x (6)
z 1 x
xy

Subtracting system (7) from system (8) yields
the error dynamical system between system (7) and
where x, y and z are the state variables, and and a
system (8) ek = ((xr, yr, zr)k (xd, yd, zd)k) were used
are the positive real constants. The Rikitake system
to create a cost function CF representing the Root
(4) exhibits a chaotic attractor for = 2 and a = 5 as
Mean Square Error (RMSE) of synchronization
shown in Figure 1.
between X and Y:
To illustrate the synchronization of two identi-
cal Rikitake systems, we consider different active-
passive decompositions of the drive system with 1 M
the denote X and the response system with the
CF= Yk ( xr , yr , zr ) X k ( xd , yd , zd )
M k =1
2
(9)
denote Y.
The identical drive system X ( = 2 and a = 5) The parameter estimation can be formulated as
is given by: a multidimensional nonlinear problem to minimize
the cost function CF. SOMA is used to find a suit-
x d 2 xd + zd s(t ) able parameter a, such that the cost function CF

X = yd 2yyd + ( z 5)xd (7) can be asymptotical approach to minimum point.
z 1 xd yd The minimum value of cost function guarantee of
d the best solution with suitable parameters. Systems
are asymptotically (and globally) synchronized.
The response system Y is described by the fol- In our simulations, the transmitted signal is
lowing equations: chosen s(t) = yd. The initial states of the drive sys-
tem (7) and the response system (8) are taken as
x r = xr + zr s(t ) xd(0) = 6, yd(0) = 0, zd(0) = 0 and xr(0) = 1, yr(0) = 2,
X = yr yr + ( z a )x (8) zr(0) = 1, respectively. Hence the error system has
z 1 x y the initial values e1(0) = 5, e2(0) = 2 and e3(0) = 1.
r r r SOMA-All-To-One is used to solve the systems,
which the control parameters setting are given in
where a, are unknown parameter of response Table 1. Simulation was implemented using Math-
system, s(t) is the transmitted signal. ematica programming language and executed on
Pentium D 2.0 G, 2 GB personal computer.

4.1 Case study 1: simulation on one-dimensional


parameter estimation
In this case, we consider one-dimensional param-
eter estimation. That mean one parameter a (or )
is known with the original value, one of (or a) are
unknown and need to be estimated.
a. When a = 5 is known in advance, the initial
guesses are in the range [0,5] for , the control
parameter was set as Table 1. SOMA-All-To-
One has found the best result was collected
with = 1.97209 as shown in 3D cost function
(Fig. 3). After 3 migrations, both the worst and
the best values of the cost function approaches
minimum value 0.634789 quickly as shown in
Figure 1. The Rikitake chaotic attractor. Figure 2,

45

AMER16_Book.indb 45 3/15/2016 11:23:39 AM


Figure 2. The worst and the best values of cost func-
tion (1a).
Figure 5. 3D cost function ( = 2).

with a = 4.99441 as shown in 3D cost function


(Fig. 5). Both the worst and the best values of
the cost function approaches minimum quickly
as shown in Figure 4 (CF = 0635367).

4.2 Case study 2: simulation on two-dimensional


parameter estimation
In this case, we consider two-dimensional param-
eter estimation. Both two parameter a, and are
unknown and need to be estimated. The initial
guesses are in the range [0, 5] for and [0, 10] for
Figure.3. 3D cost function (a = 5). a. SOMA-All-To-One has found the best result
(CF = 0.634578) was collected with parameters =
1.9658 and a = 4.98468 as shown in 3D cost func-
tion (Fig. 7) Both the worst and the best values of
cost function approaches minimum gradually after
24 migrations as shown in Figure 6.
The final estimated value are = 1.9658 and
a = 4.98468. Thus, the actual parameters were
fully identified. As shown in Figure (27), the val-
ues of cost function always approach to original
minimum value CF = 0.635398, and the estimated
parameter obtained by SOMA are also very close
to the true value of original parameters. So, its
proven that SOMA is effective to estimate param-
eters for chaos synchronization system.
Base on the values were estimated by SOMA
( = 1.9658 and a = 4.98468), the response system
Figure 4. The worst and the best values of cost func- was constructed. The effective of the estimated
tion (1b). value on the synchronization errors of driver sys-
tems X (4,4,1) and response system Y(1,1,1)
via ADP were demonstrated as shown in Figure 8.
b. When = 2 is known in advance, the initial Without ADP, the synchronization between two
guesses are in the range [0, 10] for a, the con- systems were not identified totally as shown in Fig-
trol parameter was set as Table 1. SOMA-All- ure.8 (a,e,i), and the trajectories of e(t) were unpre-
To-One has found the best result was collected dicted as shown in Figure.8 (c,g,k). In the opposite,

46

AMER16_Book.indb 46 3/15/2016 11:23:41 AM


Figure 8b. Synchronization of xd and xr with ADP.

Figure 6. The worst and the best values of cost func-


tion (2).

Figure 8c. Difference of xd and xr without using ADP.

Figure 7. 3D cost function.

Figure 8d. Difference of xd and xr with ADP.

Figure 8a. Synchronization of xd and xr without using Figure 8e. Synchronization of yd & yr without using
ADP. ADP.

47

AMER16_Book.indb 47 3/15/2016 11:23:44 AM


Figure 8f. Synchronization of yd and yr with ADP.
Figure 8j. Synchronization of zd and zr with ADP.

Figure 8g. Difference of yd and yr without using ADP.

Figure 8k. Difference of zd and zr without using ADP.

Figure 8h. Difference of yd and yr with ADP.

Figure 8l. Difference of zd and zr with ADP.

the estimated values and ADP method are effective


to synchronize for two chaotic systems.

5 CONCLUSIONS

Figure 8i. Synchronization of zd and zr without using In this paper, the ADP method is applied to syn-
ADP. chronize two identical Rikitake chaotic systems.
Parameter estimation for chaotic system was
Figure. 8 (d,h,l) displays that the trajectories of e(t) formulated as a multidimensional optimization
tends to zero after t > 12, and trajectories of xr,yr,zr problem. Self-Organizing Migration Algorithm
converged to xd,yd,zd absolutely when ADP was (SOMA) was used to find the unknown values
applied as shown in Figure. 8 (b,f,j). Its proven that of chaotic parameters. Based on the results from

48

AMER16_Book.indb 48 3/15/2016 11:23:46 AM


SOMA algorithm, two chaotic systems were syn- Liao, X., Chen, G. & O. Wang. 2006. On Global synchro-
chronized absolutely. nization of chaotic systems, Dynamics of continuous
discrete and impulsive systems, Vol 1.
McMillen. 1999. The shape and dynamics of the rikitake
attractor. The Nonlinear Journal Vol.1, 110.
REFERENCES Parlitz U, Junge L. 1996. Synchronization based param-
eter estimation from times series. Phys Rev E, 54,
Cuomo, K.M. & A.V. 1993. Oppenheim. Circuit imple- 62539.
mentation of synchronized chaos with applications to Parltiz U. 1996. Estimating model parameters from
communications. Phys. Rev. Lett.,71. 1:6568. time series by autosynchronization. Phys Rev
Ge Z.M, Cheng J.W. 2005. Chaos synchronization and Lett,76,12325.
parameter identification of three time scales brush- Pecora, L.M. & T.L. Carroll. 1990. Synchronization in
less DC motor system. Chaos, Solitons & Fractals, 24, chaotic systems. Phys. Rev. Lett., 64(8), 821824.
597616. Schuster, H.G. & W. Just. 2005. Deterministic Chaos.
Guan X.P., Peng H.P., Li L.X. & Wang Y.Q. 2001. Param- Wiley VCH.
eter identification and control of Lorenz system. Acta Shen L.Q, Wang M. 2008. Robust synchronization and
Phys Sin, 50, 2629. parameter identification on a class of uncertain cha-
Huang L.L., Wang M. & Feng R.P. 2005. Parameters otic systems. Chaos, Solitons & Fractals, 38, 106111.
identification and adaptive synchronization of cha- Wu X.G., Hu H.P. & Zhang B.L. 2004. Parameter esti-
otic systems with unknown parameters. Phys Lett A, mation only from the symbolic sequences generated
342, 299304. by chaos system. Chaos, Solitons & Fractals, 22,
Keisuke Ito. 1980. Chaos in the Rikitake two-disc 359366
dynamo system, Earth and Planetary Science Letters Xiaoxin Liao, Guanrong Chen & Hua O Wang. 2002. On
Vol. 51, 2, 451456. global synchronization of chaotic systems. AK, May.
Kocarev, L. & U. Parlitz. 1995. General Approach for Zelinka, I. 2008. Real-time deterministic chaos control
Chaotic Synchronization with Applications to Com- by means of selected evolutionary techniques., Engi-
munication. Phys. Rev. Lett., 74, 50285031. neering Applications of Artificial Intelligence, 10.
Li R.H., Xu W. & Li S. 2007. Adaptive generalized pro- Zelinka, I. 2004. SOMA - Self-Organizing Migrating
jective synchronization in different chaotic systems Algorithm, In: B.V. Babu, G. Onwubolu Eds., New
based on parameter identification. Phys Lett A, 367, optimization techniques in engineering, Springer-Verlag,
199206. chapter 7.

49

AMER16_Book.indb 49 3/15/2016 11:23:48 AM


This page intentionally left blank
Applied Mathematics in Engineering and Reliability Bri, Snel, Khanh & Dao (Eds)
2016 Taylor & Francis Group, London, ISBN 978-1-138-02928-6

Clustering for probability density functions based on


Genetic Algorithm

V.V. Tai
Department of Mathematics Can Tho University, Can Tho, Vietnam

N.T. Thao
Division of Computational Mathematics and Engineering, Institute for Computational Science
Ton Duc Thang University, Ho Chi Minh City, Vietnam
Faculty of Mathematics and Statistics Ton Duc Thang University, Ho Chi Minh City, Vietnam

C.N. Ha
Faculty of Mathematics and Statistics Ton Duc Thang University, Ho Chi Minh City, Vietnam

ABSTRACT: Basing on the L1 -distance between Probability Density Functions (pdfs) in a cluster and
its representing pdf, the L1 -distances between representing pdfs of different clusters, this article proposes
two new internal validity measures for clustering for pdfs. Then, we apply Genetic Algorithm coded for
solving integer optimization problems to minimize these internal validity measures so that establish the
suitable clusters. The numerical examples in both synthetic data and real data will show that the proposed
algorithm gives the better results than those of existing ones while testing by internal validity measures
and external validity measures.

1 INTRODUCTION tering results are also different. Therefore, how


to evaluate these different results is an interesting
In recent years, because of the fast development of question for many researchers. Generally, there
networking, data storage, and the data collection are two main types of validity measures used to
capacity, there has been an increasing on data that evaluate the quality of the clustering result: exter-
we receive and exchange everyday, especially on big nal validity measure and internal validity measure.
data. According to Wu et al. (Wu, Zhu, Wu, & Ding Some popular external validity measures are Rand
2014), in every day, 2.5 quintillion bytes of data index (Rand 1971), F-index (Larsen & Aone 1999),
have been created and 90% of data in the world Jaccard index, all of them evaluate the clustering
have been produced from 2009 to 2011. Therefore, throughout some specific references. Therefore, an
how to analyse effectively the big data that have a external evaluation is impossible when we do not
huge volume and be received from many uncertain have any reference. The internal criteria considers
sources is a challenge for many researchers in data some metrics which are based on data set and the
mining and statistics (George, Haas, & Pentland clustering schema (analyze intrinsic characteristics
2014, Wu, Zhu, Wu, & Ding 2014). Clustering that of a clustering), so it can be performed for all cases.
can partition unknown large data into groups so There were a large number of popular internal
that elements in each group have the similar prop- validity measures proposed in both non-fuzzy and
erties is a basic method in data mining and statis- fuzzy clustering as Intra index (MacQueen 1967),
tics. It is an important step to understand the data Xie-Beni index (S index) (Xie & Beni 1991), Dunn
before performing further analysis. Therefore, the Index (Dunn 1973), DB index (Davies & Bouldin
clustering problem has been researched extensively 1979). Most of them evaluate the quality of the
in many areas such as physics, biology, economics, clustering result by the compactness and the sepa-
engineering, sociology and any field that needs to ration of clusters. Basing these internal validity
group the similar elements together. measures, a lot of algorithms have been proposed
There are many algorithms can resolve the to search the optimal value of these measures, so
Clustering for Discrete Elements (CDE). These that the compactness and separation of established
algorithms were summarized by Fukanaga and clusters are optimized. We can list a lot of studies
Webb (Keinosuke 1990, Webb 2003). However, using Genetic algorithm for CDE as (Falkenauer
because of the various strategies in CDE, the clus- 1992, Jain, Murty, & Flynn 1999, Hruschka 2003,

51

AMER16_Book.indb 51 3/15/2016 11:23:48 AM


Agustn-Blas, Salcedo-Sanz, Jimnez-Fernndez, the representing pdf of cluster and propose two
Carro-Calvo, Del Ser, & Portilla-Figueras 2012). new internal validity measures. Section 3 reviews
Besides, some other evolutionary approaches were the Genetic Algorithm called as MI-LXPM pre-
also applied to resolve the clustering problems as sented in (Deep, Singh, Kansal, & Mohan 2009).
Particle Swarm Optimization (Das, Abraham, Section 4 is the numerical examples that use MI-
& Konar 2008), Ant Colony algorithms (Jiang, LXPM to optimize the proposed internal valid-
Yi, Li, Yang, & Hu 2010), Artificial Bee Colony ity measures in Section 2. It will demonstrate our
algorithms (Zhang, Ouyang, & Ning 2010). All of algorithm can improve the performance of CDF.
them have supplied a novelty approach to establish Section 5 is the conclusion.
clusters and improve the quality of result.
The clustering for probability density func-
tions (CDF) that be necessary for big data has 2 SOME RELATIONS
been interested by many researchers recently. We
can find some important studies in literature as 2.1 L1 -distance and representing pdf
Matusita (Matusita 1967), Glick (Glick 1972) Let F { f x f x fN x } , N > 2 is the set
which proposed some standard measures to com- of pdfs for k clusters, C (C ,C , ,Ck ) , k 2 .
pute the similarity of two or more pdfs (Vo Van Definition 1. L1 - distance of F is defined as fol-
& Pham-Gia 2010) which established the clus- lows: For N > 2 ,
ter width criterion and applied it for hierarchical
approach and non-hierarchical approach in CDF,
(Goh 2008, Montanari & Cal 2013) which pro- f1 f2 ,, fN 1 R n
fmax ( x )dx 1. (1)
posed some novelty methods to build clusters in
CDF and (Chen & Hung 2014) which introduced And for N = 2 ,
a method called automatic clustering algorithm
to find the number of clusters and then establish f1 f2 1 = | f1 x ) f2 x ) | dx, (2)
optimal result. However, the validity measures Rn
used in above studies are external validity meas-
ures and none of previous studies proposed an where fmax x ) = max f1 x ) f2 x ) fN x ).
internal validity measures in CDF. Therefore, it is From (1), it is easy to prove that f1 f2 ,, fN 1
impossible to perform the CDF when we do not is a non-decreasing function of N , with
have any reference. In addition, it cannot apply 0 f1 f2 , , fN 1 N 1 . Equality on the left
evolutionary approaches for optimizing the clus- occurs when all fi are identical and on the right
ters without internal validity measures. Although when fi have disjoint supports. From (2), we have
Chen and Hung (Chen & Hung 2014) had pro-
1
posed the automatic clustering algorithm but we f1 f2 1 = n fmax x )dx 1.
cannot evaluate their result is whether really opti- 2 R

mal or not. Furthermore, automatic clustering


algorithm is easy to merge all of pdfs to a single
cluster (number of clusters k = 1 ) when the pdfs 2.2 The representing probability functions
have a high overlapping degree. Basing on the idea of clusters
that optimize the compactness and separation Definition 2. Give the set of pdfs, F = (f1, f2, ..., fN),
of established clusters, this article proposes two N 2 which be separated to k clusters, C = (C1, C2,
internal validity measures in CDF. From them, ..., Ck), k 2. The representing pdf for cluster C1
we apply the Genetic Algorithm coded for solv- is defined by
ing integer optimization problem (Deep, Singh,
Kansal, & Mohan 2009) to minimize these internal f Ci
fi
validity measures. Hence, the suitable clusters are fvi = i
, (3)
established. The above algorithm is integrated in ni
Global Optimization Toolbox in Matlab Software
and is easy to use. The numerical examples in this where ni is the number of pdfs in cluster Ci. We also
article will show the proposed method can find the have fvi 0 for all x and n fvi dx = 1.
R
optimal internal validity measure. The results with
optimal internal measures will be re-tested when
2.3 Two new proposed internal validity measures
using external measure (with references). It can be
seen that the proposed algorithm improve signifi- In this section, we propose IntraF and SF index to
cantly the performance of CDF. evaluate the quality of the established clusters in
This article is organized as follows. In section 2, CDF. Two internal validity measures are presented
we summarize some issues relating to L1 -distance, as follows:

52

AMER16_Book.indb 52 3/15/2016 11:23:48 AM


IntraF index Crossover
The crossover operator used in (Deep, Singh,
1 k Kansal, & Mohan 2009) is the Laplace p crosso-
f
2
IntraF = ffvi (4) ver. Give two individual x1 x11, x12 , , x1n
n i =1 f Ci
and x 2 x12 , x22 , , xn2 , their offsprings
p
1 1 1 1
y y1 , y2 yn and y2 y12 , y22 yn2 are
where f ffvi is the L1 - distance between f and generated in following way:
fvi and n is the number of all pdfs.
The more similar between pdfs in cluster to yi1 xi1 + i | xi1 xi2 | yi2 xi2 + i | xi1 xi2 |, (6)
their representing pdf are, the smaller IntraF is.
Therefore, IntraF index reflects the compactness
In (6), i satisfies the Laplace distribution and
of established clusters and at first, we can see that
is generated as
it is suitable to evaluate the clusters quality.

SF index 1
The IntraF index can compute the compact- a b log(ui ) iif ri
2
=
ness of clusters but cannot assess the separation a + b log(u ) iif 1
between different clusters. Therefore, we propose ri >
i
2
the new index to measure this separation. This
index is called as SF and it is defined as follows: where a is location parameter and b > 0 is scaling
parameter, ui , ri [ , ] are uniform random num-
i= f
k 2
Ci
f ffvi bers. For CDF problem, in each above individual,
SF = 2 n is the number of pdfs and 2 xi k with k is
n i fv
fi fv j
fv the number of clusters.
(5)
IntraF
= 2
Mutation
mini j fvi fv j The mutation operator used in MI-LXPM is the

Power mutation. By it, a solution x is created in
the vicinity of a parent solution x in the following
where fvi fv j is the L1 distance between repre- manner.
senting pdfs of cluster i and cluster j .
The SF index compute the pairwise L1 - distance x s x xl if t r

between all representing pdfs of all clusters. Then x=
their minimum is considered as the separation
measurement. The more separate between the clus-
(
x + s x x
u
) if tr

ters are, the larger denominator is and the smaller


In above equation, s is a random number having
power distribution and calculated by s ( s ) , where
SF is. Thus, the smallest SF indeed indicates a valid p
optimal partition which consider both compact-
s1 is chose randomly in interval [ 0 1] and p called
ness and separation of clusters. l
the index of mutation is an integer number; t = xu x
l u x x
where x and x be the lower and upper bounds on
3 GENETIC ALGORITHM FOR SOLVING the value of the decision variable (in CDF x l = 2 and
INTEGER OPTIMIZATION PROBLEM x u k ); r is a random number between 0 and 1.

Firstly, we have had to encode the solution in clus- Truncation procedure for integer restriction
tering problem to the chromosome before applying In order to ensure that after crossover and muta-
Genetic Algorithm to optimize the internal validity tion operations have been performed, the integer
measures. Each individual is presented by a chro- restrictions are satisfied, the following trunca-
mosome having the same length with the number tion procedure is applied. For all i 1,, n, xi is
of pdfs. The value l j in each gene in the chromo- truncated to integer value xi by therule: If xi is
somes represents the label of cluster to which jth integer then xi = xi , otherwise xi is equal to [ i ]
pdf is assigned. For example, the clustering result or [ i ] + 1 with the probability is 0.5, [ i ] is the
with C1 { f1, f4 } C2 = { f2 f5 f7 } C3 = { f3 f6 } is integer part of xi .
presented by the chromosomes: 1 2 3 1 2 3 2.
The Genetic Algorithm for solving the integer Selection
optimization problems (Deep, Singh, Kansal, & MI-LXPM use the tournament selection that
Mohan 2009) is called MI-LXPM and presented as presented by Goldberg and Deb (Goldberg & Deb
follows. 1991).

53

AMER16_Book.indb 53 3/15/2016 11:23:50 AM


The above part presents the detailed MI-LXPM (lotus and sunflowers) with 13 images each. In
algorithm. This algorithm then be applied to opti- each example, we apply MI-LXPM-CDF to opti-
mize the SF index that has been proposed in Sec- mize the internal validity measure, then comparing
tion 2 for solving problem of CDF. We call this external measure (the error in the comparison with
hybrid algorithm, which be presented by below five the truth) of the new algorithm with the existing
step, as MI-LXPM-CDF: ones in (Vo Van & Pham-Gia 2010, Chen & Hung
2014). The detailed results are shown as follows.
St. 1 Starting with a randomly clustering solu-
tions presented by chromosomes. Example 1: We supposed to have seven popula-
St. 2 Evaluating SF index for each clustering tions with univariate normal pdfs, with specific
solution. parameters:
St. 3 Performing the genetic operations such as,
selection, crossover, and mutation, on the 1 2 =  = 7 = 1;
current clustering solutions to introduce
new ones. 1 0 3 2 = 4.0; 3 = 9.1;
St. 4 Replace the current clustering solution with 4 1 0 5 = 5.5; 6 8 0 7 = 4.8.
the new ones having smaller SF index.
St. 5 If some criterion is met then stop, else go Form Figure 1, it can be seen that the suitable
to St. 2. separation for these pdfs is
The main ideal of MI-LXPM-CDF is that:
throughout each iteration, from existing cluster- C1 { f1, f4 } C2 = { f2 f5 f7 } C3 = { f3 f6 }
ing solutions, we create some new ones and choose
a determined numbers of best ones for the next The clustering results of MI-LXPM-CDF and
iteration. In the end, we have the solution with the other algorithms are presented in Table 1.
internal validity measures is optimized. Because It can be observed that all of methods are abso-
MI-LXPM is an algorithm to find the global opti- lutely accurate in the comparison with the remark
mum, the new approach increase the chance to mentioned before. In fact, this is a simple and easy
avoid trapping in local solution in the compari- example. This be only used as the first test for our
son with some hill climbing algorithms, such as algorithm. The result verifies that our algorithm is
k-means or non-hierarchical approach. The above
algorithm is named as MI-LXPM-CDF whose
suitability, feasibility, applicability will be tested by
the numerical example in following section.

4 NUMERICAL EXAMPLES

In this section, we conduct three experiments to


compare the proposed algorithm with Van and
Pham-Gias non-hierarchical (Vo Van & Pham-Gia
2010) and the Automatic clustering of Chen and
Hung (Chen & Hung 2014). In the first example,
we consider seven univariate normal probability
densities whose variances are the same and means
are different. This is a simple examples presented
in (Vo Van & Pham-Gia 2010). We review this
example to illustrate the theoretical results, test
Figure 1. The pdfs of 7 univariate normal distributions.
the suitability of the proposed algorithm. The
more complicate synthetic example researched in
(Chen & Hung 2014) will be review in Example Table 1. The results of MI-LXPM-CDF and existing
2. This example contains 100 uniform distribu- algorithms.
tion pdfs with dynamic parameter and separate
into two groups with 50 pdfs in each group. In the Misclustering
final example, we apply the proposed algorithm for rate (%) SF index
images recognition that be an interesting problem
Vo Van & Pham-Gia 2010 0 0.0493
for many researchers in data mining with big data. Chen & Hung 2014 0 0.0493
We take 26 images from Caltech 101 dataset (Fei- MI-LXPM-CDF 0 0.0493
Fei 2004). These 26 images contain 2 categories

54

AMER16_Book.indb 54 3/15/2016 11:23:55 AM


suitable at first and need to retest in more compli- with U ( ai bi ) and U (ci di ) denote the uniform
cate example as follows. distribution on the interval ( ai bi ) and (ci di ) ,
Example 2: In this example, we review the synthetic respectively, 1 4 are drawn from U ( ) .
data studied in (Chen & Hung 2014). The data We next create the mixtures of the above uni-
consist of two classes f1 and f2 with 100 uniform form distributions. Considering the first class
pdfs on the interval [0,1000 ] (see Figure 2). The g1 f1 and the second class g2 f1 + (1 ) f2 ,
pdfs of these two classes are defined as follows: where [ 0,1] and receive the value as 0, 0.1, 0.2,
0.3, respectively in this paper. Figure 3 shows the
f1,i U ( ai bi ) f U (ci di ) , i = 1, 50, classes g1 (black) and g2 (red) in the cases = 0.1
,i
and = 0.3 . The clustering results of MI-LXPM-
CDF and other methods for four cases of are
where presented in Table 2.
In this example, the larger is, the more over-
ai 4 (i )+ bi i + 2 lapping degree between pdfs and the more com-
ci 805 5 j 3 di j 4 plicated of problem are. It can be seen that our
method improves significantly the performance
of non-hierarchical approach in (Vo Van & Pham-
Gia 2010) for all cases. Especially, MI-LXPM-
CDF have the maximal accuracy in three cases
, = 0.2 and = 0.3 . In case of = 0.1 ,
the proposed method makes error being 5%, the
reason is the algorithm cannot find the global opti-
mum. It shows that our algorithm, especially muta-
tion operator need some improvement in coming
so that it can escape the local optimum and gives
the better results. Anyway, the clustering results in
Table 2 demonstrate that MI-LXPM-CDF is fea-
sibility and improve the performance of clustering
when evaluating by whether internal or external
validity measures.
Example 3 In this example, we apply our algorithm
in a more challenger problem being recognized
images. The data set of images collected from
101 Objects database will be considered. We take
26 images in 2 categories (lotus and sunflowers)
with 13 images each. The detail will be shown in
Figure 2. Two classes of pdfs f1 (black) and f2 (red). Figure 4. We use the raw colour data in Grayscale

Figure 3. Two classes of pdfs g1 (black) and g2 (red) in cases of = 0.1 and = 0.3.

55

AMER16_Book.indb 55 3/15/2016 11:23:56 AM


Table 2. The results of MI-LXPM-CDF and existing In this example, the disadvantage of Automatic
algorithms in each case of . clustering be shown when it gives a single cluster
with all pdfs (k = 1), therefore the misclustering
Misclustering rate % rate of this method is 50%. The non-hierarchical
= 0 = 0.1 = 0.2 = 0.3
approach gives the result with the average misclus-
tering rate is 31.61% while MI-LXPM-CDF is the
Vo Van & Pham-Gia 2010 9.2 9 8.8 13.4 best with the error is 11.54%. It proves that the pro-
Chen & Hung 2014 0 0 0 0 posed algorithm can improve significantly the clus-
MI-LXPM-CDF 0 5 0 0 tering performance and can be well applied in many
practical problem in data mining with big data.

5 CONCLUSION

Basing on the L1 - distance, the representing pdf of


cluster and some related problems, this article pro-
poses two internal validity measures named IntraF
and SF index to evaluate the clustering results. The
Figure 4. The detail of images data. SF index be used as the object function needed to
minimized. Further more, this article applies the
Genetic Algorithms named MI-LXPM that be
coded for solving integer optimization problem to
find the optimal value of SF index in CDF. The
proposed algorithm, MI-LXPM-CDF, is tested by
external validity measure in many synthetic and real
data sets. The numerical examples show MI-LX-
PM-CDF not only has good effects on simulation
problems but also improve the clustering perform-
ance in practical problems, such as images recogni-
tion. Clearly, in the era of big data has arrived, with
the uncertain and large volume data, this research
and others which focus on the improvement of per-
formance of clustering and classification problem
are really necessary. The coming, MI-LXPM-CDF
will be researched and improved some operators to
increase it ability in searching the global optimal
internal validity measure. Besides, some problems
Figure 5. Two classes of pdfs: lotus (red), sunflowers in data mining with big data such as, images, sound
(black). and video recognition will be researched.

Table 3. The results of MI-LXPM-CDF and existing


algorithms. REFERENCES
Methods Misclustering rate % Agustn-Blas, L., S. Salcedo-Sanz, S. Jimnez-Fernndez,
L. Carro-Calvo, J. Del Ser, & J. Portilla-Figueras
Vo Van & Pham-Gia 2010 31.61 (2012). A new grouping genetic algorithm for clus-
Chen & Hung 2014 50 tering problems. Expert Systems with Applications
MI-LXPM-CDF 11.54 39(10), 96959703.
Chen, J. H. & W. L. Hung (2014). An automatic clus-
tering algorithm for probability density functions.
colour space for these images and estimate the Journal of Statistical Computation and Simulation
pdfs for each image by the Grayscale distribution (ahead-of-print), 117.
Das, S., A. Abraham, & A. Konar (2008). Automatic ker-
of image pixels.
nel clustering with a multi-elitist particle swarm opti-
Figure 5 shows the pdfs estimated from two mization algorithm. Pattern recognition letters 29(5),
classes of images with the red pdfs for lotus and the 688699.
black pdfs for sunflowers. Each researched method Davies, D.L. & D.W. Bouldin (1979). A cluster separation
is run 10 times and the average of misclustering measure. Pattern Analysis and Machine Intelligence,
rates (%) of all methods are showed in Table 3. IEEE Transactions on (2), 224227.

56

AMER16_Book.indb 56 3/15/2016 11:24:00 AM


Deep, K., K.P. Singh, M. Kansal, & C. Mohan (2009, Keinosuke, F. (1990). Introduction to statistical pattern
June). A real coded genetic algorithm for solving inte- recognition. Academic Press, Boston.
ger and mixed integer optimization problems. Applied Larsen, B. & C. Aone (1999). Fast and effective text
Mathematics and Computation 212(2), 505518. mining using linear-time document clustering. In
Dunn, J.C. (1973). A fuzzy relative of the ISODATA Proceedings of the fifth ACM SIGKDD international
process and its use in detecting compact well-sepa- conference on Knowledge discovery and data mining,
rated clusters. pp. 1622. ACM.
Falkenauer, E. (1992). The grouping genetic algorithms MacQueen, J. (1967). Some methods for classification
widening the scope of the GAs. Belgian Journal of and analysis of multivariate observations. In Proceed-
Operations Research, Statistics and Computer Science ings of the fifth Berkeley symposium on mathematical
33(1), 2. statistics and probability, Volume 1, pp. 281297. Oak-
Fei-Fei, R. (2004). L. and Fergus and P. Perona. Learn- land, CA, USA.
ing generative visual models from few training exam- Matusita, K. (1967). On the notion of affinity of several
ples: an incremental bayesian approach tested on 101 distributions and some of its applications. Annals
object categories. In CVPR Workshop on Generative- of the Institute of Statistical Mathematics 19(1),
Model Based Vision. 181192.
George, G., M.R. Haas, & A. Pentland (2014). Big data Montanari, A. & D.G. Cal (2013). Model-based cluster-
and management. Academy of Management Journal ing of probability density functions.
57(2), 321326. Rand, W.M. (1971). Objective criteria for the evaluation
Glick, N. (1972). Sample-based classification procedures of clustering methods. Journal of the American Statis-
derived from density estimators. Journal of the Ameri- tical association 66(336), 846850.
can Statistical Association 67(337), 116122. Vo Van, T. & T. Pham-Gia (2010). Clustering probabil-
Goh, A. (2008). Unsupervised Riemannian Clustering of ity distributions. Journal of Applied Statistics 37(11),
Probability Density Functions. pp. 377392. 18911910.
Goldberg, D.E. & K. Deb (1991). A comparative analysis Webb, A.R. (2003). Statistical pattern recognition. John
of selection schemes used in genetic algorithms. Foun- Wiley & Sons.
dations of genetic algorithms 1, 6993. Wu, X., X. Zhu, G.-Q.Wu, & W. Ding (2014). Data min-
Hruschka, E.R. (2003). A genetic algorithm for cluster ing with big data. Knowledge and Data Engineering,
analysis. Intelligent Data Analysis 7(1), 1525. IEEE Transactions on 26(1), 97107.
Jain, A.K., M.N. Murty, & P.J. Flynn (1999). Data clus- Xie, X.L. & G. Beni (1991). A validity measure for fuzzy
tering: a review. ACM computing surveys (CSUR) clustering. IEEE Transactions on Pattern Analysis &
31(3), 264323. Machine Intelligence (8), 841847.
Jiang, H., S. Yi, J. Li, F. Yang, & X. Hu (2010). Ant clus- Zhang, C., D. Ouyang, & J. Ning (2010). An artificial bee
tering algorithm with K-harmonic means clustering. colony approach for clustering. Expert Systems with
Expert Systems with Applications 37(12), 86798684. Applications 37(7), 47614767.

57

AMER16_Book.indb 57 3/15/2016 11:24:01 AM


This page intentionally left blank
Applied Mathematics in Engineering and Reliability Bri, Snel, Khanh & Dao (Eds)
2016 Taylor & Francis Group, London, ISBN 978-1-138-02928-6

Optimization of truss structures with reliability-based frequency


constraints under uncertainties of loadings and material properties

V. Ho-Huu, T. Vo-Duy & T. Nguyen-Thoi


Division of Computational Mathematics and Engineering, Institute for Computational Science,
Ton Duc Thang University, Ho Chi Minh City, Vietnam
Faculty of Civil Engineering, Ton Duc Thang University, Ho Chi Minh City, Vietnam

L. Ho-Nhat
Hochiminh University of Technology, Ho Chi Minh City, Vietnam

ABSTRACT: In this paper, the Reliability Based Design Optimization (RBDO) problem of truss struc-
tures with frequency constraints under uncertainties of loading and material properties is presented.
Moreover, a double loop method with a new combination of an improved differential evolution algo-
rithm which is proposed recently and an inverse reliability analysis is used for solving this problem. Three
numerical examples including a welded beam, a 10-bar and 52-bar trusses are considered for evaluating
the efficiency and application ability of the proposed approach.

1 INTRODUCTION Optimization (RBDO) for truss structures with fre-


quency constrains has not yet been considered in
In the engineering discipline, the optimization the literature. Thus, it is really necessary to develop
always plays a very important role in designing of efficient tools for optimization of truss structures
structures. The optimal design would help signifi- with reliability-based frequency constraints under
cantly reduce the cost and also improve the per- uncertainties of loadings and material properties.
formance of the structures. However, engineering In solving the RBDO problems, the most direct
design problems are often related to uncertainties approach is a double loop method in which the
which derive from various sources like manufac- optimization loop (outer loop) is a deterministic
turing process, material properties and operating optimization process; it repeatedly calls the relia-
environments, etc. These uncertainties may cause bility analysis loop (inter loop) in each cycle (Chen
structures to suffer different working conditions et al. 2013; Valdebenito & Schuller 2010). The
from the initial design. Sometimes, this results in reliability analysis loop is a separate optimization
risks to structures. Therefore, considering influence problem which can be evaluated using direct meth-
of uncertain factors during the designing process is ods such as the reliability index approach (Chen
really necessary. et al. 2013) or inverse methods such as inverse reli-
The optimization of truss structures with fre- ability strategy (Du et al. 2007; Du et al. 2004). In
quency constraints is to minimize the whole weight the double loop method, choosing a optimization
of the structures while frequency constraints must algorithm in the optimization loop is of crucial
be satisfied. The design variables are the elements important for solving a particular RBDO prob-
areas or/and nodal coordinates. For this kind of lem (Valdebenito & Schuller 2010). For example,
optimization problems, the frequency constraints the gradient-based optimization methods such as
play a important role for avoiding resonance Sequential Quadratic Programming (SQP), Gen-
phenomenon of structures (Grandhi 1993), but eralized Reduced Gradient algorithm (GRG), etc.
in mathematical aspect it is not easy to solve can be quite efficient for the optimization prob-
because of their highly nonlinear, non-convex and lems with explicit, convex and continuous objec-
implicit function properties. Therefore, despite of tive functions, but they will be inefficient for the
being introduced in (Bellagamba & Yang 1981) optimization problems with implicit, non-convex
and being presented in more details in (Kaveh & and discontinuous objective functions. This is
Zolghadr 2014), the structural optimization with because these methods use gradient informa-
frequency constraints still has a lot of rooms for tion in searching progress. In contrast, the global
improvement and attracts certain attention from optimization methods such as Genetic Algorithm
researchers. Besides, the Reliability-Based Design (GA), Differential Evolution (DE), Particle Swarm

59

AMER16_Book.indb 59 3/15/2016 11:24:01 AM


Optimization (PSO), etc. search solutions on the ability of the constraint function gi such that this
whole design space with only objective function probability is greater than or equal to a desired
information. Therefore, they could easily deal with probability given by user. In presence of multiple
various optimization problems. However, these constraints, however, some constraints may never
methods are still costly in computational source be active and consequently their reliabilities are
for searching the global solution. extremely high (approaching 1.0). Although these
Recently, Ho-Huu et al. 2016 has proposed an constraints are the least critical, the evaluations of
adaptive elitist Differential Evolution (aeDE) algo- these reliabilities will unfortunately dominate the
rithm for truss optimization with discrete variables. computational effort in probabilistic optimization
The aeDE is the newly improved version of the Dif- (Du et al. 2004). To overcome this drawback, Du
ferential Evolution (DE) algorithm based on three et al. 2004 proposed an inverse reliability strategy
modifications. The effectiveness and robustness of in which the reliability assessment of the constraint
the aeDE are verified through six optimization prob- function gi is implemented only up to the necessary
lem of truss structures. The numerical results demon- level. The brief description of the inverse strategy
strated that the aeDE is more efficient than the DE is summary as follows:
and some other methods in the literature in terms of A percentile performance is defined as
the quality of solution and convergence rate.
This paper hence tries to fill the above men-
0 (2)
tioned research gaps by solving the RBDO prob-
lem for truss with frequency constraint for the
first time. For solving this problem, the double where g is the -percentile performance of
loop procedure is employed. In particular, for the g(d,X,P), namely,
optimization loop, the aeDE is employed while for
the reliability analysis loop an inverse reliability
strategy (Du et al. 2004) is used. Three numerical
{
ob g (d X P ) g }= (3)
examples including a welded beam, a 10-bar and
52-bar trusses are considered for evaluating the It has been shown in (Du et al. 2004) that g
efficiency and application ability of the proposed 0 indicates that Prob{g(d,X,P) 0} . Therefore,
approach. the original constraint is now converted to evalu-
ate the -percentile performance. More details
of this strategy and method for evaluating the
2 FORMULATION OF RBDO PROBLEM -percentile performance may be found in (Du
et al. 2004).
A typical RBDO problem is defined by
Minimize : f (d,x,p )
4 ADAPTIVE ELITIST DIFFERENTIAL
Design Variables: DV ={ , x }
EVOLUTION ALGORITHM
Su
ubject to : Prob{gi ( ,x, ) 0} ri , i =1,2,...,m.
(1) Among a variety of global optimization algo-
rithms, the Differential Evolution (DE) algo-
where f(d,x,p) is the objective function; d is the rithm first proposed by performance better than
vector of deterministic design variables; x is the some other methods in the literature (Civicioglu
vector of random design variables; p is the vector & Besdok 2013; Vesterstrom & Thomsen 2004).
of random parameters; gi(d,x,p) (i = 1,2,...,m) are However, it still requires high computational cost
constraint functions; ri (i = 1,2,...,m) = (i) are during the searching process. One of the main rea-
desired probabilities of constraint satisfaction; m sons for this restriction is that the DE did not keep
is the number of probabilistic constraints; (.) is the trade-off between the global and local search
the standard cumulative function of the normal capabilities. Hence, in our previous work, we have
distribution; i is the target reliability index of the proposed the adaptive elitist Differential Evolu-
ith probabilistic constraint; x is the mean of the tion (aeDE) algorithm which ensures the balance
random design variables x. between the global and local search capabilities
of the DE. Through six numerical examples for
truss structures, it was shown that the aeDE better
3 INVERSE RELIABILITY than the DE in terms both quality of solution and
ASSESSMENT METHOD convergence rate. For more details of the aeDE,
readers can refer to (Ho-Huu et al. 2016). In this
In conventional reliability analysis, the probabi- paper, the aeDE is extended to solve the RBDO
listic constraint is checked by finding the prob- problem.

60

AMER16_Book.indb 60 3/15/2016 11:24:01 AM


Figure 2. Flow chart of the double-loop method.

The RBDO problem has four random variables


(x1, x2, x3, x4) which are statistically independent
and follow normal distribution. The RBDO model
of the welded beam problem is given by
Figure 1. Flow chart of the double-loop method.
find d = [d1,d d3 d 4 ]T
5 A GLOBAL INTEGRATED
1 d1 d 2 + c 2 d3d 4 ( z1 d 2 )
2
minimize f d z
FRAMEWORK FOR SOLVING
RBDO PROBLEM subject to Prob.{g j ( ) 0} ( tj ) j = 1,...,5
.
(4)
The aeDE and the inverse reliability strategy is
integrated into the double loop procedure. This where
integration is named DLM-aeDE and is summa-
rized in the flow chart of Figure 1. g1 ( ,z ) = ( ,z ) z6 g2 ( ,z ) = ( ,z )/z7 1
g3 ( ,z ) = x1 /x4 1; g ( z ) = ( z )/
)/zz5 1
g5 ( , ) = 1 c (x )/z1 ;
6 NUMERICAL EXAMPLES ( ,z ) = { ( , )2 +2t ( ,z ) (x,, )X 2 /2R( 2 1/2
(x, ) }
z1
In this section, two numerical examples consisting t ( ,z ) = ; tt ( ,z ) M ( , )R(x )/J(x )
2 x1x2
of a welded beam and a 10-bar truss are consid-
2 2
ered. Because the RBDO for truss structures with x2 x1 + ( x1 + x1 )
M( , ) z1 z2 ; (x )=
frequency constraints has not been provided in the 2 2
literature, a welded beam, a benchmark problem in 2
J( ) = 2 x1x2 {x2 / 12 ( x1 x3 ) / 4}
the RBDO field, is presented as the first example to 3
validate the accuracy of the implementation codes. 6 4 z1z2
( ,z ) = 2
1 2
; ( ,z ) = 3
Then, a 10-bar truss structure is carried out. The x3 x4 z3 x3 x4
parameters of the aeDE including the population xi ~ N ( di , 0.1693 )
2
f r i = 1,2
fo
size NP, threshold, delta and MaxIter are set to 20, 2
xi ~ N ( di , . ) for
o i = 3,4
103, 106 and 1000, respectively. In this study, all
codes including finite element analysis of the beam 1t 2t = 3t 4t = 5t = 3;
and the truss and the aeDE are written in Matlab. 3.175 d1 50.8; d 2 254;
0 d3 254 d 4 50.8
4 2
z1 2.6688 10 (N); z2 = 3.556 10 ( mm );
6.1 Welded beam 5 4
z3 = 2.0685 10 (MPa); ) z4 = 8. (MPa);
);
The first example is a welded beam as shown in z5 6.. ( ) z6 9.
);
1
(MPa)
Figure 2. This beam was previous solved by Cho 2 5 3
& Lee 2011 using a CL-SORA method, Hyeon & z7 = 2.0685
6 10 ( ); 1 = 6.74135 10 ($ /
); );
6 3
Chai 2008 using a moment-based RBDO method c2 = 2.93585 10 ($ / mm m )
and Ho-Huu et al. 2016 using a SORA-ICDE
method. The objective function is the welding The obtained results of the DLM-aeDE are
cost. Five probabilistic constraints are related listed in Table 1 in comparison with those obtained
to physical quantities such as shear stress, bend- by moment-based RBDO, SORA-ICDE and other
ing stress, buckling, and displacement constraint. methods. It can be seen that the results obtained

61

AMER16_Book.indb 61 3/15/2016 11:24:02 AM


Table 1. Optimization results for welded beam problem.

Hyeon & Chai 2008 Ho-Huu et al. 2016 This work

Design variable (mm) Moment SORA-ICDE DLM-aeDE

x1 5.729 5.730 5.728


x2 200.59 201.00 201.089
x3 210.59 210.63 210.610
x4 6.238 6.240 6.239
Cost ($) 2.5895 2.5926 2.5923
1 3.01 3.01 3.01
2 3.52 3.29 3.07
Reliability index 3 3.01 3.00 3.01
4 Infinite Infinite Infinite
5 3.31 3.12 3.01

Table 2. Data for the 10-bar planar truss structure.

Parameters (unit) Value

Modulus of elasticity E (N/m )2


6.89 1010
Material density (kg/m3) 2770
Added mass (kg) 454
Allowable range of 0.645 104 A
cross-sections (m2) 50 104
Constraints on first three 1 7, 2 15, 3
frequencies (Hz) 20

Figure 3. Model of a 10-bar planar truss structure.


bars are assumed to be independent design vari-
ables while Youngs modulus and mass density of
the truss and the added masses are fixed as given
by the DLM-aeDE are in good agreement with in Table 3. In this study, both the cross-sectional
those gained by other studies. It can also be seen areas of bars, Youngs modulus, mass density of
from Table 1 that all reliability levels are satisfied the truss and the added masses are assumed to be
the required reliability indexes. These results dem- the random design variables which have normal
onstrate that the Matlab implementation of the distribution with expected values equal to those
DLM-aeDE is reliable and accurate. of the Deterministic Optimization (DO) prob-
lem and standard deviation of 5%. The reliability
indexes of all frequency constraints are set to be
6.2 10-bar planar truss
3. This is equivalent to assume that the safety level
In the second example, a simple 10-bar truss of the structure must be greater than or equal to
structure, as depicted in Figure 3 is considered. 99.865%.
All free nodes are added a non-structural mass The results of the DLM-aeDE are presented
of 454 kg. Data for the problem including the in Table 3 in comparison with those obtained by
material properties, design variable bounds, and some methods for deterministic optimization. It
frequency constraints are summarized in Table 2. can be seen that the reliability indexes for all fre-
This example was investigated by some researchers quency constraints are satisfied the required reli-
such as Kaveh & Zolghadr 2014 utilizing demo- ability indexes of the RBDO problem. The best
cratic PSO, Zuo et al. 2014 using a hybrid algo- weight obtained by the DLM-aeDE is 665.637 lb
rithm between optimality criterion and genetic corresponding with the probability of safety of
algorithm (OC-GA), etc. However, these studies 99.865%. The results in Table 3 also show that the
are limited on solving the deterministic optimiza- for the DO problem, the reliability of the structure
tion problem in which the cross-sectional areas of is very low (around 50%). This illustrates that the

62

AMER16_Book.indb 62 3/15/2016 11:24:03 AM


Table 3. Optimum results for 10-bar space truss structure.

Kaveh & Zolghadr 2014 This work

Deterministic Deterministic Reliability-based


Optimization Optimization Design Optimization

Design variable (area in2) DPSO aeDE DLM-aeDE

A1 35.944 35.775 42.893


A2 15.53 14.926 19.020
A3 35.285 34.840 45.926
A4 15.385 14.252 18.729
A5 0.648 0.646 0.661
A6 4.583 4.569 5.714
A7 23.61 24.632 30.599
A8 23.599 23.043 30.019
A9 13.135 11.932 15.320
A10 12.357 12.601 15.883
Weight (lb) 532.39 524.629 665.637
Reliability index (Probability of safety %) 1 0.00 (50.00%) 3.00 (99.86%)
2 2.20 (1.35%) 4.79 (100%)
3 0.00 (49.99%) 3.00 (99.86%)
Number of structural analyses 3940 774000

Table 4. Eight element group for the


52-bar dome truss structure.

Group number Elements

1 14
2 58
3 916
4 1720
5 2128
6 2936
7 3744
8 4552

safety of the whole truss is enhanced effectively


and become more applicable in reality when the
influence of uncertain factors during the designing
Figure 4. Model of a 52-bar dome truss structure.
process is taken in to account.

6.3 52-bar dome truss


Table 5. Data for the 52-bar dome truss structure.
In the last example, a simple 52-bar dome truss
structure, as shown in Figure 4 is considered. Parameters (unit) Value
All of the bars are arranged into eight groups as
in Table 4. All free nodes are permitted to move Modulus of elasticity E (N/m2) 2.1 1011
2 m in each allowable direction from their initial Material density (kg/m3) 7800
position but again must guarantee symmetry for Added mass (kg) 50
the whole structure. Therefore, there are five shape Allowable range of 0.0001 A 0.001
cross-sections (m2)
variables and eight sizing variables. The material
Constraints on first three 1 15.961, 2
properties, design variable bounds, and frequency frequencies (Hz) 28.648
constraints of the problem are given in Table 5. This

63

AMER16_Book.indb 63 3/15/2016 11:24:03 AM


Table 6. Optimum results for 52-bar dome truss structure.

Miguel & Fadel Miguel 2012 This work

Deterministic Deterministic Reliability-based


Optimization Optimization Design Optimization

Design variable (area in2) FA aeDE DLM-aeDE

ZA 6.4332 5.9889 4.0553


XB 2.2208 2.2482 2.4973
ZB 3.9202 3.7658 4.0568
XF 4.0296 3.9865 3.9998
ZF 2.5200 2.5005 2.6939
A1 1.0050 1.0014 1.0106
A2 1.3823 1.1288 1.0076
A3 1.2295 1.1843 2.0040
A4 1.2662 1.4444 2.0138
A5 1.4478 1.3897 1.5572
A6 1.0000 1.0002 1.0201
A7 1.5728 1.5531 2.3314
A8 1.4153 1.4354 2.2555
Weight (lb) 197.53 193.479 271.036
Reliability index 1 2.72 (99.67%) 3.00 (99.86%)
(Probability of safety %) 2 0.00 (50.03%) 3.00 (99.86%)
Number of structural analyses 7200 2100261

problem was previously studied by some research- 7 CONCLUSIONS


ers such as (Gomes 2011), (Miguel & Fadel Miguel
2012), (Khatibinia & Sadegh Naseralavi 2014), etc. In this study, the RBDO problem for truss structures
However, similar to the 10-bar planar truss, these with frequency constraints uncertainties of loading
studies are limited on solving the deterministic and material properties is presented. Moreover, the
optimization problem in which the cross-sectional new double loop approach combining an inverse
areas of bars are assumed to be independent design reliability method and an adaptive elitist differen-
variables while Youngs modulus and mass density tial evolution algorithm (DLM-aeDE) is employed
of the truss and the added masses are fixed as given to solve this problem. The proposed method is then
in Table 5. In this study, both the cross-sectional applied for a welded beam and a 10-bar truss struc-
areas of bars, Youngs modulus, mass density of ture. The results reveal that (1) the DLM-aeDE is
the truss and the added masses are assumed to be good competitor to the other algorithms for solv-
the random design variables which have normal ing the RBDO problem; (2) the best solution of the
distribution with expected values equal to those of RBDO for 10-bar and 52-bar trusses are found with
the Deterministic Optimization (DO) problem and reliability of 99.865%; (3) the RBDO for truss struc-
standard deviation of 5%. The reliability indexes tures with frequency constraints make designing
of all frequency constraints are set to be 3 which process of truss structures more practical in reality.
is equivalent to the safety level of the structure of
being greater than or equal to 99.865%.
The results of the problem are provided in ACKNOWLEDGEMENTS
Table 6 in comparison with those in (Miguel &
Fadel Miguel 2012) for deterministic optimiza- This research is funded by Vietnam National Foun-
tion. From Table 6, it can be seen that for the DO dation for Science and Technology Development
problem, the reliability of the structure is very low (NAFOSTED) under grant number 107.99-2014.11.
(around 50%) for the second constraint. This may
lead to dangerousness for the structure when the
input parameters are changed. On the other hand, REFERENCES
with the results of the RBDO problem, the reli-
ability of the structure may be ensured with the Bellagamba, L. & Yang, T.Y. 1981. Minimum-mass truss
required safety levels. structures with constraints on fundamental natural

64

AMER16_Book.indb 64 3/15/2016 11:24:03 AM


frequency. AIAA Journal, 19(11), 14521458. http:// ter & Structures, 165, 5975. http://doi.org/10.1016/j.
doi.org/10.2514/3.7875. compstruc.2015.11.014.
Chen, Z., Qiu, H., Gao, L., Su, L. & Li, P. 2013. An Hyeon Ju, B. & Chai Lee, B. 2008. Reliability-based
adaptive decoupling approach for reliability-based design optimization using a moment method and a
design optimization. Computers & Structures, 117(0), kriging metamodel. Engineering Optimization, 40(5),
5866. http://doi.org/http://dx.doi.org/10.1016/j. 421438. http://doi.org/10.1080/03052150701743795.
compstruc.2012.12.001. Kaveh, A. & Zolghadr, A. 2012. Truss optimization with
Cho, T.M. & Lee, B.C. 2011. Reliability-based design natural frequency constraints using a hybridized CSS-
optimization using convex linearization and sequential BBBC algorithm with trap recognition capability.
optimization and reliability assessment method. Struc- Computers and Structures, 102103, 1427. http://doi.
tural Safety, 33(1), 4250. http://doi.org/10.1016/j. org/10.1016/j.compstruc.2012.03.016.
strusafe.2010.05.003. Kaveh, A. & Zolghadr, A. 2014. Democratic PSO
Civicioglu, P. & Besdok, E. 2013. A conceptual compari- for truss layout and size optimization with fre-
son of the Cuckoo-search, particle swarm optimization, quency constraints. Computers & Structures, 130(0),
differential evolution and artificial bee colony algo- 1021. http://doi.org/http://dx.doi.org/10.1016/j.
rithms. Artificial Intelligence Review (Vol. 39). http:// compstruc.2013.09.002.
doi.org/10.1007/s10462-011-9276-0. Khatibinia, M. & Sadegh Naseralavi, S. 2014. Truss
Du, X., Guo, J. & Beeram, H. 2007. Sequential optimi- optimization on shape and sizing with frequency con-
zation and reliability assessment for multidiscipli- straints based on orthogonal multi-gravitational search
nary systems design. Structural and Multidisciplinary algorithm. Journal of Sound and Vibration, 333(24),
Optimization, 35(2), 117130. http://doi.org/10.1007/ 63496369. http://doi.org/http://dx.doi.org/10.1016/j.
s00158-007-0121-7. jsv.2014.07.027.
Du, X., Sudjianto, A. & Chen, W. 2004. An Integrated Miguel, L.F.F. & Fadel Miguel, L.F. 2012. Shape and size
Framework for Optimization Under Uncertainty Using optimization of truss structures considering dynamic
Inverse Reliability Strategy. Journal of Mechanical constraints through modern metaheuristic algorithms.
Design, 126(4), 562. http://doi.org/10.1115/1.1759358. Expert Systems with Applications, 39(10), 94589467.
Gomes, H.M. 2011. Truss optimization with dynamic http://doi.org/10.1016/j.eswa.2012.02.113.
constraints using a particle swarm algorithm. Expert Valdebenito, M.A. & Schuller, G.I. 2010. A survey on
Systems with Applications, 38(1), 957968. http://doi. approaches for reliability-based optimization. Struc-
org/http://dx.doi.org/10.1016/j.eswa.2010.07.086. tural and Multidisciplinary Optimization, 42(5), 645
Grandhi, R. 1993. Structural optimization with fre- 663. http://doi.org/10.1007/s00158-010-0518-6.
quency constraintsA review. AIAA Journal, 31(12), Vesterstrom, J. & Thomsen, R. 2004. A comparative
22962303. http://doi.org/10.2514/3.11928. study of differential evolution, particle swarm opti-
Ho-Huu, V., Nguyen-Thoi, T., Le-Anh, L. & Nguyen- mization, and evolutionary algorithms on numeri-
Trang, T. 2016. An effective reliability-based improved cal benchmark problems. Evolutionary Computation,
constrained differential evolution for reliability-based 2004. CEC2004. Congress on. http://doi.org/10.1109/
design optimization of truss structures. Advances CEC.2004.1331139.
in Engineering Software, 92, 4856. http://doi. Zuo, W., Bai, J. & Li, B. 2014. A hybrid OCGA approach
org/10.1016/j.advengsoft.2015.11.001. for fast and global truss optimization with frequency
Ho-Huu, V., Nguyen-Thoi, T., Vo-Duy, T. & Nguyen- constraints. Applied Soft Computing, 14, Part C(0),
Trang, T. 2016. An adaptive elitist differential evolution 528535. http://doi.org/http://dx.doi.org/10.1016/j.
for truss optimization with discrete variables. Compu- asoc.2013.09.002.

65

AMER16_Book.indb 65 3/15/2016 11:24:03 AM


This page intentionally left blank
Applied Mathematics in Engineering and Reliability Bri, Snel, Khanh & Dao (Eds)
2016 Taylor & Francis Group, London, ISBN 978-1-138-02928-6

Optimum revenue calculation method to generate competitive


hydroelectric power on Hua Na hydropower

Phan T.H. Long & L.Q. Hung


Water Resources University, Hanoi, Vietnam

Phan Dao
Ton Duc Thang University, Ho Chi Minh City, Vietnam

ABSTRACT: The paper introduces the calculate plan with better periods for flood control, calculation
methods based on dynamic programming in the way of irregular mesh. The program was applied to cal-
culate the Hua Na Hydropower, with two different operating models. The objective function to generate
competitive hydroelectric power suggests the maximum revenue.

1 INTRODUCTION

1.1 Competitive generating market


The deregulation and development of electric-
ity market proved that it is the advanced stage of
management science in energy field. Electricity
market (Figure 1) creates fair competitive busi-
ness environment for participants and becomes
outstanding solution to attract the investment and
to raise the efficiency of production and business
activities for the power industry. Electricity market
has been developed all over the world, not only a
single national market but also multi-national one Figure 1. Roadmap of competitive power market in
trading the power among countries in the same Vietnam.
region. ASEAN member countries such as Sin-
gapore, Philippine, Thailand, Malaysia, etc. have
positive steps in forming their individual electricity Table 1. Basic param-eters of Hua Na HPP.
market and toward the ASEAN electricity market
in the future. Following the Prime Ministers Deci- Unit Value
sion 63/2013/QD-TTg the competitive electricity
wholesale market would be put under pilot imple- Installed Capacity MW 180
mentation from 2016 to 2018 and was expected to Number of units 2
begin operations officially from 2019. Maximum Capacity per unit MW 90
Minimum Capacity per unit MW 70
Average Annual Generation GWh 716.6
1.2 Hua Na hydropower plant Full water supply level m 240
Hua Na hydropower plant (basic parameters see in Pre-flood water level m 235
Table 1) is located in Nghe An province Que Phong Dead water level m 215
Flood control volume 106m3 100
district, Dong Van commune, on river Chu. With a
Total volume 106m3 569.35
total investment capital of VND 7,065 trillion, this
Active volume 106m3 390.99
hydropower plant was the first large-scale project
Area of reservoir km2 5,345
of the Vietnam National Oil and Gas Group.
Maximum head m 118.30
Following the Prime Ministers Decision
Design head m 100
1911/2015/QD-TTg dated 5-Nov-2015 about the Minimum head m 85.43
operating reservoirs on the Ma river basin, the Turbine full-gate discharge m.s1 203.4
Hua Na HPP must be moderate with following

67

CH09_45.indd 67 3/15/2016 2:09:25 PM


new conditions such as minimum level reservoir, Table 2. Price (VND) per power (KW) and (KWh).
term of flood control, etc. (unlike the process of
design consultants) beside has engaged in com- Dry season Rainy season
petitive electricity markets. This is a problem that
Price KW KWh KW KWh
needs some solution of calculation to optimize rev-
enue and limit excessive discharge. Peak hour 250 2000 100 1500
Normal hour 250 900 100 500

2 METHODOLOGY

2.1 Dynamic programming


Dynamic programming is a technique used for
optimizing a multistage process. It is a Solution-
seeking concept which replaces a problem of n
decision variables by n sub problems having prefer-
ably one decision variable each. Such an approach
allows analysts to make decisions stage-by-stage,
until the final result is obtained. For operating
reservoir, the water levels have been divided from
full water supply level to dead water level. For one
month, with two values of water level at begin and
end of month, the values of discharge, head, power
and revenue will be calculated. Figure 2. Model dynamic programming applied to one-
dimensional storage reservoir with irregular mesh.

2.2 Dividing power output for peak hours water level (240 m); Irregular mesh has two parts: per
0.1 m from 230 to 240 and 0.5 m from 215 to 230 m
According the rules, Vietnam power market has 5
peak hours in working days. Every week has 5 work-
ing days. Thus a month has about 108 peak hours. 2.6 Step-wise procedure of the algorithm
Monthly power output is divided into two parts:
Depending on the natural inflow, release capacity,
One with high price and other with mean price.
and boundary conditions of reservoir, the maxi-
mum value of revenue for all reservoirs (in case
2.3 Selection calculation term of multiple reservoir system) at every time step of
operating horizon are found out.
Following the new conditions, such as flood control,
Considering the maximum revenue as in the
the water level in reservoir always less than 235 m
code Visual Basic 2010 bellows:
from 01/07 to 30/11; the time to calculate the opti-
mal plan suggests to start from 01/12 years to 30/11
next years. But in Article 11, from 15/10, the water
level can be rise to 240 m with condition good fore-
cast hydrology. Two different operating models are
store from 16/10 and store from 1/12 every year.

2.4 Monthly revenue


At the end of period, as 15-Oct or 30-Nov, con-
Based on dynamic programming calculation, for
ventional dynamic programming is run through
each month, three values of head, discharge and
this corridor (see Figure 2) to find the trajectory,
price have been determined. The prices could be
water_level, which gives maximum objective func-
increased, or changed by year or determined by
tion value, amount.
ratio between dry season and rainy season (see
Table 2). The result will be better if monthly price
has been determined.
3 RESULTS AND DISCUSSIONS

2.5 Irregular meshing 3.1 Meshing method


Two mesh have been applied: Regular mesh with The use of meshing methods increases the amount
per 1 m from dead water level (215 m) to full supply and calculation time, but will result in more

68

AMER16_Book.indb 68 3/15/2016 11:24:04 AM


consistent calculations, the meshing with small
distance in above and longer distance in the lower
part corresponds to the share volume lake which
parts are together. Method of finer meshing will
gain better calculation results. However, for multi
reservoirs, the application of finer meshing, for
example distance in 1 cm, has increased rapidly
amount of calculation. It is useful when applied
to the calculation for multiple reservoirs. Model of
irregular mesh is demonstrated in Figure 3.
Figure 5. Evolution process upstream water level in
3.2 Result for chosen year and observed years chosen year.

The annual mean price has been shown on Table 3:


And the real mean price of Hua Na HPP from Table 4. Annual revenue.
date operate generating has shown on Figure 4.
Irregular mesh Regular mesh

Setting Billion VND Billion VND

Store since 16/10 788.310 787.866


Store since 1/12 772.108 771.688

Table 5. In minimum water level conditions in dry sea-


son (Store since 16/10).

Valee Unit Value

Annual revenue bil.VND 827,3478


Annual power out GWh 679,7706
Annual mean price VND 1217,099

Figure 3. Model dynamic programming applied to two-


dimensional storage reservoirs with irregular mesh. For irregular mesh, the active volume has been
used contrary to regular mesh. Water level at
Table 3. Annual mean price. 30-Jun is 217 m in case irregular mesh compare to
222 m in case regular mesh. The comparison has
Irregular mesh Regular mesh
been shown on Figure 5.
Mean price VND/KWh VND/KWh In 2015, all of reservoirs of Vietnam have the
minimum water level in the dry season. In case
Store since 16/10 1072.222 1071.822 Hua Na HPP, the new values have been calculated.
Store since 1/12 1105.223 1105.036 Annual revenue results are clearly demonstrated
and compared in Tables 45.

4 CONCLUSIONS

The operators which comply with the new reser-


voir may have more difficulty but with putting
into operation the competitive electricity market
will help businesses sell electricity at times of high
prices on water shortages, lack of electricity. And
in overall revenue divided by a power unit volume
is higher than the fixed price.
The use of irregular meshing method increases
the amount and time of calculation, but the result
Figure 4. Real mean price of Hua Na HPP (VND/ is suitable. The mesh with two parts corresponds to
KWh). the share volume lake which parts are together.

69

AMER16_Book.indb 69 3/15/2016 11:24:04 AM


The selected period will not be calculated when operating under inter-reservoir operation on the
depending on the hydrological year, which should Ma River basin.
be based on operational procedures. The price has Ministry of Industry and Trade. 2014. The circular
been changed in the dry and rainy seasons of the Rules operate competitive electricity market.
Nandalal, K.D.W. & Bogardi J.J. 2013. Dynamic Pro-
electricity market, which may be different from the gramming Based Operation of Reservoirs Applicabil-
distribution season the river is applied. ity and Limits, in Cambridge.
For power plants competitive, the objective of Prime Minister. 2013. The Decision Regulation of
the investor is the largest revenue, which sometimes the roadmap, the conditions and electricity industry
will not fit with the calculation of the greatest structure to form and develop the level of the electric-
amount of electricity anymore. ity market in Vietnam.
Prime Minister. 2015. The Decision Promulgating oper-
ation Process reservoir on the Ma River basin.
REFERENCES

Hua Na Hydropower JSC. 2015. Report on the Assess-


ment of change in average power of Hua Na NMTD

70

AMER16_Book.indb 70 3/15/2016 11:24:04 AM


Maintenance modelling and optimization

AMER16_Book.indb 71 3/15/2016 11:24:04 AM


This page intentionally left blank
Applied Mathematics in Engineering and Reliability Bri, Snel, Khanh & Dao (Eds)
2016 Taylor & Francis Group, London, ISBN 978-1-138-02928-6

A dynamic grouping model for the maintenance planning of complex


structure systems with consideration of maintenance durations

H.C. Vu, A. Barros & M.A. Lundteigen


Department of Production and Quality Engineering, Norwegian University of Science and Technology,
Trondheim, Norway

P. Do
Research Centre for Automatic Control, Lorraine University, Nancy, France

ABSTRACT: In the framework of maintenance optimization of multi-component systems, dynamic


grouping has developed and becomes an interesting approach. However, most existing dynamic grouping
models assume that the repair time is negligible. This assumption may be not always relevant and limits
the application of these models in many real situations. The main objective of this paper is to develop a
dynamic grouping model taking into account both preventive and corrective maintenance duration for
complex structure systems. Analytical method is developed for the evaluation of total maintenance cost.
This analytical method helps to overcome, when compared with simulation methods, the computational
time problem which often is a big problem for the maintenance optimization of systems with a large
number of components. A numerical example is presented to show how the proposed grouping approach
can be used for the maintenance planning of a complex structure system containing 12 components.
A link with the coastal highway route E39 project is also discussed.

1 INTRODUCTION (1997) has drawn much attention, thanks to its abil-


ity of taking into account different online infor-
Many industrial systems involve a high number mation which may occur over time (e.g. a varying
of components where maintenance is necessary to deterioration of components, unexpected oppor-
maintain the performance throughout the system tunities, changes of utilization factors of compo-
lifetime. Maintenance planning that results in the nents). This approach has been developed to deal
most economical way of grouping maintenance is with the different maintenance challenges in many
essential, and failing to do so will expose the sys- papers, e.g. condition-based model (Bouvard et al.
tem owner to very large costs. This is a question 2011), predictive maintenance model (Van Horen-
that is of significant interest to the e.g. the Nor- beek & Pintelon 2013), time limited opportunities
wegian Road Administration who is currently (Do et al. 2013), availability constraint under lim-
planning for replacing many ferry crossings along ited access to repairmen (Do et al. 2015). However,
the coastal E-39 in Norway with new strait cross- these works only deal with series structure systems,
ing bridges and submergible tunnels (the so-called in which the economic dependence among compo-
Ferry free E-39 project). nents is always positive. Indeed, the maintenance of
In the last decade, grouping maintenance has a group of components can save the setup cost paid
developed and becomes an interesting approach for preparation tasks of a maintenance action, and
in the maintenance optimization framework of the downtime cost. Recently, to response to the fact
multi-component systems (Dekker 1996). The idea that system structures are usually more complex and
of this approach is to take advantage of positive include redundancy (it could be a mixture of some
economic dependence to reduce the maintenance basic configurations, e.g. series, parallel, series-par-
cost by jointly maintaining several components at allel, k-out-of-n), the dynamic grouping is developed
the same time. The positive economic dependence to take into consideration of the complexity due to
among components implies that costs can be saved the system structures (Vu et al. 2014, Vu et al. 2015,
when several components are jointly maintained Nguyen et al. 2014). In these papers, under impacts
instead of separately (Nicolai & Dekker 2008). of the complex structures, the economic depend-
Among many grouping maintenance strategies, ence can be either positive or negative, depending
the dynamic grouping developed in Wildeman et al. on the considered group of components.

73

AMER16_Book.indb 73 3/15/2016 11:24:04 AM


In order to facilitate the maintenance modeling, among components could be both positive and
the most of existing works consider that the main- negative (i.e. the maintenance cost of a group
tenance durations are negligible. This assumption of components is not equal to the total mainte-
may be unrealistic and limits the application of these nance cost of all components in the group).
models in real situations. To this end, the dynamic- The system contains n repairable components
grouping with taking into account the preventive which have two possible states: operational or
maintenance durations is developed in Vu et al. (2014), failed.
but even so, the corrective maintenance duration A time-based model is used to model the time to
(repair time) is still not investigated. Van Horenbeek failure of components. ri (t ) denotes the failure
& Pintelon (2013) has considered the repair time in rate of component i, and ri(t) > 1 ( i 1, ..., n ).
their predictive grouping model, and used the Monte The logistic supports (e.g. repair teams, spare
Carlo simulation to deal with the unpredictability of parts) are sufficient, available, and efficient to
component failures. Unfortunately, the model is only ensure that the repair at failures and the preven-
valid for the systems with a series structure where the tive replacement can be successfully and quickly
stoppage of any component leads to the shutdown carried out.
of the entire system. The situation is likely to be more The maintenance duration of a preventive main-
complicated in cases of the complex structures where tenance action (denoted by ip ) and a correc-
the system can still operate (partially functioning) tive maintenance action (denoted by ic ) are
when some redundant components fail. constant and bigger than zero.
In this paper, a dynamic grouping strategy is
According to the complex structure, two kinds
developed for the complex structure systems with
of components are here distinguished.
taking into account both preventive and corrective
maintenance duration. However, under consid- Critical components: a shutdown of a critical
eration of maintenance duration, the maintenance component for whatever reason leads to a shut-
model becomes much more complex. In the present down of the whole system.
paper, an analytical method is developed to find Non-critical components: the system can par-
the optimal maintenance planning where mainte- tially work when a non-critical component
nance cost is the criterion. When compared with stops.
the Monte Carlo simulation, the proposed analyti-
cal method can reduce significantly the computa-
2.2 Maintenance cost structure
tional time, and may be applied to systems with a
large number of components. The cost to be paid for a maintenance action
The rest of the paper is organized as follows. (preventive or corrective) of a component contains
Section 2 is devoted to present the maintenance three following parts (Vu et al. 2014, Vu et al.
modeling and some general assumptions of this 2015).
work. The dynamic grouping approach proposed
A setup cost that can be composed by the cost
in Vu et al. (2014) and Vu et al. (2015) for the com-
of crew traveling and preparation costs (e.g.
plex structure systems is shortly described in Sec-
erecting a scaffolding or opening a machine).
tion 3. The development of the dynamic grouping
A specific cost that is related to the specific
approach to take into account the durations of
characteristics of the component such as spare
repair actions is shown in Section 4. In Section 5,
part costs, specific tools and maintenance
a numerical example is proposed to show how the
procedures.
developed grouping approach can be applied to the
A downtime cost that has to be paid if the com-
maintenance planning of a complex system of 12
ponent is a critical one because the system is not
components. The link between the research and the
functioning during the maintenance of the com-
Ferry free E-39 project is discussed in Section 6.
ponent. This downtime cost could be produc-
Finally, concluding remarks are made in Section 7.
tion loss costs, quality loss costs, restart costs, or
machine damage costs, etc.
2 MAINTENANCE MODELING In general, the above costs may be changed
overtime, and not the same for every compo-
2.1 General assumptions nent or every maintenance action. In this paper,
in order to simplify the maintenance model, the
During the development of the proposed group-
setup cost and the downtime cost are assumed
ing strategy, the following assumptions are
to be independent from the component charac-
considered.
teristics. Moreover, all the costs are constant and
Consider a multi-component system with a com- depend on the nature of the maintenance action
plex structure where the economical dependence (preventive or corrective).

74

AMER16_Book.indb 74 3/15/2016 11:24:05 AM


2.3 Preventive Maintenance 3 DYNAMIC GROUPING APPROACH FOR
COMPLEX STRUCTURE SYSTEMS
The aim of Preventive Maintenance (PM) is to
reduce the probability of experiencing a failure
In a complex structure system, the maintenance
of a component/system. In this paper, after a PM
grouping of a group of components can have both
action, the maintained component is considered to
negative and positive impacts on the system func-
be a new one. The cost that has to be paid for a PM
tion depending on the criticality of the group and
action of a component i can be written as
its components. Thus, the consideration of the
criticality in grouping optimization is important,
Cip S p + cip i (t ) cdp ip (1) and can help to improve the grouping performance
(Vu et al. 2014, Vu et al. 2015).
where Sp and cip are the setup cost and the specific The dynamic grouping model, developed for
cost of a PM action respectively; i(t) is an indica- complex structure systems, contains the four fol-
tor function, which presents the criticality of the lowing phases (Fig. 1).
component i at time t, defined as
Phase 1: System analysis. In this phase, the solu-
tion for determining the criticality of compo-
1 if component i is critical at time t, nents (i) or groups of components ( G k ) at
i (t ) = (2)
0 otherwise; time t with respect to a specific system function
is developed. For this purpose, the reliability
cdp ip is the downtime cost has to be paid during block diagram is used in this paper.
the replacement of the critical component i; and Phase 2: Individual optimization. This phase is
cdp is the mean downtime cost per unit time. designed to determine the long-term maintenance
Note that the consideration of both the main- plan for each component separately by minimiz-
tenance durations, and the complexity of the ing its long-run expected maintenance cost rate.
system structure leads to the possibility that In this phase, the economic dependency among
a non-critical component can become critical components is not considered, and the criticality
one within the maintenance period of the other of components is considered to be fixed.
components. That is the reason why the indica- Phase 3: Grouping optimization. A specific
tor function i is presented as a function of time short-term horizon, and all the PM actions of
(Eq. 2). components within the horizon are firstly iden-
tified based on the individual maintenance plan
obtained from phase 2. These PM actions are
2.4 Corrective Maintenance then grouped to take advantage of the posi-
During the system operation, if a component i tive economic dependence among components.
fails, the component is then immediately repaired. The grouping solution is found by maximizing
After the minimal repair action, the repaired com- the total economic profit within the considered
ponent is considered to be in the state that it has horizon.
just before the failure. As with the preventive main- Phase 4: Update of the grouping solution. The
tenance, when a Corrective Maintenance (CM) grouping solution obtained in phase 3 needs to
action is carried out on a component i, it requires a be updated in the two following cases: grouping
CM cost, denoted by Cic , which can be expressed planning for a new short-term horizon (rolling
as follows horizon); occurrences of dynamic contexts such
as maintenance opportunities, changes in pro-
Cic S c + cic i (t ) cdc ic (3) duction planning, changes in operation condi-
tions (Vu et al. 2014).
where Sc, cic , and cdc are the setup cost, the spe- Above paragraph presents the four phases of
cific cost, and the mean downtime cost per time the dynamic grouping approach developed for
unit related to a CM action of the component i
respectively.
In the next section, the dynamic grouping strat-
egy developed in (Vu et al. 2014, Vu et al. 2015)
will be briefly presented. Under this strategy, the
setup cost and the downtime cost paid for PM can
be saved by simultaneously performing some PM
actions. Note that, in these papers, the grouping of
CM actions is not allowed due to the limitations of Figure 1. Dynamic grouping for complex structure
the logistic support. systems.

75

AMER16_Book.indb 75 3/15/2016 11:24:05 AM


complex structure systems. In the next section, we where vi(t) is the age (total operating time) of
will describe how this approach can be developed component i at time t. We have i ( ) 0, and
to take into account both the complexity of the i (Ti ) xi .
system structure, and the repair time. Equation 4 can be rewritten as follows

xi
Cip Cic ri (t )dt
4 DYNAMIC GROUPING APPROACH Ri ( xi ) =
CR 0
(6)
WITH TAKING INTO ACCOUNT xi
xi + 0 ri (tt dt + ip
c
i
THE REPAIR TIME

The consideration of the repair time does not lead The optimal replacement interval of the com-
to the complete change in the above grouping ponent i (denoted by Ti ) is then determined by
approach. Indeed, to take into account the repair minimizing its long-term expected maintenance
time, only phase 2 and phase 3 are developed and cost rate.
presented in this section. The other phases of the
approach remain unchanged and can be found xi g i C
CRRi ( xi ) (7)
in more detail in Vu et al. (2014) and Vu et al. xi
(2015).
and
4.1 Individual optimization
xi
As mentioned above, in this phase, the long-term Ti xi + ic ri (t )dt (8)
0
maintenance plan is separately determined for
each component. To do this, age-based replace- The corresponding minimal maintenance cost
ment strategy (Barlow & Hunter 1960) is usually rate is
chosen thanks to its high performance at compo-
nent level. When the repair time is considered, the xi
replacement decisions based on the components Cip Cic ri (t )dt
age will face many difficulties in maintenance Ri
CR =
0
(9)
Ti + ip
modeling and maintenance optimization due to
the unpredictability of the failures. For this reason,
the calendar-based replacement strategy is used in 4.2 Grouping optimization
this paper (Fig. 2).
Individual maintenance plan. Based on the replace-
According to the calendar-based replacement
ment intervals obtained in the previous phase, in
strategy, the component i is replaced at fixed-time
this phase, the individual maintenance dates of
intervals Ti and minimally repaired at its failures.
components in a specific short-term horizon are
The long-term expected maintenance cost rate
determined.
of the component i is calculated based on the
In details, consider a planning horizon PH
renewal theory as follows
between a and b. The first PM activity of the com-
ponent i in PH, denoted by ti1 , is then determined
Cip Cic i ( Ti ) as follows
Ri (Ti ) =
CR (4)
Ti + ip
ti1 Ti di (a ) + a (10)
where i ( Ti ) is the mean number of failures of
component i on ( i ] . Under the minimal repair, where di (a ) is the time between a and the last PM
i ( Ti ) is equal to activity of the component i before a.
The other PM activities of the component i in
i (Ti ) xi
i ( Ti ) ( )
i
ri (t )dt 0 ri (t )dt (5) PH can be determined as

ti ti j 1 + ip + Ti if j > 1 and ti j b (11)

where ti j denotes the jth PM activity of the com-


ponent i in PH.
Grouping solution. A partition of { } is a col-
lection of m mutually exclusive groups G1,...,G m
Figure 2. Calendar-based replacement strategy. which cover all N PM activities in PH.

76

AMER16_Book.indb 76 3/15/2016 11:24:06 AM


Gl G k = , l k (12) where QG k and Li are the sets of non-critical com-
ponents which become critical ones during the main-
and tenance of group Gk and component i respectively.
The following points need to be noted during
G1 G 2 ... G m = {1, ..., N } (13) the calculation of CS(GS ) .
The optimal value of tG k is determined as
A grouping solution, denoted by GS, is a parti-
tion of { } such that all PM activities in each
group are jointly executed at the same time.
tG k ag i
t
{H 1
Gk
+ cdc ( Gk
)
Gk

Evaluation of the grouping performance. The cost



saving of the grouping maintenance compared to the q (tG tG k + G k ) (19)
individual maintenance is used as the only one crite- q Q
Gk

rion to evaluate the performance of a grouping solu-
tion. The cost saving of a grouping solution GS is The calculation of CS(GS ) is mostly based on
m the mean number of failures of components in dif-
CS (GS ) = CS (G k ) ferent intervals; in other words, it is based on the
k =1 (14) age of components at different instants. Unfor-
m
(
tunately, the determination of the age of compo-
= UG k H
HG1 k H
HG2 k HG3 k nents over time becomes a real challenge when the
k =1
repair time is taken into account. To overcome
this problem, numerical simulation methods have
where
been widely used in the literature. The use of simu-
CS (G k ) is the cost saving when all components lation methods leads to a high computational time
of group Gk are jointly maintained. and difficultiesin the case of systems with a large
UG k is the saving of the PM setup cost when all component number. For this reason, in the next
nk components of group Gk are grouped. We con- paragraph, an analytical method is developed to
sider that only one setup cost has to be paid when determine the age of components.
a group of components are maintained together.
Age analysis method. This analytical method is
UG k ( nk ) S p (15) developed to calculate the age of components at
instants ti ti j + ip in the individual maintenance
plan, and at instants tG tG k + G k in the grouped
HHG1 k is the penalty costs due to the changes of
maintenance plan. The following steps of the
maintenance dates from the optimal individual
proposed method are separately and repeatedly
ones ti j to the group execution date tG k.
applied to the two above plans.
HG1 k
j k
{Cic [ i (0, tG k )
(16)
Step 1: Determination of a partition of PH.
i G A partition of PH containing V sub-intervals
i (0,
0 ti j )] C
CRRi (tG k ti j )} (PH1, PH2, ..., PHV) is determined so that a
component can have only one state either under
HHG2 k is the cost related to the change in total PM or not in each sub-interval.
planned downtime of the system. Step 2: Analyze the system structure in a sub-inter-
val PH Hv = [ av , bv ] . This analysis is done in order
H
HG2 k cdp [ Gk
Gk
i p
i ]
(17) to determine the following sets of components.
i j
G k
Gv is the set of components which are preven-
tively maintained in PHv.
where G k is the maintenance duration of Av is the set of components which are not func-
group Gk. When the number of repair teams is tioning due to the PM of Gv.
sufficient, we have G p.
i j G k i Bv is the set of components which are not in Gv
H 3
HG k is the cost related to the change in total and Av.
unplanned downtime of the system. For a component i in Bv, a set of components
Cvi is determined such that the PM of any com-
H
HG3 k cdc [(1 Gk
) p
q q (tG tG k + G k ) ponent in Cvi leads the component i to stop.
q Q
Gk Step 3: Calculate the age of components at bv.
(1 i) ip l (ti j ti j + ip )]
i (bv ) i i Gv .
i j Gk l Li

(18) i (bv ) i av ) if i Av .

77

AMER16_Book.indb 77 3/15/2016 11:24:09 AM


For a component i in Bv, vi (bv) is determined by Table 1. Given data of components.
solving the following equation system.
Components i i cip ip cic ic
av + [ i (bv ) i av )] + i
c
bv i (av , bv ) 1 253 2.45 155 1 22 0.35
+ cj j (av , bv ), i Bv (20) 2 205 2.35 225 3 36 0.81
j Cvi 3 117 1.87 785 1 66 0.29
4 190 2.00 245 2 28 0.54
Step 4: Return to the step 2 for all sub-intervals 5 119 1.65 375 1 92 0.36
from v = 1 to V. 6 284 2.50 300 2 76 0.44
7 297 3.05 345 1 55 0.27
Optimal grouping solution. Based on the above 8 108 1.55 555 1 102 0.31
calculation of the grouping performance, we can 9 200 1.95 190 3 45 0.39
compare different grouping solutions and deter- 10 125 1.85 350 2 44 0.32
mine the optimal one. 11 189 2.75 460 2 30 0.78
12 275 1.85 130 1 24 0.34

= arg max CS (GS ) (21)
GS

Table 2. Given data of components.


The finding of the optimal grouping solution
is a NP-complete problem because the number Components i Cip Cic Ti* CRi*
of possible grouping solutions increases very
quickly with the increasing of the number 1 1 250 79 348.75 1.21
of PM activities in PH. Consequently, in 2 0 240 51 352.19 1.18
this paper, the Genetic Algorithm citeHol- 3 0 800 81 434.42 3.97
land1975 is used to search the optimal grouping 4 0 260 43 471.75 1.10
solution. 5 1 470 150 310.39 3.83
6 1 475 143 390.06 2.02
7 1 440 102 379.20 1.72
5 NUMERICAL EXAMPLE 8 0 570 117 444.85 3.61
9 0 205 60 385.20 1.08
In order to show how our dynamic grouping 10 0 365 59 367.79 2.15
approach can be applied to the maintenance 11 0 475 45 371.72 2.02
planning with taking into account the repair 12 0 145 39 612.40 0.52
time, a specific system and its data are randomly
created. The system contains 12 components
with the reliability block diagram shown in 5.1 Individual optimization
Figure 3.
The failure behaviors of the components is In this phase, the block replacement strategy is
described by the Weibull distribution with scale used for the maintenance planning at component
parameter i and shape parameter i > 1 . The fail- level. The intermediate results and the optimal
ure rate of component i is maintenance frequencies of components are pre-
sented in Table 2.
i 1 The long-term maintenance cost rate of the
i t system when all PM activities are individually per-
ri (t ) = (22)
i i formed is
12
The data of components are given in Table 1.
s = CR
Ri = 24.41
G
The other costs are Sp = Sc = 15, dp = 80 , and CR
Rsy (23)
i =1
cd = 120.
c

5.2 Grouping optimization


The new system is put into operation at time
zero; therefore, we have i ( ) 0 , di(0) = 0, and
ti1 Ti for all i 1,..., n. Consider a finite inter-
val PH = [ a, b ] in which each component is pre-
y maintained once. We have a = 0, and
ventively
Figure 3. Reliability block diagram. b i =1 ti1 = 612.40.
a 12

78

AMER16_Book.indb 78 3/15/2016 11:24:12 AM


Table 3. Optimal grouping solution.

Group Components Gk tGk Gk CS(Gk)

1 1, 4, 5, 6, 7, 10 1 367.08 2 326.29
2 2, 9, 11 0 373.38 3 62.77
3 3, 8, 12 0 458.74 1 95.98

The total maintenance costs of the system in PH


when all PM activities are individually performed,
denoted TC G , is calculated as follows
Figure 4. Submerged floating tunnel bridge concept.
TC G = CRG
Rsys ( b a ) = 14948.68 (24)

In order to find the optimal grouping solu- There are eight wide and deep fjords along the
tion, Genetic Algorithm is used with the following route. The fjord crossings will require massive
parameters: crossover probability = 0.8, mutation investments and huge bridges than previously
probability = 0.02, population size = 60, number of installed in Norway. Figure 4 describes the sub-
generations = 500. The program is implemented by merged floating tunnel bridge concept for crossing
the use of Matlab R2015a on a DELL computer the Sognefjord, which is both deep and wide, and is
(Intel core i7, 2.6 Ghz, 16 GB RAM). The com- considered challenging to cross. The more informa-
putational time is approximately 40 minutes. The tion about the E39 project can be found on http://
optimal grouping solution obtained by the pro- www.vegvesen.no/Vegprosjekter/ferjefriE39/.
gram is reported in Table 3. The construction and operation of such super-
The total economic profit of the optimal group- structure system face with many technological
ing solution is challenges included maintenance planning prob-
lems. This research is then partially funded by the
3 E39 project, and motivated by the following main-
CS (GS ) = CS (G k ) = 485.03 (25) tenance planning problems.
k =1
The bridge is a complex system containing a
The total maintenance costs of the system in large number of components which are inter-
PH when the PM activities are grouped, denoted dependent. More over, the bridge availability is
TCG, is calculated as follows highlighted due to the important impacts of its
closures on the traffic, the environment, and the
TC G = TC G CS (GS ) = 14463.65 (26) people safety. For this purpose, the developed
dynamic grouping approach based on the ana-
lytical method can help to improve the bridge
From the obtained results, we can conclude
availability, and prevent the maintenance opti-
that the maintenance grouping helps to reduce the
mization from the computational time problem.
total maintenance costs of the system in PH from
The taking into account of the repair time in
14948.68 to 14463.65. The reduction is equal to
the maintenance modeling is necessary when the
3.24% of the total maintenance costs of the sys-
estimated repair time is considerable.
tem in PH when all PM activities are individually
performed. Given the above efforts, many challenges related
to the maintenance planning of the superstructure
system still exist such as the uncertainties of the
6 THE LINK BETWEEN THE PROPOSED data and their impacts on the maintenance per-
RESEARCH AND THE COASTAL formance, the imperfect preventive maintenance,
HIGHWAY ROUTE E39 PROJECT the components maintainability, etc. These chal-
lenges will be our objectives in the future research.
In this section, we will shortly discuss about the
link between the presented research and the E39
project. Norways coastal highway E39 is part of 7 CONCLUSION
the European trunk road system. The route runs
from Kristiansand in the south to Trondheim in In this paper, a dynamic grouping approach is
central Norway, a distance of almost 1100 km. developed for the maintenance planning of the

79

AMER16_Book.indb 79 3/15/2016 11:24:14 AM


complex structure systems with consideration of Dekker, R. (1996). Applications of maintenance optimi-
both preventive and corrective maintenance dura- zation models: a review and analysis. Reliability Engi-
tion. To overcome the problem of computational neering & System Safety 51(3), 229240.
time, an age analytical method is proposed, given Do, P., A. Barros, K. Berenguer, C. Bouvard, & F. Bris-
saud (2013). Dynamic grouping maintenance with
the complexity of the maintenance optimization time limited opportunities. Reliability Engineering &
problem when considering maintenance duration. System Safety 120, 5159.
The numerical example describes how the pro- Do, P., H.C. Vu, A. Barros, & C. Berenguer (2015). Main-
posed grouping approach can be applied to the tenance grouping for multi-component systems with
maintenance planning of a system of 12 compo- availability constraints and limited maintenance teams.
nents. The obtained results confirm the advantage Reliability Engineering & System Safety 142, 5667.
of the proposed grouping strategy and show that Holland, J. (1975). Adaptation in natural and artificial
the computational time is reasonable. Finally, it systems: An introductory analysis with applications to
should be noted that this research is motivated by biology, control, and artificial intelligence. Ann Arbor,
MI, United States: University of Michigan Press.
the real problems that we have to face to within the Nguyen, K.A., P. Do, & A. Grall (2014). Condition-
scope of the E39 project. However, the application based maintenance for multi-component systems
of the proposed approach is still limited, and needs using importance measure and predictive informa-
to be investigated in the future research. tion. International Journal of Systems Science: Opera-
tions & Logistics 1(4), 228245.
Nicolai, R. & R. Dekker (2008). Optimal maintenance
ACKNOWLEDGMENTS of multi-component systems: a review. In Complex
system maintenance handbook, pp. 263286. London,
This research was supported by the Ferry free E39 United Kingdom: Springer.
Van Horenbeek, A. & L. Pintelon (2013). A dynamic pre-
project and the Department of Production and dictive maintenance policy for complex multi-compo-
Quality Engineering, Norwegian University of nent systems. Reliability Engineering & System Safety
Science and Technology. 120, 3950.
Vu, H.C., P. Do, A. Barros, & C. Berenguer (2014).
Maintenance grouping strategy for multi-component
REFERENCES systems with dynamic contexts. Reliability Engineer-
ing & System Safety 132, 233249.
Barlow, R. & L. Hunter (1960). Optimum preven- Vu, H.C., P. Do, A. Barros, & C. Berenguer (2015).
tive maintenance policies. Operations research 8(1), Maintenance planning and dynamic grouping for
90100. multi-component systems with positive and negative
Bouvard, K., S. Artus, C. Berenguer, & V. Cocquempot economic dependencies. IMA Journal of Management
(2011). Condition-based dynamic maintenance opera- Mathematics 26(2), 145170.
tions planning & grouping. application to commer- Wildeman, R.E., R. Dekker, & A. Smit (1997). A dynamic
cial heavy vehicles. Reliability Engineering & System policy for grouping maintenance activities. European
Safety 96(6), 601610. Journal of Operational Research 99(3), 530551.

80

AMER16_Book.indb 80 3/15/2016 11:24:15 AM


Applied Mathematics in Engineering and Reliability Bri, Snel, Khanh & Dao (Eds)
2016 Taylor & Francis Group, London, ISBN 978-1-138-02928-6

A parametric predictive maintenance decision framework considering


the system health prognosis accuracy

K.T. Huynh & A. Grall


ICD, ROSAS, LM2S, Universit de technologie de Troyes, UMR 6281, CNRS, Troyes, France

C. Brenguer
Universit Grenoble Alpes, GIPSA-lab, Grenoble, France
CNRS, GIPSA-lab, Grenoble, France

ABSTRACT: Nowadays, the health prognosis is popularly recognized as a significant lever to improve
the maintenance performance of modern industrial systems. Nevertheless, how to efficiently exploit prog-
nostic information for maintenance decision-making support is still a very open and challenging ques-
tion. In this paper, we attempt at contributing to the answer by developing a new parametric predictive
maintenance decision framework considering improving health prognosis accuracy. The study is based on
a single-unit deteriorating system subject to a stochastic degradation process, and to maintenance actions
such as inspection and replacement. Within the new framework, the system health prognosis accuracy
is used as a condition index to decide whether or not carrying out an intervention on the system. The
associated mathematical cost model is also developed and optimized on the basis of the semi-regenerative
theory, and is compared to a more classical benchmark framework. Numerical experiments emphasize
the performance of the proposed framework, and confirm the interest of introducing the system health
prognosis accuracy in maintenance decision-making.

1 INTRODUCTION recent works on the comparison between predic-


tive maintenance and conventional CBM strat-
Maintenance involves a wide range of activities, egies (see e.g., (Khoury et al. 2013, Huynh et al.
such as inspection, testing, repair, replacement, 2014)) have shown the contrary: the former is not
etc., in order to extend equipment life, improve really more profitable than the latter. This leads us
equipment availability, as well as to retain equip- to think that the current predictive maintenance
ment in its proper condition. Since these activi- frameworks (used e.g., in the above references)
ties are costly, maintenance strategies are needed are not suitable to efficiently exploit the prognos-
to organize and schedule them in a logical and tic information for maintenance decision-making.
economical manner. In literature, maintenance Faced to this issue, the present paper attempts at
strategies have evolved from the naive breakdown developing a new predictive maintenance decision
maintenance, to the blind time-based maintenance, framework which can overcome the drawbacks of
and lately towards the sophisticated Condition- the more classical ones.
Based Maintenance (CBM) (Jardine et al. 2006). More precisely, the study is based on a single-
Nowadays, with the development of prognostics unit deteriorating system subject to maintenance
and health management technologies, a new trend actions such as inspection and replacement. The
of maintenance strategies called predictive main- system degradation evolution is described by a
tenance has been recently emerged (Shin and Jun homogeneous Gamma process, and the system fails
2015). Theoretically, the predictive maintenance when the degradation level exceeds a fixed failure
belongs to the class of CBM strategies; but unlike threshold. Such a degradation and failure model
the conventional ones, it bases the maintenance allows us to compute and analyze some prognos-
decisions on the system health prognostic infor- tic condition indices characterizing the future
mation instead of diagnostic information. Such health state of the system. The standard deviation
a strategy anticipates more efficiently the system and the mean value of the system Residual Useful
failures, allows timely interventions on the sys- Life (RUL) are the prognostic condition indices of
tem, and hence promises better performance than interest, and we investigate how these indices can
conventional CBM strategies. However, various be exploited to make pertinent maintenance deci-

81

AMER16_Book.indb 81 3/15/2016 11:24:15 AM


sions. In fact, adopting the well-known parametric its RUL (Si et al. 2011). By this way, let X t be a sca-
structure-based decision rule (Brenguer 2008), the lar random variable representing the accumulated
former is used to decide whether or not carrying degradation of the system at time t 0 ; without
out an intervention (i.e., inspection or replacement) any maintenance operation, {X t }tt 0 is an increas-
on the system, while the latter is used to determine ing stochastic process with X 0 = 0 (i.e., the system
proper replacement times. A framework with such is initially new). Moreover, assuming the degrada-
maintenance decisions is known as parametric tion increment between 2 times t and s ( t s ),
predictive maintenance decision framework con- X s X t , is s -independent of degradation levels
sidering improving health prognosis accuracy. To before t , one can apply any monotone stochastic
quantify the performance of the new maintenance process from the Lvy family (Abdel-Hameed 2014)
framework, we develop and optimize its math- to model the evolution of the system degradation.
ematical cost model on the basis of the long-run In the present paper, the well-known homogeneous
expected maintenance rate and the semi-regenera- Gamma process with shape parameter and scale
tive theory. The comparisons with a more classical parameter is used. The choice of such a proc-
benchmark framework under various configura- ess for degradation modeling has been justified by
tions of maintenance costs and system characteris- diverse practical applications (e.g., corrosion dam-
tics allow us to emphasize the performance of the age mechanism (Kallen and van Noortwijk 2005),
proposed framework, and justify the interest of carbon-film resistors degradation (Wang 2009),
introducing the system health prognostic informa- SiC MOSFET threshold voltage degradation (San-
tion in maintenance decision-making. tini et al. 2014), fatigue crack growth (Bousquet
The remainder of this paper is organized as fol- et al. 2015), actuator performance loss (Langeron
lows. Section 2 is devoted to modeling the system et al. 2015), etc.), and it is considered appropri-
and to computing associated condition indices. ate by experts (Blain et al. 2007). Moreover, using
Section 3 deals with the detailed description and Gamma process can make the mathematical for-
theoretical analyses of the considered predictive mulation feasible. As such, for t s , the degrada-
maintenance decision frameworks including the tion increment X s X t follows a Gamma law with
maintenance assumptions, the most well-known probability density function (pdf)
maintenance framework, and the new framework
considering the system health prognosis accuracy.
(s t )x (s t ) 1e x
In Section 4, the maintenance cost models of these f (s t ) ( x ) = 1{x } , (1)
frameworks are developed and optimized. The ( ( s t ))
assessment and discussions on the performance of
the new maintenance framework are carried out in and cumulative distribution function (cdf)
Section 5. Finally, the paper is end with some con-
clusions and perspectives.
( ( s t ) x )
F (s t ) ( x ) = , (2)
( ( s t ))
2 SYSTEM MODELING,
CONDITION INDICES where ( ) = z 1e z dz and (, x) =
R+
x 1 z
2.1 System modeling
0
z e dz denote the complete and lower
incomplete Gamma functions respectively, and
Consider a single-unit deteriorating system consist- 1{ } denotes the indicator function which equals 1
ing of 1 component or 1 group of associated com- if the argument is true and 0 otherwise. The cou-
ponents (from the maintenance viewpoint). The ple of parameters ( ) allows to model various
system suffers an underlying degradation process degradation behaviors from almost-deterministic
which can cause random failures. Such a process to very-chaotic, and its average degradation rate
may be a physical deterioration process such as and the associated variance are m = / and
cumulative wear, crack growth, erosion, corrosion, 2 / 2 respectively. When degradation data
fatigue, etc. (Grall et al. 2002); or it may be an arti- are available, these parameters can be estimated by
ficial process describing the phenomenon that the classical statistical methods such as maximum like-
system health state or its performance worsen with lihood estimation, moments estimation, etc. (Van
usage and age (Xu et al. 2008). For such system, Noortwijk 2009).
it is recommended by Singpurwalla (1995) to base Associated with the degradation process, we
the degradation modeling on time-dependent sto- use a threshold-type model to define the system
chastic processes. The notion of process helps us to failure. For economic (e.g., poor products quality,
describe more finely the behavior of the system, and high consumption of raw material) or safety rea-
hence allowing, e.g., a more accurate prediction of sons (e.g., high risk of hazardous breakdowns), a

82

AMER16_Book.indb 82 3/15/2016 11:24:15 AM


system is usually declared as failed when it is no
longer able to fulfill its mission in an acceptable
condition even if it is still functioning. A high
system degradation level is thus unacceptable.
According to this view, we consider that the system
fails as soon as its degradation level exceeds a criti-
cal prefixed threshold L . The system failure time
L is thus expressed as

L = inf {t X t L} . (3)

2.2 Condition indices


Condition indices are indices characterizing the
health state of a system, based on which one can
make a maintenance decision (Huynh et al. 2014). Figure 1. ( i | x ) and ( i | x ) with respect to x.
Such indices may be the result of the real-time diag-
nosis of impending failures (i.e., diagnostic condi-
tion indices) or of the prognosis of future system
X i x , the conditional MRL and the associated
health (i.e., prognostic condition indices). For our
standard deviation are computed by
considered system, the degradation level returned
by an inspection at time i , X i , is a diagnostic
condition index,because it can define the system ( i | x ) R F (L x ) du (5)
+
health state at the current time i . Note that the
diagnosis in reality is not a simple task and may ( i | x ) R uF (L x ) ddu ( i | x ) (6)
require sophisticated techniques (Trav-Massuy`es +

2014). However, since the diagnosis is beyond the


scope of the paper, we simply assume that the diag- where Fu, () is derived from (2). Applying (5)
nosis is attached to inspection operations which and (6) to the system characterized by L = 15 and
can perfectly reveal the system degradation level. = 1 / 3 , we obtain the shape of ( i | x ) and
Given the diagnostic information and the degrada- ( i | x ) as in Figure 1. It is easy to prove that
tion and failure model, one can predict prognostic ( i | x ) and ( i | x ) are non-increasing func-
condition indices. In the literature, RUL, defined tions in x . The RUL prognosis becomes then more
as the length from the current time to the end of precise for higher value of x. Moreover, ( i | x )
the system useful life, is a well-known prognostic and ( i | x ) depend only on x. Thus, given the
index because it can provide an idea about how degradation and failure model as in Section 2.1,
long a system at a particular age will still survive. the considered diagnostic and prognostic condition
The concept of RUL has been widely investigated indices are equivalent. Even so, each of them has
by many works in the Prognostics and Health Man- its own meaning in maintenance decision-making
agement research area (Liao and Kottig 2014). In support. A proper maintenance framework should
this paper, a so-called conditional RUL is consid- take care to this point.
ered, and its mathematical expression at time i
given the degradation level X i is defined as (Ban-
jevic 2009) 3 MAINTENANCE DECISION
FRAMEWORKS

i | Xi i | X i 1 X
{ i L }, (4)
We propose in this section a new predictive main-
tenance decision framework considering improv-
where L is the system failure time given from (3). ing health prognosis accuracy. The framework
Obviously, ( i |X i ) is a random variable, it can be is relied on the parametric structure-based deci-
then characterized by the mean value and the stand- sion rule described in (Brenguer 2008). To better
ard deviation. Indeed, the former is usually used to understand the originality of the proposed frame-
locate the distribution of ( i |X i ), while the lat- work, we introduce at first assumptions on the
ter is adopted to describe the variability existing in maintained system, then we analyze the mainte-
( i |X i ) . The mean value of ( i |X i ) is known nance decision framework most well-known in the
as conditional mean residual lifetime (MRL) of literature through a representative strategy. Our
the system (Huynh et al. 2014). At time i , given maintenance decision framework is introduced

83

AMER16_Book.indb 83 3/15/2016 11:24:17 AM


next. To illustrate how to use this new framework, mented and a replacement is performed when-
a predictive maintenance strategy is also derived. ever the system degradation level at an inspection
The proposed illustrations in this section are based time reaches a threshold. In (Ponchet et al. 2010),
on the system defined by = 1 / 3 and L = 15 , the same degradation index is used for a replace-
and on the optimal configurationof the considered ment decision, but the inspection schedule is non-
strategies when the set of maintenance costs Ci = 5, periodic. In (Huynh et al. 2014) and (Huynh et al.
C p = 50, Cc = 100 and Cd = 25 is applied. 2015), the replacement decisions are made accord-
ing to the system conditional MRL and the system
conditional reliability respectively. Within this clas-
3.1 Maintenance assumptions
sical framework, when the system is multi-unit or
Consider the system presented in Section 2.1, we subject to multi-failure mode (e.g., shock and deg-
assume that its degradation level is hidden, and radation), using prognostic condition indices (e.g.,
its failure state is non-self-announcing. Inspection the system MRL, the system reliability, the system
activities are then necessary to reveal the system RUL standard deviation, etc.) for maintenance
state. The notion of inspection here is not simply decision-making is still more profitable than using
the data collection, but also the feature extraction diagnostic ones (e.g., the system degradation level)
from the collected data, the construction of deg- thanks to their overarching property (Huynh
radation indicators, and perhaps more. In other et al. 2015). But, when the system is single-unit
words, this activity includes all the tasks before and its failure is due to the degradation solely, they
the Maintenance Decision Making task in a pre- always lead to the same maintenance performance
dictive maintenance program. Such an inspection (Huynh et al. 2014). This means that the classical
operation is itself costly, and takes time; but, com- predictive maintenance decision framework does
pared to the life cycle of a system, the time for an not allow efficiently exploiting the prognostic con-
inspection is negligible. Thus, we assume that each dition indices. In the following, we will learn about
inspection operation is instantaneous, perfect, the reasons through a representative maintenance
non-destructive, and incurs a cost Ci > 0 . strategy of this classical framework.
Two maintenance actions are available: a Pre- To facilitate the comprehension, a periodic
ventive Replacement (PR) with cost C p Ci , and a inspection schedule is assumed and the replace-
Corrective Replacement (CR) with cost Cc . Since ment decisions are made according to the detected
maintenance actions are the true physical replace- system degradation level at an inspection time. Let
ment such that the system is as-good-as-new rather define a renewal cycle as the time interval between
than repairs, they take negligible times and incur two successive replacement operations, the periodic
fixed costs irrespective of the degradation level of inspection and degradation-based replacement
the system. Even though both the PR and CR oper- strategy over a renewal cycle is stated as follows.
ations put the system back in the as-good-as-new The system is regularly inspected with period
state, they are not necessarily identical in practice > 0 . At a scheduled inspection date i ,
because the CR is unplanned and performed on a i = 1, 2,, a CR of the system is carried out if it
more deteriorated system, moreover the cost Cc fails (i.e., X i L ). But, if the system is still run-
can comprise different costs associated to failure ning, a decision based on the degradation level X i
like damage to the environment. It is thus likely to is made. If X i < L , the running system is con-
be more complex and more expensive (i.e. Cc C p ). sidered too degraded, and a PR should be carried
Futhermore, a replacement, whether preventive out at i . Otherwise, nothing is done at i , and
or corrective, can only be instantaneously per- the maintenance decision is postponed until the
formed at predetermined times (i.e., inspection next inspection time at i = i + . The inspec-
time or scheduled replacement time). Therefore, tion period and the PR threshold are the
there exists a system downtime after failure, and 2 decision variables of this strategy, so we call it
an additional cost is incurred from the failure time (, ) strategy. Applying the (, ) strategy to the
until the next replacement time at a cost rate Cd . system defined at the beginning of Section 3, one
obtain the behavior the maintained system as in Fig-
ure 2. The optimal decision variables are opt = 4.6
3.2 Classical framework and representative strategy
and opt = 9.1478 (the cost model to derive these
The classical predictive maintenance decision optimal variables will be dealt with in Section 4).
framework considers that a replacement is always In Figure 2, the degradation evolution of the main-
attached to an inspection and can only be carried tained system and the associated inspection and
out at a certain inspection time i only. Most replacement operations are shown on the top, and
maintenance strategies in the literature belong the evolution of the conditional RUL standard
to thisframework. For example, in (Huynh et al. deviation at inspection times i andreplacement
2011), a periodic inspection schedule is imple- times r are represented in the bottom.

84

AMER16_Book.indb 84 3/15/2016 11:24:21 AM


no longer necessary, and it might be enough just to
wait time units, 0, from the last inspection
to do a system replacement at r = i + . Obvi-
ously, because of the use of two different health
indexes for the inspection scheduling and replace-
ment decision, this new predictive maintenance
decision framework is more flexible than the classi-
cal one. Furthermore, when the waiting time = 0 ,
the new framework returns to the classical one; so it
is more general and more profitable in most cases.
In the following, a typical and representative main-
tenance strategy is presented in order to illustrate
this new framework.
Usually, the accuracy of health prognosis can
be measured through the standard deviation of
Figure 2. Illustration of the ( ) strategy. the conditional system RUL, and the waiting time
can be determined from the system conditional
MRL. For the system model considered in Section
Under such a maintenance strategy, the condi- 2.1, the degradation level detected at an inspection
tion index (e.g., the threshold of the ( ) strat- time returns the same information as the stand-
egy) has a double role. On one hand, it, together ard deviation of the conditional system RUL: the
with the inspection period , decides the number higher the degradation, the more the system RUL
of inspections carried out over a renewal cycle, and prognosis is accurate, so we merely use the system
on the other hand, it is adopted to decide whether degradation level to control inspection decisions.
or not to trigger a PR. This is the biggest weakness This choice, on one hand, simplifies the computa-
of the classical maintenance decision framework, tion, and on the other hand, allows a maintenance
because a single condition index may not have all decision rule consistent with the above representa-
the required properties to make efficient decision tive of the classical framework. Of course, when the
in order to, at the same time, avoid inopportune system is more complex (e.g., multi-unit or multi-
inspection operations and properly prolong the failure mode), the more overarching standard
system useful lifetime. Indeed, to avoid a large deviation of the conditional system RUL should
number of inspection operations, the ( ) strat- be used instead of the system degradation level. As
egy lowers the value of ; however a lower will a result, an exemplary maintenance strategy can be
shorten the system useful lifetime because of early stated as follows. Over a certain renewal cycle, the
PR operations. On the contrary, a high value of system is regularly inspected with period . At a
can lengthen the system useful lifetime, but more scheduled inspection date i , i = 1, 2, ,
cost could be paid for redundant inspection opera- a CR of the system is carried out if it fails (i.e.,
tions. More reasonable maintenance decisions X i L). But, if the system is still running, a deci-
should handle this weakness. sion based on the accuracy of the system RUL
prognos is given at i is adopted. If X i < L ,
where is a degradation threshold indicating the
3.3 Proposed framework and
accuracy of RUL prognosis, no additional inspec-
representative strategy
tion is needed for the current renewal cycle because
To avoid the drawback of the above classical main- the system failure time can be already predicted
tenance decision framework, we propose not to use with an acceptable precision, and a system replace-
the same condition index for both scheduling the ment is planned time units later (i.e., at time
inspections and deciding of a replacement. Two r = i + ). The waiting time i is defined from the
condition indices instead of a single one should be system conditional MRL as follows
used: one to control the inspection decision, and the
other to control the replacement decision. Accord-
ingly, a replacement is not necessarily always
( ) ( ( i y) ) 1{ },
( i y )
(7)
performed at an inspection time i , but at a prede-
termined time r (normally r i ). The decision where ( i | y ) is the system conditional MRL
of performing an inspection could be based on the given from (5), is known as safety time interval.
property that the system health prognosis accuracy The replacement at r may be either preventive
improves with age. As such, since a certain age i or corrective depending on the working or failure
of the system, we can know quite precisely its fail- state of the system at this time. After the replace-
ure time. Additional inspection operations are thus ment, a new renewal cycle begins, and the next

85

AMER16_Book.indb 85 3/15/2016 11:24:23 AM


the literature, the cost rate C is usually evaluated
analytically by the renewal-reward theorem (Tijms
2003). This classical method is normally useful
for static maintenance decision rules (Huynh et
al. 2014). When the decision rules are dynamic as
in the present paper, it is more interesting to take
advantage of the semi-regenerative theory (Cocoz-
za-Thivent 1997). Consequently, (8) can be rewrit-
ten as (Brenguer 2008)

E Ni ( ) E N p ( )
C Ci + Cp
E [ ] E [ ]
E Nc ( ) E W ( )
+ Cc + Cd , (9)
E [ ] E [ ]
Figure 3. Illustration of the ( ) strategy.

where = i i1 , i = 1, 2,, denotes the


inspection time will be carried out on this cycle at length of a single Markov renewal cycle which is the
i = r + . Otherwise (i.e., X i < ), we cannot time interval between 2 successive inspections, E
predict precisely the failure time, so one or more denotes the expectation with respect to . In the
inspections are needed to gather additional infor- following, we focus on formulating the stationary
mation about the system, and the maintenance law and the expectation quantities in (9).
decisions are postponed until the next inspec-
tion time i = i + . The maintenance strategy
admits the inspection period , the degradation 4.1 Stationary law of the maintained system state
threshold to control prognosis accuracy and the The behavior of maintained system at inspection
safety time interval as decision variables, so we times can be characterized by the stationary law
called it ( ) . With the same system and main- of the Markov chain continuous state space
tenance costs as in the illustration of Section 3.2, R + , { i }i N . Let consider the Markov renewal cycle
the ( ) strategy reaches its optimal configura- [ i i ] , i = 1, 2, , y and x are respectively the
tion at opt = 6 , opt = 5.5526 and opt = 4.8 . The degradation levels of maintained system at the
evolution of the maintained system under the opti- beginning and the end of the cycle (i.e., X y
mal configuration is shown in Figure 3. We can see
and X x ), the stationary law is the solution
i
i1

that optp of the ( , ) strategy is much smaller i

p of the ( ) strategy; this means that the


than oopt of the following invariant equation
( , ) strategy does not need a RUL prediction
accuracy as high as the ( ) strategy. In other ( ) R F( y) ( y) dy
d , (10)
+
words, the information about the health prognosis
accuracy has been taken into account to improve
the maintenance decision-making. where F ( x | y) is the degradation transition law
from y to x . This transition can be obtained by
an exhaustive analysis of all the possible evolution
4 COST MODEL AND OPTIMIZATION and maintenance scenarios on the Markov renewal
cycle i i . As a result, we obtain
The long-run expected maintenance cost rate is

used here as a cost criterion to assess the perform- (x ) 0 f , ( ) ( y) ddy f ( x )

( )
ance of the considered maintenance frameworks L


f y ( z y) dz
d ( y ) dy
d (11)
E C (t )
y

C = lim , (8) + f , ( x ) ( y ) dy
d ,
t t L

C (t ) denotes the total maintenance cost includ- where f ( y), () and f , () are derived from
ing the downtime cost up to time t : C (t) = (1). Given (11), we can adapt the fixed-point itera-
Ci Ni (t ) C p N p (t ) + Cc Nc (t ) Cd W (t ), where tion algorithm to numerically evaluate ( ) . Many
Ni (t ), N p (t ) and Nc (t ) are respectively the number numerical tests were carried out, and they have
of inspections, PR and CR operations in [0, t], and shown that the algorithm converges very quickly to
W(t) is the system downtime interval in [0, t]. In the true stationary law.

86

AMER16_Book.indb 86 3/15/2016 11:24:26 AM


4.2 Expected quantities Table 1. Optimal configuration of (, ) and (, , )
strategies.
On a Markov renewal cycle, E Ni ( ) = 1. The
other expected quantities are all computed in a Optimal decision variables Optimal cost rate
similar way by integration with respect to the sta-
tionary law determined in Section 4.1. opt = 4.6, opt = 9.1478 C(opt, opt) = 6.3422
opt = 6, opt = 5.5526 C(opt, opt, opt) = 5.9746
L opt = 4.8
E [ ] ( y) ( y) dy
d . (12)

L
E N p ( ) F ( y), ( L y ) ( y ) dy. (13)
general conclusions on the performance these
L
E Nc ( ) F ( y), ( L y ) ( y ) dy
d strategies are given in Section 5.
L
+ F , ( L ) ( y) dy +0 F ( ) ( ) dy.
5 PERFORMANCE ASSESSMENT
(14)
This section aims at seeking a general conclusion on
L the effectiveness of the new predictive maintenance
E W ( ) ( y) dy
d F u ( L ) du
d decision framework. To this end, we compare the
0
( y) performance of the ( ) strategy to the ( )
+ y ) du (
L
0
F u (L
) dy strategy under various configurations of mainte-
+

0 ( ( ) ) ( ) dy.
nance operations costs and system characteristics.
The so-called relative gain in the optimal long-run
expected maintenance cost rate (Huynh et al. 2015)
(15) is resorted for this purpose
In the above expressions, F ( ) () = 1 F ( ) ()
and F ( ) () is given from (2).
C (% ) =
(
C opt optt ) (
C opt opt , opt ) 100%.
(
C opt opt )
4.3 Maintenance optimization
Using (11), and introducing (12), (13), (14) and If C ( %) > 0 , the ( , ) strategy is more profit-
(15) into (9), we obtain the full mathematical cost able than the ( ) strategy; if C ( % ) = 0, they
model of the proposed predictive maintenance have the same profit; otherwise, the ( ) strat-
decision framework. The classical framework is a egy is less profitable than the ( ) strategy.
particular case of the new one, its cost model is At first, we are interested in the impact of the
also derived from (9) by taking = 0 . Given the maintenance costs on the performance of the new
cost model, optimizing the strategies ( ) and predictive maintenance decision framework. This
( , ) returns to find the set of decision varia- is why we fix the system characteristic at L = 15
bles of each strategy that minimizes the associated and = 0.2 (i.e., m = 1 and 2 = 5 ), the
long-run expected cost maintenance rate practical constraint Ci C p < Cc also leads us to
take Cc = 100 and consider the three cases
C ( opt oopt ) (
i {C ( ,
, )
)} , varied inspection cost: Ci varies from 1 to 49
with step equals 1, C p = 50, and Cd = 25,
C ( opt oopt oopt ) = (min
)
{ , ,
( , )} , varied PR cost: Ci = 5, C p varies from 6 to 99
with step equals 3, and Cd = 25,
varied downtime cost rate: Ci = 5, C p = 50, and
where 0 , 0 , L and 0 . The general-
Cd varies from 10 to 190 with step equals 5.
ized pattern search algorithm presented in (Audet
et al. 2002) can be resorted to find the optimal For each of above cases, we sketch the relative
maintenance cost rate and the associated decision gain C ( %) , and the results are obtained as in
variables. Applying this algorithm to the system Figure 4. Not surprisingly, the ( , ) strategy
and the set of maintenance costs presented at the is always more profitable than the ( ) strategy
beginning of Section 3, we obtain optimal quan- (i.e., C ( %) > 0). Thus, there is no risk when using
tities as in Table 1. Based on the optimal values the proposed maintenance decision framework
of cost rates, the ( , ) strategy is more profit- (i.e., it returns to the classical one in the worst
able than the ( ) strategy. However, this is just case). This is a significant advantage of this new
the conclusion for the present special case. More framework. Moreover, it is especially profitable

87

AMER16_Book.indb 87 3/15/2016 11:24:31 AM


decision framework, the sooner the accuracy level
of RUL prognosis is reached, the less inopportune
inspections are performed. The lower degradation
variance allows a RUL prognosis with higher pre-
cision, hence it is not surprising that the ( , )
strategy is most profitable at the small values of
2 , and its profit decreases when the system
becomes more chaotic (see Figure 5).

6 CONCLUSIONS & PERSPECTIVES

We have proposed in this paper a new parametric


predictive maintenance decision framework consid-
ering improving health prognosis accuracy for sto-
Figure 4. C (%) with respect to maintenance costs. chastically deteriorating single-unit systems. Many
numerical experiments show the advantage of the
proposed maintenance decision framework com-
pared to the most well-known one in the literature.
In fact, the proposed framework is more general
and more flexible than the classical one, so there is
no risk when using the new framework. Further-
more, this new framework is especially suitable for
the systems with small degradation variance and
incurred high inspection costs. The results in the
present paper also confirm the interest of the sys-
tem health prognosis information for maintenance
decision-making when it is properly used. This
encourages us to continue investing in prognostics
and health management technologies, and building
new predictive maintenance strategies.
For our future works, we continue studying the
advantage of the proposed predictive maintenance
decision framework for multi-unit systems (e.g.,
Figure 5. C (%) with respect to the degradation deteriorating systems with k -out-of- n structure).
variance. We also believe that the framework will be particu-
larly suitable for systems with limited number of
inspection and repair facilities (e.g., offshore sys-
when the inspection is expensive (see the left side tems such as submarine power cables, offshore wind
of Figure 4) as it can avoid inopportune inspection turbines, subsea blowout preventer system, etc.).
operations. When the PR cost C p or the down-
time cost rate Cd increases, the inspection cost Ci
becomes relatively smaller, and hence the relative
REFERENCES
gain C ( %) is weaker (see the middle and the left
side of Figure 4). Consequently, unlike the inspec- Abdel-Hameed, M. (2014). Lvy Processes and Their
tion cost, the PR cost or the downtime cost rate do Applications in Reliability and Storage. SpringerBriefs
not much affect the performance of the proposed in Statistics. Springer.
framework. Audet, C., J. Dennis, & E. John (2002). Analysis of gen-
To investigate the impact of the degradation eralized pattern searches. SIAM Journal on Optimiza-
variance 2 on the performance of the proposed tion 13(3), 889903.
maintenance decision framework, we take L = 15 Banjevic, D. (2009). Remaining useful life in theory and
and m = 1 , vary 2 from 1 to 19 with step 1, and practice. Metrika 69(23), 337349.
Brenguer, C. (2008). On the mathematical condition-
we study the evolution of C ( % ) when the set
based maintenance modelling for continuously dete-
of maintenance costs is fixed at Ci = 5 , C p = 50, riorating systems. International Journal of Materials
Cc = 100 and Cd = 25 . The result is as in Figure 5. and Structural Reliability 6(2), 133151.
Once again, the ( , ) strategy is always eco- Blain, C., A. Barros, A. Grall, & Y. Lefebvre (2007).
nomically better than the ( ) strategy by the Modelling of stress corrosion cracking with stochastic
same reason as above. Under the new maintenance processesapplication to steam generators. In Proc.

88

AMER16_Book.indb 88 3/15/2016 11:24:35 AM


of the European Safety and Reliability Conference - Langeron, Y., A. Grall, & A. Barros (2015). A modeling
ESREL 2007, pp. 23952400. framework for deteriorating control system and pre-
Bousquet, N., M. Fouladirad, A. Grall, & C. Paroissin dictive maintenance of actuators. Reliability Engi-
(2015). Bayesian gamma processes for optimizing con- neering & System Safety 140, 2236.
dition-based maintenance under uncertainty. Applied Liao, L. & F. Kottig (2014). Review of hybrid prognos-
Stochastic Models in Business and Industry 31(3), tics approaches for remaining useful life prediction of
360379. engineered systems, and an application to battery life
Cocozza-Thivent, C. (1997). Processus stochastiques et prediction. IEEE Transactions on Reliability 63(1),
fiabilit des systmes, Volume 28 of Mathmatiques & 191207.
Applications. Springer. In French. Ponchet, A., M. Fouladirad, & A. Grall (2010). Assess-
Grall, A., C. Brenguer, & L. Dieulle (2002). A condi- ment of a maintenance model for a multi-deteriorating
tion-based maintenance policy for stochastically dete- mode system. Reliability Engineering & System Safety
riorating systems. Reliability Engineering & System 95(11), 12441254.
Safety 76(2), 167180. Santini, T., S. Morand, M. Fouladirad, L.V. Phung,
Huynh, K.T., A. Barros, & C. Brenguer (2015). Multi- F. Miller, B. Foucher, A. Grall, & B. Allard (2014).
level decision-making for the predictive maintenance Accelerated degradation data of sic mosfets for life-
of k-out-of-n:f deteriorating systems. IEEE Transac- time and remaining useful life assessment. Microelec-
tions on Reliability 64(1), 94117. tronics Reliability 54(9), 17181723.
Huynh, K.T., A. Barros, C. Brenguer, & I.T. Castro Shin, J.H. & H.B. Jun (2015). On condition based main-
(2011). A periodic inspection and replacement policy tenance policy. Journal of Computational Design and
for systems subject to competing failure modes due to Engineering 2(2), 119127.
degradation and traumatic events. Reliability Engi- Si, X.S., W. Wang, C.H. Hu, & D.H. Zhou (2011).
neering & System Safety 96(04), 497508. Remaining useful life estimationa review on the sta-
Huynh, K.T., I.T. Castro, A. Barros, & C. Brenguer tistical data driven approaches. European Journal of
(2014). On the use of mean residual life as a condition Operational Research 213(1), 114.
index for condition-based maintenance decision-mak- Singpurwalla, N.D. (1995). Survival in dynamic environ-
ing. IEEE Transactions on Systems, Man, and Cyber- ments. Statistical Science 10(1), 86103.
netics: Systems 44(7), 877893. Tijms, H. (2003). A first course in stochastic models.
Jardine, A.K.S., D. Lin, & D. Banjevic (2006). A review Wiley, New York.
on machinery diagnostics and prognostics implement- Trav-Massuys, L. (2014). Bridging control and artificial
ing condition-based maintenance. Mechanical systems intelligence theories for diagnosis: A survey. Engineer-
and signal processing 20(7), 14831510. ing Applications of Artificial Intelligence 27, 116.
Kallen, M.J. & J.M. van Noortwijk (2005). Optimal Van Noortwijk, J.M. (2009). A survey of the application
maintenance decisions under imperfect inspection. of gamma processes in maintenance. Reliability Engi-
Reliability Engineering & System Safety 90(23), neering & System Safety 94(1), 221.
177185. Wang, X. (2009). Nonparametric estimation of the shape
Khoury, E., E. Deloux, A. Grall, & C. Brenguer (2013). function in a gamma process for degradation data.
On the use of time-limited information for mainte- Canadian Journal of Statistics 37(1), 102118.
nance decision support: A predictive approach under Xu, Z., Y. Ji, & D.H. Zhou (2008). Real-time reliability
maintenance constraints. Mathematical Problems prediction for a dynamic system based on the hidden
in Engineering 2013. Article ID 983595, 11 pages, degradation process identification. IEEE Transactions
doi:10.1155/2013/983595. on Reliability 57(2), 230242.

89

AMER16_Book.indb 89 3/15/2016 11:24:37 AM


This page intentionally left blank
Applied Mathematics in Engineering and Reliability Bri, Snel, Khanh & Dao (Eds)
2016 Taylor & Francis Group, London, ISBN 978-1-138-02928-6

Condition-based maintenance by minimax criteria

O. Abramov & D. Nazarov


Laboratory of Complex Systems Reliability Control, Institute of Automation and Control Processes
FEB RAS, Vladivostok, Russia

ABSTRACT: An approach to design the condition-based maintenance of engineering systems is con-


sidered in the paper. This approach is based on prediction of system state. The algorithms which provide
perfect predicting of system state and designing of optimal preventive maintenance strategies in the case
when inspection data are incomplete and insignifican are discussed.

1 INTRODUCTION noises and model errors, which may be given in


a rather coarse way. This paper is organized as
The excessive servicing and increasing requirements follows. In Section II, we formulate the problem
in the efficient operation of equipment determine of predicting the state of an engineering system
the necessity to state a maintenance problem in in the case when inspection data are incomplete
a new way to solve it for every particular unit of and insignificant. In Section III, we formulate the
equipment on the individual basis. An informa- problem of optimal minimax (maximin) predictive
tion base for preventive maintenance is formed maintenance. In Section IV, we construct the pre-
by predicting and estimating of an engineering dictor by the minimax squaring criterion. In Sec-
system state. There prove to be many difficulties tion V, we solve the problem of picking an optimal
in solving the matter, These difficulties are caused minimax strategy of predictive maintenance. Some
by the lack on storage of apriority statistic infor- conclusions complete the paper.
mation on field variation of system parameters. In
this case the application of classical methods of
mathematical statistic to the solution of problems 2 FORMULATION OF THE PROBLEM
of predicting the state and predictive maintenance OF MINIMAX STATE PREDICTING
scheduling may cause serious errors. One is then
confronted with problems of designing a predictor Assume that variations in system state parameters
and an optimal strategy of predictive maintenance can be approximated in a rather coarse way as
in the event of incomplete model knowledge. follows:
Both these problems are referred to minimax
problems. With that the first problem or the prob- Y (t ) = AT u(t ), t T , (1)
lem of minimax predicting the state was consid-
ered early (Abramov et al. 2000), (Abramov and
where A = { j }nj = 0 is a set of random coefficients,
Rozenbaum 2004). But it should not be consid-
ered that this problem is settled completely. The u( ) { j (t )}nj =0 are continuous deterministic
second problem, referred to as minimax control, functions of time, and T is the operating period.
has attracted attention to handle predictive main- The model errors are present here but they are not
tenance. In this paper, we present a new minimax determined.
algorithm for predicting the state and solve a The representation (1) can be discussed as an
problem of picking an optimal minimax strategy expansion y(t ) by some function system. Such
of individual maintenance without using informa- expansion allows to approximate theoretically any
tion about stochastic properties of measurement real process y(t ) and it is variance with the facts.
noises and errors of the model chosen to describe The system state y(t ) is monitored on the inter-
the random processes of parameter variation. The val time Tp T with additive error e(t ) . Measure-
predicting algorithm is constructed by a minimax ments form a sequence Z = { (tk )}kp =1,t Tp T .
squaring criterion. An optimal minimax strategy Probability properties e(t ) are not determined, it
of individual maintenance is found by methods is only known that
of dynamic programming. Our approach requires
only the possible variation limits for unknown | ( ) | ( ),t Tp T , (2)

91

AMER16_Book.indb 91 3/15/2016 11:24:37 AM


where c(t ) is a given function.
g i G ( y s,tt ),
The problem consists in determining of estima- s (t )S
S T y (t )Y T
tions y
(t ),t T  Tp . (3)
g = max min G ( y, s,t ).
The model of system state variations (1), the S T y (t )Y T
s (t )S
constrains (2) on disturbances and measure-
ments Z form an initial data base for a solution It is evident that a function which delivers mini-
of the problem. The deficiency and uncertainty mum of maximum (or maximum of minimum) of
of data base, in particular, the default of full index G ( y s t ) (or G ( y, s,t ) ) is a sought minimax
probability properties of the disturbances and (or maximin) predictive maintenance strategy. In
the presence of unknown errors in the model (1) other words, the representation (3) are problem
make in difficult to obtain the sought estimate formulation of optimal minimax (maximin) pre-
y
(t ),t T  Tp by using well-known statistical dictive maintenance.
techniques, such as least-squares method, least-
magnitudes method, and so on. It seems prefer-
ential to construct the estimate y
(t ),t T  Tp 4 MINIMAX PREDICTING ALGORITHM
proceeding from the worst cases, i. e, on basic
of minimax concept (Abramov, Rozenbaum, & In accordance with the statement making in sec-
Suponya 2000); tion 2 a problem of predicting the state consist
in definition of estimations y
(t ),t T  Tp ,
such estimations must ensure the fulfilling
3 OPTIMAL MINIMAX (MAXIMIN) min y(t ) y(t (t ) , t T  Tp . In assumption of
PREDICTIVE MAINTENANCE presents of model errors e(t) in the relationship (1)
for t T the considering problem can be formu-
Let the state of the equipment unit during service lated as follows:
is described by a parameter y. Under influence of
destabilizing factors, y varies in a random man- G i AT U , (4)
ner. These variations can be approximated by the A | | c
expansion (1). The stochastic properties of y(t ) p p,m p
are unknown. where z { (tk ))}k =1 , j (tk ) k = , j = 0, e = { (tk )}k =1 ,
The variations of y(t ) as a matter of fact can be c = { (tk )}kp =1.
lead to failures or to a deterioration of function-
ing of engineering systems. To prevent such unde-
sirable occurrences we must control of y(t ) . The A norm of misclosure, i. e. AT U ,
control is realized by an predictive maintenance of serves as the optimality criterion in (4).
engineering systems. This norm can be ggiven as
It is necessary to give a performance criterion AT ( z e AT U )T ( z e AT U ) ,
=
to optimize an predictive maintenance. Different i. e as squaring. With that we obtain:
technical-and-economic indexes: reliability, capac-
T
ity, efficiency and so on can be such criteria. A per- G i ( U )T ( T
U
U ). (5)
formance index generally represents a functional s. | | c A
It is evident that any change of equipment state
y Y involves a change in the performance index By defining minimum from A in (5) we find
G(Y T ) . Then the predictive maintenance consists that
in tracking y(t ) and forcedly changing it in some
instants of time t T . It represents field inspec- A opt (U T U ) U T ( z e ). (6)
tion and adjustment of equipment parameters,
replacement of units, assemblies and components The substitution (6) into (5) gives the following:
whose parameters reach their critical values. An
inspection result z (t ) = y(t ) + e(t ) , where e(t ) is G x( )T L ( z e ), (7)
the random error whose stochastic properties are |e | c
undeterminable but whose range of value E is
known. Adjustment consists in changing y(t ) by where L is the symmetric matrix an arrange-
nonrandom value r R. ment of p elements taken ( ) at a time
We shall consider, as a control strategy, a func- L ( I U (U T U ) 1 U T )T ( I U (U T U )1U T ),
tion s(t ) , where s S ( S is the set of preventive I is unit matrix.
actions). The problem of constructing s(t ) in Using (6) and (7) we can obtain numerical val-
minimax (maximin) statement may be written as ues { opt n
j } j =0 , and then define a sought estimation
follows: g (t ),t T  Tp using (1). A solution (6) can be

92

AMER16_Book.indb 92 3/15/2016 11:24:39 AM


found by the methods of nonlinear programming The limiting losses when using the engineering
here. It should noted that ( ) maximum error of system in state ( c ) without any maintenance
determination y
(t ),t T  Tp we can estimate as in interval [ c ] can be represented in the form:
the following:
tc
W1( t tc ) ax H ( y d . (10)
ax | (t ) | max
( ) max ma | ( ) |, (8) A t

where t is a fixing instant out of Tp T . If at an instant t


,t t
tc we take a read-
The algorithm under discussion meet general ing of y(t ) associated with expanses and we
requirement to any predicting procedure. Esti- obtain a value z (t ) = y(t ) + e(t ) ( e(t ) is a ran-
mates found are unique, optimal, and unbiased. dom measurement error whose stochastic prop-
erties are undeterminable but only its range E
is known), then the information state of the
5 PREDICTIVE MAINTENANCE engineering system will be ( c ) , where A
STRATEGY is the vector of coefficients a j obtained from
measurement result. The the limiting losses are
To obtain a concrete solution of the task (3) W2 ( t tc ) W1( t (
t )) + +W W ( t
tc ) .
we must give a concrete optimality criterion. In If at instant t
,t t
tc we adjust y(t ) (a
principle, such criterion should be chosen based change of y(t ) by rR) and associated expanses
on the requirements to the performance of the are , then limiting losses for state ( c ) can
particular equipment unit. The requirements a be described as follows:
specified on the basis of a certain index system.
Within this system economic indexes are most W3 ( t tc ) W1( t t
) + +W
W( t
tc ), (11)
general. The indexes, in particular, include a
guaranteed level of total losses when using the where A is the vector of coefficients a j with
engineering system on set T. The index can be allowance for a change in state of the engineering
written as follows: system after the adjustment.
Based on (10), we can form recurrent relations
WT p
y (t )Y T T
H ( y(t ))dt + VT , (9) for finding an optimal strategy S(t ) :

W (t tc )= i Wi , (12)
i =1,2 ,3
where H ( y(t )) is the loss function which describes
losses when equipment state differs from normal
W1 W1(t,tc , ), (13)
one; VT is the operating expenses.
We may obtain a globally optimal strategy
S(t ) by stepwisely minimizing criterion WT W2 in W1(t,t
, A)) + + mi W ( ,t
,ttc ), (14)
t t tc A
based on Bellmans optimality principle. Algo-
rithms applied are simple enough and can be
implemented in recurrent form. We should build a W3 in W1(t,t
, A)) + + mi W (

, t ,ttc ). (15)
t t tc r R
state space to form recurrent relations for solving
problem on the basis of the optimality principle. The value of i for which minimum in (12) is
In other words, we should find a coordinate set achieved and values t
, r for which minimum of
which contains all information about the engi- minima in (14) and (15) are achieved are functions
neering system in a given time interval regard less of ( c ) and describe a sought optimal predic-
of its past behavior. When y(t ) is described by tive maintenance strategy S(t ) .
(1) a sought set can be represented as a collection As a matter of fact, solving equations (12), (13),
( c ) where A is a vector whose members (14), (15) is a problem of dynamic programming. To
define ranges of coefficients a j j = 0, n in model form S(t ) we can make use of space approximation
(1), t,tc T t tc . technique. With that for finding A it is necessary
Let the function W ( t tc ) describes limiting to use minimax algorithms of predicting the state, in
losses associated with the optimal servicing of the particulary, it can be the algorithm from Section 4.
engineering system in state ( c ) . Predictive
maintenance consists in inspecting and adjust-
ing y(t ) (it is not difficult to show that replacing 6 CONCLUSIONS
units, assemblies and components of the engi-
neering system is equivalent to the adjustment of The problem of designing of predictive mainte-
y(t )). nance of engineering systems is solved for the case

93

AMER16_Book.indb 93 3/15/2016 11:24:44 AM


when inspection data are incomplete and insig- ACKNOWLEDGMENT
nificant. It is usually the case in actual practice.
With that we find a global optimal (in minimax This work is partially supported with the grant
sense) strategy of predictive maintenance here. 14-08-00149 of the Russian Foundation for Basic
This strategy guarantees the efficient functioning Research.
of operated engineering systems on the operating
interval with minimal expanses. There is a mini-
max predicting algorithm in this paper. The algo- REFERENCES
rithm allows to obtain strict estimations for own
unknown quantities. The estimation are more use- Abramov, O. & A. Rozenbaum (2004). Passive control
ful and reliable than statistical estimations of the of the operation of measuring instruments. Measure-
task of predicting the state where the single reali- ment Techniques. 47 No.3, 233239.
zation y(t ) is observed. The considered algorithm Abramov, O. (2010). Parallel algorithms for comput-
ing and optimizing reliability with respect gradual
intends for accompaniment of solution of tasks of failures. Automation and Remote Control. 71, No.7,
predictive maintenance. The proposed approach 13941402.
has been realized on practice as the theoretical Abramov, O., A. Rozenbaum, & A. Suponya (2000). Fail-
basis for designing of controlling and measure- ure prevention based on parameters estimation and
ment systems (Abramov and Rozenbaum 2004), prediction. In Preprints of 4th IFAC Symposium SAFE-
(Abramov 2010). PROCESS 2000., Budapest, Hungary, pp. 584586.

94

AMER16_Book.indb 94 3/15/2016 11:24:50 AM


Applied Mathematics in Engineering and Reliability Bri, Snel, Khanh & Dao (Eds)
2016 Taylor & Francis Group, London, ISBN 978-1-138-02928-6

Impact of cost uncertainty on periodic replacement policy with budget


constraint: Application to water pipe renovation

Khanh T.P. Nguyen


University of Technology of Troyes, LM2S, Troyes, France

Do T. Tran
Inria LilleNord Europe, Villeneuve dAscq, France University of Lille, CRIStAL (CNRS UMR 9189),
Lille, France

Bang Minh Nguyen


Non-Revenue Water Reduction Department
Nha Be Water Supply Joint-Stock Company, HCMC, Vietnam

ABSTRACT: Due to market flexibility, repair and replacement costs in reality are often uncertain and
can be described by an interval that includes the most preferable values. Such a portrayal of uncertain
costs naturally calls for the use of fuzzy sets. In this paper, we therefore propose using fuzzy numbers
to characterize uncertainty in repair and replacement costs. The impact of fuzzy costs on the optimal
decision is then investigated in the context of an industrial problem: optimizing water pipe renovation
strategies. Here we examine specifically the risk of violating a budget constraint imposed on the annual
cost associated with pipe failure repairs. This risk is evaluated using the mean chance of the random fuzzy
events that represent random fuzzy costs exceeding a given budget. The benefit of taking account of cost
uncertainty is then validated through various numerical examples.

1 INTRODUCTION sions can generally be classified into three groups.


The first group focuses on improving system fail-
The role of maintenance policies in industry is ure models and takes into account the effects of
increasingly highlighted as they can reduce pro- different failure modes or shock condition. Sheu
duction costs, extend the useful life of industrial & Griffith (2002), Sheu et al. (2010) proposed a
equipment, and also alter the strategy for new periodic replacement policy for systems subjected
investments in equipment. Among maintenance to shocks. Lai & Chen (2006) considered a two-
policies, periodic maintenance is a traditional unit system with failure rate interaction between
policy that is commonly used in reality thanks the units. In (Sheu et al. 2012), the authors consid-
to its simplicity of implementation and ability to ered non-homogeneous pure birth shocks for the
easily integrate related constraints into decision block replacement policy. The second group aims
processes. The main objective of periodic main- to solve the questions of large size and complex
tenance is to determine the best replacement time systems. Wang & Pham (2006) studied the corre-
that maximizes system reliability or availability lated failures of multiple components in a serial
and safety, and minimizes maintenance cost. The system and aimed to optimize system availability
basic policy proposed by Barlow & Hunter (1960) and/or maintenance costs. Scarf & Cavalcante
recommends that equipment be replaced after k. T (2010) proposed hybrid block replacement and
hours, where k = 1, 2, , and any failure occurring inspection policies for a multi-component system
between two successive replacements be restored in serial structures. The third group extends main-
with a minimal repair. This minimal-repair model tenance policies by considering numerous main-
assumes that the cost of minimal repairs is lower tenance decisions and evaluate the performance
than the cost of a preventive replacement. of these activities. The model proposed by Sheu
Various extensions and variations of this basic (1992) considered the possibility of installing a
model have been proposed in the literature over the new alternative or performing a minimal repair
time. Many of them have been surveyed in (Wang or doing nothing when a failure occurs. Jamali
2002, Ahmad & Kamaruddin 2012). Those exten- et al. (2005) proposed a joint optimal periodic and

95

AMER16_Book.indb 95 3/15/2016 11:24:51 AM


conditional maintenance strategy to improve the In fact, the block replacement policy is mainly
policys efficiency. Lai (2007) optimized a periodic developed in the literature of the water resource
replacement model based on the information of domain (Shamir & Howard 1979, Kleiner &
cumulative repair cost limit. Rajani 1999, Kanakoudis & Tolikas 2001, Kleiner
All of the above studies are based on the assump- & Rajani 2010, Nguyen 2014), and is usually used
tion of constant costs. This assumption may not by the Nha Be Water Supply Company1a major
reflect market flexibility in practice. In (Sheu et al. water supply company for Districts 4, 7, and Nha
1995), the authors extended a replacement prob- Be of Ho Chi Minh City, Vietnam.
lem with two types of failures considering random Let Cb be the cost associated with a pipe repair
repair costs that depend on the system age. How- event and Cr be preventive renovation cost. The
ever, the assumption about random costs is only objective of block replacement is to determine the
valid when there are a lot of samples to perform value of T in order to minimize the expected life
statistical evaluations on probability measures. In cycle cost per unit time, C(T):
many cases, there is not sufficient information and
it is not easy to propose an adequate probability Cb N (T ) Cr
distribution for random parameters. Especially in C (T ) = (1)
T
the case of water pipe breakages, damage costs,
such as water losses, social damage consequences, where N(T ) is the expected number of pipe-
are uncertain. There are not enough samples to break repair event during the time period (0, T].
estimate a probability distribution. Then, consult- The details of N(T ) will be described in the next
ing with the expert, the repair/replacement cost can section.
be predicted by somewhere between a and c with a
preference for b, where a < b < c. In this context,
2.1 Pipe break prediction model
it is natural to extend the representation to using
fuzzy quantities (Zimmermann 2010). As pipe break prediction is one of the most impor-
To the best of our knowledge, no previous work tant aspects of water network management,
addresses the problem of fuzzy costs in block numerous studies have addressed this issue in the
replacement with budget constraint and studies literature, see (Kleiner & Rajani 2001), for exam-
their impacts on the optimal policy. This paper ple, for an overview of statistical breakage models.
is therefore aimed at filling this gap of the litera- Most pipe break prediction models are deduced
ture. On the other hand, our study is especially from the maintenance records of pipe repairs data,
dedicated to a real case study of the replacement thus they could be considered as pipe-break repair
problem in water distribution networks. The paper event models.
is structured as follows. Section 2 presents the Shamir & Howard (1979), Kleiner & Rajani
statement and formulation of the water pipe reno- (1999) considered deterministic exponential and
vation problem. Section 3 is dedicated to identify- linear models for the annual breakages. Kanak-
ing the optimal renovation time when considering oudis & Tolikas (2001) used an exponential expres-
fuzzy costs and budget constraints. In Section 4, sion of breakage numbers to perform economic
we perform an experimental analysis to examine analysis of the pipe replacement policy. In (Dandy
the impact of fuzzy costs on the optimal decision. & Engelhardt 2001), the authors performed a lin-
Finally, Section 5 concludes the paper with pros- ear regression of breakage data to predict pipe
pects for future work. breaks for the scheduling of pipe replacement with
genetic algorithms. However, it is not easy to collect
enough data required to establish a deterministic
2 PROBLEM DESCRIPTION model. Therefore, probabilistic models have been
AND FORMULATION proposed to model pipe breakage in water distribu-
tion networks. Among them, the Weibull-exponen-
In this paper, we consider a block replacement pol- tial distribution has been widely used in statistical
icy in which the system is correctively repaired at models to describe the interval time between pipe
failures and preventively replaced at periodic times, installation and the first pipe break or between
T years. We assume that the failure rate is not dis- two successive breaks (Eisenbeis 1994, Mailhot
turbed by each repair. This policy is applied to a et al. 2000). Le Gat & Eisenbeis (2000) used the
District Metering Area (DMA), i.e., a small water Weibull proportional hazard model to character-
distribution network in which pipes are subject to ize the distribution of times to failure and showed
the same shocks and the same hydraulic pressure. that short maintenance records (510 years)
could give as good results as long maintenance
records. Kleiner & Rajani (2010) developed a non-
1
According to the Companys breakage repair reports. homogeneous Poisson model that allows us to take

96

AMER16_Book.indb 96 3/15/2016 11:24:51 AM


into account pipe-dependent, time-dependent, and X
( x2 ) = 1 , the terms x1 and x3 are the lower and
pipe-and time-dependent breakages, to represent upper bounds of the possible area.
the probability of breakage in individual water
pipelines. Renaud et al. (2012) presented a break Definition 2. Fuzzy measure
prediction tool, the Casses freeware, which is Let r be a real number, the possibility, the necessity
based on a counting process that relies not only on and the credibility of event ( X
k) are respec-
the pipe age and previous breakages, but also on tively given by, (Liu & Liu 2002):
the pipes characteristics and the environment.
Focusing on the impact of cost uncertainty on Pos(X
k ) sup X ( x ),
x k
the periodic replacement optimization, we only
Nec(X
k) sup X ( x ),
consider in this paper a small DMA in which the x k
environment has the same characteristics. Hence, 1
we propose using a non-homogeneous counting Cre(X
k) = ( (X
k)) (X
k ))
2
process to model the pipe break repair in time. In
detail, let the non-homogeneous Poisson process Considering an example of a triangular fuzzy
{N(t), t 0} characterize the number of pipe break number, X
(x1, x2, x3), its credibility is given by:
repairs during the interval (0, t], the expected value
N(T ) is given by:
0, k x1

t
N (t ) = w( x )dx (2) 1 x x1 , x x x
0 2 x2 x1 1 2
Cre(X
k) = (5)
where w(x) is called the Rate Of Occurrence Of x3 x , x2 < x x3
Failures (ROOF). Consider the increasing ROOF, x x
3 2
w(x) is assumed to follow:
0, x > x3
Case 1: exponential expression,
Definition 3. Fuzzy arithmetic
w( x ) = exp( x ) , < ;x 0 (3) Let Z
f (X
,Y
) denote the system characteristic
of interest (e.g. steady state availability), evaluated
Case 2: Weibull expression, by a function of fuzzy numbers X
, Y
then Z
is also
a fuzzy number. Following Zadehs extension princi-
w( x ) = x 1 0 < , < ; x 0 (4) ple (Zadeh 1965), the membership function of Z

is defined as:

2.2 Handling cost uncertainty with fuzzy numbers Z ( ) i {X


( x ) Y
(y
( y) | z f x, y )} (6)
X
,Y

Depending on the available knowledge, the cost


associated with a pipe breakage, Cb, and the pre- In practice, the a-cut method is developed
p to
ventive renovation cost, Cr, can be modeled by pre- evaluate the membership function of Z
f (X
,Y
) .
cise values or probability distributions or the most
typical values. In reality, when only little knowl- Definition 4. -cut set
edge is available, the costs can be predicted by Given a fuzzy set X
in X and any real number
somewhere between a and c, with a preference for [0,1], then the -cut set of X
, denoted by X
, is the
b, of course a < b < c. In this context, it is preferred crisp set: X
{x X , X
( x ) } .
to extend the representation to using a Triangular Definition 5. Fuzzy arithmetic with -cut set
Fuzzy Number (TFN), C
b or C
r , (Zimmermann Let xL xR and yL yR are respectively the
2010). -cut interval of X
and Y
with the corresponding
p
Definition 1. Fuzzy number value, then the -cut interval of Z
f (X
,Y
) is
Let X be a universal set, then a ffuzzy number X
defined as zL zR where:
is a convex normalized fuzzy set X
, defined by its
membership function: X
: X [ 0,1] , called the zL x [ xL , xR ], y yL , yR f ( x, y )
grade of membership of x in X
. This membership R
x [ xL , xR ], y [ yL , yR f ( x, y )
(7)
function assigns a real number X
( x ) in the inter- z
val [0,1] to each element x X.
The triangular fuzzy number is also noted by X
Therefore, the membership function Z
( z ) can
= (x1, x2, x3) where x1, x2, x3 R and x1 < x2 < x3, be deduced by considering the lower bound and upper
the term x2 is the most probable value of X
with bound of the a-cuts of Z
.

97

AMER16_Book.indb 97 3/15/2016 11:24:51 AM


Ch(X
N B) = Cre(X
N B )P( N n) (10)
n

In the case of a limited annual repair resource, a


budget B is allocated to the pipe-break repair cost dur-
ing a given strategic time unit (i.e., each year). Hence,
the manager wants to handle the risk that the annual
pipe-break repair cost exceeds the given budget B.
That risk is a fuzzy random event, whose occurrences
mean chance should be lower than a real number r,
i.e., (X
N B ) r. This is equivalent to:

Figure 1. Illustration of the -cut set of a triangular Ch(X


N B) 1 r (11)
fuzzy number.
Note that, when:
the pipe-break repair cost is a constant and the
From Eq. (7), the following -cuts of functions annual pipe break number is a random variable,
of positive fuzzy numbers can be easily derived: the constraint in Eq. (11) is measured by:

Z
= X
+ Y
; Z
: xL yL , xR yR (8) P( N ) 1 r,

where XN is a random variable representing the


Z
= AX
+ B; Z
: AxL A L + B
B, Ax (9) annual pipe-break repair cost.
the pipe-break repair cost is a fuzzy variable
and the pipe break number is characterized by
where A and B are constant and A, B > 0. the mean value N , the constraint in Eq. (11) is
measured by:
2.3 Handling annual budget constraint for fuzzy
random costs Cre(X
N B ) 1 r,
As the number of pipe breaks N(t) occurring dur-
ing a certain time (0, t] is a random variable and where X
N is a fuzzy variable representing the
the cost associated with every pipe break, C
b , is a fuzzy cost associated with the expected annual
fuzzy number, the accumulated p pipe-break repair pipe break number.
cost during a certain time, N(t) C
b , is a fuzzy dis-
crete random variable.
Definition 6. A fuzzy random variable is a random 3 IDENTIFYING THE OPTIMAL
variable taking fuzzy values, (Liu 2001). RENOVATION TIME
Let (, A, P ) be a probability space of a dis-
crete random variable N, and F be a collection of 3.1 Without budget constraint
fuzzy variables X
. A fuzzy discrete random variable, Let Tl (year) be the life time of the water pipe net-
noted by X
N is defined by a function from to F, work, the objective of the problem is to determine
such that: the optimal renovation time T* (year), where T*
(0, Tl], that is:
X
n ( x ) with probability
t P( N n1 )
1 T* gT i C
(T )
X
( x ) with probabilitity P( N n2 ) ( ,Tl ]
X
( x ) = n 2 C
b N (T ) C
r (12)
N
= argT ( ,Tl ] min
( x ) with probability T

t P( N nm )
X nm
where N(T ) is evaluated corresponding to the two
cases of the ROOF w(x) as follows:
Definition 7. Let X
N be a fuzzy discrete random
variable. Then the mean chance that fuzzy random
event (
N ) occurs is given by: N(T ) = (exp(( T ) 1) (13)

98

AMER16_Book.indb 98 3/15/2016 11:24:56 AM


N (T ) = T (14) the probability that the number of pipe breaks in
the time interval (t, t + v] is n, denoted by
As the Eq. (12) cannot be directly solved by P( ,t +v ) ( ) , is given by:
the analytical approach, in this paper we propose
using the grid search approach to find T* (0, Tl]. P( ,t +v ) ( ) P(
P( (t v ) (t ) )

T ) for every T ( : 1 : T ], i.e. one
( ) N
(
We evaluate C( n
2 l N( ) (17)
month for each search step, and then compare = exp ( N (t v ))
them to find the minimum value. As the repair n!
and/or renovation costs are characterized by
TFNs, C(
T ) is also a TFN compare TFNs, we corresponding to each integer n, from Eq. (5),
propose using g the expected value method. For a the credibility, which is the repair cost associated
given TFN X
( x1, x2 , x3 ) , a typical model (Liu with n pipe breaks, being lower than the budget
~ constraint B, is given by:
& Liu 2002) for defining its expected value E[X]
is given by:
B
1 1; n
E[ X
] = (x
( x1 x2 x3 ). (15) cb3
4
1 ncb3 B B B
This expected value coincides with the neutral 1 ; <n
scalar substitute of a fuzzy interval (Yager 1981). 2 ncb3 ncb2 b3
c cb2
Cre = (18)
The neutral scalar substitute is among the most 1 B ncb1 B B
natural defuzzification procedures proposed in ; <n
the literature ((Bortolan & Degani 1985). We have: 2 ncb2 ncb1 b2
c cb1
X
Y
E [ X
] E [Y
] .
B

0, n>
3.2 With budget constraint cb1

As the ROOF is increasing with time, more pipe


breaks occur as time goes by. Recall that T is the
preventive renovation year and [T ] be the nearest 4 EXPERIMENTAL ANALYSIS
integer less than or equal to T, then the worst situ-
ation may appear during: 4.1 Parameter estimation for pipe repair event
the previous year of the renovation time, ([T ] models
1, [T ]). We have processed the data obtained from the
or the renovation year, ([T ], T). report on daily pipe break repair activities of the
Therefore, N([T ] 1, [T ]) and N([T ], T). spec- Nha Be Water Supply Company in the period
tively the pipe break numbers that occur during the from January 2008 to September 2015. From this
previous year of the renovation time and the recent data set, we specifically select two DMAs, namely
year of the renovation time. In order to handle Tan My Street and Cu xa Ngan Hang, to apply
the budget constraint, when performing the grid the proposed models. These DMAs have homo-
search, we evaluate the following mean chances geneous pipes across the areas and, on the other
corresponding to every T: hand, present sufficient data for the estimation.
Their information is provided in Table 1. We see
that most of the breakages occurred on branch

(
Ch N ( T ) T Cb B ) (16)
pipes: 92.6% for DMA I and 98.7% for DMA II.
(
Ch N ( T T ) Cb B) The parameters , of the repair event models in
Eqs. (3) and (4) for branch pipes are estimated as
follows.
If the above mean chance is higher than 1 r, Let si and n be the occurrence time of the i-th
we then evaluate the corresponding life cycle cost pipe-break repair event and the number of repair

T ) . Otherwise, this value of T is elimi-
per year, C( events during the observed period To. Then the
nated from the set of possible solutions of the grid maximum likelihood estimates and of
search. and , respectively, in Eq. (3) are obtained by solv-
Let C
b (cb1, cb , cb3 ) be a TFN, the mean ing the following equations (Rausand & Hyland
chance Ch(-) is evaluated by Eq. (10), in which, 2004):

99

AMER16_Book.indb 99 3/15/2016 11:24:58 AM


Table 1. Data of the selected district metering areas
(DMAs).

DMA I DMA II
Tan My Street Cu xa Ngan Hang

Name Pipe type Main Branch Main Branch

Material uPVC PE uPVC PE


Installation year 2000 2000 2003 2003
Total length (m) 6355 4610 4712 3025
Number of breaks 6 75 1 67

n
n nTo
si + 1 exp(
=0 (19)
i =1 o)
Figure 2. Accumulated number of pipe repair events
from January 2008 to September 2016.
n
= (20)
exp( o ) 1
Table 2. Values of fuzzy costs.
Similarly, the maximum likelihood estimates C
b ($) C
r ($)
and of and , respectively, in Eq. (4) are given
by: (232 525 734) DMA I: (300000, 375000, 450000)
DMA II: (300000, 320000, 450000)
n
= (21)
i =1 ln
n
n ln
l To l si
paper,
p we employ TFNs, C
r (cr1 , cr2 , cr3 ) and
n C
b (cb1 , cb2 , cb3 ) to solve the problem.
=
(22) On the other hand, the risk that the annual cost
To associated with repair events exceeds $20,000 is
recommended to be lower than 10%. In this section,
Considering Figure 2, we observe that both we will study how fuzzy costs impact on the opti-
models (the exponential and Weibull ROOF) are mal decision in the case without this budget con-
appropriate for the repair data of both DMAs straint and in the case with this budget constraint.
from January 2008 to September 2016. Among
them, the coefficient of determination R2 of the 4.2.1 Optimal renovation time
Weibull model is higher than that of the exponen- The detailed values of the fuzzy costs are presented
tial model. Hereafter, the Weibull model is chosen in Table 2. As it is generally recommended that the
to characterize the counting process of pipe-repair life time of an uPVC main pipe should not exceed
events for both DMAs. 25 years, we run the grid search over the interval
(0, 25) years with the step of one month to find
the optimal renovation time for both DMAs. Four
4.2 Impact of fuzzy costs on the optimal
cases will be examined:
renovation time without budget constraint
Case A: not considering fuzzy costs and budget
The pipe repair or replacement cost depends on
constraint
the the pipe material/diameter and especially on
Case B: considering fuzzy costs but not taking
the road types such as alley or road/route, asphalt
into account budget constraint
or dirt road. In addition, the variation of the pipe-
Case C: not considering fuzzy costs but taking
break detection time and maintenance time can
into account budget constraint
lead to different damage cost, including water
Case D: considering both fuzzy costs and budget
loss, disruption in service, and so on. Therefore,
constraint
it is difficult to expect a precise value of the ren-
ovation cost for the overall DMA or of the cost Table 3 presents the optimal renovation time
associated with only pipe repair activities. Clas- corresponding to the above cases for both DMAs.
sical approaches normally use the most probable In detail, if we do not take into account the budget
value in the calculation and optimization. In this constraint, the optimal renovation time in Case B

100

AMER16_Book.indb 100 3/15/2016 11:25:00 AM


Table 3. Optimal renovation time (years after
installation).

DMA Case A Case B Case C Case D

I 19 19.5 18.83 17.83


II 16.83 17 14.92 13.92

is longer than that of Case A. Indeed, with fuzzy


cost, the renovation time of DMA I is postponed
six months from the renovation time of 19 years
when using precise values. Figure 3. Impact of Cb on the optimal renovation
2

However, if we consider the budget constraint, time (T*).


the renovation time when using fuzzy numbers is
earlier than that with only the most probable value
used. For instance, the optimal decisions obtained
say that DMA I needs to be renovated after 14
years and 11 months from its installation. This
renovation time is one year earlier if we evaluate
with fuzzy numbers.
In the next part, we will focus on DMA I to
deeply study the impact of fuzzy costs on the opti-
mal decision. We also assess whether the manager
can obtain the real benefit of taking fuzzy costs
into account.

4.2.2 Improvement factor when using


fuzzy numbers
In this section, we will present a parameter, called
improvement factor in order to consider if using
fuzzy numbers brings back more benefit or not.
Firstly, we find the renovation time for the case
with or without budget constraint,, using the most
probable value or fuzzy number ( i* , i A, B,C , D ) .
Then, we sample 1000 values of the cost, which
is predicted in the interval [a, c] with the most prob-
able value b, and evaluate the optimal cycle life cost Figure 4. Distribution of improvement factor accord-
ing to Cb .
per year C (Ti* ) , with i A, B, C, D, that is cor- 2

responding to every sample of the cost.


Finally the improvement factor f(i,j), which is
corresponding to every cost sample, is evaluated most probable value in the possible area of C
b on
by following equation: the optimal renovation time. As C
b2 increases when
cb2 goes up, the optimal renovation time is therefore
C (Ti* ) C T j* ) accelerated in all four Cases A, B, C, and D.
f( i , j ) = When the budget constraint is not considered,
C (Ti* ) we find that the renovation time of Case B (fuzzy
cost), is sooner than the one of Case A (precise
where C(T) is evaluated by Eq. (1), Ti* is the opti- cost) if cb2 < (cb1 + cb3 ) / 2 . On the contrary, the
mal renovation time corresponding to case i, i, j renovation time is postponed when considering the
{A, B, C, D}. fuzzy cost if cb2 > (cb1 + cb3 ) / 2 . They are coinci-
dent if cb2 (cb2 + cb2 ) / 2 .
Impact of the most probable value of C
b :
These T* adjustments help the manager reduce
The renovation cost is assumed to be a precise
the annual cycle life cost. Indeed, considering
value. The cost associated with a pipe break event
Figure 4a, the distribution of the improvement
is characterized by a TFN ( b1 ,c
, cb 2 , b3 ) where cb1
factor f(A,B) shows that using fuzzy numbers helps
and cb3 are fixed while cb2 varies from cb1 to cb3 .
us save 0 to 2.5% or 0 to 10% (on average) of the
Figure 3 presents the impact of the position of the

101

AMER16_Book.indb 101 3/15/2016 11:25:01 AM


Figure 6. Distribution of improvement factor accord-
ing to Cr when do not consider budget constraint.
2

Figure 5: Impact of Cr on the optimal renovation


2
time (T*). annual cycle life cost. Indeed, considering the fuzzy
number C r , the manager can save 0 to 0.2%, or 0
to 0.15% (on average) of the expected annual life
cycle cost when the most probable value changes
expected annual life cycle cost when the most
from the middle point to the left end point and the
probable value moves from the middle point to the
right end point of the possible area (Figure 6).
left end point or the right end point of the pos-
Considering the budget constraint, as only the
sible area.
cost associated with a pipe break, Cb, affects on the
When considering the budget constraint, we find
risk of violating the budget constraint, we find that
that the renovation time of Case D (fuzzy cost), is
the renovation time of Case D (fuzzy cost) is equal
sooner than the one of Case C (precise cost) if cb2
to the one of Case C (precise cost) in most cases.
< 600 $. The gap is larger when the most probable
Only for the case where cr2 cr1 = $300, 000 , the
value cb2 approaches to the left end of the possible
renovation time is sooner. However, it is not caused
value interval. When cb2 > 600, the optimal reno-
by the impact of fuzzy numbers but by the descrip-
vation time in Cases C and D are almost same. If
tion of the optimal decision in the case without a
the right end of the possible value interval is also
budget constraint.
the most probable value, the renovation time when
considering fuzzy cost is slightly postponed com-
pared to the one of the case without fuzzy cost.
5 CONCLUSIONS
These T* adjustments help the manager reduce the
risk that the annual cost associated with pipe repair
We have examined an optimal periodic replace-
events exceeds the budget B. However, the annual life
ment policy in which repair and replacement costs
cycle cost slightly increases when using fuzzy number
are not precise. The model was used to optimize the
in this case (Figure 4b). It is necessary to balance the
strategy for water pipe renovation. First, a pipe-
satisfaction degree of the budget constraint and the
break repair event model was constructed from real
reduction in the annual life cycle cost.
maintenance records of the Nha Be Water Supply
Impact of the most probable value of C r : Company. Based on this model, we considered the
Considering a precise value of the cost associated impact of fuzzy costs on the optimal decision and
with a pipe break event, the renovation cost is also highlighted the real benefit of using fuzzy
characterized by a TFN ( r1 ,,ccr2 , r3 ) where cr1 and numbers. It was shown that, without budget con-
cr3 are fixed while cr2 varies. Figure 5 presents the straints, the use of fuzzy numbers helps reduce the
impact of the position of the most probable value life cycle cost per year. When budget constraints
in the possible area of C r on the optimal renova- are taken into account, it is necessary to weigh the
tion time. When cr2 goes up, the optimal renovation degree of satisfying budget constraints against the
time is postponed in all of the four cases A, B, C, augmentation of annual life cycle costs.
and D because the corresponding C r2 is increasing. In future work, hydraulic constraints may be
When the budget constraint is not considered, considered in the optimization of maintenance
we find that the renovation time of Case B (fuzzy policies for water pipe networks. Moreover, the
number), is later than the one of Case A (precise impact of the spread of fuzzy numbers, i.e. degree
cost) if cr2 < (cr1 + cr3 ) / 2 . On the contrary, the of cost uncertainty, on the optimal decision will
renovation time is accelerated when considering be studied. In that case, a distance-based ranking
the fuzzy cost where cb2 > (cb1 cb3 ) / 2. They are method, such as the Bertoluzza measure, can be
coincident if cb2 (cb1 + cb3 ) / 2 . These adjustments used to compare fuzzy life cycle costs in order to
of the renovation time help the manager reduce the find the optimal renovation time.

102

CH13_8.indd 102 3/17/2016 8:50:07 PM


REFERENCES Nguyen, B.M. (2014). Optimal pipe replacement strat-
egy based on the cost function in water distribution
Ahmad, R. & Kamaruddin, S. (2012). An overview of system using MAT- LAB programming. In The 10th
time-based and condition-based maintenance in International Conference on Sustainable Water Envi-
industrial application. Computers & Industrial Engi- ronment, Seoul, South Korea.
neering 63(1), 135149. Rausand, M. & Hyland, A. (2004). System Reliability
Barlow, R. & Hunter, L. (1960). Optimum preven- Theory: Models, Statistical Methods, and Applications
tive maintenance policies. Operations Research 8(1), (2nded.). Wiley Series in Probability and Statistics.
90100. Hoboken, NJ: Wiley.
Bortolan, G. & Degani, R. (1985). A review of some Renaud, E., Gat, Y.L., & Poulton, M. (2012). Using
methods for ranking fuzzy subsets. Fuzzy Sets and a break prediction model for drinking water net-
Systems 15(1), 119. works asset management: From research to practice.
Dandy, G.C. & Engelhardt, M. (2001). Optimal sched- Water Science and Technology: Water Supply 12(5),
uling of water pipe replacement using genetic algo- 674682.
rithms. Journal of Water Resources Planning and Scarf, P.A. & Cavalcante, C.A.V. (2010). Hybrid block
Management 127(4), 214223. replacement and inspection policies for a multi-com-
Eisenbeis, P. (1994). Modelisation statistique de la previ- ponent system with heterogeneous component lives.
sion des defaillances sur les conduites deau potable. European Journal of Operational Research 206(2),
Ph.D. thesis, Louis-Pasteur, Strasbourg, France. 384394.
Jamali, M., Ait-Kadi, D., Clroux, R., & Artiba, A. Shamir, U. & Howard, C. (1979). An analytical approach
(2005). Joint optimal periodic and conditional main- to scheduling pipe replacement. American Water
tenance strategy. Journal of Quality in Maintenance Works Association 71(5), 248258.
Engineering 11(2), 107114. Sheu, S.H., Chen, Y.L., Chang, C.C., & Zhang, Z.G.
Kanakoudis, V.K. & Tolikas, D.K. (2001). The role of (2012). A block replacement policy for systems subject
leaks and breaks in water networks: technical and eco- to non- homogeneous pure birth shocks. IEEE Trans-
nomical solutions. Journal of Water Supply: Research actions on Reliability 61(3), 741748.
and Technology 50(5), 301311. Sheu, S., Griffith, W.S., & Nakagawa, T. (1995). Extended
Kleiner, Y. & Rajani, B. (1999). Using limited data to optimal replacement model with random minimal
assess future needs. Journal - American Water Works repair costs. European Journal of Operation Research
Association 91(7), 4761. 85(3), 636649.
Kleiner, Y. & Rajani, B. (2001). Comprehensive review Sheu, S.H. & Griffith, W.S. (2002). Extended block
of structural deterioration of water mains: statistical replacement policy with shock models and used items.
models. Urban Water 3(3), 131150. European Journal of Operational Research 140(1),
Kleiner, Y. & Rajani, B. (2010). I-WARP: Individual 5060.
water main renewal planner. Drinking Water Engi- Sheu, S.H. (1992). Optimal block replacement policies
neering and Science 3(1), 7177. with multiple choice at failure. Journal of Applied
Lai, M.-T. & Chen, Y.C. (2006). Optimal periodic replace- Probability 29(1), 129141.
ment policy for a two-unit system with failure rate Sheu, S.H., Chang, C.C., Chen, Y.L., & Zhang, Z. (2010).
interaction. The International Journal of Advanced A periodic replacement model based on cumulative
Manufacturing Technology 29(34), 367371. repair-cost limit for a system subjected to shocks.
Lai, M.-T. (2007). A periodical replacement model based IEEE Transactions on Reliability 59(2), 374382.
on cumulative repair-cost limit. Applied Stochastic Wang, H. & Pham, H. (2006). Availability and mainte-
Models in Business and Industry 23(6), 455464. nance of series systems subject to imperfect repair
Le Gat, Y. & Eisenbeis, P. (2000). Using maintenance and correlated failure and repair. European Journal of
records to forecast failures in water networks. Urban Operational Research 174(3), 17061722.
Water 2(3), 173181. Wang, H. (2002). A survey of maintenance policies of
Liu, B. & Liu, Y.-K. (2002). Expected value of fuzzy vari- deteriorating systems. European Journal of Opera-
able and fuzzy expected value models. IEEE Transac- tional Research 139(3), 469489.
tions on Fuzzy Systems 10(4), 445450. Yager, R.R. (1981). A procedure for ordering fuzzy sub-
Liu, B. (2001). Fuzzy random chance-constrained pro- sets of the unit interval. Information Sciences 24(2),
gramming. IEEE Transactions on Fuzzy Systems 9(5), 143161.
713720. Zadeh, L.A. (1965). Fuzzy sets. Information and Control
Mailhot, A., Pelletier, G., Noel, J.F., & Villeneuve, J.P. 8(3), 338353.
(2000). Modeling the evolution of the structural Zimmermann, H.-J. (2010). Fuzzy set theory. Wiley
state of water pipe networks with brief recorded pipe Interdisciplinary Reviews: Computational Statistics
break histories: Methodology and application. Water 2(3), 317332.
Resources Research 10(36), 30533062.

103

AMER16_Book.indb 103 3/15/2016 11:25:07 AM


This page intentionally left blank
Monte Carlo methods for parallel computing of reliability and risk

CH14_38_S04.indd 105 3/15/2016 1:15:46 PM


This page intentionally left blank
Applied Mathematics in Engineering and Reliability Bri, Snel, Khanh & Dao (Eds)
2016 Taylor & Francis Group, London, ISBN 978-1-138-02928-6

Acceleration of multi-factor Merton model Monte Carlo simulation via


Importance Sampling and GPU parallelization

M. Bre & R. Bri


VBTechnical University of Ostrava, Ostrava, Czech Republic

ABSTRACT: Credit risk refers to the risk of losses due to unexpected credit events, as a default of a
counterparty. The modelling and controlling of credit risk is a very important topic within banks. Very
popular and frequently used tools for modelling credit risk are multi-factor Merton models. Practical
implementation of these models requires time-consuming Monte Carlo (MC) simulations, which signifi-
cantly limits their usability in daily credit risk calculation. In this paper we present acceleration techniques
of Merton model Monte Carlo simulations, concretely parallel GPU implementation and Importance
Sampling (IS) employment. As the importance sampling distribution we choose the Gaussian mixture
model and for calculating the IS shifted probability distribution we use the Cross-Entropy (CE) method.
The speed-up results are demonstrated using portfolio Value at Risk (VaR) and Expected Shortfall (ES)
calculation.

1 INTRODUCTION 1.1 Briefly about multi-factor Merton model


Let assume we have a portfolio of N risky loans
In this paper we present a new approach to the
(exposures) indexed by n 1,, N . We are inter-
Importance Sampling (IS) in the multi-factor Mer-
ested in the possible defaults, which can occur in
ton model. In the standard IS approach the normal
the fixed time interval [ 0 ] . Let Dn denote the
distribution is used as a family of the IS distribu-
default indicator of an exposure n, which can be
tions. This approach results in a decent variance
represented as a Bernoulli random variable taking
reduction but a certain level of degeneracy of
the values
probability can be observed. The observed degen-

{
eracy of probability is caused by a relatively high
1, if the exposure n is in the default
difference between the IS distribution chosen from Dn = .
the normal distribution family and the optimal IS 0, otherwise
distribution and it also limits the achievable vari- (1)
ance reduction. As a correction to this problem we
use the Gaussian mixture model for the IS family
of distributions. This new approach limits the level We assume that the probabilities PDDn = P( Dn = )
of the observed degeneracy of probability as well are given as a portfolio parameter.
as increases the variance reduction. The portion of the exposure n which can be
The other significant part of this paper is the lost in the time of default is called the exposure
implementation of discussed models and IS pro- at default denoted by EAD Dn . For simplicity we
cedures via CUDA on the the GPU devices. The assume EAD Dn is constant in the whole time interval
GPU implementation of the model enables very [ 0 ] and it is given as the portfolio parameter.
fast calculation of the observed parameters (VaR The portion of EAD Dn representing the real loss
or ES) with or without the use of the IS. in the case of default, is given by a random vari-
First we present a short recapitulation of the GDn [ , ] . The distri-
able loss given at default LGD
multi-factor Merton model and the terminology bution, the expectation ELGD GDn and the standard
used, then we state a detailed specification of the deviation VLGD Dn of LGDGDn are given as the port-
tested model. For a deeper understanding of the folio parameters. The portfolio loss LN is than
Merton model see (Ltkebohmert 2008). defined as a random variable

107

CH14_38_S04.indd 107 3/15/2016 1:15:46 PM


N 5. rn has standard normal distribution if condi-
LN EAD
E Dn LGD
G n Dn . (2) tion K
k=
2
, k = 1 is satisfied.
n =1
Variables n,k and n are assumed as a given
portfolio parameters.
Now we can define the value at risk (VaR) as p
When PD Dn is given and rn has the standard
quantile (or confidence level) of LN
normal distribution, one can calculate threshold
1
cn (1 PDn ) so default indicator can be
Rp ( LN ) = inf{x R : P(L
VaR ( LN x ) 1 p} represented as
(3)
= inf{x R : FLN ( x ) p},
Dn rn > cn . (6)
where FLN ( x ) is the cumulative distribution func-
tion of LN . And the Expected Shortfall (ES) as 1.1.2 Monte Carlo simulation of multi-factor
a conditional tail expectation with the condition Merton model
x VVaRRp ( LN ) With previous knowledge and full portfolio speci-
fication we can now approximate the portfolio VaR
1 and ES via the Monte Carlo simulations. Single
1 p V
S p ( LN ) =
ES x P( LN x ) ddx exposure defaults can be directly calculated from
VaRRp ( LN )
1 1
(4) the systemic and the idiosyncratic shocks X k( i ) and
1 p p
= VaRRu ( LN ) du. n( i ) drawn from the standard normal distribution
N( ) , upper index (i) indicate index of the Monte
Carlo sample. With the generated random LGD GDn( i )
1.1.1 Exposure correlation factors we can calculate the total random scenario loss
In the reasonable portfolio the single exposures
N
defaults are correlated, let us outline, how the
correlation is handled in the Merton model.
L(Ni ) EAD
E Dn LGD
G n( i ) Dn( i ) . (7)
n =1
We assume that every exposure has a unique
owner (obligor). Let Vn (t ) denote n-th obligors The Monte Carlo simulation consisting of M
assets, Sn (t ) obligor n equity and Bn (t ) obligor trials approximate portfolio VaR as
n bond, so

Vn (t ) = Sn (t ) + Bn (t ), 0 t T . (5)
Rp ( LN ) = min L(Ni )
VaR { ( L(Ni ) ) ( p ) M }
M p
= LN , (8)
In the Merton model a default can occur only at
the maturity T, which leads into two possibilities [ j]
where LN j =1 ( LN LN ), LN is the j-th
(i ) M ( j) (i )
1. Vn (T ) > Bn (T ) : obligor has sufficient asset to loss in the ascendant sorted loss sequence L(Ni ) , and
fulfil debt, Dn = 0. ES as
2. Vn (T ) Bn (T ) : obligor cannot fulfil debt and
defaults, Dn = 1. M
L[N] .
1
Let rn denote the n-th obligors asset-value log-
S p ( LN ) =
ES
M

M p j
j
(9)
M p
return rn l (Vn (T ) / Vn ( )). The multi-factor
Merton model assumptions to resolve correlations
between exposure defaults are: 1.2 Tested portfolio structure specification
1. rn depends linearly on K standard nor- The most important part of the multi-factor Mer-
mally distributed risk (systemic) factors ton model is the structure of the portfolio (exposure
X ( X1 , , X K ) . dependence on the risk factors). To obtain a portfolio
2. rn depends linearly on the standard normally with a realistic behaviour we use a natural risk factor
distributed idiosyncratic term n , which is construction considering the region-industry (sector)
independent of the systemic factors X k . and the direct (hierarchy) links between exposures.
3. single idiosyncratic factors n are uncorrelated. Hierarchy links are represented by Hierarchy
4. asset-value log-return random variable can be Systemic Factors (HSF), which can be interpreted
represented as rn n Yn + n2 n , where as direct links between the exposures (for example
Yn K k= n , k X k represents exposure compos- two subsidiary companies with a common parent
ite factor, n represents exposure sensitivity company), each of these systemic factors usually
to systemic risk and weights n,k represents has impact only on a small fraction of the port-
dependence on single factors X k . folio exposures. Sector links are represented by

108

CH14_38_S04.indd 108 3/15/2016 1:15:48 PM


Sector Systemicfactors (SSF), which can be inter- posite factors of SSF. Single composite factors
preted as industrial and regional factors, each of S( ) , , S( KS ) are defined by a given correlation
these systemic factors usually impacts majority of matrix and are calculated as
the portfolio exposures. Therefore every exposures
asset-value log-return random variable rn depends S
S( ) 1
on two composite factors H n (hierarchy com-
posite factor) and Sn (sector composite factor)   , (13)

according to following formula: S S
( KS ) K
S

rn gn H n 1 gn2 n , (10) where kS s denote idiosyncratic term for SSF k.


All of the aforementioned parameters gn n gkH ,
HSF tree structure and correlation matrix are
rn 1 2
n Sn + n rn , (11) given as portfolio parameters and can be inter-
preted in the standard form of n,k and n
where H n is composite factor of hierarchy cor- parameters, where K k=
2
, k = 1 is satisfied.
relation risk factors (HSF), gn ( , ) is group For tested model the LGDs are considered from
correlation coefficient with composite HSF, Sn is the Beta distribution with mean and standard
composite factor of sector correlation risk factors deviation given by portfolio parameters.
(SSF), n ( , ) is idiosyncratic weight towards For a better illustration in the further text we
composite SSF and n is exposure idiosyncratic will use normalized EAD Dn :
factor.
Let K S denote the number of SSF and K H N
denote the number of HSF. We assume that, there EADk = 1, (14)
are corresponding K S sector composite factors k =1
and K H hierarchy composite factors. Links (cor-
relation) between single composite factors are rep- which express EAD
Dn as a portion of the total port-
resented differently for HSF and SSF. folio exposure.
In the case of HSF we assume links between
systemic factors take form of a dependence tree
structure. Let H( ) , , H( K H ) denote the unique 2 EMPLOYING IMPORTANCE SAMPLING
composite factors of HSF corresponding to K H .
Composite factors are ordered according to a given As mentioned before, we are interested in the VaR
tree structure and their calculation is given recur- and the ES of the observed portfolio loss random
sively, where every node H( k ) has at most one par- variable LN . The Monte Carlo approximation of
ent H( l ) and specified correlation coefficient gkH , these values is highly sensitive to the stated confi-
see formula (12). dence level p, which is usually very close to 1. In
our study we use the confidence levels of 0.99995,
g H H + 1 ( g H )2 H 0.9995 and 0.995. For example when the confi-
pp(( k ) l dence level is 0.99995 the MC simulation of 106
H( k ) = kH ( l ) k k , (12)
k p( k ) = samples provides only 50 samples with the infor-
mation about VaR/ES.
where kH denotes idiosyncratic term for HSF k and One of the straightforward ways to increase the
p ( k ) is parent mapping function. Example of calcu- number of samples in the region of VaR/ES calcu-
lating HSF composite factors can be seen in Figure 1. lation is to change the distribution of the portfo-
In the case of SSF we assume links between lio loss random variable so called the Importance
systemic factors take form of the full correlation Sampling (IS) method. The principle of the IS can
matrix. Let S( ) , , S( KS ) denote unique com- be easily demonstrated on the ES calculation. The
ES can be represented as the conditional mean or
mean of the specific function

0, x <V
VaR
Rp ( LN )
H p (x ) = x (15)
, x V
VaR
Rp ( LN )
1 p

ES
S p ( LN ) E f ( H p ( LN ))
= H p ( LN ( y f (yy dy , (16)
Figure 1. Example of group correlation tree.

109

CH14_38_S04.indd 109 3/15/2016 1:15:53 PM


where y are values of the random vector Y of condition v( y ) : H p ( LN ( y f ( y ) > 0 v(
v( y ) > 0 .
all random variables contributing to LN (idiosyn- This approach is called variance minimization
cratic terms, LGDs), is the set of the all possible (VM) method. Usually the VM method leads to
values of y f ( y ) is the joint probability density very difficult problems, which have to be solved
function of Y , LN ( y ) LN (Y ) LN is the function numerically.
mapping y to corresponding value of LN and Another approach to obtain the IS pdf g ( y ) is
E f is mean under the pdf f yy) . If we use the IS the Cross-Entropy (CE) method. The CE method
with the new probability distribution of LN given similarly to the VM method solve a minimization
by pdf g ( y ) we can calculate original ES as problem, but instead of minimizing the variance
it minimize the Kullback-Leibler (KL) diver-
f Y ) gence D( g v ) with the optimal (zero variance) IS

(
E g H p LN (Y ) )
g (Y )
distribution

(
= H p LN ( y )
g
f
( y
)
y)
)
g ( y ) dy
d (17)
g ( y ) =
(
H p LN ( y ) ) f (y)
(21)

= E f ( H p ( LN )) = ES p ( LN ). E f H p ( LN )

The ratio of probability density functions


f y)
w( y ) := g ( y ) is called the the likelihood ratio (LR). v ( y )X
{
g ( y ) : arg min D( g ( y ),v( y )) }
From formula (17) we can see the natural require-
ment on g ( y ) : H p LN ( y f ( y ) ( ) g ( y )
g( y ) > 0 . = arg min g ( y ) ln d
dy
Formula (17) also provide the MC estimation of v ( y )X v( y )

{ }
the ES when using the IS
= arg min H p ( LN ( y f ( y ) ln v(yy dy .
1 M v ( y )X
S pg ( LN ) =
ES H p LN Yi w(Yi )
N i =1 (22)

LN (Yi ) (LN (Yi ) )


M
VaRpg ( LN ) w(Yi )
V To obtain a solvable problem, we need to add
i =1 some constrain to the system of pdfs X. Usual
= , choice is a parametrized family of pdfs:
M ( p)
(18) X: {v (x; ) }, (23)
where Yi is i-th sample of Y g ( y ) and M is the
number of random samples. It remains to define where v ( x; ) is pdf taking vector of parameters
Rpg ( LN )
VaR as and := { : p ( N ( y )) f ( y)) > ( y ; ) > }.
Obtained minimization problem is usually concave,

{ }
therefore we can replace the optimization problem
Rpg ( LN ) = min LN (Yi )
VaR Yi ( p) M , with the following equation
(19)

M : N (y f (yy l (yy ; dy = 0. (24)
(LN (Yi ) ) p
where Yi : LN (Y j ) w(Y j ) .
j =1

2.1 Cross-Entropy method To solve the problem (24) we use the Monte
Carlo simulation:
We already know the principles of the IS and have
the IS estimators of VaR and ES, but a new IS pdf
M
g ( y ) is still unknown. The most straightforward
method for the estimation of g ( y ) is to minimize
p

N( i )) l ( i ; ) = 0, (25)
i =1
the variance of the ES IS estimator:
this is called the Stochastic Counterpart (SC) of the

( ( )) (( ))
f Y
problem (24). Note that (25) is usually a system of
g ( y ) = arg min Sv2 H LN Y , (20) non-linear equations, but for some pdfs results into
v ( y )X v Y
an explicit solution.
In this paper we focus mainly on the IS of idi-
where Sv2 (X ) denote variance according to pdf v ( y ) osyncratic terms of the systemic factors (HSF and
and X is an arbitrary system of the pdfs fulfilling the SSF). Therefore to simplify the notation of the ran-

110

CH14_38_S04.indd 110 3/15/2016 1:15:57 PM


dom vector Y of all random variables contributing where p, , are matrices of K S K H columns
to LN will be in further text understood as a vector of parameters p j j , j . Therefore the system of
of K S K H independent standard normal random pdfs for the IS is
variables. LGDs or other random variables will be

{g (x; }.
still part of Y , but the IS wont affect them.
Now if we consider X as a system of K S K H X Y
, , ): j = , ji > (30)
independent normally distributed random vari-
ables parametrized by mean and variance, we will
Because the support of the pdf of the normal
get the following solution of problem (25):
distribution is R , the condition f x ) > 0 gY ( x ;
M p, , ) > 0 is fulfilled. Since the components of
H p ( LN (Yi )) Yi j gY ( x; p, , ) are independent, the problem (24)
= i =1 reduces into K S K H systems of non-linear equa-
j , jj , (26)
M tions. Therefore together with the condition p j = 1
H p ( LN (Yi )) 1
i =1 we will receive j = 1, , K S + K H ,
, i = 1, ,n :

M 2 M
H p ( LN (Yi )) Yi


j
j
H p LN Yk k , j i Yk j
j =
2 i =1 k =1
M
, j , (27) j i = M
,
H p ( LN (Yi )) H p LN Yk k , j ,i
i =1 k =1
M 2

where   2 is the SC approximation of mean,
j j
H p LN Yk k , j ,i Yk j j ,i
k =1
variance of j-th component of Y and Yi is j-th 2j ,i = M
, (31)
j
component of i-th MC sample. H p LN Yk k , j ,i
k =1
M
2.2 Gaussian mixture model
In the end of previous part we presented formulas
H p LN Yk k , j ,i
k =1
p j ,i = ,
for calculating the optimal IS distribution in the M
family of normal distributions. This approach is H p LN Yk
commonly used for the IS in the multi-factor Merton k =1
model, see for example (Glasserman & Li 2005). The
choice of the IS family of distributions as normal where
distributions is not always optimal and can improved
by more complex IS family of distributions.
p j i fN Yk ; j ,i j ,i
The IS family of distributions examined in this j
paper is the family of the Gaussian mixture dis-
k j ,i := n
. (32)

tributions, the same approach in different appli- pj i fN Yk ; j ,i , j ,i
j
cation can be found in (Kurtz & Song 2013). The i =1
Gaussian mixture random variable is defined as
a weighted sum of different normal random vari- We obtain K S K H systems, each represent-
ables. The pdf of the Gaussian mixture random ing a problem of approximation of the Gaussian
variable can be expressed as mixture from data sample. This sub-problems
can be solved for example by EM or K-means
n
algorithm see (Bishop 2006, Redner & Walker
g ( x; p, , ) pi f N ( x; i i ), (28)
1984).
i =1
But the computation effort of the system (31)
where fN ( x; i , i ) is the pdf of the normal distri- will be significantly smaller if we have an informa-
bution with the mean i and the variance i2 and tion from which component of g x j ; j , j j ( )
p 1 = in=1 pi = 1. New IS Gaussian mixture joint was Yk generated. Let zk j denote Bernoulli
j
pdf of Y will be vector of identificators, such as

gY ( x; p,, , )
KS K H
(
g xj; pj, j j , ) (29) ( z ) = 10,,
k j i

Yk j (
fN x; j i , j i ). (33)
j =1 otherwise

111

CH14_38_S04.indd 111 3/15/2016 1:16:01 PM


One can show that if we know the values of zk j ,
then k j ,i = zk j ( )
i
Therefore the system (31)
results in explicit solution of the problem (24).

2.3 Objective function for component


identification
In the previous part we constructed formulas for
the calculation of the IS Gaussian mixture distri-
bution. These formulas depend on the knowledge Figure 2. Approximation of dependence between
of the samples source component zk j , but this is j ( y ) and X j .
not easily obtainable information. In this part we
propose a numerical approximation of zk j based
on model behaviour.
First lets consider a set of K S K H functions
N
EADi Di ( y ) i i , j
j ( y ) := i =1
N
, (34)
max
i 1, , N
{ i i, j } EAD
Di Di ( y )
i =1

where i i , j EADi are portfolio parameters of


exposure i and Di ( y ) is the default indicator of
exposure i under the vector of all idiosyncratic
shocks y . In the case of no defaulting exposure Figure 3. Approximation of X j distribution under the
the function j ( y ) yields 0. It can be easily shown condition LN VVaRp ( LN ) by 3 component Gaussian
that 0 j ( ) 1 . mixture.
To demonstrate a link between zk j and
j Yk lets consider portfolio containing a com-
ponent j with huge impact on LN . In Figure 2 we very good approximation obtained from the pro-
show dependence between component idiosyn- posed procedure is that the approximation obtained
cratic shock X j and j ( y ) under the condition by the normal distribution differ significantly from
LN V VaRp ( LN ) . From the study of the aforemen- the approximated distribution. Note that X j dis-
tioned figure we can conclude, that: tribution under the condition LN V VaRp ( LN ) is
an optimal distribution found by the CE method
X j distribution under the condition LN
VaR Rp ( LN ) consist of multiple components,
(
for H p ( x ) = x V
VaRp ( LN ) .)
Since we want to calculate both VaR and ES,
j ( y ) separate these components by its value, the CE problem formulation based on H p ( x )
in other words we can assume given by (15) does not have to be optimal. The VaR
approximation can suffer if the CE method favours
samples with very high value of loss and disfavours

Y
j k (ai ai + ) (( zk j )i = 1) , (35) those close to VaRRp ( LN ) bound. Therefore we will
use
where 0 a1 n 1 1 (n denote number
of the Gaussian mixture components) are some (
H p (x ) = x V )
VaRp ( LN ) , (36)
known values.
Numerical justification of the assumption which give all samples with LN V VaRp ( LN ) same
(35) can be seen in Figure 3, where we can see weight.
the histogram of the simulation of X j distribu- Till now we havent dealt with bounds ai cal-
tion under the condition LN V VaRp ( LN ) and culation. Generally it can be a difficult problem,
its approximation by the 3 component Gaussian but j ( y ) component recognition is not sensitive
mixture in comparison with approximation by the to small changes of ai , therefore rough approxi-
normal distribution. Approximation by the Gaus- mation is sufficient. Such computationally feasible
sian mixture was obtained by using the objective sufficient approximation can be obtained by mini-
function j ( y ) and the pre-calculated bounds mizing (by e.q. line-search methods) difference
a1 0 a2 = 0.2, a3 0 8 a4 1. Other fact beside between MC sample of X j distribution under

112

CH14_38_S04.indd 112 3/15/2016 1:16:06 PM


the condition LN V VaRp ( LN ) and the Gaus- In the definition of H p Yk is still present
sian mixture obtained using j ( y ) component the unknown value of VaR Rp ( LN ) , which can be
recognition. replaced by its approximation VaR g
Rp ( LN ) from
t-th iteration. The last obstacle is that the H p Yk
2.4 Adaptive CE method for IS calculation will be for most samples zero and the iteration
process will crash at the beginning. The solution
So far we have constructed formulas for calculating
to this is the replacement of the confidence level p
the Gaussian mixture IS, stated the optimal form
by a sequence of pi which is at first few iteration
of the function H p ( x ) in (36) and constructed an
significantly lower than p and at the and of itera-
instrument for the Gaussian mixture j-th compo-
tive process equals p.
nent identification using objective function j ( y ) .
All of the previous observations lead to algorithm
But the single calculation from M MC samples
1. Obtained algorithm can be further enhanced for
would result in the poor approximation, if the M
example by the Screening method or by the adap-
was not high enough. The sufficient number of the
tive smoothing parameter sequence see (Kroese,
MC samples for stable and precise approximation
Taimre, & Botev 2013, Rubinstein & Kroese 2013,
of the CE problem is comparable with the number
Rubinstein & Kroese 2011).
of MC samples for sufficient approximation of
VaR/ES. This would make the whole IS princi-
ple useless, because it wont bring savings in the
3 IMPLEMENTATION AND GPU
computational time/effort. Solution to this incon-
PARALLELIZATION
venience is iterative process, slowly shifting the IS
distribution to the CE method optimal one.
The serial Matlab implementation is a straight-
The formulas for the CE method SC (31) can be
forward interpretation of the multi-factor Mer-
modified by using the IS during the SC process:
ton model with the Matlab built-in functions. The
M whole simulation (all of the MC samples) can be
H p Yk w Yk ( zk j )i Yk j calculated at once without the use of loops. Most
computationally expensive parts of the simula-

= k =1
,
j i ,t M
H p (Yk )w Yk ( ) , i
k =1
M 2
H p Yk w Yk ( zk , j )i Yk j  j ,i ,t
 j ,i ,t =
2 k =1
M
,
H p Yk w Yk ( ) i
k =1
M
H p Yk w Yk ( ) i
p = k =1
,
j ,i ,t M
H p Yk w Yk
k =1

(37)
where t denotes iteration, ( k , j )i denote if the
i-th component of j-th systemic factors Gaus-
sian mixture was the source of the sample k,
H p (Yk ) := ( LN (Yk ) VaR
Rp ( LN )) and

f Yk
w Yk = , (38)
gY Yk ( t t t )
where f Yk is the pdf of nominal distribution
(joint distribution of the independent
p normal dis-
tributions) and gY Yk ; t , t (
t is the pdf of )
IS Gaussian mixture distribution given by param-
eters approximated in the iteration t 1 .

113

CH14_38_S04.indd 113 3/15/2016 1:16:11 PM


tion can be calculated by very well optimized cients is handled in sparse format (only column/
Matlab matrix functions and therefore this imple- row index and value of non-zero elements is
mentation can serve as a good comparison tool stored)
of the performance efficiency for further GPU specialized GPU implementation: is applica-
implementations. ble only on specialized type of portfolios which
use systemic factor grouping into SSF and HSF,
3.1 GPU parallelization implementation fits the mathematical descrip-
tion in subsection 1.2 (correlation matrix of SSF
As was already mentioned the simulation of the is stored in constant memory).
multi-factor Merton model consists of many MC
samples, that are mutually independent. This is Finally some remarks shared by all GPU
suitable for a massively parallel computation hard- implementations:
ware such as the GPU device. usage of shared memory buffering - as all cores
need the same portfolio data, we can (by selected
3.1.1 Shortly about GPUs cores) copy the data from global to shared mem-
Let us very shortly outline main param- ory (which is much faster than global),
eters of GPUs, which are crucial for model generating random numbers from normal or uni-
implementation: form distribution is done by cuRAND library,
GPUs consist of many (in current devices in compiled with -use_fast_math tag, which
order of thousands) computation cores, grouped decreases precision of math functions in favour
into streaming multiprocessors (SM), communi- of speed
cation between single cores is strictly restricted Beta random number generator is not present in
to groups belonging to one SM unit. Execution the cuRAND library, therefore we implemented
of CUDA kernel (parallel GPU implementa- own procedure based on rejection-sampling
tion) must mirror this structure and we must method see (Dubi 2000, Kroese, Taimre, &
specify block size (how many threads per SM Botev 2013).
will run) and grid size (how many blocks will be
executed).
There are four basic types of memory on the 4 NUMERICAL RESULTS
GPUs:
global memory: main storage memory, large, In this section we test all of the aforementioned
high latency (thread waits long time before procedures and implementations. First we examine
get the data), must be accessed in pattern (i-th the behaviour of the GPU implementations and
core access i-th element) to obtain reasonable then we look at the variance reduction achievable
utilization of bandwidth by the proposed Gaussian mixture IS.
shared memory: small, shared between cores
in one SM, low latency 4.1 GPU acceleration
constant memory: small, can broadcast con-
tent of array among all cores As was mentioned before we implemented three
registers: cannot be directly accessed, sepa- different approaches to simulate the multi-factor
rated for every core, very fast, buffer some Merton model. Now we test their behaviour in
small local variables comparison with the Matlab serial implementation
For software implementation on GPU we use on three different scenarios.
the NVIDIA CUDA technology. For further infor- 1. increasing number of the systemic factors which
mations see (NVIDIA 2015). impacts majority of exposures (SSF), majority
of corresponding i j are non-zero
3.1.2 GPU implementations overview 2. increasing number of systemic factors which
When implementing multi-factor Merton model we impacts a small fraction of exposures (HSF),
decided to create multiple implementations, which majority of corresponding i j are zero
can benefit from different type of portfolios: 3. increasing number of exposures
base GPU implementation: straightforward All tests were performed on Intel Sandy Bridge
interpretation of the model, single threads per- E5-2470 processor (294.4 Gflops, 38.4 GB/s) and
form single MC samples in the same way as the NVIDIA Kepler K20 accelerator (3520 Gflops,
serial implementation, 208 GB/s), the serial Matlab implementation uses
sparse GPU implementation: similar to base double precision and the GPU implementations
implementation, but the matrix of i j coeffi- use single precision. The theoretical performance

114

CH14_38_S04.indd 114 3/15/2016 1:16:13 PM


benefit of GPU implementations is 192 (single
core + double precision vs. all GPU cores + single
precision) and the theoretical memory bandwidth
benefit of the GPU implementations is 11 (dou-
ble vs. single precision).

4.1.1 Increasing number of SSF


This test is designed to test implementations
behaviour when the number of systemic factors
increases while matrix of i j coefficients becomes
more dense. We use the sequence of portfolios with
1000 exposures, 100 HSF and the sequence of (16,
25, 36, 49, 64, 81, 100) SSF. The density of matrix
of i j coefficients rises from 16% up to 51%. The
scaling results can be seen in Figure 4. Figure 5. Implementations scaling based on rising
From results we can observe following number of low impact systemic factors.
specialized GPU implementations speed-up
drops from factor 515 (for 16 SSF) to factor
209 (for 100 SSF),
sparse GPU implementation suffers the most, From results we can observe following
the speed-up drops from factor 77 (for 16 specialized GPU implementation speed-up
SSF.) to factor 16 (for 100 SSF), this could be rise from factor 537 (for 100 HSF) to factor
expected because size of sparse interpretation 1001 (for 1600 HSF),
equals 3 number of non-zero elements. sparse GPU implementation benefits the most,
base GPU implementation speed-up drops speed-up rise from factor 51 (for 100 HSF) to
from factor 35 (for 16 SSF) to factor 19 (for factor 287 (for 1600 HSF), this could be again
100 SSF). expected because number of non-zero elements of
The drop in performance of all the GPU imple- matrix of i j coefficients does not increase much.
mentations is caused by the increasing memory com- base GPU implementation speed-up drops
plexity, which bounds the computation utilization. from factor 32 (for 100 HSF) to factor 18
(for 1600 HSF).
4.1.2 Increasing number of HSF The drop in performance of base GPU imple-
The second test is designed as the counter example mentation is caused again by the increasing mem-
to the first one. Now we test the sequence of port- ory complexity, because it does not take in account
folios with 1000 exposures, 25 SSF and sequence the sparsity of matrix of i j coefficients.
of (100, 200, 400, 800, 1600) HSF. The density of
matrix of i j coefficients decreases from 22% 4.1.3 Increasing number of exposures
down to 1.7%. The results can be seen in Figure 5. The last test serves as insight of the implementa-
tions behaviour when applied on the very large
portfolios. We test the sequence of portfolios with
25 SSF, 100 HSF and sequence of (1000, 2000,
4000, 8000, 16000, 32000) exposures. The results
can be seen in Figure 6.
From results we can observe following
specialized GPU implementation speed-up
rise from factor 537 (for 100 exposures) to fac-
tor 784 (for 3200 exposures),
sparse GPU implementation speed-up is
approximately 50 for all tested portfolios,
base GPU implementation speed-up is
approximately 30 for all tested portfolios.
All of the GPU implementations exhibit good
scaling when the number of exposures rises, even
Figure 4. Implementations scaling based on rising more the specialized GPU implementation ben-
number of high impact systemic factors. efits from the large portfolios.

115

CH14_38_S04.indd 115 3/15/2016 1:16:14 PM


Single portfolios differs in exposure assignation
to SSF, HSF and coefficients gn , n .
Portfolio 1. pkS l N (0, 0.5) and normalized, pkH ~
In N (0, 10) and normalized, gn = 0.9, n = 0.5.
Portfolio 2. pkS K1 pkH = K1 , gn = 0.9, n = 0.5
S H
Portfolio 3. pkS l N (0, 0.5) and normalized, pkH ~
In N (0, 10)) and normalized, gn = 0.5, n = 0.9
Portfolio 4. pkS K1 pkH = K1 , gn = 0.5, n = 0.9
S H

Portfolio 1. represents a portfolio with clustered


exposures (large groups of exposures with the
same HSF/SSF composite factor) with high
dependence on the systemic factors.
Portfolio 2. has the same level of exposure depend-
Figure 6. Implementations scaling based on rising ence on the systemic factors as portfolio 1., but
number of exposures. exposures are equally distributed among the
HSF/SSF composite factors.
Portfolio 3. has exposures clustered as in portfolio
1., but the level of exposure dependence is as
low as in portfolio 2.
Portfolio 4. has exposures evenly distributed as
portfolio 2. and low level of exposure depend-
Figure 7. Template structure of HSF correlation tree.
ence as in portfolio 3.

4.2 IS variance reduction 4.2.2 Variance reduction in comparison with the


standard approach
In this part we examine the variance reduction Beside different portfolios we also test different lev-
achievable by the IS. We compare the standard IS els of confidence level p { . , . , . }.
approach using the family of normal distributions First lets examine VaR and ES of selected port-
and the IS with the Gaussian mixture family of folios and confidence levels, VaR/ES calculated by
distributions. MC using 107 samples are listed in Table 1.
Measured levels of VaR, ES shows that the
4.2.1 Portfolio parameters specification lower level of exposure dependence and even dis-
For numerical tests we constructed four different tribution of exposures leads to the lower value of
portfolios according to the structure mentioned VaR,ES. This can suggest, that the IS for portfolio
in section 1.2. Each of the constructed portfolios 3. and 4. could be less effective. The impact of con-
consists of = 10 4 exposures, K S = 25 SSF and fidence level is predictable, the IS effectiveness will
K H = 600 HSF. Properties which are shared by all be lower for lower confidence levels. This is caused
of the constructed portfolios are by reducing rarity of samples providing informa-
EAD tion about VaR, ES and therefore no large change
Di i 2 / Nj =1 j 2,
of the distribution is needed.
PD Di = 0.001 + 0.001 (1 Ni ) ,
the distribution of LGDs is Beta distribution
with mean ELGD GDn = 0.5 and standard devia-
Table 1. Tested portfolios VaR and ES.
tion VLGD Dn = 0.25 for all exposures,
the structure of HSF correlation is defined by the Confidence level p
tree template shown in Figure
g 7. duplicated 60 times,
correlation coefficients gkH k = 1,, K H . Characteristic Portf. idx. 0.99995 0.9995 0.995
the SSF correlation matrix is defined by 5
region and 5 industry factors, each SSF repre- VaR 1 0.0371 0.0251 0.0129
sent unique combination of the region and the 2 0.0291 0.0203 0.0123
industry. Correlation between two SSF is 0.2 if 3 0.0057 0.0041 0.0027
they share same region, 0.15 if they share same 4 0.0051 0.0038 0.0026
industry and 0.03 otherwise. ES 1 0.0417 0.0304 0.0181
exposures are assigned to a composite SSF/ 2 0.0332 0.024 0.016
HSF randomly by defined probabilities 3 0.0065 0.0048 0.0033
4 0.0058 0.0044 0.0032
pkS (Sn = S(k ) ) and pkH P( H n = H(k ) ) .

116

CH14_38_S04.indd 116 3/15/2016 1:16:14 PM


Table 2. Measured variance of Crude MC, IS normal dist. and IS 3 comp. Gaussian mixture (106 samples, 1000
simmulations).

Crude Monte Carlo IS normal distribution IS Gaussian mixture

Confidence level p Confidence level p Confidence level p

Char. Portf. idx. 0.99995 0.9995 0.995 0.99995 0.9995 0.995 0.99995 0.9995 0.995

VaR 1 4.95e-07 6.21e-08 4.88e-09 6.40e-09 2.42e-09 6.96e-10 6.71e-10 4.93e-10 1.90e-10
2 3.41e-07 2.10e-08 2.95e-09 4.98e-09 1.01e-09 6.39e-10 5.47e-10 1.59e-10 1.50e-10
3 1.14e-08 7.63e-10 5.73e-11 8.62e-11 2.33e-11 6.14e-12 2.81e-11 1.21e-11 4.79e-12
4 7.34e-09 6.02e-10 4.64e-11 8.31e-11 2.23e-11 5.77e-12 2.31e-11 9.36e-12 4.25e-12
ES 1 8.64e-07 1.02e-07 1.05e-08 4.24e-09 1.17e-09 4.65e-10 5.06e-10 3.69e-10 2.02e-10
2 6.99e-07 6.03e-08 5.25e-09 2.78e-09 9.70e-10 3.37e-10 3.66e-10 1.53e-10 8.00e-11
3 2.81e-08 1.89e-09 1.42e-10 6.88e-11 1.90e-11 5.17e-12 2.05e-11 1.00e-11 4.01e-12
4 1.61e-08 1.37e-09 1.13e-10 7.12e-11 1.99e-11 4.52e-12 1.73e-11 7.84e-12 2.96e-12

Figure 8. Variance reduction achieved by IS: Gaussian mixture and normal distribution.

Lets proceed to the testing of the variance Table 3. Variance reduction ratio Gaussian mix./nor-
reduction. In Table 2 we can see the variance of mal dist.
all combinations of tested confidence levels and
portfolios for the plain (crude) MC simulation, the Confidence level p
IS using the normal distribution and the IS using Characteristic Portf. idx. 0.99995 0.9995 0.995
the Gaussian mixture. The variance is calculated
as an empirical value of 1000 simulations consisted VaR 1 9.54 4.90 3.65
of 106 samples. 2 9.10 6.35 4.26
For more illustrative view of achieved variance 3 3.06 1.91 1.28
reduction see Figure 8. Figure shows a compari- 4 3.58 2.38 1.35
son of the variance reduction between the stand- ES 1 8.37 3.16 2.30
ard and the Gaussian mixture approach for all 2 7.59 6.34 4.20
confidence levels and portfolios combinations. 3 3.35 1.88 1.28
Clearly the IS using the Gaussian mixture achieve 4 4.11 2.54 1.52
better variance reduction in every test, this was
evident because the normal distributions family
is a subset of the Gaussian mixture distributions reduction between the IS using the normal distri-
family. bution and the IS using the Gaussian mixture.
For exact comparison of the two IS approaches, The improvement of the IS by using the Gaus-
see Table 3. Table shows ratios of the variance sian mixture is given by the presence of systemic

117

CH14_38_S04.indd 117 3/15/2016 1:16:17 PM


factor with very high impact on loss LN . These ACKNOWLEDGEMENT
components can be found mostly in the portfolio 1.
and 2., therefore in these portfolios we obtain the This work was supported by The Ministry of
best improvements in the variance reduction. Sam- Education, Youth and Sports from the National
ple of such component was presented in Figure 3. Programme of Sustainability (NPU II) project
IT4Innovations excellence in science - LQ1602.

5 CONCLUSION
REFERENCES
The objective of this paper was to speed-up the
multi-factor Merton model MC simulation. This Bishop, C.M. (2006). Pattern recognition and machine
learning. Springer.
was fully accomplished by the GPU implementa-
Dubi, A. (2000). Monte Carlo applications in systems
tion and the IS application. engineering. Wiley.
We presented three different GPU implemen- Glasserman, P. & J. Li (2005). Importance sampling
tations, each better for different purpose. Two of for portfolio credit risk. Management science 51(11),
the GPU implementations solve the general multi- 16431656.
factor Merton model with speed-up against serial Kroese, D.P., T. Taimre, & Z.I. Botev (2013). Handbook
model in range of 19 to 287 depending on of Monte Carlo Methods. John Wiley & Sons.
structure of portfolio, see section 4.1. Third GPU Kurtz, N. & J. Song (2013). Cross-entropy-based adap-
implementation was specialized, taking input in tive importance sampling using gaussian mixture.
Structural Safety 42, 3544.
form of structure described in section 1.2. This
Ltkebohmert, E. (2008). Concentration risk in credit
implementation achieves speed-up in range of 209 portfolios. Springer Science & Business Media.
to 1001 depending on the portfolio structure. NVIDIA (2015). Cuda c best practices guide. http://docs.
For the IS we proposed a new approach using nvidia.com/cuda/cuda-c-best-practicesguide/. Version
the Gaussian mixture distribution. Using this 7.5.
approach we achieved a significant variance Redner, R.A. & H.F. Walker (1984). Mixture densities,
reduction improvement for the certain portfolio maximum likelihood and the em algorithm. SIAM
structures, see section 4.2.2. In comparison to the review 26(2), 195239.
standard IS approach we got from 9.5 to 1.3 Rubinstein, R.Y. & D.P. Kroese (2011). Simulation and
the Monte Carlo method. John Wiley & Sons.
better results. The total achieved variance reduc-
Rubinstein, R.Y. & D.P. Kroese (2013). The cross-entropy
tion was up to 1911 for the ES calculation and up method: a unified approach to combinatorial optimiza-
to 737 for the VaR calculation. tion, Monte-Carlo simulation and machine learning.
The combination of the IS and the GPU imple- Springer Science & Business Media.
mentation can bring a speed-up of the standard
serial MC simulation in orders of hundreds of
thousands for portfolios with high dependence on
systemic factors.

118

CH14_38_S04.indd 118 3/15/2016 1:16:17 PM


Applied Mathematics in Engineering and Reliability Bri, Snel, Khanh & Dao (Eds)
2016 Taylor & Francis Group, London, ISBN 978-1-138-02928-6

Highly reliable systems simulation accelerated using CPU and GPU


parallel computing

S. Domesov & R. Bri


VBTechnical University of Ostrava, Ostrava-Poruba, Czech Republic

ABSTRACT: Highly reliable systems simulation is a complex task that leads to a problem of rare
event probability quantification. The basic Monte Carlo method is not a sufficiently powerful technique
for solving this type of problems, therefore it is necessary to apply more advanced simulation methods.
This paper offers an approach based on the importance sampling method with distribution parameters
estimation via the cross-entropy method in combination with the screening algorithm. This approach is
compared to another one based on the Permutation Monte Carlo method particularly in terms of the
achieved variance reduction. The paper also explains, how to apply these simulation methods to systems
with independent components, that can be represented by the use of the adjacency matrix. A new gen-
eralized algorithm for the system function evaluation, which takes into account an assymetric adjacency
matrix, is designed. The proposed simulation method is further parallelized in two ways, on GPU using
the CUDA technology and on CPU using the OpenMP library. Both types of implementation are run
from the MATLAB environment, the MEX interface is used for calling the C++ subroutines.

1 INTRODUCTION both following the exponential distribution, the


well-known formula
Highly reliable systems can be found in many
branches of engineering, typical examples include ( )t
communication networks, production lines, stor- p (t ) = + e +
(1)
+ +
age systems, etc. The computer simulation helps to
ensure almost flawless operation of these systems.
This paper offers an innovative approach to (see e.g. (Dubi 2000)) can be used to calculate the
the simulation methodology usable for reliability probability, that the component is operational
quantification of highly reliable systems with inde- in a given time t, stands for the failure rate,
pendent components. This approach uses a static stands for the repair rate and it is assumed that
model of the simulated system, however it can also p (0) 1 . The reference (Bri 2007) offers other
be utilized while simulating the system operation component models, for example for the case,
over a long period of time. Generally the approach when the component subjects to more than one
can cover the following situations: type of failure, or the case, when the time to fail-
ure follows the log-normal distribution.
1. A static system simulation. The values of prob- 3. A steady state simulation. If the steady state of
ability, that the individual components are the system is studied, the input values of the
operational, are given. Based on these values, problem are the values of probability, that the
the system reliability (i.e. the probability, that components are operational in the infinite time.
the whole system is operational) is estimated. Therefore the problem is also convertible to the
2. A dynamic system simulation. When simulating situation 1. In the aforementioned case of the
the system operation over a period of time, it is random variables time to failure and time to
assumed, that the time dependent components repair following the exponential distribution,
reliability follows a proper probability distribu- this probability is given by
tion. In specific time points within the studied
period, components reliabilities are computed

to be than used for the system reliability quan- p( ) lim p (t ) = . (2)
tification, the problem can simply be converted t +
to the situation 1. For example when the opera-
tion of a component is modelled using two ran- In the case of highly reliable systems, the prob-
dom variables time to failure and time to repair, ability of the system failure is very low, therefore

119

AMER16_Book.indb 119 3/15/2016 11:25:39 AM


the system simulation leads to the problem of rare
event probability quantification.  MC z1 2 ,  MC + z1 , (4)
N 2 N
1.1 System specification where usually equals 0.05 or less in practical
Consider a system of n ( n ) components. Each
component remains in one of the two states, opera-
applications and z1 denotes the 1 2 -quan-
2
( )
tile of the standard normal distribution. Number
tional or failed. The state of the system is described
of samples required to achieve the predetermined
by the state vector b = ( , , ) , its elements repre-
accuracy with probability 1 is
sent states of the components. If the ith compo-
nent is operational, bi = 0 , and if it is failed, bi = 1
2
. Furthermore it is necessary to define the system
function H. This function returns 0, if the system N z1 . (5)
2
is operational for a specific state vector b, and oth-
erwise H ( b) = 1.
The stochastic properties of the system are Example 1. Consider an example of rare event
described by the random vector B = ( , , ) , probability estimation. Determine the number
where the random variable Bi is assigned to the ith of samples required to estimate the value of 
component. The probability distribution of Bi is with accuracy = 0.1  and with probability
Bernoulli with the parameter pi, i.e. ( Bi ) pi . 1 = 0.95. In this case 2 =   2 , therefore
( )
2

Event Bi = 1 indicates the failure of the ith compo- N   2 1 z1  384 1 1 . For  = 10 m
2
nent, whereas Bi = 0 indicates the operational state. it is necessary to perform more than 3.84 10m+2
Actually the vector p = ( , , ) is a vector of the samples, which would be excessively time consum-
unreliabilities of the components. ing especially for complicated systems and high
System availability is defined as probability, that values of m. (The value of m equals 4 and more for
the system is operational. However when dealing rare event probabilities.)
with rare event probabilities it is more suitable to The previous example shows the need of using
formulate the problem as calculating the unavail- variance reduction techniques, that allow achieving
ability of the system, i.e. the probability  that the the same accuracy when performing a significantly
system is not operational. Obviously  = E ( H ( )) lower number of samples.
holds.

2.2 Importance sampling


2 SIMULATION METHODS For variance reduction the Importance Sampling
(IS) technique can be used. Random samples
A system of n components, a system function H B1, , B N are generated from a different distribu-
and a vector p of components unreliabilities are tion and the value of  is then estimated as
given. The aim is to estimate the probability  .
1 N f( ),
2.1 Monte Carlo
 IS = H(
N k =1
k )g
( k)
k
(6)

When a basic Monte Carlo (MC) approach is


used, we first generate N ( N N ) samples of the where f is the original probability density function
random vector B and therefore we obtain random of random vector B (called nominal pdf) and g is
samples B1, , BN . The value of  is then esti- the probability density function, from which the
mated as samples were generated (called IS pdf). The ratio
f ( ) g () = W () is often called the likelihood
ratio. The IS pdf must satisfy the condition
1 N
 MC = H(
N k =1
k ). (3)
g( ) 0 H ( x ) f ( x ) = 0, (7)

Its an unbiased estimator of  , E ( MC ) =  , how- see for example (Rubinstein & Kroese 2011,
ever this approach is not suitable for highly reliable Kroese, Taimre, & Botev 2013). The principle of
systems (Kleijnen, Ridder, & Rubinstein 2010). this technique is simple, but it can be difficult to
For the variance of  MC we can easily y obtain find an appropriate IS pdf, which will lead to a
Var ( MC ) = 2 / N , where 2 = Var ( H ( )). massive variance reduction.
Using the central limit theorem we can determine It is not a rule, but it is usual to select an IS pdf
a 1 confidence interval for  as from the same family of distributions as the nomi-

120

AMER16_Book.indb 120 3/15/2016 11:25:39 AM


nal pdf comes from. In this case the nominal pdf
is a joint probability density function of a multi-
variate Bernoulli distribution with independent
variables, therefore it is a product of the probability
density functions of random variables B1, , Bn . As
the IS pdf we will also use a product of Bernoulli
probability density functions, so the key issue is to
find appropriate parameters q1 qn of the new
Bernoulli distributions. This can be done using the
cross-entropy method (Rubinstein & Kroese 2013).

2.3 The Cross-Entropy method


The Cross-Entropy (CE) method offers an effec-
tive way to find an IS pdf g for which the variance
of the estimator (6) is small. This method is based
on the minimization of the Kullback-Leibler diver-
gence between the unknown IS pdf and theoretical
optimal IS pdf, which is explained thoroughly by
(Rubinstein & Kroese 2013). For this purpose it is
sufficient just to briefly outline the method.
It is assumed that the nominal pdf f has the
form f (; ) and the IS pdf belongs to the same
parametric family, i.e. g f (; ). It can be shown
that the aforementioned minimization is equiva-
lent to the maximization of E q ( H ( ) f ( B; q ))
according to vector q. The requested vector At first sight it seems that the CE algorithm pro-
vides a straightforward way to find an appropriate
vector q of parameters of the IS pdf, however the
argmax E q ( H ( ) ln
l f( ; )) (8) algorithm should be used with caution.
q
In the step 2. the samples B1, , BN are gen-
erated from the pdf f ; q ( j ) . If the system is

can naturally be estimated using the MC method
as operational for all of these samples, the new vector
q(j) cannot be determined. This situation is often
caused by low values of vector q ( j ) elements. It
1 N
q = argmax H ( ) ln
l f( ; ), (9) is possible to add the step
N k =1
If H ( B1 ) H ( BN ) , repeat step 2.
q

between steps 2. and 3., however this would pro-


where B1, , BN are random samples generated
long the computation time and may lead to the
from the nominal pdf f (; ) .
likelihood ratio degeneracy, for more informa-
In the case of the distributions from the expo-
tion about the degeneracy of the likelihood ratio
nential families (for example Bernoulli or expo-
see (Rubinstein & Kroese 2011). To avoid this
nential) the stochastic program (9) has a simple
(
solution. The elements of the vector q = q 1,, q n ) situation, it is therefore necessary to pay sufficient
attention to the choice of the initial vector q(0). It
can be computed as is evident, that when the elements of the vector p
of parameters of the nominal distribution are rare
k =1 H ( k ) Bkki , event probabilities, the choice q (0) p is not con-
N

q i = i 1,, n, (10) venient. A proper way of choosing the vector q(0)


k =1 H ( k )
N
will be proposed in section 6.
Consider also the case, when for some compo-
where Bki means the ith coordinate of the vector Bk nent index i the values Bki equal 0 for all k ,, N .
(Kroese, Taimre, & Botev 2013). In this case the ith element of vector q(i) would be
There is also a possibility to use the CE method set to zero and the IS pdf would not satisfy the
as an iterative method and gradually refine the condition (7). The screening algorithm, which will
vector of parameters of the IS pdf. The iterative be described below, is primarily intended to pre-
version for the distributions from the exponential vent the likelihood ratio degeneracy, but it also
families is given by the following Algorithm 1. solves this problem.

121

AMER16_Book.indb 121 3/15/2016 11:25:44 AM


2.4 Screening algorithm Example 2. The schemes in Figure 1 show an
example of a system with independent compo-
The screening algorithm is often used to identify
nents, that can be represented using an adjacency
the non-bottleneck (i.e. unimportant) components
matrix. The corresponding adjacency matrix for
of the vector q(j). If the relative difference between
this system is
the nominal and the new parameter
0 1 0 0 1 0 0 0
q(j )
i
pj 0
(11) 0 1 0 1 0 0 0
pj
0 0 0 1 0 0 0 0
0 0 0 0 0 0 1 1
is smaller than some threshold , the ith compo- M= .
nent iis marked as non-bottleneck and the value 0 0 0 0 0 1 0 0
of q(j ) is set to pj. This means that the ith compo-
i
0 0 1 0 0 0 1 0
nent does not influence the likelihood ratio in the 0 0 0 0 0 0 0 1
IS estimator (6). When applying this algorithm
to the highly reliable systems simulation, it is suf- 0 0 0 0 0 0 0 0
ficient to set to zero. Consequently the value
of qi cannot be smaller than the nominal value In the language of relations the system and
pi. Since pi denotes the (usually rare event) prob- corresponding system function can be repre-
ability of the ith component failure, it would be sented as follows. A system is determined by a set
unreasonable to decrease this probability in the ={ , , , , } and relation R on set ,
IS pdf. the set C {c , , cn } is a set of all system compo-
The use of this type of the screening algorithm nents. For each two elements in Q it can be decided,
means to insert the following step after the step 4. whether they are in relation or not. If i is in
of the Algorithm 1, (
relation with j , we write i j R . Gener-)
For all i I check, ( j)
ck if qi < pi . If this condition ally this relation is not symmetric, if i j R ( )
is fulfilled, set qi( ) pi and remove i from I.
j ( )
holds, then j i R does not necessarily hold.
The computer representation of relation R can be
easily realised using a ( n + ) ( n + ) matrix of
3 SYSTEM REPRESENTATION AND logical values 0 and 1, which we call the matrix of
CORRESPONDING SYSTEM FUNCTION the relation or simply the adjacency matrix. First
row and column belongs to the IN elements, last
Every system is a set of components, however each row and column to the OUT element and the
remaining belong to the components. The value 1
( )
type of system requires different kind of represen-
tation and different approach to the simulation. at position (i, j) indicates, that i j R .
For easier work with systems, many types of sys- The system function H is specified by the follow-
tem representation have already been developed. ing condition. The system is operational if there
An example is explained in (Bri 2010), where the
system is represented as a directed acyclic graph.
Other ways to construct the system function for
special types of systems are discussed in (Ross
2007).

3.1 System representation using adjacency matrix


In this case a representation using an adjacency
matrix is suggested. This representation is intended
for systems, that can be interpreted as a collection
of n components and two special elements called
IN and OUT, where some of these n + 2 elements
are connected to (or in relation with) other ele-
ments. Each of the n components is either opera-
tional or failed. We say that a system of this kind is
operational, if there exists a path from IN to OUT
leading only through the operational components.
Such a system can be depicted as a directed graph
with n + 2 nodes. Figure 1. System example.

122

AMER16_Book.indb 122 3/15/2016 11:25:48 AM


exists a sequence k1, , kd ( d N ) of indexes of 4 IMPLEMENTATION AND
operational components that PARALLELIZATION





(
IN, ck1 R, ckd , OUT R, ) (12)
The IS-based simulation method presented in sec-
tion 2 in combination with the function H evalu-
i { , }: ation forms a useful tool for the simulation of

, ki ,,c
cki +1
R
systems specified by the adjacency matrix. Its
principle is based on the generation of independ-
holds and failed otherwise. ent samples, therefore we can easily reduce the
This representation is suitable for example for simulation time using parallel computing.
reliability networks studied in (Kroese, Taimre, There are many ways to implement the method.
& Botev 2013), which are usually represented as For comfortable work with the simulation results
undirected graphs with components represented in the form of graphs it is convenient to use the
by edges. This reference also describes a simulation Matlab environment. However, to reduce the
method based on the Conditional Monte Carlo computing time it is better to focus on lower-level
designed for reliability networks. This method will programming languages. There is a possibility to
be modified for the more general system represen- combine the advantages of both approaches, to use
tation and compared with the simulation method Matlab for the user-friendly work with results and
based on the importance sampling. implement the most important algorithms in other
languages. The MEX interface of Matlab allows to
3.2 System function evaluation algorithm call functions written in C or C++ from Matlab as
easily as if they were usual Matlab functions.
For the system function evaluation we created After consideration of possible solutions the
the Algorithm 2 based on our previous results in two following alternatives of Matlab implementa-
(Bri & Domesov 2014). Even though the former tion acceleration were chosen:
algorithm was originally intended for systems with
symmetric adjacency matrix, it can be modified to 1. parallel computing on CPU using the OpenMP
reflect the more general case, i.e. the asymmetric library (via the MEX interface),
adjacency matrix. 2. parallel computing on GPU using the CUDA tech-
nology (via the Parallel Computing Toolbox).
In the first alternative the source codes of the
accelerated functions are written in the C++ lan-
guage and for random numbers generation the Boost
library, version 1.56, is used. The second alternative
uses source codes written in the CUDA C extension
of the C language, random numbers are generated
via the cuRAND library [NVIDIA 2014].

4.1 System function implementation


The process of the system function evaluation con-
sumes most of the simulation time, therefore the
efficiency of the implementation of this function
determines the computation time of the simulation
to some extent.
For function H evaluation the Algorithm 2 is
used. To reduce the memory requirements and the
consumption of computation time of the implemen-
tation, bitwise operations are used. Matrix M and
each of the sets M1, M2 and M3 are implemented as
arrays of data type, each variable has
the length of 32 bits. In the case of the sets M1, M2
and M3 the individual bits determine the presence
of certain component in the set and in the case of
the matrix M the bits symbolize relations between
the components and the elements IN and OUT. For
example the representation of the matrix M from
the example 2 as an unsigned int array is

123

AMER16_Book.indb 123 3/15/2016 11:25:50 AM


[18 20 8 192 32 68 128 0 ]. (13) W ( b p, q ) W1 W2 , (15)

If the simulated system has 32 components where


or less, the implementation works with only one
variable for each of the sets M1, M2
W1 =
pi i P 1 pi (16)
and M3.
i S1 qi i Q 1 qi
4.2 Simulation method implementation
and
The simulation is divided into two basic steps, first
one is the cross-entropy algorithm for the deter- i P 1 pi
mination of distribution parameters and second W2 =
. (17)
one is the importance sampling method itself. The i Q 1 qi
basic scheme of the CE algorithm is written in the
Matlab language and it runs in m iterations. For the second factor we can write
The most important part of the implementa-
tion is formed by the function ,
log 2 log (1
log pi ) log (1 qi ) (18)
which includes the function H algorithm. The
i P i Q
function is accelerated using the
two ways mentioned above and it is executed m +
1 times in total, once for every CE method itera- and using the Taylor series we can approximate
tion and once for the IS method. Its input argu-
ments are the matrix M converted into an array of qi2 pi2
variables, number of components n, logW
W2 qi 2
pi + 2
. (19)
number of samples N to be performed during one i Q i P
CE method iteration/IS method execution, vector
p of parameters of the nominal pdf, vector q of
parameters of the IS pdf and a vector of bottle- 4.4 Computation time
neck components indexes. The function outputs In this section we compare the computation time
the value Nk =1 ( ) ( ; , ) and in the case of the two ways of the function
of the CE method iterations the second output is implementation. Computation time of one function
a vector of values N k =1 ( ) ( ; , ) Bkki for run is examined, therefore there are no iterations of
every i { , , n}. the CE method and the vector q is pre-chosen.
Two testing systems shown in Figure 2 and
4.3 Accuracy of likelihood ratio calculation Figure 3 were chosen for the experiments. System
A is a reliability network with a regular structure,
The CE algorithm and the IS method contain cal- it has 60 components represented as edges of this
culating the likelihood ratio as network. System B is a reliability network with 18
components taken from (Kroese, Taimre, & Botev
n
2013), where it was used as a testing problem for the
pb ( pi )
i
1 bi

f ( b p ) i =1 i demonstration of the Permutation Monte Carlo


W ( b p, q ) = = n . (14) method efficiency. This method will be discussed
f (b q)
i q b
(1
i
qi )1 bi in section 5 as an alternative to the IS approach.
i =1 For simplicity we consider p = 0.01 and q = 0.1
for both systems.
The problem with the accuracy arises while com- The graphs at Figure 4 show the dependence
puting 1 pi for small values of pi. If pi < 0.18 , the of the computation time on number of threads
expression 1 pi returns 1 in the single precision. for the first way of approximation. This imple-
To prevent this, we define five sets of component mentation uses OpenMP and it was tested on
indexes. The set S1 contains indexes of compo- a double Intel Sandy Bridge E5-2470 processor
nents, for which bi = 1. The set P contains indexes (16 cores). Number of samples in each simula-
of components, for which bi pi < and tion was 7.488 107. The values of computation
P { , , n}  (S P ) . Similarly Q contains time for few different thread counts are written
indexes of components, for which bi qi < in Table 1. Each of the simulations was executed
and Q { n}  (S Q ) . It is convenient to three times, the values of computation time
choose for example = 0.15 as a threshold. With are averages of results obtained from this three
this notation the likelihood ratio can be written as simulations.

124

AMER16_Book.indb 124 3/15/2016 11:25:52 AM


In the case of the CUDA implementation 192
threads per block, 390 blocks per grid and batches
of 1000 samples per each thread are used, this gives
us the same total of 7.488 107 samples as in the
previous case. As a testing device a NVIDIA Kepler
K20 accelerator was used. Values of the computa-
tion time for both testing systems are written in
Table 1.

5 COMPARISON WITH THE


PMC METHOD

The abbreviation PMC stands for the Permuta-


tion Monte Carlo method, which is used for net-
work reliability calculations in (Kroese, Taimre, &
Botev 2013). However the principle of the method
Figure 2. System A. can be used for a wider group of systems. With a
modification based on the Algorithm 2 we applied
this method to general systems determined by an
adjacency matrix and we accelerated it using the
CUDA technology.
The PMC method is based on a variance reduc-
tion technique called Conditional Monte Carlo,
which also provides an unbiased estimator of the
value  = E ( ) . The Conditional Monte Carlo tech-
nique is based on the fact that E ( E (Y | )) E (Y )
for any random variable Y and random vector Z. It
is assumed that there exists such a random vector Z
that E ( H ( B ) | Z = z ) can be computed analytically
Figure 3. System B. for any value of z. The value of  is estimated as

1 N
C =
N k =1
(Y | k ), (20)

where Z1, , Z N are samples of the random vec-


tor Z.
The PMC method uses a different formulation
of the problem. The static system described above
is interpreted as an alternative evolution model
captured at a specific point of time. The alternative
system is described using the adjacency matrix and
Figure 4. OpenMP implementation scalability graph. random vector X = ( , , ) . In this case the
component states change in time. At time t = 0 all of
the components are failed. The random variable Xi
Table 1. IS method implementation acceleration. has the exponential distribution with parameter i
and it describes the time to repair of the ith compo-
Time [s] nent ci. It is defined by the relation P (X i ) pi ,
which is fulfilled for i = ln ( pi ) . Using this
Number of threads System A System B formulation the unavailability  of the system can
OpenMP 1 191.69 34.65
be expressed as
2 96.05 17.53
4 48.48 9.02  = E I S ( )1 = P (S ( ) 1) , (21)
8 25.27 4.77
16 14.20 2.83
CUDA 2.42 1.06
where S(X) is a function that returns the time when
the system starts operating for a sample of X.

125

AMER16_Book.indb 125 3/15/2016 11:25:55 AM


Let us define ( , , ) as a permutation The calculation of S(x) presented in (Kroese,
of values 1, ,n obtained by sorting the values Taimre, & Botev 2013) is based on the sequential
X1, , X n in ascending order, i.e. X 1 X n . construction of the incidence matrix and in each
The random variables X1, , X n denote the times step it is decided whether the system is operational
to repair of the components, therefore the random or not. For the CUDA implementation it is not con-
vector denotes the order in which the components venient to use this method of calculation, because
were put into operation. The function crit ( ) the process of the construction of the incidence
returns the smallest number of components that matrix differs for the specific vectors x and every
must be put into operation until the whole system thread would need to record its own adjacency
is operational. Therefore S ( ) X crit( ) . matrix. For this reason a new implementation based
At this point all the random variables needed on the Algorithm 2 was chosen. This implementa-
to use the Conditional Monte Carlo method are tion (see Algorithm 3) uses the matrix M which is
defined. In this case Y I S ( ) 1 and Z = . It is common for all the threads and therefore can be
necessary to compute the value stored in the global or constant memory.

G ( ) = (I S | ) (22) 5.2 Computation time


The CUDA accelerated implementation was com-
analytically. In (Kroese, Taimre, & Botev 2013) the pared to the original Matlab implementation using
value of G() is computed as the testing reliability network with 18 components,
c
see Figure 3. The same hardware as in section 4.4
G ( ) c j exp ( c j +1 ), (23) was used for testing. Due to the higher compu-
tation time we chose N = 7.488 106 and we also
j =1
considered p = 0.01. The original Matlab imple-
mentation uses a loop over the samples, therefore
where c ccrit ( ), it can be easily parallelized using the
i . Results of this three versions of implementation
i j
for i {, , c} (24) are shown in Table 2.
j =1

and the values c,j are given by a recursive formula

ck
1,1 1 k +1, j = k , j , (25)
c k c j +1
k
k , k +1 1 k j (26)
j =1

for k {, ,cc } and j { k} .

5.1 CUDA accelerated implementation


The Matlab implementation presented in (Kroese,
Taimre, & Botev 2013) uses pre-counted values vk
( k ,, c ) for the calculation of values k,j, that
are saved at a form of a matrix. This approach is
not suitable for the CUDA implementation because
the threads can only use a limited amount of the
local memory. However for the calculation of the
values c,j, { b} , an explicit formula Table 2. PMC method implementation
acceleration.
c
i
c j = (27) Time [s]
i 1,i j 1 i c j +1

Matlab 2332.18
Matlab + parfor (16 threads) 183.58
can be derived, the use of this formula leads to the
Matlab + CUDA 1.91
reduction of memory requirements.

126

AMER16_Book.indb 126 3/15/2016 11:25:57 AM


6 APPLICATIONS

The proposed approach based on the IS method


was applied to the testing systems presented in sec-
tion 4.4. For the experiments the CUDA acceler-
ated version of the implementation was used. For
comparison the same problems were also solved
using the PMC method, also accelerated using Figure 5. Ratio of the variances, higher value is better.
CUDA.
The unreliability of all components is identical
and equals p. Both problems are solved for ten dif-
{ }
ferent values of p, specifically p 0.1, 0.12 , , 0.110 .
The following inputs of the CE method are cho-
Table 3. Achieved variance reduction.

Variance s2 Variance reduction


sen: sample size NCE = 7.488 105 , as the stopping
criterion 10 iterations are predetermined and the p MC IS + CE PMC IS + CE PMC
initial vector q (0) differs depending on the vector
p. Sample size is N = 7.488 106 for both simula- 0.1 2.38e-02 5.40e-03 1.28e-02 4.4 1.9
tion methods. 0.12 2.02e-04 5.49e-07 1.79e-05 368 11
0.13 2.00e-06 4.34e-11 3.23e-09 46156 621

6.1 Reliability network of 60 components


Network graph of this system is shown in
Figure 2.
For p = 0.1 the initial vector q (0) p was cho-
sen and as a result of the CE method we obtained
an optimal vector q of parameters of the IS pdf,
denote q q0.1 . This vector was used as an input
of the IS method, the estimation of the unavail-
y of the whole system is  IS = 2.44 10 2. For
ability
p = 0.01 we have many possibilities of choos-
ing the initial vector, however the vector q0.1 Figure 6. Vector q for the non-directed system B.
obtained for the previous value of p appeared to
be an appropriate choice. For the remaining val-
ues p { . 3 , , . 10 } , the procedure is analogous, ratio of the variances grows with lower unavail-
as the initial vector q (0) we always use the optimal ability, i.e. it is particularly suitable for highly reli-
vector q, that was obtained for the higher value of able systems.
unreliability p. The Table 3 compares both methods with the
The results of the IS method are written in Table simple MC simulation in terms of the achieved
4, for comparison the results of the PMC method variance reduction. Results for < 0.13 are not
are listed to. The approximations of the system listed because the sample size N was not sufficient
unavailability are almost equal, both methods to capture the rare event.
work properly. Differences are in the accuracy of
this results, in the Table 4 the accuracy is repre- 6.2 Reliability network of 18 components
sented by the Relative Standard Deviation (RDS)
estimated as This system is given by the network graph in
Figure 3.
s We applied the same procedure to the series
100, (28) of problems depending on p 0.1, 0.12 , , 0.110 .
 N
For p = 0.1 we chose q (0) p and the CE method
returned the optimal vector q0.1 of IS pdf param-
where s is the standard deviation and  is the eters, that is demonstrated by the upper ggraph of
approximation of the system unavailability, smaller the Figure 6. For p = 0.01 we chose q (0) q0.1
value of RSD is better. The graph at Figure 5 shows and obtained q0.01 as a result of the CE method,
the ratio of the variance achieved by both meth- see the lower graph of the Figure 6. Graphs for
ods, for p < 0.13 the variance achieved by the IS p = 0.001 and lower are not plotted, they would
estimator is more than 100 times lower than the coincide with the lower graph. We can notice, that
variance of the PMC estimator. For suitability of values of this vectors correspond to the impor-
the IS approach it is especially convenient that the tance of individual components, i.e. high value of

127

AMER16_Book.indb 127 3/15/2016 11:26:01 AM


Table 4. Comparisons of the proposed IS approach and the PMC method.

System A (60 components) System B (18 components) System Bdirected

IS PMC IS PMC IS PMC

p  IS RSD  PMC RSD  IS RSD  PMC RSD  IS RSD  PMC RSD


0.1 2.44e-02 0.110 2.44e-02 0.169 1.90e-02 0.129 1.91e-02 0.090 2.34e-02 0.112 2.35e-02 0.083
0.12 2.03e-04 0.133 2.05e-04 0.753 2.00e-05 0.294 1.99e-05 0.206 2.75e-05 0.213 2.77e-05 0.175
0.13 2.00e-06 0.120 1.98e-06 1.051 2.01e-08 0.279 1.99e-08 0.228 2.80e-08 0.218 2.79e-08 0.192
0.14 2.00e-08 0.097 2.00e-08 1.083 2.00e-11 0.288 2.00e-11 0.230 2.79e-11 0.216 2.80e-11 0.194
0.15 2.00e-10 0.098 1.98e-10 1.092 2.00e-14 0.282 2.00e-14 0.230 2.82e-14 0.216 2.81e-14 0.193
0.16 2.00e-12 0.097 1.98e-12 1.093 1.99e-17 0.267 1.99e-17 0.231 2.81e-17 0.215 2.81e-17 0.193
0.17 2.00e-14 0.097 1.99e-14 1.089 1.99e-20 0.293 1.99e-20 0.231 2.81e-20 0.218 2.80e-20 0.194
0.18 2.00e-16 0.097 2.03e-16 1.077 2.00e-23 0.284 2.00e-23 0.231 2.78e-23 0.216 2.79e-23 0.194
0.19 2.00e-18 0.097 2.01e-18 1.084 2.00e-26 0.274 2.00e-26 0.230 2.79e-26 0.222 2.80e-26 0.194
0.110 2.00e-20 0.097 1.96e-20 1.096 2.01e-29 0.271 2.00e-29 0.231 2.80e-29 0.220 2.79e-29 0.194

qi means that if the component i is failed, the whole


system is failed with high probability.
Computed values of the system unavailability
and RSD for both simulation methods are writ-
ten in Table 4. The values of the unavailability
agree with results reported in (Kroese, Taimre, &
Botev 2013). The values of RSD are now slightly
lower for the PMC method, however we can see
that the IS approach is suitable for this reliability
Figure 7. Vector q for the directed system B.
network, that serves as a testing problem for the
PMC method.
7 CONCLUSIONS
6.3 System with oriented edges The simulation method based on the importance
Consider again the system given by the network sampling technique with IS pdf parameters esti-
graph in Figure 3. We are interested in a system mation using the cross-entropy method was suc-
with similar structure, but the edges are now cessfully applied to the highly reliable systems with
treated as oriented. Obviously only the edge independent components. The proposed procedure
number 10 is working in both directions. Lets of choosing the initial vector of the CE method
say this edge will communicate only in the direc- has proven to be beneficial. In section 6 it was
tion from left down to up right, e.g. the path used for solving a series of problems with decreas-
going from IN to OUT through edges 2, 6, 10, 14, ing unreliability, however this process can be used
18 is valid but the path going from IN to OUT generally when estimating very low values of the
through edges 1, 4, 8, 13, 10, 11, 15 is not valid. unavailability  . The procedure is summarized in
It is expected that after this restriction the system our general Algorithm 4 for rare event probability
unavailability will grow. quantification.
The IS approach was applied to this sys- Notice, that in section 6 we worked with a
tem using the same procedure as in the previous sequence k = 109 ,108 , ,100 .
cases, see Figure 7. We can observe the effect of The results show that this IS-based approach
this structural change on the importance of the is well suited for rare events quantification in the
individual components. After the modifications in field of highly reliable systems simulation due to
implementation, the PMC method can also be used its massive variance reduction. For example in
for directed systems. For both methods results see the case of the testing system with unavailability
Table 4. As expected, the values of unavailability  = 2 10 6 the variance was reduced more than 4
are higher than in the previous case. 104 times in comparison to the MC method. It was

128

AMER16_Book.indb 128 3/15/2016 11:26:04 AM


lation method also brought a generalization for
directed systems given by an adjacency matrix.

ACKNOWLEDGEMENT

This work was supported by The Ministry of


Education, Youth and Sports from the National
Programme of Sustainability (NPU II) project
IT4Innovations excellence in scienceLQ1602.

REFERENCES

Bri, R. (2007). Inovation methods for reliability quantifica-


tion of systems and elements. Ostrava, Czech Republic
Vysok kola bskTechnick univerzita Ostrava.
Bri, R. (2010). Exact reliability quantification of highly
not possible to apply the simple MC method to reliable systems with maintenance. Reliability Engi-
more reliable systems, however it was shown that neering & System Safety 95(12), 12861292.
the variance reduction increases with increasing Bri, R. & S. Domesov (2014). New computing technol-
system reliability. ogy in reliability engineering. Mathematical Problems
The approach was verified by applying the PMC in Engineering.
Dubi, A. (2000). Monte Carlo applications in systems
method to the same series of testing problems. The
engineering. Wiley.
results obtained by both methods were compara- Kleijnen, J.P.C., A.A.N. Ridder, & R.Y. Rubinstein
ble, the IS-based method was successful especially (2010). Variance reduction techniques in monte carlo
in the case of the system of 60 components with methods.
different impact to the system reliability, where it Kroese, D.P., T. Taimre, & Z.I. Botev (2013). Handbook
achieved about 100 times lower variance. of Monte Carlo Methods. John Wiley & Sons.
Significant acceleration of the simulations was NVIDIA (2014). Cuda c best practices guide. URL: http://
achieved using CPU and GPU parallel comput- docs.nvidia.com/cuda/cuda-c-best-practicesguide/.
ing. Especially CUDA has proven to be a power- Ross, S.M. (2007). Introduction to probability models.
Elsevier Inc.
ful technology for simulation based algorithms.
Rubinstein, R.Y. & D.P. Kroese (2011). Simulation and
For example the computation time of the CUDA the Monte Carlo method. John Wiley & Sons.
accelerated PMC method was approximately Rubinstein, R.Y. & D.P. Kroese (2013). The cross-entropy
1200 times shorter than the computation time of method: a unified approach to combinatorial optimiza-
the non-accelerated Matlab implementation. The tion, Monte-Carlo simulation and machine learning.
modified implementation of this successful simu- Springer Science & Business Media

129

AMER16_Book.indb 129 3/15/2016 11:26:05 AM


This page intentionally left blank
Network and wireless network reliability

AMER16_Book.indb 131 3/15/2016 11:26:06 AM


This page intentionally left blank
Applied Mathematics in Engineering and Reliability Bri, Snel, Khanh & Dao (Eds)
2016 Taylor & Francis Group, London, ISBN 978-1-138-02928-6

Advanced protocol for wireless information and power transfer


in full duplex DF relaying networks

Xuan-Xinh Nguyen
Wireless Communications Research Group, Ton Duc Thang University, Ho Chi Minh City, Vietnam

Duy-Thanh Pham & Thu-Quyen Nguyen


Ton Duc Thang University, Ho Chi Minh City, Vietnam

Dinh-Thuan Do
Ho Chi Minh City of Technology and Education, Ho Chi Minh City, Vietnam

ABSTRACT: An issue of energy consumption in the wireless network communication systems has
attracted much attention from the researchers in recent years. The problem of effective energy consump-
tion for cellular networks becomes an important key in the system design process. This paper proposes
a new protocol for power tranfer, named Time Switching Aware Channel protocol (TSAC), in which the
system can be aware channel gain to adjust the proper time for the power transfer. This paper also inves-
tigates the throughput optimization problem of energy consumption in Decode and Forward (DF) based
cooperative network approach in which an allocated relay power transmission is proposed. By assuming
that the signal at a relay node is decoded correctly when there is no outage, the optimal throughput effi-
ciency of the system is analytically evaluated.

1 INTRODUCTION the wireless signals can carry both energy as well


as information (Varshney, 2008), the surrounding
The trend of the researchers towards energy con- radio signal considered as a novel viable source is
sumption in wireless communication systems has achieving more and more research attention in the
experienced a drastic change over recent years. field of wireless communication systems.
There have been the increasing energy-aware radio Basing on the first announced approach in (Var-
access solutions on energy consumption where a shney, 2008), more practical receiver architectures
prudent use of energy is one of the decisive design have been developed by supposing that the receiver
elements. Besides, applications involving wireless has two circuits to separately perform energy har-
sensor networks are becoming increasingly popu- vesting and information decoding (Zhou et al.,
lar in todays demanding life. 2013, Nasir et al., 2013). Especially, with a strat-
The sensor networks and cellular networks, egy called time switching, the receiver can either
wireless devices are equipped by replaceable or switch on the two circuits at separate times. For a
rechargeable batteries in conventional wireless power splitting strategy, it is possible to divide its
networks. However, the lifetime of these battery observations into two streams which are directed
powered devices are usually limited. Due to lots of to the two circuits at the same time. The work
inconvenient circumstances such as a sensor net- (Zhou et al., 2013) has been taken into account a
work with thousands of distributed sensor nodes, simple single-input single-output scenario, and the
devices located in toxic environments, medical sen- upgrading to multi-input multi-output broadcast-
sors implanted inside human bodies, replacing or ing scenarios has been considered in (Zhang and
recharging the available batteries periodically may Ho, 2013).
not be the reasonable option. In the paper (Xiaoming et al., 2014), a time allo-
For those reasons, obtaining permanent power cation policy is carried out for two transmitters,
supply over the energy harvesting (Ozel et al., the efficiency of the energy transfer is maximized
2011, Chin Keong and Rui, 2012) has become by means of an energy beamformer that exploits
an attractive methodology to prolong these wire- a quantized version of the CSI which is received
less network lifetime. Nowadays, solar and wind in the uplink by the energy transmitter. Moreover,
are utilized as typical energy resources in our life. G. Yang investigates the optimal time and power
In addition, by taking advantage of an idea that allocations strategies (Gang et al., 2014) so that

133

AMER16_Book.indb 133 3/15/2016 11:26:06 AM


the total amount of harvested energy is maximized the power transfer and information communica-
and takes into account the effect of the CSI accu- tion. The main contributions can be described as
racy on the latter quantity. follows
A relay assisted system with energy transfer
1. This paper proposed a new protocol for energy
has focused on two main directions: a) Simulta-
harvesting at energy constrained relay that can
neous Wireless Information and Power Transfer
be aware Channel State Information (CSI) to
(SWIPT) scenarios where the employed relay, (Ng
allocating the time for a fix pre-defined power.
and Schober, 2013, Ng et al., 2013a) (or the source
2. A close-form for analytical expression in term
terminal (Ng et al., 2013b)) salvages energy from
of systems throughput and numerical result for
the radiated signal incident from the source termi-
optimal relay transmission power allocation is
nal (or the employed relay). b) Multi-hop energy
also derived.
transfer scenarios in which the energy is transferred
to remote terminals (Xiaoming et al., 2014, Gang The rest of this paper, section 2 introduces the sys-
et al., 2014). The results in (Xiaoming et al., 2014) tem model and presents the energy harvesting pro-
show that multi-hop energy transfer can decrease tocol. Section 3 derives the outage probability and
the high path-loss of the energy-bearing signal, throughput analysis and power allocation policy for
while (Gang et al., 2014) deals with the case where optimization throughput of the system. The numeri-
a multi-antenna relay feeds two separate termi- cal result is presented in section 4. Finally, Section 5
nals with information and power, respectively, and concludes the proposed protocol of this paper.
studies the transmission rate and outage probabil-
ity that is sacrificed in the remote energy transfer.
The principle of Full Duplex (FD) technique, 2 SYSTEM MODEL AND PROTOCOL
which allows the communication node to transmit DESCRIPTION
and receive simultaneously signals over the same
frequency, has been announced and discussed As shown in Fig. 1, we consider a system model
(Choi et al., 2010, Duarte et al., 2012, Yingbo which includes a source denoted by S, a destination
et al., 2012, Rui et al., 2010) and (Krikidis et al., denoted by D and an intermediate assistance relay
2012). In comparison with the Half Duplex (HD) denoted by R. Each node, i.e. S and D are installed
mode, the FD mode has the ability to double the an antenna, therefore, it works at half-duplex
spectral efficiency due to its efficient exploitation mode, R is equipped two antennas, and operate at
in the limited resources. The self-interference of full-duplex mode.
FD mode, however, leaking from nodes transmis- Channel assumptions: h, f, g are independent
sion to its own reception, reduces the performance and identically distributed (i.i.d.) exponential ran-
of FD communication. dom variables with mean h, f and g, respectively.
An objective of the energy harvesting communi- The channel gain can be get by using trainning
cations, called throughput optimization, has been sequences.
broadly studied in volumes of literatures. In (Tutun-
cuoglu and Yener, 2012) and (Jing and Ulukus, 2.1 Signal model
2012), the throughput optimization for transmitter
with a deadline constraint was investigated over In the Wireless Information Transfer phase (WIT),
a static channel condition. In addition, the prob- the received signal at R, yR,i and D, yD,i in the
lems of throughput optimization were extended time slot ith, respectively, are given by
and applied to fading and multiple access chan-
nels (Chin Keong and Rui, 2010, Ozel et al., 2011,
Jing and Ulukus, 2011). Besides, the cooperation
between nodes is also introduced and considered
to throughput optimization in the energy harvest-
ing communications. In (Chuan et al., 2013), the
problem of throughput maximization was investi-
gated for the orthogonal in Gaussian relay channel
with the energy harvesting constraints.
In particular, apart from the aforementioned
literature, this paper considers a FD DF relaying
networks underlay wireless energy transfer. We
use and improve the Time Switching (TS) receiver
mechanism so that relays harvest the energy from
the source RF radiation, in which the system can Figure 1. System model of full duplex relay
be aware channel gain to adjust the proper time for communication.

134

AMER16_Book.indb 134 3/15/2016 11:26:06 AM


energy harvesting protocol which is named Time
Switching Aware Channel (TSAC). The protocol
can be described as below (see Figure 2).
In this protocol, the antenna which is respon-
sible for received signal, absorbs the RF signal to
convert to DC signal. Therefore, the received signal
at the energy constrained relay is given by

hi
Figure 2. Energy harvesting system model. yR,i = PS xi + nR,i (6)
d1m

hi As in (Zhou et al., 2013, Nasir et al., 2013), the


yR,i = PS xi + fi PR xi + nR,i (1) absorbable energy can be saved in an extreme-ca-
d1m pacitor and then entirely is used for transmission
stage. Hence, the transmission energy at R can be
gi described as
yD,i = PR xi + nD,i (2)
d2m PS hi
2
Ei iT (7)
d1m
where (.)i is the block time index ith of (.); xi and
xi is message symbol at S and decoded symbol at
R respectively, that with unit power and zero aver- where 0 < < 1 is the energy conversion efficiency
age. Supposing that the relay node always exactly that depends on the rectification process and the
decodes when S R link have no outage. Besides, energy harvesting circuitry.
hi, gi is the source to relay channel gain and the In this protocol, the transmitted power in the
relay to destination channel gain at slot time ith, relay node is predefined with a fixed value, deno-
respectively; PS, PR is transmitted power from the ded by PR. Hence, the utilizable energy to work is
source and the relay, respectively; m is the path loss given by
exponent; d1 is the source to relay distance, d2 is
the relay to destination distance; nR,i and nD,i are Ei PR ( i )T (8)
respective AWGNs at R v D in block time ith.
From (1) and (2) the SINR at R and in the time Setting (7) equal (8), the allocated fraction of
slot ith, respectively, are determined by time for power transfer in any block time is derived
as follow
2
PS hi
R ,i = 2
(3) PR d1m
PR d1m fi d1m R2 ,i i = 2
(9)
PS hi + PR d1m
2
PR gi In equation (9), the duration of time allocated
D ,i = (4)
d2m D2 ,i in energy harvesting phase is a function of some
parameters including channel gains hi (can be esti-
where: R2 ,i , D2 ,i are the variances of AWGNs nR,i mated as (Love et al., 2008))1, preset power at relay
and nD,i, respectively. node PR, distance between S and R d1, absorbable
Based on the DF relaying scheme, the end-to- coefficient and power transmission at source PS.
end SINR at block time ith, is proven as bellow Specially, this fraction time is always less than one,
i.e. i < 1, that implies the allocated time can respond
e 2 e ,i (
i R ,i , D ,i ) (5)
for wireless information and power transfer.

1
where R,i, D,i are the performances of the first and For the channel gain, source and relay node has to
obtain them CSI. At the beginning of each block (e.g.
second hop, given by (3) and (4) respectively.
block time ith), the CSI acquisition is achieved in two
steps. In the first step, the source transmits its pilot signal
2.2 Energy harvesting protocol to the relay, and the relay estimates hi. In the second step,
the relay R feeds back hi, to the source node, S. In order
In the Wireless Power Transfer (WPT) time slot, we to reduce the feedback overhead, the relay can feed back
focus the performance of scheme that uses the new their quantized version to the source.

135

AMER16_Book.indb 135 3/15/2016 11:26:06 AM


m
3 OUTAGE PROBABILITY AND where X = PRPd1 , = P d , = d
m m 2
and Ei 0 1 0 1 R

THROUGHPUT ANALYSIS S P P S S
is the exponential integral function as eq. 8.211 in
In this section, the throughput and outage prob- (Jeffrey and Zwillinger, 2007).
ability of the proposed protocol are analyzed. In Proof: the proposition 1 can be derived as
this mode, the outage probability occurs when below
the system performance, i.e. e2e,i, drops below the Substituting (9), (11) into (12) the average throu-
threshold value 0, it is defined as 0 = 2Rc 1 with Rc ghput of system can be expressed as
is transmission rate of system. So that the expres-
sion of outage probability can be obtained by t E
gi
2 { { Di > }}
Pi = P
OP { e ei < } E 2 { { > }}
{ ( )< } hi
(10) fi
2 Ri
= Pr Ri Di

{P { } }
(16)
E
E 2 2 R ,i >0
Because R,i independent with D,i, so that the hi fi

= t1 ( 2 )
outage can be rewritten as
3

Pi = 1 Pr
OP { Ri > } Pr { Di > } (11)
The first item could be given by
In this transmission mode, i.e. delay-constrained
mode, the throughput efficiency of system at the t1 E
gi
2 { { Di > 0 }}
time slot ith, is the function of outage probability
and the EH duration, which is formulated by =E
gi
2 { {P R gi
2 m
0d2
2
D }} (17)
ti ( OPi ) (
OP i ) (12)
0d2m D2
= exp
And the average throughput efficiency of sys- g PR
tem is
And the second item is
t E {ti } (13)

where E{x} is the expectation function of variable x. t2 Eh2 f


2 { { Ri > 0 }}
{ { )}}
The optimal value of transmitted power for
maximization throughput, which can be modified
with various system parameters, i.e. PR, PS, , 0
= Eh f
2 P PS hi
2
0 (P d m
R 1
2
fi + d1m 2
R

0 PR d1m x x
is determined via resolve of function below ( ) d m R2 1
= exp
h PS f exp h PS
dx
f
PRopt ag a
PR
{t (PR )} (14)
0

Subject to: PR 0 0d1m R2


h PS
= exp
h PS h PS + 0 P d1m f
Throughput of system in this case can be deter-
(18)
mined as (13) and after some algebraic manipula-
tions, we have below proposition as where step (a) can be derived by the channels,
Proposition 1: the average throughput of full- i.e. h, f, are follow the (i.i.d)) exponential random
variables with p.d.f fX ( x ) = X p ( x X )
duplex relaying energy harvesting network with 1

TSAC protocol can be expressed as and c.d.f FX ( x ) = 1 exp ( x X ) .


Finally, the third element is determined as
0 d 2m D2 0 d1m R2
t = exp exp
g PR h PS t3 E
hi
2
fi
2 { { Ri > 0 } i}
h PS X X +
{ } h X X
X
exp exp 2 2
h PS + PR d1 f h
m
h f =E
hi fi
fi < hi 2
i
(X + ) X + X +
x x 1
Ei Ei X
f h h
(15) =
h
exp 1 exp
h

f x + X
dx

136

AMER16_Book.indb 136 3/15/2016 11:26:08 AM


X X X +
= exp exp
h h f

X + X + X + (19)
Ei Ei
f h h

where X PR d1m PS , 0 P d1m PS , 0d1m


R2 PS , Ei is the exponential integral function as
eq. 8.211 in (Jeffrey and Zwillinger, 2007). The last
integral can be direved by appling eq. 3.352.2 given
in (Jeffrey and Zwillinger, 2007).
Substituting (17), (18) and (19) into (16). The prop-
osition 1 is easily derived. This is complete the proof.
Sloving (14) by using (15), the optimal transfer
power at relay with aim maximization throughput
can be obtained. Because of complexity in expres-
sion, so a close-form of optimal expression can- Figure 3. Throughput efficiency versus PR with ( = 0.6,
not be derived. However, an optimal value can be 0.8, 1).
proven by using numerical simulation that is gath-
ered in the next section.

4 NUMERICAL RESULT

This section uses the derived analytical results to


provide the comparison with the Monte Carlo
simulation. Interestingly, there is strict agreement
among both cases. As a result, the validity of ana-
lytical results is verified.
We set the SINR threshold, 0 = 10 dB; the aver-
age, h = g = 0 dB, f = 20 dB; the distance
of first hop S R, d1 = 3 m; distance between R
node and D node, d2 = 1 m; path loss, m = 3; the
energy harvesting efficiency = 0.8; the transmis-
sion power at source, PS = 26 dB; variance of noise
at destination and relay node, D2 = 5 dB and
R2 = 10 dB, respectively.
Fig. 3 plots the throughput efficiency versus the
different transmitted power values with the flexible
changes of = 0.6, 0.8, 1, respectively. Figure 4. Throughput effeciency vesus 0. Other param-
The throughput effeciency is maximum at eters PR = 10 dB.
approximate PR = 9 dB of transmitted power, with
t 0.35, 0.32, 0.27 at = 1, 0.8, 0.6, respectively,
it can be called the optimal power allocation. for the higher value of which is greater than 16
Besides, the throughput efficiency can be enhanced dB, the different values of energy conversion effi-
by increasing the energy conversion efficiency. ciency probably dont affect the throughput of the
As your observation, Fig. 4 examines the impact systems.
of threshold value, 0 (or tranmission rate) on the
throughput of systems. It can be observed from
Fig. 4 that the greater throughput can be obtained 5 CONCLUSION
with the small threshold 0, e.g. at = 1, t 0.65 at
0 = 0 dB and t 0.35 at 0 = 10 dB. For the value In this paper, a decode-and-forward wireless
of threshold 0 which is less than 16 dB, the higher cooperative or sensor network with new power
the energy conversion efficiency is, the higher transfer protocol, i.e. TSAC protocol, has been
optimal throughput becomes. On the other hand, considered where the harvested energy of relay

137

AMER16_Book.indb 137 3/15/2016 11:26:11 AM


node from the source can be used effectively to Love, D.J., Heath, R.W., Lau, V.K.N., Gesbert, D., Rao,
forward the source signal to the destination node. B.D. & Andrews, M. 2008. An overview of limited
In order to determine the optimal throughput at feedback in wireless communication systems. Selected
the destination, analytical expression for the out- Areas in Communications, IEEE Journal on, 26,
13411365.
age probability is derived. The simulated results Nasir, A.A., Xiangyun, Z., Durrani, S. & Kennedy, R.A.
in this paper have given the practical insights 2013. Relaying Protocols for Wireless Energy Harvest-
into the impact of various system parameters, i.e. ing and Information Processing. Wireless Communi-
PR, , 0, on the performance of wireless energy cations, IEEE Transactions on, 12, 36223636.
harvesting and information processing using DF Ng, D.W.K. & Schober, R. Spectral efficient optimiza-
relay nodes. tion in OFDM systems with wireless information
and power transfer. Signal Processing Conference
(EUSIPCO), 2013 Proceedings of the 21st European,
913 Sept. 2013. 15.
REFERENCES Ng, D.W.K., Lo, E.S. & Schober, R. 2013b. Wireless
Information and Power Transfer: Energy Efficiency
Chin Keong, H. & Rui, Z. 2012. Optimal energy alloca- Optimization in OFDMA Systems. Wireless Commu-
tion for wireless communications with energy harvest- nications, IEEE Transactions on, 12, 63526370.
ing constraints. Signal Processing, IEEE Transactions Ng, D.W.K., Lo, E.S. & Schober, R. Energy-efficient
on, 60, 48084818. resource allocation in multiuser OFDM systems
Chin Keong, H. & Rui, Z. Optimal energy allocation for with wireless information and power transfer. Wire-
wireless communications powered by energy harvest- less Communications and Networking Confer-
ers. Information Theory Proceedings (ISIT), 2010 ence (WCNC), 2013 IEEE, 710 April 2013 2013a.
IEEE International Symposium on, 1318 June 2010. 38233828.
23682372. Ozel, O., Tutuncuoglu, K., Jing, Y., Ulukus, S. & Yener,
Choi, J.I., Jain, M., Srinivasan, K., Levis, P. & Katti, S. A. 2011. Transmission with Energy Harvesting
Achieving single channel, full duplex wireless com- Nodes in Fading Wireless Channels: Optimal Policies.
munication. Proceedings of the sixteenth annual Selected Areas in Communications, IEEE Journal on,
international conference on Mobile computing and 29, 17321743.
networking, 2010. ACM, 112. Rui, X., Hou, J. & Zhou, L. 2010. On the performance
Chuan, H., Rui, Z. & Shuguang, C. 2013. Throughput of full-duplex relaying with relay selection. Electronics
Maximization for the Gaussian Relay Channel with letters, 46, 16741676.
Energy Harvesting Constraints. Selected Areas in Tutuncuoglu, K. & Yener, A. 2012. Optimum Transmis-
Communications, IEEE Journal on, 31, 14691479. sion Policies for Battery Limited Energy Harvesting
Duarte, M., Dick, C. & Sabharwal, A. 2012. Experiment- Nodes. Wireless Communications, IEEE Transactions
Driven Characterization of Full-Duplex Wireless Sys- on, 11, 11801189.
tems. Wireless Communications, IEEE Transactions Varshney, L.R. Transporting information and energy
on, 11, 42964307. simultaneously. Information Theory, 2008. ISIT 2008.
Gang, Y., Chin Keong, H. & Yong Liang, G. 2014. IEEE International Symposium on, 611 July 2008.
Dynamic Resource Allocation for Multiple-Antenna 16121616.
Wireless Power Transfer. Signal Processing, IEEE Xiaoming, C., Chau, Y. & Zhaoyang, Z. 2014. Wireless
Transactions on, 62, 35653577. Energy and Information Transfer Tradeoff for Lim-
Jeffrey, A. & Zwillinger, D. 2007. Preface to the Seventh ited-Feedback Multiantenna Systems With Energy
Edition A2 - Ryzhik, Alan JeffreyDaniel Zwillinger, Beamforming. Vehicular Technology, IEEE Transac-
I.S. Gradshteyn, I.M. Table of Integrals, Series, and tions on, 63, 407412.
Products (Seventh Edition). Boston: Academic Press. Yingbo, H., Ping, L., Yiming, M., Cirik, A.C. & Qian,
Jing, Y. & Ulukus, S. 2012. Optimal Packet Scheduling in G. 2012. A Method for Broadband Full-Duplex
an Energy Harvesting Communication System. Com- MIMO Radio. Signal Processing Letters, IEEE, 19,
munications, IEEE Transactions on, 60, 220230. 793796.
Jing, Y. & Ulukus, S. Optimal Packet Scheduling in a Zhang, R. & Ho, C.K. 2013. MIMO Broadcasting for
Multiple Access Channel with Rechargeable Nodes. Simultaneous Wireless Information and Power Trans-
Communications (ICC), 2011 IEEE International fer. Wireless Communications, IEEE Transactions on,
Conference on, 59 June 2011. 15. 12, 19892001.
Krikidis, I., Suraweera, H.A., Smith, P.J. & Chau, Y. Zhou, X., Zhang, R. & Ho, C.K. 2013. Wireless Infor-
2012. Full-Duplex Relay Selection for Amplify-and- mation and Power Transfer: Architecture Design and
Forward Cooperative Networks. Wireless Communi- Rate-Energy Tradeoff. Communications, IEEE Trans-
cations, IEEE Transactions on, 11, 43814393. actions on, 61, 47544767.

138

AMER16_Book.indb 138 3/15/2016 11:26:12 AM


Applied Mathematics in Engineering and Reliability Bri, Snel, Khanh & Dao (Eds)
2016 Taylor & Francis Group, London, ISBN 978-1-138-02928-6

A performance analysis in energy harvesting full-duplex relay

Tam Nguyen Kieu, Tuan Nguyen Hoang, Thu-Quyen T. Nguyen & Hung Ha Duy
Ton Duc Thang University, Ho Chi Minh City, Vietnam

D.-T. Do
Ho Chi Minh City of Technology and Education, Ho Chi Minh City, Vietnam

M. Vozk
VSB-Technical University of Ostrava, Ostrava, Czech Republic

ABSTRACT: In this paper, we compare the impact of some relay parameters on two relaying schemes:
Amplify-and-Forward (AF) and Decode-and-Forward (DF) in full-duplex cooperative networks.
Especially, closed-form expressions for the outage probability and throughput of the system is derived.
Furthermore, we evaluate the dependence of system performance, in term of the outage probability and
throughput, on the noise at nodes, transmission distance and relay transmission power.

1 INTRODUCTION The remainder of this paper is arranged as fol-


lows. Section 2 shows the system model of the EH
In recent years, a number of radio system applica- enabled FD relaying network using delayed-limit
tions which require long lifetime are facing with an mode in TSR. In Section 3, the outage probabil-
obstructive challenge on energy consumption. In ity and throughput analysis. Simulation results are
order to determine the throughput, a delay-limited introduced in section 4. Finally, conclusion is given
transmittance mode is usually used to derive ana- in Section 5 of this paper.
lytical expressions for outage probability and
throughput. The Simultaneous Wireless Power and
Information Transfer (SWPIT) for duplex relaying 2 SYSTEM MODEL
has been introduced, in which two resources com-
municate together over a power collecting relay. By In this section, we describe the Time Switching
examining the Time Switching Relay (TSR) receiv- Based Relaying (TSR) protocol and derive expres-
ing structure, the TS-based Two-Way Relaying sions for the outage probability and throughput,
(TS-TWR) protocol is intro-duced (Ke, Pingyi & Ben which are considered in delay-limited transmission
Letaief 2015a, b, Krikidis et al. 2014, Nasir 2013). mode.
There has been research which deeply studied As in Figure 1, the suggested model is compris-
about the interference from Secondary Uses (SUs) ing of three nodes. The source node is denoted by S,
to primary receivers (PTs) and from PTs to SUs in the destination node is denoted by D and the relay
cognitive radio networks. The secondary users are node is denoted by R. Each node is equipped with
able to not only transmit a packet on a licensed two antennas, one of them is responsible for signal
channel to a primary user when the selected chan- transmission and the other is for signal reception.
nel is idle or occupied by the primary user but also The cooperative relay is assumed to be an energy
harvest RF (radio frequency) energy from the pri- constrained device so that it must harvest energy
mary users transmissions when the channel is busy from the source, and use that energy to transfer the
(Mousavifar et al. 2014, Sixing et al. 2014, Sixing source information to the destination node (D).
et al. 2015, Tong et al. 2014). Terms g1 and g2 respectively represent the quasi-
In other situation, we investigate a Decode-and- static block-fading channel from the source to the
Forward (DF) and Amplified-and Forward (AF) relay and from the relay to the destination node. In
relaying system relied on radio power collection. addition, terms l1 and l2 denote the distance from
The power constrained relay node early collects the source to the relay and from the relay to the
power over Radio-Frequency (RF) signals from destination, respectively.
the source node (Nasir 2014, Yiyang et al. 2013, The TSR protocol for the proposed system is
Yuanwei et al. 2014). illustrated in Figure 2.

139

CH17_24.indd 139 3/15/2016 2:54:20 PM


PS
yR = x g + g R xr + nR
m S 1
(3)
l1

where g R is the residual self-interference factor
at R.
In this paper, we investigate both AF and DF
schemes in duplex relaying system. For AF, the
relay amplifies the signal with an amplification
Figure 1. System model of one way full duplex relaying. factor

2
PS
K 1 =
2
g1 PR g r I (4)
l1m

With DF, the relay decodes signal before retrans-


mitting it. So the transmitted signal from the relay
can be expressed as follows.


Figure 2. The parameters of TSR protocol. K yR [i ] with AF
xR (i ) = P (5)
R xS [i ] with DF
PS
The information process is separated into two
stages. At first, the energy is transferred from the source where accounts for the time delay bred by relay
to the relay within a duration of T ,(0 1) . The processing.
remaining time, (1 )T , is employed to convey It is costly seeing that harvested power then
information, where is time switching coefficient assist operation for the next stage transmission,
and T is the duration of one signal block. PR is advanced by
During the energy harvesting phase, the received
signal at the relay node can be expressed as 2
Eh g
PR = = PS 1m (6)
PS ( )T l1
yR = g1xS + nR (1)
l1m where is defined as / (1 ) .
Therefore, the received signal at the destination
where PS is the source transmission power and m is given by
is the path loss exponent.
In this work, we assume a normalized path loss g2
model in order to show the path loss degradation yd (k) xr [k] + nd [k] (7)
effects on the system performance. For simplic- l2m
ity, nR and nd are the zero mean Additive White
Gaussian Noise (AWGN) with variance 1. With AF, we have
Regarding wireless received power, the harvested
energy at the relay is given by g2 g1 PS g2
yD ( k ) = K PR xS + K PR g r xR
2 lm l1m l2m
Ps g1 2  
  
 
Eh T (2) signal RSI
S
d1m
g2
+ K PR nR + nD
where is the energy conversion efficiency. lm
For the information transfer phase, assume that 2  
noise
the source node transmits the signal xS to R and (8)
R forwards signal xR to the destination node.
Both
ot signals
gn have unit energy and zeromean, i.e, With DF we obtain
E xi = 1 and E [ xi ] = 0 , for i {S, R}. There-
2

fore, the signal received signal received at the relay
under self-interference source is rewritten as yD (t ) = PR g2 xS (t ) + nd (t ) (9)

140

AMER16_Book.indb 140 3/15/2016 11:26:12 AM


In the above results, the instantaneous received
SINR at D through R is determined as I
1/ H l1ml2m H + Iy

= 1
{ }
2
E 0
s d ( )
=
{ }+ E{ }
(10)
E
I
l1ml2m Z + Iy y

1 e dy
K1 2
We have
s d ( ) r
2 2
PS g1 PR g2
2

l1ml2m PR g r (13)
= 2 2
(11) where s d , r are the mean valuelu of the expo-
IP
PS g1 P g2
+ +I nential random variables g1 g 2, g r , respectively,
and K1 ( x ) is Bessel function defined as (8.423.1)
2
l2m
PR g r l1m in (David H. A. 1970).
2
2 2
2 2
Proof: We denote x g1 g2 and y g r . If x
We assume that the channel gains g1 g2 and y are dependent then we have
are independent and identically distributed (i.i.d.)
exponential random variables. I
l1ml2m H + Iy

Pr x < 1
, y <
3 OUTAGE PROBABILITY AND P P H
Hy H
PoutAF = S S (14)
THROUGHPUT ANALYSIS

1
In this section, we compare the outage probability 1, y>
and throughput of full-duplex one-way relaying H
with energy harvesting and information transfer in
two relaying modes: AF and DF. Based on these Interestingly, the cumulative distribution func-
analytical expressions, we can see clearly some of tion of x is calculated by
factors imfluencing factors on system performance
and learn how to deploy it in different situations.
FX (a ) = P (X a ) = 1 2 a / s d K ( a / s d )
3.1 Outage probability analysis (15)
The outage probability of FD relaying network in and Y can be modeled with probabilityabili distri-
fY (b) = ( (b r ) . Then
r )e
delay-limitted model is calculated as bution function
the Proposition 1 is achieved after some simple
Pout P ( H) (12) manipulations.
Proposition 2: the outage probability of the energy-
where R is target rate and H = 2R 1. harvesting-enabled two-way full-duplex relaying
with DF protocol is derived as
Proposition 1: the outage probability of the energy-
harvesting-enabled two-way full-duplex relay with
AF protocol is derived as I
1 l1ml2m Z + Iy

PoutDF 1 1 e r H 2
sd ( PS PS Zy )
2 2

PS g1 PR g2

2
m m

l1 l2 PR g r
(16)
PoutAF = Pr 2 2
< H I
IP
P g
S 1 P g l1ml2m H + Iy
+ R m2 + I
2 l K1 2
P g lm 2 s d ( )
R r 1

141

AMER16_Book.indb 141 3/15/2016 11:26:15 AM


Proof: Base on

1 P x
DF = min , m Sm (17)
y l1 l2 I

and the above effect, we obtain the desired result


after doing some algebras.

3.2 Optimal throughput analysis


In Propositions 1 and 2, the outage probability
of the considered model, when the relay harvests
energy from the source signal and employs that
power to forward source signal to the destina-
tion, is a function of distance l1 and noise factor,
I, and increases when l1 increases from 0 to 2 Figure 4. Throughput of AF and DF model versus
and I increases from 0.1 to 1. In the delay-limited distant.
transmission mode, the transmitter is communi-
cating at a fix transmission rate R (in bits/sec/Hz)
and (1 )T is the efficient information interval.
Hence, the throughput of system can be written
as

=(
( )T
)R T
(18)

Unfortunately, it is as high complexity to get the


optimal throughput mathematically. However we
can get the optimal value by numerical method as
given in the next part.

4 SIMULATION RESULTS

In this section, we employ the results of derived


analysis to offer perception into the variety of
Figure 5. Outage probability of AF and DF model ver-
design options. The energy harvesting efficiency is
sus I.
set to be = 1 , the path loss exponent is set to be
m = 3 . For simplicity, we set s d = 1; r = 0.1
and l1 l2 = 1 (except Figure 3, Figure 4) as well
as I = 1 (except Figure 5 and Figure 6).

Figure 3. Outage probability of AF and DF model ver-


sus distant. Figure 6. Throughput of AF and DF model versus I.

142

AMER16_Book.indb 142 3/15/2016 11:26:18 AM


It can be seen from Figure 3 and Figure 4 that Krikidis I., Timotheou S., Nikolaou S., Gan Z., Ng
the outage probability of DF model is better than D.W.K. & Schober R. 2014. Simultaneous wireless
AF model while its throughput is worse. As close information and power transfer in modern communi-
and intermediate distance, the outage probability cation systems. Communications Magazine, IEEE, vol.
52, pp. 104110.
is gradually increasing but throughput of them is Mousavifar S.A., Yuanwei L., Leung C., Elkashlan M.,
contrariwise. The outage is maximum at some spe- and Duong T.Q. 2014. Wireless Energy Harvesting
cific distance from S node. and Spectrum Sharing in Cognitive Radio. Vehicular
The same thing happens in Figure 5 and Fig- Technology Conference (VTC Fall), 2014 IEEE 80th,
ure 6, the outage probability of DF model is still pp. 15.
better than AF model but its throughput is worse Nasir A.A., Xiangyun Z., Durrani S. & Kennedy R.A.
than AF. This is due to noise at relay node which 2013. Relaying Protocols for Wireless Energy Har-
has impact on system performance. vesting and Information Processing. Wireless Com-
munications, IEEE Transactions on, vol. 12, pp.
36223636.
5 CONCLUSION Nasir A.A., Xiangyun Z., Durrani S. & Kennedy R.A.
2014. Throughput and ergodic capacity of wireless
energy harvesting based DF relaying network. Com-
In this paper, the mathematical and numerical munications (ICC), 2014 IEEE International Confer-
analysis have shown practical insight of full-duplex ence on, pp. 40664071.
relaying system in term of the effect of different Sixing Y., Erqing Z., Zhaowei Q., Liang Y., & Shufang
system parameters on the performance of wire- L. 2014. Optimal Cooperation Strategy in Cognitive
less energy collecting and information processing Radio Systems with Energy Harvesting. Wireless
system, which employs AF and DF relay modes. Communications, IEEE Transactions on, vol. 13, pp.
The throughput results in this paper accounts for 46934707.
the upper bound on the realistically attainable Sixing Y., Zhaowei Q. & Shufang L. 2015. Achievable
Throughput Optimization in Energy Harvesting Cog-
throughput. Moreover, we also find that AF model nitive Radio Systems. Selected Areas in Communica-
outperforms DF model in delaylimited scheme tions, IEEE Journal on, vol. 33, pp. 407422.
of full-duplex relaying network. Tong C., Zhiguo D. & Guiyun T. 2014. Wireless informa-
tion and power transfer using energy harvesting relay
with outdated CSI. High Mobility Wireless Commu-
REFERENCES nications (HMWC), 2014 International Workshop on,
pp. 16.
David H.A. 1970. Order Statistics. New York, NY, USA: Yiyang N., Shi J., Ran T., Kai-Kit W., Hongbo Z. &
Wiley. Shixiang S. 2013. Outage analysis for device-to-device
Ke, X., Pingyi F. & Ben Letaief K. 2015a. Time-switch- communication assisted by two-way decode-and-
ing based SWPIT for network-coded two-way relay forward relaying. Wireless Communications & Signal
transmission with data rate fairness. Acoustics, Speech Processing (WCSP), 2013 International Conference
and Signal Processing (ICASSP), 2015 IEEE Interna- on, pp. 16.
tional Conference on, pp. 55355539. Yuanwei L., Lifeng W., Elkashlan M., Duong T.Q.
Ke, X., Pingyi F. & Ben Letaief K. 2015b. Wireless Infor- & Nallanathan A. 2014. Two-way relaying net-
mation and Energy Transfer for Two-Hop Non-Re- works with wireless power transfer: Policies design
generative MIMO-OFDM Relay Networks. Selected and throughput analysis. Global Communica-
Areas in Communications, IEEE Journal on, vol. 33, tions Conference (GLOBECOM), 2014 IEEE, pp.
pp. 15951611. 40304035.

143

AMER16_Book.indb 143 3/15/2016 11:26:20 AM


This page intentionally left blank
Applied Mathematics in Engineering and Reliability Bri, Snel, Khanh & Dao (Eds)
2016 Taylor & Francis Group, London, ISBN 978-1-138-02928-6

A stochastic model for performance analysis of powered


wireless networks

Nhi Dang Ut & Dinh-Thuan Do


Department of Computer and Communications Engineering, HCMC University of Technology
and Education, Ho Chi Minh City, Vietnam

ABSTRACT: A wireless network using a relay node to harvest energy and process information simulta-
neously is considered in this paper. The relay node uses the harvested energy from the source signal then
it amplifies and forwards that signal to destination node. Based on two receiver architectures, namely time
switching and power switching, this paper introduces stochastic model for analysis of the Time Switch-
ing based Relaying protocol (TSR) and the Time Power Switching based Receiver (TPSR), respectively.
To determine the throughput at destination node, the analytical expression for the outage probability is
derived for the delay-limited transmission mode. The numerical results confirm the effect of some system
parameters to the optimal throughput at destination node for the network, such as the time fraction for
energy harvesting, the power splitting ratio, the source transmission rate, the noise power, and the energy
harvesting efficiency. More particularly, we compare the throughput at destination node between TSR
protocol and ideal receiver, TSR protocol and TPSR receiver for the delay-limited transmission mode.

1 INTRODUCTION cells with respect to BSs is introduced in Kaibin


Huang & Vincent Lau (2014). A MIMO wireless
In recent years, energy harvesting through Radio broadcast system consisting of three nodes, where
Frequency (RF) solution has been received signifi- one receiver harvests energy and another receiver
cant research as a solution to keep the lifetime of decodes information separately from signals is
a wireless network longer. In contrast with tradi- considered by Rui Zhang and Chin Keong Ho
tional energy supplies such as batteries or internal (2011). Then this work is extended by consider-
charging sources, energy harvesting would enable ing an imperfect channel state information at the
the wireless networks to operate by using energy transmitter (Xiang & Tao 2012).
harvested from external source such as RF signals In this paper, we introduce an wireless coop-
(Varshney 2008) due to the fact that RF signals can erative network where a relay node harvests
carry energy and information at the same time. energy from the source signal, then it Amplify-
The concept of energy harvesting and process and-Forward (AF) that signal to destination node.
information for an ideal receiver was first intro- Based on the time switching, power switching
duced by Varshney, where the author studied architectures (Zhou, Zhang & Ho 2012), and AF
about the fundamental performance tradeoff for relaying protocol (Laneman, Tse & Wornell 2004),
simultaneous information and power transfer. But we introduce the Time-Switching based Relaying
this approach has been proved that is not avail- (TSR) protocol for energy harvesting and informa-
able in practice since practical receivers still have tion processing at the relay node in delay-limited
their limitation to decode the carried information transmission mode. We also compare the optimal
directly (Zhou, Zhang and Ho 2012). Other works throughput observed at destination node between
have been done by using realizable receivers with TSR and the time power switching relaying receiver
separate receivers for energy harvesting and infor- (Dinh-Thuan Do 2015) to have a deep look inside
mation processing. The work by Varshney has been the behavior of the system.
extended to a frequency-selective channels with
additive white Gaussian noise (Grover & Sahai
2010). Then a hybrid network is studied using 2 SYSTEM MODEL
stochastic-geometry model where base stations
(BSs) and PBs form independent homogeneous Figure 1 shows the wireless system under study,
Poisson Point Processes (PPPs) and mobiles are where the information from the source (denoted
uniformly distributed in corresponding Voronoi by S), is transmitted to destination node (denoted

145

AMER16_Book.indb 145 3/15/2016 11:26:20 AM


in which the relay node harvests energy from the
source signal, where 0 1 . The rest of time
(1 )T , is used to transmit information from S
to D. The first half remaining (1 ) / 2 time is
used to transmit information from S to D, and the
rest remaining half (1 )T is used to transmit
information from R to D. The value of that we
Figure 1. System model for energy constrained relay
wireless network. choose will affect to the throughput at destination
node, which will be illustrated in the following
sections.

3.1 Energy harvesting and information


processing at the relay node
Figure 2b shows the block diagram for TSR pro-
tocol. The received signal at relay node, yr (t ) , is
corrupted by a noise signal a[r] (t ) generated by the
antenna, is first sent to energy harvesting receiver.
Then, in the remaining time (1 ) / 2 , it is sent
to information receiver. The energy harvesting
receiver rectifies the received signal to get the direct
current and uses that harvested energy for infor-
mation processing.
Figure 2. TSR protocol (a) TSR model and (b) Block The received signal, yr(t), at the relay node is
diagram. given by:

1
by D), through a relay node (denoted by R). The yr (t ) = Ps h s(t ) + a[[r] (t ) (1)
l1
distance from source to relay node and from relay
node to destination node are denoted by l1 and
l2, respectively. The channel gain source to relay where h is the channel gain from the source to relay
node and from relay node to destination node are node, l1 is the distance from the source node to relay
denoted by h and g, respectively. node, Ps is the power transmitted from the source,
Based on the system model and the time switch- is the pathloss exponent, and s(t) is the normal-
ing receiver architecture, this paper introduces the ized information signal from the source node.
time switching-based relaying and the time power The energy harvested at relay node, denoted by
switching based relaying receiver for energy har- Er, is defied by:
vesting and information processing from source
to destination at the relay node with delay-limited Ps | h |2
transmission mode. The delay-limited transmis- Er = T (2)
l1
sion mode means that the destination node has to
decode the received information block-by-block
and as the result, the code length can not be larger where is the energy conversion efficiency and
than the transmission block time (Liu, Zhang & 0 1.
Chua 2012). The figure of merit for the system After harvesting energy from yr(t), then informa-
under study is the throughput at the destination tion receiver converts yr(t) to baseband signal and
node, which is defined as the number of bits are processes it, this introduces an additive noise due
successfully decoded per unit time per unit. to conversion from RF signal to baseband signal,
denoted as c[ r ] (t) . The sampled baseband signal at
relay node after converted is given by:
3 TIME SWITCHING-BASED
RELAYING (TSR) PROTOCOL 1
yr k ) = k)) + a,r ((k
Ps h s( k k ) c,r ( k ) (3)
l1
The time switching-based protocol (TSR) for
energy harvesting and information processing at
relay node can be seen in Figure 2. where k is the index value, s(k) is the sampled-
In Figure 2, T is the block time to transmit normalized signal from the source, a,r (t) is the
information from S to D, is the fraction of T AWGN noise introduced by receiving antenna at

146

AMER16_Book.indb 146 3/15/2016 11:26:20 AM


relay node, and c,r (t) is the AWGN noise intro- Finally, substitute Pr from (8) into (7):
duced by the conversion process from RF to base-
band signal. The relay node then amplifies this 2 | h |2 Ps h g s( k )
sampled signal and transmits it. The transmitted yd k ) =
signal from relay node, denoted as xr ( k ) , can be ( ) l1 l2 Ps | h |2 + l12r
  
expressed as: signal part
(9)
2 s
2
g r ( k )
Pr yr ( k ) + + d ( k )
xr ( k ) =
2
(4) ( ) 2 s + 12r
2
Ps | h |  
+ 2a ,r + 2c ,r overal
e n oise
l 1

Ps | h |2 The received signal at destination node yd(k), is


where + 2a ,r 2c ,r is the power constraint expressed by (9) in terms of Ps , , , l1, l2, h and g.
l 1
factor, 2a ,r and 2c ,r are the variances of a,r ( k )
3.2 Throughput analysis
and c,r ( k ) , respectively, Pr is the power transmit-
ted from the relay node. y D , can
The SNR at destination node, denoted by
The signal received at destination node after
be calculated using (9) by D =
{
E signal p t in },
sampled, denoted as yd(k), is given by:
is expressed by: {
E overall noise in }
1
yd k ) = ( k ) a,d ( k ) + c,d ( k )
g xr (k (5) 2 | |
4 2
|g|
2

l2
s

(1 )
1 2 ( s

)
D =
r

2 2 2
where a,d ( k ) and c,d ( k ) are the AWGN noises 2 s
h|
2
+
r

introduced by the antenna and conversion at desti- (1 ) l 2



(P s
h
2
l1
2
) d

nation node, respectively, and g is the channel gain r

2 4 2
2 P |h| |g|
from R to D. =
s

Substituting xr(k) from (4) into (5), we have:


2 2
2 Ps | h | | g | + Ps |
2

r
2
| l1 l 2
2

d
(1 )
2 2 2
l1 l 2
r d
(1 )

(10)
g Prl1 yr ( k )
yd k ) = + a , d ( k ) where 2d 2d ,a + 2d ,c .
l2 Ps | h |2 l1 (2a ,r 2
c ,r )
(6)
The throughput at destination node, denoted
+ c,d ( ) by , is determined by evaluating the outage
probability, denoted as out , given a constant
And by substituting yr(k) in (3) into (6), we transmission rate from source node R bits/sec/Hz,
have: ( + 0 ) , 0 is the threshold value of SNR
R = log 2 (1
for data detection at destination node. The outage
probability at destination node for TSR protocol
Pr Ps h g s( k )
yd k ) = + is given by:
l2 Ps | h |2 l12r
(7) out = p ( < ) (11)
Pr l1 g r (k )
+ d ( k )
l2 Ps | h |2 +2r where 0 = 2R 1 .
The outage probability at destination node, can
where r(k) is defined as r (kk a,r k ) + c,r ( k ) , be expressed analytically by:
and d ( k ) is defined as d (k
k a,d ( k k)) + c,d ( k ) ,

are the overall AWGN noises at the relay and des- z a z+b
+
tination node, respectively, 2r 2a ,r + 2c ,r is the 1 h
( )

overall variance at relay node.
out = 1
h e dz (12a)
z =d /c
From (2), we can calculate the power transmit-
ted from relay node as:
d

Er 2 P | h | 2 1 e c h
u K1 ( ) (12b)
Pr = = s (8)
( )T / 2 l1 ( )
where:

147

AMER16_Book.indb 147 3/15/2016 11:26:22 AM


a Ps l1 l22d 0 ( ) (13a) The denominator in (P1), c | h |4 d | h |2 , can
be even positive or negative, thus out is given by:
b l12 l22r 2d 0 ( ) (13b)

c 2 Ps (13c)
out = p (( ) < ( + ))
2 a | h |2 + b
p| g | < , | h |2 /c
d Ps l12r 0 (13d) c | h |4 d | h |2
=
2 a | |2 + b
p | g | > c | |4 d | |2 = 1, | h |
2
/c
4a
u= (13e)
c h g
(A2)
h is the mean value of | | , g is the mean value
2
The second leg in (A2)) is due to the fact that
of | |2 and K1 () is the first order modified Bas-
if | |2 d / c , then c | h |4 d | h |2 is a negative
sel function (Gradshteyn & Ryzhik 1980).
number and the probability of | g |2 greater than
Finally, the throughput at destination node is
negative numbers is always 1. Because of (A2),
given by:
out is given:

=(
( ) /2 ( ) ( )
)R = (14)
az + b
d /c
T 2
out = fh ( ) p | g |2 > dz
This is based on the fact that the transmission rate z =0 cz 2 dz

from the source is R bits/sec/Hz and (1 ) / 2 is az + b
+ f h ( ) p | g |2 < 2 dz
the effective time to transmit information from the
z =d /c cz dz
source node to the destination node. The throughput g
is depended on Ps , , , l1, l2 , R, 2r and 2d .
azz b
(cz dz ) g
d /c
2 ( ) ( )
Following is the demonstration (Ali Nasir, = h
+ h f 2 1 e

dz
Xiangyun Zhou, Salman Durrani & Rodney z =00 z =d /c

Kennedy 2013) for equation (12) and (13).
Substituting the value of SNR in (10) into (11), (A3)
we got:

2 s2 | |4 | g |2
D = p < 0
2 s h |
2 2 2
l r s
2 2
d (1 ) + l12 l22r 2
d ( )
2 Ps l1 l2 2 0 (1 ) | |2 + l12 l2 2 2 0 ( )
= p g < d
2
r d

2 P s
2
| |4
2 P l
s 1 r 0 | |2
2 a | h |2 + b
=p g <
(A1)
c | h |4 d | h |2

where:

a Ps l1 l2 2d 0 ( ) where z is the integration variable,


f h z ) = 1 e z / h is the Probability Density Func-
h

l12 l22r 2d 0 ( tion (PDF) of exponential random variable |h|2,


b ) F|g|2 ( z ) = p(|
p (| g |2 z ) 1 e
z / g
is the Cumulative
Distribution Function (CDF) of the exponential
c 2 Ps 2
random variable | | and g is the mean of the
2
exponential random variable | | . By substituting
d Ps l12r 0 f|h| z ) = 1 e z / h in (A3), out is given by:
h

148

AMER16_Book.indb 148 3/15/2016 11:26:26 AM



z a z+b
+
1 h
( )
g
out = 1
h e dz (A4)
z =d /c

The analytical expression of out for the TSR


protocol presented in (12) is presented by (A4).
The integration in (A4) can not be written
shorter any more. However, we can apply a high
Figure 3. Block diagram for TPSR receiver.
SNR approximation to get further simplified
for out because at high SNR, the third factor
in the denominator (10), l12 l22r 2d (1 ) , is
very small when compared to the other factors, node. In term of power splitting, P is the power of
2 s h |2 2 2r , and Ps | h |2 l1 l22d (1 ) . the transmitted signal and 0 1 , denotes the
So, we can re-write: fraction of power splitting ratio for the harvested
energy. The received power is divided into two
parts, P and (1 ) P , which is used for energy
2 | |4 2
| g |2
D s harvesting and for signal transmission from source
2 s h |2 2
2r s
2 2
d ( ) to destination node, respectively.
Based on the previous analysis for TSR pro-
(A5) tocol, the harvested energy at TPSR receiver is
Or we can say that at high SNR, b can be given by:
replaced by 0. Due to this, out in (A4) can be re-
2
written as: Ps h
Er = (15)
z
l1
az
+
1 h ( )
out = 1
h e dz (A6) The transmitted power from relay node is:
z =d /c
2
Lets define x cz d . The approximated out- Er P h
Pr = = s (16)
age probability at high SNR is: ( )T l1 ( )
d
x a The received signal at destination node after
+
e c h
h c x sampled is:
out 1
c h e dx
(A7)
x =0
h Ps h g s ( k )
d 2

= 1 e c h
u K1 (u) yd ( k ) =
( ) l1 l2 Ps h + l12r
2
  
where u = c4 a , K1() is the first-order modified signal part
h g (17)
Bessel function of the second kind and the last
2
Ps h g r ( )
equality is obtained by using the formula, + + d ( )
( ) l2 Ps h + l12r
2

  

( )
x
e 4 ddx

K1 [17] Overall noiise
s

0 Next, we can calculate the SNR at destination


node, D =
E{ }:
3.3 Time power-switching-based E { }
relaying receiver (TPSR) 2 4 2
Ps h g
Figure 3 illustrates the block diagram for the Time D =
Power Switching based (TPSR) receiver in which Ps h
2 2 2 2
l + Ps h l1 l 2
r
2

d
( ) + l1 l 2
2 2

d
2

r
( )

T is the block time for information to transmit (18)


from source node to destination node, is the frac-
tion of the block time and 0 1 , in which The throughput at destination node for TPSR
is used for energy harvesting and the remaining receiver with delay-limited transmission mode can
time is used to transmit the signal to destination be calculated based on (12) as:

149

AMER16_Book.indb 149 3/15/2016 11:26:31 AM


=(
( )T The SNR at destination node, i D , can be cal-
)R T
=( ) ( ) (19a) E { } , is given by:
culated as i D =
E { }
where:
( )
2 4 2
Psi h g
iD =
a Ps l1 l2 2d 0 ( ) (19b)
2 2
l 2r + l l 2d
2 2
1 r
i i
s h s )
(23)
b l12 l2 2r 2d 0 ( ) (19c)
The throughput at the destination node for the
c Ps 2
(19d) ideal receiver with SNR given in (23) for delay-lim-
ited transmission mode is calculated by:
Ps l1 2r 0
d (19e)
i =
(1 ) R
i
out
(24)
2
4a
u= (19f)
c h g This is due to the fact that the effective time for
communication between source and destination
node is T/2.
The outage probability, i out , is calculated
3.4 Ideal receiver based on (12), where:
In this section, we introduce a ideal relay receiver,
which harvests energy and processes information a Psi l1 l2 2d 0 (25)
from the same received signal (Varshney 2008).
In the first half of the block time T/2, it harvests b l12 l2 2r 2d 0 (26)
energy and processes information the received sig-
nal from source node and in the remaining half,
it transmits the source signal to the destination c Psi( ) 2
(27)
node.
The energy harvested in T/2 is:
d Psi l1 2r 0 (28)
2
Psi h
Eri = (20)
l1
4 NUMERICAL RESULS
The power transmitted from the relay node,
In this section, numerical results are provided to
using harvested energy Eri is:
illustrate the TSR protocol and TPSR receiver
2
with delay-limited transmission mode. The dis-
Eri Psi h tance from source node to relay node and distance
Pr i = = (21)
T /2 l1 from relay node to destination node are denoted
as l1 and l2, respectively, is the fraction of
block time T, is the energy harvesting efficiency.
At destination node, the received signal, yd(k), is
For simplicity, we choose the source tranmission
expressed by:
rate is R = 3 for default, the energy harvesting effi-
ciency = 1, the power transmitted from source
h Psi h g s ( k )
2
Ps = 1, and the pathloss exponent = 2.7 (corre-
ydi ( k ) = sponding to an urban cellunar network). ) We also
l1 l2 Psi h + l12r
2
assume that 2a 2a ,r = 2a ,d , n2c n2c ,r = n2c ,d ,
  
signal part and the mean value h and g of random variables
(22) 2 2
h and g , are set equal to 1.
h g r ( )
2
i
+ d ( ) Figure 4 shows the optimal throughput at des-
s
+
l2 Psi h + l1 2r
2 tination node for TSR protocol with delay-limited
   transmission mode for different values of . As we
Overall noise can see, the throughput increases when increases,

150

AMER16_Book.indb 150 3/15/2016 11:26:34 AM


Figure 5. Optimal throughput value for TSR protocol
Figure 4. Throughput at destination node for TSR pro- and the ideal receiver for different values of atenna noise
tocol, with 2 2 = 0.01 , Ps = 1, = 1, and l1 l2 = 1 . variance, 2 = 0.01 , Ps = 1, = 1, and l1 l2 = 1 .
a c
c

but when it reaches its maximum value, approxi-


mately at 0.28 , it starts to decrease. This is
because when exceeds its optimal value, there
is not much time for harvesting energy from the
source signal. This leads to smaller throughput
observed at destination node. For conclusion, the
greater the value of than its optimal value, the
more time is used for energy harvesting and less
time to transmit signal to destination node. And
the result is smaller throughput observed at desti-
nation node.
Figure 5 shows the optimal throughput for TSR
protocol and the ideal receiver in comparison with
the delay-limited transmission mode for different
values of antenna noise variance 2a , the conver-
sion noise variance is set to 2c = 0.01 . The ideal Figure 6. Optimal throughput value for TSR protocol
receiver harvests and processes information from and TPSR receiver with delay-limited transmission mode
the same signal, so its optimal throughput is much for different values of R.
better than TSR protocol. As observation, the
optimal throughput for both TSR protocol and the
ideal receiver increase when the antenna noise vari-
ance decreases. This is simply understood because This causes the increase of the outage probabil-
the less noise in the signal, the better quality of sig- ity but decrease of the throughput at destination
nal received for information processing. node. And, we can observe that when R is low,
Figure 6 shows the optimal throughput value the optimal throughput of TPSR receiver is better
for TSR protocol and the TPSR receiver with than TSR protocol, but there is not much different
delay-limited transmission mode for different val- between them when R is large.
ues of the source transmission rate, R bits/sec/Hz.
It can be seen that the throughput increases as R
increases, but it will start to decrease as R is get- 5 CONCLUSIONS
ting greater than 3. This is because the through-
put in (14) depends on R, when R becomes larger, In this paper, we consider a wireless network where
the receiver at destination node failed to decode a a relay node harvests energy from the source sig-
large mount of data incoming in the block time T. nal and uses that harvested energy to forward the

151

AMER16_Book.indb 151 3/15/2016 11:26:37 AM


source signal to destination node. The throughput Laneman, J.N., Tse, D.N.C., and Wornell, G.W., Cooper-
for delay-limited transmission mode for TSR pro- ative diversity in wireless networks: efficient protocols
tocol and TPSR receiver has been discussed. The and outage behavior, IEEE Trans. Inf. Theory, vol. 50,
throughput at destination node is determined by no. 12, pp. 30623080, Dec. 2004.
Liu, L., Zhang, R., and Chua, K.C., Wireless informa-
analytical expressions for the outage probability tion transfer with opportunistic energy harvesting,
for the delay-limited transmission mode. The ana- accepted for publication inIEEE Trans. Wireless
lytical results for the optimal throughput at desti- Commun., 2012.
nation node for TSR protocol and TPSR receiver Luo, S., Zhang R., and Lim, T.J., Optimal save-then-
is provided to let us have a deep look into the sys- transmit protocol for energy harvesting wireless trans-
tem, and to understand the effect of some system mitters, accpeted in IEEE Trans. Wireless Commun.,
parameters to the optimal throughput value at des- 2013.
tination node. Nasir, Ali A., Xiangyun Zhou., Salman Durrani and
Rodney Kennedy, A., Relaying Protocols for Wireless
Energy Harvesting and Information Processing, IEEE
transaction on wireless communications, vol. 12, no.
REFERENCES 7, July 2013.
Rui Zhang and Chin Keong Ho, MIMO Broadcasting
Chalise, B.K., Zhang, Y.D., and Amin, M.G., Energy for Simultaneous Wireless Information and Power
harvesting in an OSTBC based amplify-and-forward Transfer, IEEE Global Communications Conference
MIMO relay system, inProc. 2012 IEEE ICASSP. (Globecom), December 59, 2011, Houston, USA.
Do, Dinh-Thuan., Time Power Switching based Relaying Varshney, L.R., Transporting information and energy
Protocol in Energy Harvesting Mobile Node: Optimal simultaneously, in Proc. 2008 IEEE ISIT.
Throughput Analysis, Mobile Information Systems Xiang, Z. and Tao, M., Robust beamforming for wireless
Journal, Article ID 769286, 2015. information and power transmission, IEEE Wireless
Fouladgar, A.M., and Simeone, O., On the transfer of Commun. Lett., vol.1, no.4, pp. 372375, 2012.
information and energy in multi-user systems, accepted Xu, J. and Zhang, R., Throughput optimal policies for
for publication in IEEE Commun. Lett., 2012. energy harvesting wireless transmitters with non-ideal
Laneman, J.N., Tse, D.N.C., and Wornell, G.W., Cooper- circuit power, accpepted in IEEE J. Sel. Area. Com-
ative diversity in wireless networks: efficient protocols mun., 2013.
and outage behavior, IEEE Trans. Inf. Theory, vol. 50, Xun Zhou, Rui Zhang, and Chin Keong Ho, Wireless
no. 12, pp. 30623080, Dec. 2004. Information and Power Transfer: Architecture Design
Grover, P. and Sahai, A., Shannon meets Tesla: Wireless and Rate-Energy Tradeoff, IEEE Global Communica-
information and power transfer, inProc. 2010 IEEE tions Conference (Globecom), December 37, 2012,
ISIT. California, USA.
Ho, C.K. and Zhang, R., Optimal energy allocation for Zhou, X., Zhang, R., and Ho, C.K., Wireless informa-
wireless communications with energy harvesting con- tion and power transfer: architecture design and rate-
straints, IEEE Trans. Signal Process., vol. 60, no. 9, pp. energy tradeoff, 2012.
48084818, Sept. 2012.
Kaibin Huang and Vincent Lau, K.N., Enabling Wireless
Power Transfer in Cellular Networks: Architecture,
Modeling and Deployment, Wireless Communica-
tions, IEEE Transactions on, vol.13, 2014.

152

AMER16_Book.indb 152 3/15/2016 11:26:39 AM


Applied Mathematics in Engineering and Reliability Bri, Snel, Khanh & Dao (Eds)
2016 Taylor & Francis Group, London, ISBN 978-1-138-02928-6

Energy harvesting in amplify-and-forward relaying systems with


interference at the relay

Thanh-Luan Nguyen & Dinh-Thuan Do


HCMC University of Technology and Education, Ho Chi Minh, Vietnam

ABSTRACT: Harvesting energy from Radio-Frequency (RF) signals is an arising solution for prolong-
ing the lifetime of wireless networks where relay node is energy-constrained. In this paper, an interference
aided energy harvesting scheme is proposed for cooperative relaying systems, where relay harvests energy
from signals transmitted from source and co-channel interferences, and then consumes that energy for
forwarding the information signal to the destination. A Time Switching-based Relaying Protocol (TSR)
is proposed to allow energy harvesting and information processing at the relay. Applying the proposed
approach to an amplify-and-forward relaying system with the three-terminal modelthe source, the relay
and the destination, the approximated analytical results expressed in closed-form of the outage prob-
ability is derived to analyze the performance of the system. Furthermore the ergodic capacity, which
expressed in integral-form, is derived in order to determine the achievable throughputs. In addition, the
achievable throughput of the system is investigated.

1 INTRODUCTION harvesting simultaneously. Inversely, the source


signal carries both information and energy at the
Nowadays wireless communication devices are same time. Furthermore, the receiver is assumed to
developing with an incredible speed, and existing decode the information and harvest energy from
all over the world. Both size and amount of such the same signal (Varshney 2011 & Grover 2010).
devices are increasing every year. However, a major, There are two protocols for harvesting energy and
and maybe a leading problem is that they consume decoding information separately (Zhang, Nasir &
tremendous amount of energy (Hasan 2011). Some Liu 2013, Medepally 2010), one is the Time Switch-
inefficient solutions are using traditional recharg- ing-based Relaying protocol (TSR), where the relay
ing and wiring method because of numerous small switch over time between decoding and harvesting;
devices. Another considerable answer to that prob- the other is the Power Splitting-based Relaying
lem is applying far-field microwave power transfer protocol (PSR), where a portion of the received
technique for long-distance transmission or set- power is used for energy harvesting and remaining
tling and deploying additional power beacons. But power is used for information processing.
the fact remains that such technology is not ready In cooperative networks, by setting up an
work for today communication systems and not intermediate relay between the source and the
feasible for releasing throughout the world. destination, the coverage area and capacity of
A compelling solution interested by many communication system can effectively be enhances
researchers is to harvest energy from Radio Fre- (Laneman 2004). However, since the relay is ener-
quency (RF) radiation captured by the receive gy-constrained, it is difficult to prolong the lifetime
antennas to support energy-constrained communi- of relaying systems. Optimistically, one can apply
cation devices. Ambient RF signal from commu- energy harvesting approach in order to achieve
nication devices is widely available in urban areas the desirable performance, since such information
and can present through the night, from indoors to decoding and energy harvesting has advantages
outdoors. These characteristics make energy har- in wireless networks when the nodes cooperative
vesting a highly promising technology. In this tech- together in transmitting the sources signal to desti-
nique, the receive antennas convert ambient RF nation. For both protocols, the Co-Channel Inter-
radiation into Direct Current (DC) voltage and ference (CCI) signals supply energy in the energy
supply appropriate circuits (Paing 2008 & Rajesh harvesting phase and act as noise in the informa-
2011). tion decoding phase.
Since the limitation of the circuitry, one is una- In this paper, an Amplify-and-Forward (AF)
ble to process information decoding and energy wireless cooperative network is investigated, where

153

AMER16_Book.indb 153 3/15/2016 11:26:39 AM


the relay harvests energy from the received RF
signals broadcasted by a source. Impacts of Co-
Channel Interferences (CCI) signals are consid-
ered. Specifically, the relays harvest energy from
the information signal and the CCI signals and
utilize that harvested energy to forward the source
signal to its destination. The TSR receiver archi-
tecture is adopted, and the corresponding protocol Figure 2. Block diagram of TSR protocol for energy
is then proposed. harvesting and information processing at the relay.
A three-terminal model of AF relaying is pro-
posed, where the source node communicates with
destination node through an intermediate relay Figure 2 illustrates the key parameters in the
node. Due to the effect of CCI signals, the outage TSR protocol. Where T is the block time in which
probability is derived approximately in order to a certain block of information is transmitted from
determine the outage capacity, defined as the max- the source node to the destination node. The relay
imum constant rate that can be maintained over spends rT block time for energy harvesting, where
fading blocks with a given outage probability. First 0 r 1 denotes the energy harvesting ratio and
the ergodic capacity is illustrated numerically. The the remaining block time is divided into two equal
corresponding achievable throughputs of the pro- parts, that is (1 ) / 2 , for source-to-relay and
posed energy harvesting system are also studied. relay-to-destination transmission, respectively.

2 SYSTEM MODEL 3 TIME SWITCHING-BASED RELAYING


(TSR) PROTOCOL
Consider a cooperative AF relaying system, where
the source, S communicates with the destination, 3.1 Energy harvesting and information processing
D through an intermediate relay, R. The relay is at the relay node
assumed to be energy-constrained as illustrated in.
Figure 1. A single antenna operated in the half- Firstly, the source transmits its signal s(t) to
duplex mode is equipped in each node. the relay. Accordingly, in the presence of the co-
Figure 1 shows the system model for cooperative channel interference, the received signal RTSR
S
(t ) 1 at
relaying, where the relay harvests energy from the the relay node can be expressed as:
signal transmitted from the source S and interfer-
M
ers and then use that energy to charge its battery.
Both, the source-to-relay and relay-to-destination yRTSR (t ) = hS s(t
s(tt li si (t ) + na[R] (t) (1)
i =1
transmission experience independent Rayleigh
where na[R ] (t ) is the narrowband Gaussian noise
fading, with the channel gain hS and hD with
E{| hS | } = S and E{| hD | } = D respectively, in
2 2
due to the receiving antenna2, si (t ) is the signal
which E{ } denotes expectation operator and
transmitted from the ith interferer.
is the absolute value operator.
After down conversion, the sampled baseband
We assume that there are M CCI signals affect-
signal at the relay node, yRTSR k ) is given by
ing the relay. The complex channel fading gain
between the ith interferer and the relay is denoted M
as li, with E{| li |2 } = i . In this paper, all channels
follow a Rayleigh distribution.
yRTSR k ) = hS s(k
s(k
k li si ( k )
i =1
(2)
+ na[R ] ( k ) + nc[R ] ( k )
 
TSR
nnR (k )

where s(k) and si ( k ) is the sampled informa-


tion signal from the source and the ith interferer,
respectively; na[ R ] ( k ) denotes the baseband addi-
tive white Gaussian noise (AWGN) introduced by
the receiving antenna at the relay, and nc[ R ] ( k ) is
the sampled AWGN due to the conversion from
Figure 1. System model for two hop channel state infor- RF band to baseband signal, both with zero mean
mation amplify-and-forward system, with co-channel and variances of Na[ R ] and Nc[ R ] , respectively; and
interference at the relay. nRTSR ( k ) is the overall AWGNs at the relay.

154

CH19_34.indd 154 3/17/2016 7:57:23 PM


The harvested energy during the harvesting able SINR threshold, th. This concept is mathe-
time, rT is given by matically illustrated by

M
2 Poutage P( TSR
th ) S ( tth )
FTSR (8)
hS +
2 S D
Eh e S i li rT (3) S 2D

i =1
At high SNR, the outage probability at the des-
where e, with 0 e 1 is the energy conversion tination node is approximately given by
efficiency, its value depends upon the harvesting
1/ 2
circuitry, PS = E{ s (t ) } and Pi = E{ si (t ) } , is the
2
N R / D th th
2
2 th g
Poutage = 1 exp K1 2
transmit power of the source and the interference g 1 1
1 g
sources, respectively.
j j
The transmit power of the relay ( A) i ( A) ( j ) i th 1
i j ( ) +
i =1 j =1 ( )! 1 i
Eh 2re M
2
Pi li
2
PR PS hS (4) (9)
( r )T / 2 1 r i =1
where A diag ( ) , i Pi
i , ( A) ND
Before forwarding yRTSR k ) to D, the relay is the number of distinct diagonal elements of A,
amplifies the received signal by multiplying it with 1 > 2 > ... > ( ) are the distinct diagonal
the gain, G, which can be expressed as: elements in decreasing order, i ( A ) is the multi-
plicity of i , and i j ( ) is the (i, j)th character-
istic coefficient of A (Gu & Aissa 2014).
PR
G= (5) When the interfering signals are statistically
M independent and identically distributed (i.i.d.),
Pi li
2 2
PS hS + NR i.e., i , i 1, 2,..., M , then ( A) = 1 and
i =1
i ( A) M , the outage probability, Poutage , is then
reduced to
where NR Na[R ] + Nc[R ] are the variances of the
overall AWGNs at the relay.
2 th g
1/ 2
th N R / D th
Hence, the received signal at the destination Poutage 1 K1 2 exp
node after the sampling process, yTSR D ( k ) is g 1 1 g 1
given by M
( )
M
th 1
+
( ) ! 1

D ( k ) = yR
yTSR TSR
( k ) hDG + na[D] ( k )
+ nc[D] ( k )

(6)
(10)
D (k )
nnTSR
where
where na[D] ( k ) and nc[D] ( k ) are the AWGNs at the
PS
destination node due to the antenna and conver- 1 = S (10a)
sion, both with zero mean and variances of Na[D] ND
and Nc[D] ( k ) , respectively, and nTSR D ( k ) is the
overall AWGNs at the destination. By substituting 2re
D ( k ) is given by
yRTSR k ) from (2) into (6), yTSR g = D (10b)
1 r

D ( k ) = hS s (k
yTSR k hDG
NR
M NR / D = (10c)
+ li si ( k ) nRTSR ( k ) hDG TSR
nD (k ) ND
i =1
(7) where 1 is defined as the average signal-to-noise
ratio (SNR).
As a result, the SINR of the decision is given by Proof: See Appendix A
(8) (see next page).
3.3 Ergodic capacity and the achievable
3.2 Outage probability throughput
g probability is defined as
In this paper, the outage The second parameter used to evaluate the per-
the probability that TS R
S 2 D drops below an accept- formance of the cooperative network is the

155

AMER16_Book.indb 155 3/15/2016 11:26:42 AM


throughput which is determined by evaluating the 1 r
ergodic capacity, CE in the unit of bit/s/Hz, at the O = CO (18)
destination. In the AF-cooperative communica- 2
tion, using the received SINR at the destination,
TSR
S 2 D in (8), CE is given by
4 NUMERICAL RESULTS
CE = E l
(1 + TSR
S 2D ) (11) In this section, the approximated analytical results
are derived. Monte Carlo simulation results illus-
trated to corroborate the proposed analysis. To
= log 2 ( + ) f ( ) d (12) evaluate the effects of the interferences on the sys-
0 S 2D

2 2
M
Pi li
2 2
2re PS hD
hS PS hS
i =1
TSR
S 2D =
2 2 2
M M M
hS + Pi li Pi li + NR ND (1 ) Pi li +( ) N D NR
2 2 2
2re hD S r PS hS r
i =11 i =1 i =1
(13)

tems performance, we define the average signal-


where f ( ) stands for the PDF of the ran-
TSR

S2D
to-interference ratio as SIRI . Here after, and 1

dom variable TSR


S
S 2 D Using the integration-by-parts IINF
N

method, the expression in (13) can be rewritten as unless stated, the variances are assume to be identi-
cal, that is, Na[R ] Nc[R ] Na[D] = Nc[D] , number of

{ } interferers is set to 1 (M = 1) and the values of the



CE ( ) F TSR
S
S D
( )
energy conversion efficiency is set to 1 ( e = 1 ).
0 (14)
1 1 Figure 3 shows the throughput E and O versus
S ( ) 1 d
ln 2 0 1 + S 2 D
F TSR the energy harvesting ratio r for different values of
average SIR received at the relay, where the average
1 1 SNR is fixed at 20 dB and th = 5 dB . The ana-
S ( ) d
ln 2 0 1 +
= 1 FTSR (15) lytical and simulation results of ergodic capacity
S 2D
are from (15) and (12), respectively. It is observed
that the analytical results match well the simula-
where { f ( x )}ba f (b ) f (a ) . tion results. In general, the throughput increases
The throughput at the destination depends only as r increases to some optimal value. But later,
on the effective information time, ( r ) / 2 and
is given by

E =
( r )T / 2
CE
1 r
CE (16)
T 2

3.4 Outage Capacity and the achievable


throughput
Outage capacity, in the unit of bit/s/Hz, is defined
as the maximum constant rate that can be main-
tained over fading blocks with a specified outage
probability (Gu & Aissa 2015). In the AF coopera-
tive communication system under study, the out-
age capacity is expressed as

CO 1 Poutage ( th ) l 2 ( + tth ) (17)


Figure 3. Throughput E and versus the energy har-
The achievable throughput at the destination vesting ratio, r for different values of SIR, where SNR is
relates only to the transmission time is given by fixed at 20 dB, with NR N D = 1 and th = 5 dB .

156

AMER16_Book.indb 156 3/15/2016 11:26:46 AM


It is shown that when the SNR is fixed, an increase
in the power of the CCI signals can reduce the system
performance, but required less time for energy har-
vesting at the relay. In order to enhance the system
throughput, one can either increase the power of the
information signal or decrease the noise variances.

APPENDIX A

At high SNR, the third factor in the denomi-


nator of (8), (1 ) N D NR can be ignored
since its value is too small compared to
the other two factors in the denominator,
2re hD 2 ( S hS
2
iM=11 Pi li
2
)( i =1 Pi li
M 2
+ NR )
and N D ( r )(
)( S | hS li | . As a result,2
iM1 i 2
Figure 4. Optimal throughput E and O versus the the approximated SINR at the destination apply-
average SIR for different values of average SNR, with
ing high SNR approximation is given by
NR N D = 1 and th = 5 dB .
1
TSR
S 2D (A.1)
1
as r increases from its optimal value, more data IINF
N + + NR / D
is wasted on energy harvesting resulting that the g
throughput of the system gradually drops down
from its maximum value. Furthermore, when the where,
average SIR increases the optimal throughput is
also increase. This is implies that an increase in PS 2
power of the CCI signals can deteriorate the sys- 1 = hS (A.1a)
ND
tem performance, but reduces the r required to
achieve the same value of throughput.
Figure 4 shows the Optimal throughput E and
M
Pi
NF =
2
IN li (A.1b)
O versus the average SIR for different values of N
i =1 D
average SNR, where th = 5 dB. It also shows that
for a given average SNR, the optimal throughput 2re 2
increases as the average SIR increases. This implies g = hD (A.1c)
1 r
that an increasing in the average SIR can effectively
enhance the system throughput. This implies that
in order to enhance the systems throughput, we NR
NR / D = (A.1d)
can either increase the signal power or by decreas- ND
ing the noise variances.
In order to find Poutage , the cumulative density
S ( tth ) is approximately given by
function, FTSR
5 CONCLUSION S 2D


1
In this paper, an interference aided energy harvest- F
S 2D
( th ) P
1 th y + + NR / D
z (A.2)
ing amplify-and-forward relaying system was pro- 00
posed, where the energy-constrained relay harvests f g z ) f INNF ( y )dydz
energy from the received information signal and
Co-Channel Interference (CCI) signals, then uses where f z ) and f y ) denotes the probability
g INF
N
that harvested energy to forward the signal to des- density function (PDF) of g and INF, respectively.
tination after multiply it the gain. The time switch- The PDF of INF is given by (for details on this anal-
ing-based relaying protocol was adopted here for ysis, see Bletsas, H. Shin, and M. Z. Win (2007))
circuit simplicity.
The achievable throughput of the system is v( A ) i ( A ) i j y
numerically derived from the ergodic capacity and
analytically derived from the outage capacity. The
f IINF
N
y) = i j ( A)
(j )!
y j 1 exp
i

outage probability is calculated approximately at
i =1 j =1
high SNR for simplicity. (A.3)

157

AMER16_Book.indb 157 3/15/2016 11:26:49 AM


If the interfering signals are i.i.d., the CDF of REFERENCES
INF reduces to
Bletsas, H. Shin, & Win, M.Z. 2007. Cooperative com-
M y munications with outage-optimal opportunistic relay-
f IINF y) = yM 1 exp (A.4) ing. IEEE Trans. Wireless Commun., 6: 34503460.
(M ) !
N
Grover, P. & Sahai, A. 2010. Shannon meets Tesla: Wire-
less information and power transfer. Proc. 2010 IEEE
S ( tth ) , we also need
In addition, to evaluate FTSR Int. Symp. Inf. Theory: 23632367.
S 2D Gu, Y. & Aissa, S. 2015. RF-Based energy harvesting in
to determine the CDF and PDF of RVs, 1 and g, Decode-and-Forward relaying systems: ergodic and
respectively. Note that, the CDF of 1 and PDF

of outage capacities. Proc. IEEE Int. Conf. Commun.:
g can be expressed as F 1 ( ) = 1 exp{ } and 1
64256434.
f g z ) = 1 exp{ z }, respectively. Substituting Gu, Y. & Aissa, S. 2014. Interference aided energy har-
g g
(A.3) into (A.2) and with the given CDF of 1 and vesting in decode-and forward relaying systems. Proc.
IEEE Int. Conf. Commun.: 53785382.
PDF of g we have Hasan, Z.H., Boostanimehr, & Bhargava, V. K. 2011.
Green cellular networks: A survey, some research
j
1 N R / D th v ( A )
i ( A)
issues and challenges. IEEE Commun.Surveys Tuts.,
i
F TSR ( th ) = 1 exp i j
( A) 13: 524540.
S 2 D
g 1 i =1 j =1 (j 1) !
Laneman, J.N., Tse, D.N.C. & Wornell, G.W. 2004. Coop-

th 1 erative diversity in wireless networks: Efficient proto-
y
j 1
exp

+
y cols and outage behaviour. IEEE Trans. Inf. Theory,
0 1 i 50: 30623080.
Liu, L., R. Zhang, & Chua, K.C. 2013. Wireless infor-
th z mation transfer with opportunistic energy harvesting.
dy exp
1z

g
ddz IEEE Trans. Wireless Commun., 12: 288300.

0

Medepally, B. & Mehta, N.B. 2010. Voluntary energy


(A.5) harvesting relays and selection in cooperative wire-
less networks. IEEE Trans. Wireless Commun., 9:
The two integrals can be evaluated as follow 35433553.
(Prudnikov, Brychkov & Marichev 1986. Integrals Nasir, A.A., Zhou, X., Durrani, S. & Kennedy, R.A.
and Series, eq. (2.3.3.1) and eq. (2.3.16.1)) 2013. Relaying protocols for wireless energy harvest-
ing and information processing. IEEE Trans. Wireless
j Commun., 7: 36223636.
tth 1 tth 1 Paing, T., Shin, J., Zane, R. & Popovic, Z. 2008. Resistor
y
j 1
I1 exp + y ddyy ( j) + emulation approach to low-power RF energy harvest-
0 1 i 1 i
ing. IEEE Trans. Power Elec., 23: 14941501.
(A.6) Prudnikov, A.P., Brychkov, Y.A. & Marichev, O.I. (1 &
2) 1986. Integrals and Series. New York: Gordon and
Breach Science Publishers.
tth z tth
1/ 2
th g
I 2 = exp ddz = 2 K1 2
Rajesh, R., Sharma, V. & Viswanath, P. 2011. Informa-
0 1z g 1 1 g tion capacity of energy harvesting sensor nodes. Proc.
2011 IEEE Int. Symp. Inf. Theory: 23632367.
(A.7) Varshney, L.R. 2008. Transporting information and
energy simultaneously. Proc. 2008 IEEE Int. Symp.
where K1 () stands for the first-order modified Inf. Theory: 16121616.
Bessel function of the second kind, ( j ) denotes Zhang, R. & Ho, C.K. 2013. MIMO broadcasting for
the Gamma function. simultaneous wireless information and power transfer.
The approximated outage probability, Poutage IEEE Trans. Wireless Commun., 12: 19892001.
in (10) is achieved by substituting (A.6) and (A.7)
into (A.5).

158

AMER16_Book.indb 158 3/15/2016 11:26:52 AM


Applied Mathematics in Engineering and Reliability Bri, Snel, Khanh & Dao (Eds)
2016 Taylor & Francis Group, London, ISBN 978-1-138-02928-6

On a tandem queueing network with breakdowns

A. Aissani
Department of Computer Science, University of Science and Technology Houari Boumediene (USTHB),
El Alia, Bab-Ez-Zouar, Algiers, Algeria

ABSTRACT: The purpose of this paper is to provide a method for finding the probability distribution
of the virtual waiting time of a customer in a closed queueing network with two stations in tandem and
unreliable servers. We obtain the joint probability distribution of the server state (busy or out of order)
and the residual work in each station. Then we derive the probability distribution of the virtual waiting
time of a customer in the network in terms of its Laplace-Stieltjes transform. These results are interesting
to provide some performance metrics such as the utilization or the load of a central node (or base station)
in physical networks such as mobile or wireless networks (WSN), data bases and other telecommunication
or computer systems.

1 INTRODUCTION

Queueing network models are interesting tools


when we want to take into account the effect of
packet traffic or routing protocols on the perform-
Figure 1. Model of a network with central node.
ance of a real physical network (Boucherie & Dijk
2010, Medvediev 1978, Demirkol, Ersoy, Alagz,
& Deli 2009, Qiu, Feng, Xia, Wu, & Zhou 2011, Senouci, Mellouk, Oukhellou, & Aissani 2015)
Senouci, Mellouk, & Aissani 2012). We consider (ii) energy efficiency due to low-power and low-
a Closed Queueing Network of two single server cost devices with limited capabilities (sensing, data
nodes (or stations) S1 and S2 in tandem in which processing, transmission range, memory, commu-
circulates a constant number N of requests (cus- nication); (iii) saturation throughput analysis; (iv)
tomers, packets, etc). Such a model (see Figure 1) end-to-end delay reflecting the time needed by a
is used in many systems with multiple access in message to traverse one hop of its multi-hop path
which we consider a Central Node (or Base Sta- to the sink node (Qiu, Xia, Feng, Wu, & Jin 2011).
tion) against the rest of the network considered as Such an aspect copes with retrial behavior of the
a bloc of data transmission. For example, the work message-sending node and the active/sleep peri-
(Osman & Knottenbelt 2012) presents a catego- ods of the potential next-hope nodes (Phung-Duc
rization of queueing network performance mod- 2012); (v) routing protocols (Qiu, Xia, Feng, Wu, &
els of database systems. They considers amongst Jin 2011); (vi) reliability and maintainability issues,
others the transaction processing model in which which can be understood in different ways accord-
the central node or server represents the hardware ing to the routing protocol used for the physical
components (for example the CPU) of a central- purpose (Senouci, Mellouk, & Aissani 2012) etc.
ized database or a site in a distributed database. In this context of WSNs we assume that the
A more elaborated example concerns modeling deployment is achieved yet. Moreover, since the
issues in Wireless Sensor Networks (WSNs) for second node represents the rest of the network, so
performance evaluation purpose. WSNs are widely the exponential and Poisson assumptions about
used to implement low cost non-attended monitor- arrivals and breakdowns are reasonable by virtue
ing of different environments. WSNs operate in a of limit theorems of probability. The network life-
complex real-time and real world noisy environ- time strongly depends on the routing protocol used
ment, which gives rise to several modeling challenges and can be defined in several ways. For example,
regarding to the quantitative protocol evaluation the lifetime can be defined as the time elapsed until
for QoS (Quality of Service) goals. Recent applied the first node (or last) depletes its energy (dies and
and theoretical research focus amongst others on: cannot provides service). In some scenarios, such
(i) deployment of the sensors (location and cover- as intrusion or fire detection, it is necessary that all
age issues); (Senouci, Mellouk, & Aissani 2014, nodes stay alive as long as possible, since network

159

AMER16_Book.indb 159 3/15/2016 11:26:54 AM


quality decreases as soon one node dies. In these a partial differential system of equations for the
scenarios, it is important to know when the first probability distribution of the system state of the
node dies, the FND (resp.LND) metric (First node basic stochastic process describing the evolution
dies) gives an estimated value for this event. The of our queueing network. Section 5 is devoted
HNA metric (Half of the nodes Alive) gives an to the resolution of this system of equations in
estimated value for the case when the loss of a sin- stationary regime. In section 6, we show how this
gle or a few nodes does not automatically reduce solution can be used to derive the probability dis-
the QoS. Now, since the sensor nodes are placed tribution of the virtual waiting time of a request
at different distances from the base station, the in the central node of the network. Finally, in sec-
network lifetime distribution can be studied using tion 7 we show some applications and numerical
these three metrics from the spatio-temporal point examples.
of view (Senouci, Mellouk, & Aissani 2012).
In this paper, we introduce the probability dis-
tributions of the lifetime of the node which fits to 2 SIMPLIFICATION OF THE PROBLEM
any of the above metrics according to the mode-
ling level. When a node die, a random interrup- Let G1(t ) be the distribution function of the gen-
tion period (for corrective maintenance) begins in eralized service time (Gaver 1962, Taleb. & Ais-
order to renewal the node to the state as-good- sani 2010) (also called completion time (Aissani &
as-new. Artalejo 1998)) of an arbitrary request in the base
We assume that the lifetime of the central node station S1 i.e the time from a first access of the
S1 is arbitrary distributed with Probability Distri- request at the server until he leaves the system with
bution Function (PDF) D1(t ) while the lifetime completed service.
of the node S2 is exponentially distributed with Denote also by Hi (t ) the distribution functions
parameter 2 . of the service time in station Si , Di (t ) the distri-
After breakdown, the renewal of the node begins bution of the lifetime of the server in station Si
immediately and its duration is a random variable and Ri (t ) the distribution of the renewal time in
arbitrary distributed with PDF R1(t ) in S1 and Si , i = 1 or 2.
exponentially distributed with parameter 2 in The corresponding Laplace-Stieltjes trans-
S2 . The service time of a request is also exponen- forms are denoted by gi ( s ) , di ( s ) , ri ( s ) , hi ( s ) ,
tially distributed with parameter i in the station Re( s ) > 0 , i = 1 or i = 2 .
Si , i = 1, 2 . We assume that the three sequences of It can be shown that the Laplace-Stieltjes trans-
lifetimes, renewal times, service times are mutually form of the distribution G1(t ) is given by
independent and identically distributed sequences
of random variables. Scheme 1.
After service in the node Si , the served request
joins immediately the node S j , i , j 1 2 j i . If
g1( s ) = e st
P{r dH1(t ),
{r1( s ) t dH
a new request finds in S j the server available i.e. 0
free of requests and in functioning state, then its st 1 1 d1( s )
 (x s)
service begins immediately. Otherwise, he joins a P 0 e P ( x t )dt =
s 1 xd
d1( s )
,
First in-First out queue and waits the beginning of
service without any constraint on the duration of
the waiting time. Here pk (t ) is the probability that k break-
Concerning the evolution of the request whose downs occurs during the service of the marked
service was interrupted, we consider the following customer and
two schemes.

i. After the renewal of the node, the service of the
interrupted request continues from the point at
P(x t ) x k Pk (t ).
0
which it was interrupted (Scheme 1).
ii. After renewal, a new service begins (Scheme 2). is the generating function (or z -transform) of this
probability distribution relatively to the discrete var-
The purpose of this paper is to provide a method  ( x s ) represents the Laplace trans-
iable k . So, P
for finding the probability distribution of the vir-
form relatively to the continuous real variable t .
tual waiting time of a request in such a Queue-
ing Network. In the following section, we provide Scheme 2.
a technical remark which help us to simplify the
considered problem. In section 3, we describe the 112 ( s )
basic stochastic process describing the evolution g1( s ) = ,
1 111( s )r1( s )
of our network of queue. In section 4 we derive

160

AMER16_Book.indb 160 3/15/2016 11:26:54 AM


where period from t until the server S1 achieve the
service of all customers if e(t ) = 1, (t ) = 0 .
t
11( ) 0 0 [1[1 1 ( x)]dD1 ( ), Consider the following stochastic process
(t ) (t )} defined on the state
t
space E = { , } { , } { , } R +
12 ( ) 0 0 [1[1 1 ( x)]dH1 ( ).
It is not difficult to show that the proc-
ess { (t)) } is a linear Markov process with
Recall that in section 1, we have assumed that spontaneous variations of state in the sense of
the functions D2 (t ) , R2 (t ) , H 2 (t ) corresponds to (Gnedenko & Kovalenko 1989). This process is vis-
exponential distribution functions. So, by elemen- ibly ergodic, since the embedded Markov chain is
tary algebra, we show that finite, irreducible and aperiodic.
2
d2 ( s ) = e st
dD2 (t ) =
dD , 4 SYSTEM STATE OF EQUATIONS
0 s + 2
In this section we derive the system state equations.
2
r2 ( s ) = e st
dR2 (t ) =
dR , The functions
0 s + 2
Fij( k ) ( x t ) P{e(t ) k (t ) i (t ) j (t ) x},
2 i j , k { }; x 0,,
d2 ( s ) = e st
dD2 (t ) =
dD .
0 s + 2
are solutions of the following system of partial-
differential equations
Now, according to the remark of several authors
(Gaver 1962) (for FIFO Queues) the waiting time
of an arbitrary (but marked) request can be decom- (0)
01 ( ;t
; ) (0)
01 ( ;t
; ) (0)
01 ( ; )
;t
+
posed into the sum of the proper service time, and t x x
all the renewal times of the breakdowns occurring (1)
; )
;t
01 (
during this service. = D1( x ) + (0)
2 F11 (x( t)
x
From a request point of view, it is not impor-
tant what time has been devoted to the service ( 2 (0)
2 )F01 ( x t ),
itself, or to the reparation of the breakdown. The F11( 0 ) (x
( ;t ) F11(0)
1 ( ) F11(0 ) (0
( ; )
only important thing is at what time the server will +
t x x
be available to accept a new request in service. So,
we will assume that the server in the station S1 is F11(1) (0( ; )
= D1( ) + 2 F001( 0 ) ( x;tt ) 2 F11( ) ( x; t ),
absolutely reliable and the service time in this sta- x
tion follows a probability distribution G1(t ) . F01( ) ( x; t ) F01( ) ( x; t ) F01( ) (0; t )
Similar arguments can be provided from the +
t x x
point of view of the second node S2 .
01 (0; t )
(0)
01( 0 ) (;t;
)
= R1( x ) + 2 G1( x )
x x
3 THE BASIC STOCHASTIC PROCESS + 1F000(1) ( )G1( ) ( ) ( 2 ( )
2 )F01 ( )
We introduce the following notations. Let e(t ) = 0, 01 ( y;
1
)
+ 2 G1( y) dy,
if the server in S1 is free of request and available; 0 y
e(t ) = 1 , if the server is busy or out of order.
(1)
11 ( ;t
; ) (1)
11 ( ; )
;t ;t
; ) (1)
11 (
( ) = 0 , if the server in S2 is free of requests; +
( ) = 1, if the server is busy. t x x
( ) = 0 , if the server in S2 is available; ( ) = 1, F ( 0 ) ( )
if it is out of order. = 11 R1( ) + + 2 F01( ) ( )
x
We introduce also ( ) a continuous random
2 F11( ) ( ) + 1F10
1
1 ( )G1( N 1) ( ),
variable which is equal to
1)
the period from t until the moment of break- F00(1) ( ; t ) = p1( ) G1( ( y )dR1( y )
0
down of the server S1 , if e(t ) = 0 and after t + p0 (t )G1( )
( x ),
there is no arrival in S1 ,

period from t until the beginning of a F10( ) ( x t) q1(t ) G1( N )
(x y )dR
R1( y )
0
request which will be arrived at time t if
e(t ) = 1, (t ) 0 , + q0 (t )G1( N ) ( x ),

161

AMER16_Book.indb 161 3/15/2016 11:26:59 AM


where pk = limt pk (t ),
pi (t ) = P{e(t ) (t)
t) (t ) 1 (t ) i}, qk = limt qk (t ).
qi (t ) = P{e(t ) (t ) = 1, 0,1( ) = i},
i = 0,1. Denote


Here 1( ) is the number of unavailable servers ijk ( s ) 0 e st
Fijk ( x )dx,
in the system S2 and G1k ( ) is the k -th order con-
volution of the function G1( ).
These equations can be obtained by the usual way Fijjk
dF
aijk = ( ),
(Gnedenko & Kovalenko 1989). The idea is to observe ddx
the evolution of the basic stochastic process during
an infinitesimal small interval of time ( ). bijk Fijk ( ).
The random event e(t h ) , (t
(t h ) , (t +
h) = 1 with ( ) < x (which probability is
F01( 0 ) ( x t h ) ) occurs if and only if one of the fol- Applying the Laplace transform to the system
lowing events holds of section 4, we obtain

either from the state e(t ) = 1,, (t ( ) 0, (t ) = 1 1 ( )


with 0 ( ) < h (the probability of such a ran- ( 2 2 ) 01 (
( )
) + 211 ( )
( ) = [ 01 01 1 ( s )],
( )
s
dom event is F01(1) ( h t ) ) and the duration to the 1 (0)
next breakdown of the server in S1 is less than ( s 2 ) 11 (0)
( s)) 2 (010) ( s ) [ a11 a11
(1)
d1( )],
x (the probability of such an event is D1( x ) ); s
either e(t ) = 0,, (t
( ) 1, (t ) = 1) with h (t ) < 1 (1)
2 (1 h1 ( s )))11 ( s ) = [ a01 a01 r1 ( s )
(1) (0)
( 2
x + h (with probability F11( 0 ) ( x t ) and the break- s
N 1
down of the server in S2 has been achieved 2b01 (0)
g1(s( s 1b00 (1)
0 [ 1 ( s )] ],
(with probability R2 ( h ) ; 1 (1)
2 ) 11 ( ) + 2 01 ( ) = [a
(1) (1)
either e(t ) = 1,, (t
( ) 0, (t ) = 1) , h (t ) < x h ( [ a11 a1(10 )r1( )
(with probability F01( 0 ) ( x h t ) F01( 0 ) (h
( h t ) ) and s
N 1
the service in course in S2 is not achieved and 1b10 (1)
1 [ 1 ( s )] ],
the server in S2 dont fails(with probability 1
1 H 2 h ) D2 ( h ) ). 00 ( ) = [ 0
(1)
1 1 ( s )[ g1 ( s )] ],
N
s
So, we can write: 1
(00) ( s ) = [q [ 0 q1r1( s )[ g1( s )]N ].
s
F01( 0 ) ( x t h ) = F01(1) (h
( h t )D1( x ) + F11( 0 ) (x
( x t )R2 ( h )
+ [1 [1 H 2 ( h ) D2 ( )] [ F001 ((0)
) F001( 0 ) ( )]. From the two first equations above we get

(010 ) ( )
This is a simple application of the well known
formula of total probabilities. ( (0) (1)
01 )(
)( 2 ( 11
2)
(0) (1)
11 1 ( s ))
= 01 ,
Now, we can divide both sides of the above s[( s 2 2 )( s 2 ) 2 2 ]
equation by h and take the limit when h 0 . So, 11
(0)
( )
we obtain the first equation. The other equations
( (0) 01 ) 2
( 2 2 )(
(1) ( ) ( )
11 1 ( s ))
can be obtained by the same physical arguments. = 01 11
.
s[( s 2 2 )( s 2 ) 2 2]

4.1 Remark Next, the third equation give


It seems that these equations may hold for arbitrary
distributions and/or lifetime and renewal times. (011) ( )
N 1
a (1) (0)
a01 r1( ) 2b01
(0)
g1( ) (1)
1b000 ( 1 ( s ))
= 01 ,
s( s 2 2 (1 g1( s )))
5 RESOLUTION OF THE SYSTEM IN
1 1 (1)
STATIONARY REGIME 11
(1)
(s) = [ (a11 a11 (0)
r1( )
s 2 s
Consider the Network in stationary regime, when 1b10( )
))N 1 ) 2 (01) ( s )].
( 1( s))
the following limits exist
In these formula, the constants aij( k ) are still
Fijk ( x ) = lim ( t ),
li t Fiijjk (x unknown. Note first that the equation

162

AMER16_Book.indb 162 3/15/2016 11:27:04 AM


( 2 2 )( 2 ) 2 2 =0 s0 2 s0 + 2 2 2d1 s0 )
2 X (2 , 2 2 ) W ( 2 2 , 2 )
has a positive root 2

r1( 2 ) 1 U ( 2 2 2 ) ( 2 2 2 )
r ( )
1 1 1 1 0 0
s0 = [ 2 + 2 + 2
2
+ ( 2 2 )2 + 2 ( + 2( 2 + where
2 2 ))].

X( 2 ) 2 2 s0
Since the functions (010 ) ( ) and 11
(0)
( ) are 2 2

analytical functions in the subplane e ( ) > 0 , so


when the numerator equal zero, the denominator U( 2 2 2 ) 2 2 ( g1( 2 )) 2
must also be zero.
So, the unknown constants aijk are solutions of V( 2 2 2 ) 2 2 ) 2 (1 g1( 2 ))
the following system of equations
W( 2 2 2 ) d1(s
( s0 )(s
))((ss0 2 2)
2 ) 2 (
( ) ( (0) (1)
( 01 01 )( 0 11 11 )d1 ( 0 ) =0

01 ) 2 2
( ) ( )
( 01 ( 0
( )
2 )( 11
( )
11 1 (s
( s0 )) = 0.
6 VIRTUAL WAITING TIME
Similarly, since for s = 2 , the denominator of Now, we are able to derive the distribution of the
the function 11 (1)
( ) equal zero, we have virtual waiting time of an arbitrary request in the
station S1 .
{[ 2 2 2 ( 1 ( 2 ))]( 11
( )
11 1 ( 2 )
( )
Denote by ( ) the virtual waiting time of such
1b10 ( 1( 2 )) ) (a01 a01 r1( 2 )
( ) N ( ) ( ) a request i.e. waiting time of a request which will
2b01
( )
g1( 2 ) 1b00( )
( g1( 2 ))N 1 )} = 0 be arrived at time t . Its the period between the
time t until the departures of all requests arrived
before t . If the server is available andfree of
Next, consider the function
requests, then ( ) = 0 .
Also denote by F ( x ) = limt P{ (t ) x} the
f s) = s 2 2 ( g1( 2 )).
limiting probability distribution of the virtual
waiting time and
It is not difficult to see that for s = 0, we have
( ) = limt e
sx
f ( ) = 2 and lims f s ) = +. So, the function t 0
ddP{ (t ) }
f ss) has at least one root in the domain e ( ) > 0,
s1 say. is Laplace-Stieltjes transform.
Consequently, we have a system of four equations The structure of such a stochastic process is the
following. Let t1 t2 ,t3 .... the instants of requests of
pure service (regular ones) and/or impure serv-
(1)
a01 (0)
a01 r1( s1 ) 2b01
(0)
g1( s1 )
ice (renewal of components failures, virus elimi-
1b00
(1)
( g1( s1 ))N 1 = 0, nation etc.) ( t1 t2 < t3 < .... . Then for tn t < tn +1
the process { (t)) } can be defined as
0 a01 r1 ( 1 ) 2 b001 g1 ( 1 )
(1) (0) (0)
a01
( 0, if ( n ) t tn , ]
1b00
(1)
( 1( s1 ))N 1 = 0.

)
( ) ( n) ( n ), if ( n ) t tn

Consider now the linear system of algebraic
equation which can be written under the form For t tn , we have ( n ) ( n ) + n ,
Ax = b, where where n is the service time of a regular customer
and/or the renewal period of an interruption (due
x = (a01 , a11 , a01 , a11 )
( ) ( ) ( ) ( ) to a physical breakdown or a computer attack)
b ( , ,bb1, b2 ) which had occurred at time tn . Moreover, we
b1 ( )
))N 1 2 01 ( ) assume the initial condition ( ) 0 .
1b00 ( g1 ( 1 )) 1 (s
( s1 )
The process { (t)) } has stepwise linearly
b2 (
1 1 1 (
(ss )) N 1 ( )
{ 10 ( 2 2 2 (1 g1( 2 )) decreasing paths as shown in Figure 2.
b00 } 2b01 g1( 2 )}
1 (0)
We have in the pprevious section derived the joint
distribution { ij( ) ( x )} of the server state in S1 ,
and the matrix A has the following form S2 and the variable ( ) in stationary regime.

163

AMER16_Book.indb 163 3/15/2016 11:27:10 AM


Figure 2. Sample path of the stochastic process
{ (t)) }. Figure 3. Utilization of the central node.

Now, the Laplace-Stieltjes transform ( ) of the


virtual waiting time canbe expressed through these
functions as follow:
1 1
( ) s (ij ) ( ) + lims 0 (01) ( ) 11
( )
( ).
i 0 j =0

Now, after tedious algebra we found the fol-


lowing explicit expression for the above defined
function

1 ( )
( ) = [ a + a11
( ) ( )
a01 a11
( )
2 01 Figure 4. Effect of  on the utilization U .
1
+ [ 2 (a01 ( )
a01
( )
) ( 11 ( )
11( )
)( 2 + 2 )]]
2
1 N 1
An interesting performance metric is the utiliza-
+ ( ( ) 11 1 ( s ) 1b10 [ 1 ( s )]
( ) (1)
) tion of the central node (base station) S1 against
s 2 11
the rest of the network U 1 F (0 + ) = ( + )
+ [ 1( s )]N [ 0 0 ( p1 + q1 ) 1( s )] + (1) . This metric is plotted as a function of N in
1 N 1 Figure 3 for the following cases:
+ (a (1) a11
(0)
r1(ss (1)
1b10 [ g1 ( s )] )
s 2 11 1. 1 = 0.5 : short dashed line;
+ [[gg1( s )]N [ p0 q0 ( p1 + q1 )r1( s )] + 2. 1 = 2 : long dashed line;
1 3. 1 = 10 : gray level line;
+ 1 2
s 2 s 2 2 ( g1( s ) where 1 is the breakdown rate in the central node
[ a01
( )
a01
( )
r1( ) (0)
2 b001 ( ) g1 ( )
S1 . For this experiment, we set 1 = 1 , 2 = 2 ,
2 = 1 , 1 = 1 , 2 = 2 .
We see how the utilization of the central node
1b00
(1)
[ g1( s )]N 1 ] (2) increases while N increases and 2 decreases.
Denote by mi the total sojourn time of a cus-
We need in the above computations to take into tomer in the node Si (i = 1,2). Figure 3 shows the
m
account the normalization condition. effect of the ratio = m1 on the utilization U for
2
different values of N :
1. N = 10 : short dashed line;
7 APPLICATIONS AND NUMERICAL
2. N = 5 : long dashed line;
ILLUSTRATIONS
3. N = 15 : gray level line;
In this section, we give an application of the above For this experience we take the same numeri-
results with some numerical illustrations. cal values while the breakdown rate in the Central

164

AMER16_Book.indb 164 3/15/2016 11:27:17 AM


node is fixed as 1 = 0.5 . Here, we see on Figure 4 Gnedenko, B. & I. Kovalenko (1989). Introduction to
queueing theory. London: Birkhauser.
that the increasing of decreases the utilization
Medvediev, G. (1978). Closed queueing networks and
when N decreases. their optimization. Cybernetics 6, 6573.
Osman, R. & W. Knottenbelt (2012). Database system
performance evaluation models: A survey. Perform-
8 CONCLUSION ance Evaluation 69, 471493.
Phung-Duc, T. (2012). An explicit solution for a tan-
In this work, we have provided a method for finding dem queue with retrials and losses. Opsearch 12(2),
some performance metrics such as the utilization 189207.
of the central node (base station) in some modern Qiu, T., L. Feng, F. Xia, G. Wu, & Y. Zhou (2011). A
packet buffer evaluation method exploiting queueing
networks such as WSNs or database systems.
theory for wireless sensor networks. ComSis 8 (4),
10271049.
Qiu, T., F. Xia, L. Feng, G. Wu, & B. Jin (2011). Queue-
ACKNOWLEDGMENT ing theory-based path delay analysis of wireless sensor
networks. Adv. in Electr. and Comput. Engng. 11(2),
The authors would like to thanks the anonymous 38.
referees for pertinent and helpful comments that Senouci, M., A. Mellouk, & A. Aissani (2012). Perform-
allow to improve the quality of the paper. ance evaluation of network lifetime spatial-temporal
distribution for wsn routing protocols. J. Network and
Computer Applications 35(4), 13171328.
Senouci, M., A. Mellouk, & A. Aissani (2014). Random
REFERENCES deployment of wireless sensor networks: A survey and
approach. Internat. J. Ad Hoc and Ubiquitous Comput-
Aissani, A. & J. Artalejo (1998). On the single server ing 15(13), 133146.
retrial queue subject to breakdowns. Queueing sys- Senouci, M., A. Mellouk, L. Oukhellou, & A. Aissani
tems: theory and applications 30, 309321. (2015). Wsns deployment framework based on the
Boucherie, R. & N. Dijk (2010). Queueing networks: A theory of belief functions. Computer Networks 88(6),
fundamental approach. Berlin: Springer. 1226.
Demirkol, I., C. Ersoy, F. Alagz, & H. Deli (2009). The Taleb., S. & A. Aissani (2010). Unreliable m/g/1 retrial
impact of a realistic packet traffic model on the per- queue: monotonicity and comparability. Queueing
formance of surveillance wireless sensor networks. systems: theory and applications 64, 227252.
Computer networks 53, 382399.
Gaver, D. (1962). A waiting line with interrupted serv-
ices, including priorities. J. Roy. Stat. Soc. 69B24(1),
7390.

165

AMER16_Book.indb 165 3/15/2016 11:27:21 AM


This page intentionally left blank
Risk and hazard analysis

AMER16_Book.indb 167 3/15/2016 11:27:21 AM


This page intentionally left blank
Applied Mathematics in Engineering and Reliability Bri, Snel, Khanh & Dao (Eds)
2016 Taylor & Francis Group, London, ISBN 978-1-138-02928-6

Risk assessment of biogas plants

K. Derychova & A. Bernatik


Faculty of Safety Engineering, VSBTechnical University of Ostrava, Ostrava-Vyskovice, Czech Republic

ABSTRACT: Biogas represents an alternative source of energy with versatility of utilization. This arti-
cle summarizes information about production and properties of biogas, storage possibilities and utiliza-
tion of biogas. For the assessment of the risks of biogas were established major-accident scenarios. These
scenarios were complemented by an analysis of sub-expressions of biogas namely fire and explosion,
because biogas is formed mainly from methane, which is highly flammable and explosive gas. For analysis
of methane were used Fire & Explosion Index and modelling program ALOHA.

1 INTRODUCTION technology. Biogas is basically a mixture of gases,


among major components belong methane and
Due to limited capacity of fossil fuels, it is switching carbon dioxide and minor components is formed
to alternative sources of energy. In these sources can by: hydrogen sulfide, water, hydrogen, nitrogen,
be included biogas, which belongs to the gaseous ammonia, oxygen and optionally other substances.
renewable fuels. In Europe is currently over 14 thou- Representation of individual components in the
sands biogas plants, most of them are in Germany, mixture and its amount varies depending on the
Italy, Switzerland and France. Czech Republic is in raw materials and technological process (Dery-
fifth places in the number of biogas plants, as can be chova 2015).
seen in Figure 1 (EBA 2015). In the Czech Republic
were about 554 biogas plants at the end of 2015.
Biogas is nothing new, its history dates back 2 BIOGASCLEAN ENERGY
to the late 19th century. However, its production
in the past and nowadays is significantly differ- Biogas is a widespread term for gas produced by
ent. Therefore, it can be alleged that anaerobic anaerobic fermentation. The anaerobic fermenta-
fermentation is a new developing and perspective tion takes place in the natural environment (e.g. in

Figure 1. Overview of the number of biogas plants in Europe, according to the EBA (EBA 2015).

169

AMER16_Book.indb 169 3/15/2016 11:27:21 AM


wetlands, in the digestive tract of ruminants), in an Composition of biogas is variable and dependent
agricultural environment (rice field, the dunghill), on use raw materials supplied to the process; it is
in waste management, further on the landfill sites confirmed by the authors (Rasi et al. 2007). Ideally
(landfill gas) at sewage treatment plants and biogas biogas contains only two major gases, methane and
plants (Straka 2010). In the European legislation carbon dioxide. However the raw biogas includes
(2009/28/EC) is biogas formulated as a fuel gas other minor gases, e.g. hydrogen sulfide, nitrogen,
produced from biomass and/or from the biodegrad- oxygen, water vapor, hydrogen, ammonia, and
able fraction of waste, that can be purified to natural other siloxanes (Jnsson et al. 2003, Kara 2007).
gas quality, to be used as biofuel, or wood gas. And Comparison of the chemical composition of
according to (Kara 2007) can be concept of biogas various biogases shows the following Table 1. Pro-
used for all kinds of gas mixtures produced by the portional representation of the two main compo-
operation of micro-organisms, eg. below ground, nents of biogas (methane, carbon dioxide), but
in the digestive tract of animals, in landfill sites, also minor components, differ depending on the
lagoons or the controlled anaerobic reactors. How- origin of biogas and on the composition of the
ever, in technical practice, is the biogas presented starting substrate.
as a gaseous mixture produced in the anaerobic The concentration of methane in the biogas
fermentation of wet organic substances in techni- is not permanent, may change the density of the
cal equipment (reactors, digesters, etc.). entire gas mixture. If the methane concentration
Biogas production is a very complicated bio- falls below 60%, it becomes biogas heavier than
chemical process, in which the mixed cultures of air and may accumulate in depressions at land-
organisms decompose organic material in absence fills and in reactor vessels. The presence of minor
of air. Anaerobic fermentation takes place in four components of biogas can indicate the presence of
consecutive phaseshydrolysis, acidogenesis, ace- some chemical elements in the material or malfunc-
togenesis and methanogenesis, when the last phase tion during the fermentation (Kara 2007, Straka
produce methane and carbon dioxide (Juchelkova 2010).
et al. 2010, Rutz et al. 2012, Schulz et al. 2001). Storage tanks are built for accumulation of
Process of produce biogas takes place at a particu- biogas to reduce disparities between production
lar operating temperature (according to the type of and consumption. The daily cycle of gas con-
bacteriapsychrophilic, mesophilic, thermophilic) sumption can be independently varied. Biogas can
at pH from 6.5 to 7.5 and for specific time (accord- be stored for long periods of time and then can
ing to the type of bacteria from 10 to 120 days). be used without the loss. Gas tanks can be divided
(Derychova 2014; Rutz et al. 2008). The outcome according to the materials, function and arrange-
of this process is biogas and a digestate, which is a ment. Publications (Schulz et al. 2001) divide
good quality fertilizer (Schulz et al. 2001). Suitable biogas tanks into tanks designed as a low pressure,
materials for production of biogas are substances medium pressure and high pressure reservoirs.
of biological origin, such as plant biomass, animal The characteristics of these reservoirs are shown
biomass, organic by-products and organic waste in Table 2. According to the time of storage dis-
(Kara 2007). tributes publications (Krich et al. 2005) gas tanks

Table 1. Approximate composition of the biogas (Jnsson et al. 2003).

Chemical Agricultural Waste water Landfill


Component formula biogas plant treatment plant plant

Methane [vol.%] CH4 6070 5565 4555


Carbon dioxide [vol.%] CO2 3040 balance 3040
Nitrogen [vol.%] N2 <1 <1 515
Hydrogen sulfide [ppm] H2S 102,000 1040 50300

Table 2. Design of biogas tanks (Schulz et al. 2001).

Pressure level Operation pressure Bulk Storage facilities


3
Low pressure 2050 mbar 50200 m gas tank with water seal
0.050.5 mbar 102,000 m3 gas tank with foil cover
Medium pressure 520 bar 1100 m3 steel storage tank
High pressure 200300 bar 0.10.5 m3 steel cylinder

170

AMER16_Book.indb 170 3/15/2016 11:27:22 AM


Figure 2. Various types of biogas tanks.

Table 3. Representation of the gases in biogas and Is generally known that biogas is unbreath-
biomethane (Derychova 2014, Peterson et al. 2009). able gas. Density of biogas is approximately
1.2 kg/m3. It is slightly lighter than air, this means
Component Raw biogas Upgraded biogas that biogas will goes up rapidly and mixed with
Methane 4075 vol.% 9599 vol.% air (Derychova 2015). Biogas is a flammable gas
Carbon dioxide 2555 vol.% 5 vol.% which is also explosive, under certain conditions.
Water vapor 010 vol.% Conditions for an explosion, which must be com-
Nitrogen 05 vol.% 2 vol.% plied with, are the concentrations of explosive
Oxygen 02 vol.% 0.5 vol.%
gas between the upper and lower explosive limit
Hydrogen 01 vol.% 0.1 vol.%
(biogas consisting of 65% methane and 35% car-
Ammonia 01 vol.% 3 mg/m3
bon dioxide has explosive limit 612 vol.%), the
presence of an explosive mixture in an enclosed
Hydrogen sulfide 01 vol.% 5 mg/m3
space and reaching the ignition temperature (for
biogas it is 620700C). Under the lower explo-
sion limit does not occur to ignite of the biogas
and above the upper explosion limit can biogas
for short and long term storage. On Figures 2 are only burn a flame. (Schulz et al. 2001) Critical
shown various types biogas tanks. pressure of biogas is in the range of 7.58.9 MPa
Utilization of biogas energy is versatile. Biogas and critical temperature is 82.5C. Biogas has
can be used to produce heat, cooling, electricity; a very slow diffusion combustion, the maximum
further can be used to cogeneration and trigenera- progress speed of the flame in air is 0.25 m/s,
tion (not often used). For the use of biogas in the because of CO2 (Rutz et al. 2012).
transport and distribution to the natural gas grid Biogas has an ability to separate itself into its
it is necessary to modify the biogas. Publications compounds (thermodiffusion). Therefore, it is
(Petersson et al. 2009) present a various types of appropriate and necessary to know the proper-
biogas purification. Biogas plants utilize about ties of the individual components of biogas, these
2040% of produced heat to heat up the digesters gases have its characteristic physical-chemical
(process heat) and other 6080% of heat is called properties. E.g. carbon dioxide is heavier than air
waste heat, it is farther used for additional elec- (1.53 kg/m3), thats why it decreases and adheres
tricity production (Rutz et al. 2012). to the ground. Methane, which is lighter than air
The most often is biogas converted directly (0.55 kg/m3) rises into the atmosphere (Delsinne
in cogeneration units into electricity or heat in a et al. 2010, Rutz et al. 2012).
biogas plants. Surplus of electricity can be deliv-
ered into the electric power grids or distributed
through a pipelines as a heat or gas, or can be 3 HAZARDOUS MANIFESTATION
transported down the road too. OF BIOGAS

The increasing number of biogas plants increases


2.1 Properties of biogas
the risk that there will be incidents at some of these
Biogas is composed of majority and minority stations. That prove the events that have occurred
gases, approximate composition of biogas shown in recent times In January 2013 occurred a mas-
the Table 3. Characteristics of biogas depend on sive explosion with detonation inside the biogas
the content of methane. Physical and chemical plant in Chotetov, the station was still under the
properties of biogas depend on the material and construction and it was not put into operation. In
process parameters (Kara 2007, Straka 2010). November 2013 in Chric was found employee in

171

AMER16_Book.indb 171 3/15/2016 11:27:22 AM


Graph 1. Statistic of incidents in biogas stations in the Czech Republic and Europe.

Figure 3. Scheme of scenarios for accidents caused by biogas (Derychova, 2015).

the shaft, most likely was intoxicated by methane. Possible consequences of the accidents caused
Even abroad, there are accidents which have its by biogas are the heat radiation in case of fire, blast
casualties. In Germany in 2009 exploded biogas (shock) wave with any flying fragments in case of
plants, one worker was killed and two others were explosion and toxic effects of gases into the atmos-
injured (Derychova 2015). The Graph 1 shows a phere scattering (Derychova 2015).
comparison the number of events in biogas sta-
tions in the Czech Republic and in Europe since
1995 till 2013 (Casson et al. 2015, Ministery of the 4 RISK ANALYSIS OF
Interior 2014). From the graph can be read events BIOGAS PLANTS
that occurred on biogas stations in a given period,
incidents like leaks, fire, explosion and other events Safety of biogas plants has to be focused on the
(of unknown cause). most commonly occurring risks which are an explo-
The possible scenarios for accidents caused sion (fire), leakage (poisoning, suffocation) and
by biogas are identified in the following diagram environmental pollution. As was mentioned, com-
(Figure 3). (Derychova 2015). position of biogas depends on the type of biogas

172

AMER16_Book.indb 172 3/15/2016 11:27:22 AM


Table 4. Characteristics of methane. reactor (r = 6.75 m), it leads to actual radius of the
affected area, which is 1,442.76 m2.
Parameters Value The second index was determined for biogas
CAS 74828
tank. In biogas tank is stored biogas, which mainly
Molecular weight 16 g/mol substance id methane. So for calculation of FEI
Ignition temperature 595C was used methane, it is same as previous. Mate-
Flash point 188C rial factor for methane is 21 as in previous case,
Minimum ignition energy 0.29 mJ just value of other factors were different because
Temperature class T1 of various condition during the storage. FEI cal-
Lower explosion limit 4.4 vol.% culated for biogas tank was 25.2, and is lower than
Upper explosion limit 17 vol.% FEI for bioreactor, belongs to range of light degree
Explosion group II A of hazard too. Calculated radius of the affected
Classification (1272/2008/ES) H 280, H 220 area R was 6.45 m. And to this radius was added
the radius of the considered biogas tank (r = 7 m),
it leads to actual radius of the affected area, which
is 568.32 m2.
plants, on the technological process of production
and on input feedstock (Derychova 2014). 4.2 ALOHA
Considerable material damage and impact on ALOHA (Areal Locations of Hazardous Atmos-
the lives and health of people have fire and explo- pheres) is a software model program for the Win-
sion hazards that are related to the production, dows operating system. It was developed in the
usage, storage and transport of biogas. Biogas is a eighties of the 20th century by American Organi-
mixture of flammable gases, primarily by methane. zation for Conservation of Nature (US Environ-
The risk of fire and explosion is particularly high mental Protection Agency). This program allows
close to the digesters and gas reservoirs. Methane simulation of leakage dangerous substances and
is extremely flammable and non-toxic gas, which is subsequent modelling of the effects in graphi-
lighter than air. Its explosion limits are 515 vol.% cal form. Simulation involves the following steps
and autoignition temperature is 595 C. Mixture of when the first is to enter the site of potential acci-
methane and air can explode, to ignite of mixture dents (leakage), than select the type of hazardous
can be set by electric spark or open flame. Methane substances and on the basis of geographical and
in high concentration effect on humans in short climatic conditions may be modelled the way and
duration, can lead to asphyxia due to lack of oxy- type of release to the environment. Outcome of
gen (Derychova 2015). Fire and explosion charac- the program are radiuses of threatening zones in
teristics of methane are summarized in Table 4. graphic form. (Aloha, 2007)

4.1 Fire and Explosion Index (FEI) 4.2.1 Scenario 1


The first modelled situation was methane leak-
It is an index method for assessing the risk of fire age from the stationary source (gas tank) by short
and explosion. This method is a tool for reveal- pipe with diameter 15 cm and pressure in a tank is
ing the locations with the greatest potential losses 2 atm. The parameters of tank and atmospheric
and enables us to predict the extent of damage to conditions are in Table 5. Graphical output for
equipment. (Dows F&EI, 1994) leaking tank, when methane is burning as a jet fire,
Index was determined for bioreactor, where is can be seen in Figure. 4. Potential hazard of this
biogas produced by microorganisms from varied burning methane leakage can be thermal radia-
feedstock. Biogas is formed mainly from methane, tion from jet fire or downwind toxic effects of fire
thats why is methane used for calculation of FEI. by-products.
Material factor for methane is given in table in the Burn duration is 8 minutes, maximum flame
annex of manual and is determined to 21. Other length is 13 meters, max burning rate is 274 kg/min
factors were chosen with considering to physical and total burned amount is 891 kg, which is 47%
chemical parameters of methane and production. of total amount of methane in tank.
Calculated value for FEI was 57.33. According to Out of the resulting graph can be seen that the
the table in manual is bioreactor is included to a heat radiation is able to intervene into near area by
light degree of hazard with range from 1 to 60. surface heat radiation about 35 m2.
Further calculation for bioreactor was the radius The lethal zone is red doted area, the amount
of the affected area. The radius of the affected area of heat radiation is there 10 kW/m2, and extends
R was calculated to 14.68 m. And to this radius to a distance of 15 m from the source. The zone of
has to be added the radius of the considered bio- the 2nd degree burns is indicated by orange thickly

173

AMER16_Book.indb 173 3/15/2016 11:27:23 AM


Table 5. Parameters of the first modelled situation. Table 6. Parameters of the second modelled situation.

Bioreactor Bioreactor

Diameter 14 m Diameter 13.5 m


Length 9.78 m Length 13.9 m
Volume 1,500 m3 Volume 2,000 m3
Mass 1,995 kg Mass 1,616 kg
Atmospheric Data Atmospheric Data
Wind 2.1 m/s, SW Wind 0.46 m/s, SW
Cloud cover partly cloudy Cloud cover partly cloudy
Ground roughness urban Ground roughness urban
Stability class E Stability class F
Air temperature 22C Air temperature 18C
Relative humidity 50% Relative humidity 50%

Figure 4. Thermal radiation from jet fire.

dotted area with heat flow 5 kW/m2, extends to


a distance of 22 m. The Pain zone with distance Figure 5. Flammable threat zone.
of 35 m from the source is colored by yellow, the
intensity of the heat flow there is 2 kW/m2.

4.2.2 Scenario 2
The second modelled situation was methane leak-
age from the stationary source (bioreactor) by
short pipe with diameter 8 inches. The param-
eters of bioreactor and atmospheric conditions
are in Table 6. Graphical output for leaking tank,
when methane is not burning as it escapes into the
atmosphere can be seen in Figure 5. Potential haz-
ard of this methane leakage it can be downwind
toxic effects, vapor cloud flash fire or overpressure
from vapor cloud explosion.
Graph shows two regions where is presence
of flammable vapor cloud in different concen-
tration. In red dotted area is concentration of
methane about 30,000 ppm. In yellow area is
concentration of methane at 5,000 ppm, explo-
sion range of mixture methane and carbon diox- Figure 6. Range of explosion of mixture methane and
ide in air. carbon dioxide in air (Schroeder et al. 2014).

174

AMER16_Book.indb 174 3/15/2016 11:27:23 AM


The danger of fire and explosion biogas (flam- character. From the examples above may occur
mable methane) examined the authors in the arti- when operating biogas plants to casualties. For this
cle (Schroeder et al. 2014).The authors point out reason, it is necessary to pay special attention to
the necessity to know the explosion limits of gases the safety of these facilities and by the appropriate
and gas mixtures mixed with air in order to pre- measures (technical and organizational) reduce to
vent an explosion when handling the biogas. They rate of risks to avoid damage to life, environment
concluded that using Le Chateliers equation for and property.
calculating the explosive limits of mixtures, the
result especially in the upper explosive limit was
wrong. Therefore, has been formed explosion dia- ACKNOWLEDGEMENT
grams helped with measured data of explosion
limits of biogas to determine exactly the explosion The article was prepared in the framework of the
limits of biogas. The explosion diagram is shown project SGS SP2016/115.
in Figure 6. The article has been done in connection with
project Institute of clean technologies for min-
ing and utilization of raw materials for energy
5 CONCLUSION useSustainability program. Identification code:
LO1406 project is supported by National Pro-
Biogas production is a cheap energy from residues gram for Sustainability I (20132020) financed
and wastes. The benefit of biogas is to reduce the by the means of state budget of the Czech
burden to the environment. The advantage of Republic.
biogas is its versatile utilization depending on its
purification. However, biogas plants and biogas
production represent a certain hazard. Biogas is a
dangerous flammable and toxic gas. Its flammabil- REFERENCES
ity is due to the presence of methane, toxicity is
Aloha: Users Manual. 2007. Washington: US EPA,
caused by content of hydrogen sulfide or of carbon 195 p. <http://www.epa.gov/osweroe1/docs/cameo/
dioxide. The source of environmental risk is feed- ALOHAManual.pdf>.
stock placed in reactors or storage feedstock for Casson Moreno, V., Papasidero, S., Scarponi, G.M.,
anaerobic fermentation (liquid manure, slurry). Guglielmi, D. & Cozzani, V. 2015. Analysis of acci-
Evaluation of flammability and explosiveness dents in biogas production and upgrading. Renewable
of biogas plants was carried out by FEI by two Energy, vol.
chosen scenariosfor production of biogas in Delsinne, S. et al. 2010. Biogas Safety and Regulation:
the bioreactor and for storage biogas in a tank. Workshop. Paris, 38 p.
Final indexes were quite small, and according to Derychova, K. 2015. Biogas and scenarios of its major
accidents. TRANSCOM 2015: 11th European Confer-
the manual (Dows F&EI, 1994) were bioreactor ence of Young Researches and Scientists. ilina, Slovak
and a gas tank included to a light degree of haz- Republic: University of ilina.
ard. The radius of the affected area has a value of Derychova, K. 2014. Hazards associated with biogas
1,442.76 m2 in case of bioreactor, and radius for Safety, Reliability and Risks 2014: 11th International
the gas tank has a value of 568.32 m2. Conference of Young Researchers. Liberec: Technical
The analysis was complemented by modelling University of Liberec.
program Aloha. In the program was modelling of Directive 2009/28/EC of 23 April 2009 published in the
two scenarios of methane leakage. The first sce- Official Journal of the European Union, L 140/16 of 5
nario considered pipeline leaking from tank, when June 2009, p. 1662.
Dows fire & explosion index hazard classification guide.
methane was burning as a jet fire. Aloha considers 1994. Seventh Ed. New York: American Institute of
for this leakage affected area up to 35 m2, where chemical engineers.
there has been lethal zone, zone of 2nd degrees EBA Biogas Report. 2015. European Biogas Association.
burns and zone of injuries. In case of the sec- Jnsson, O., Polman, E., Jensen, J.K., Eklund, R., Schyl,
ond scenario occurred leakage of methane from H. & Ivarsson, S. 2003. Sustainable gas enters the
the bioreactor by the leak pipe. Leaking meth- European gas distribution system. Danish Gas Tech-
ane wasnt ignited and therefore it threatened the nology Center.
neighborhood by various concentrations of this Juchelkova, D. & Raclavska, H. 2010. Biogas. 9 p.
gas up to area of 370 m2 from the bioreactor. These Kara, J. & et al. 2007. Production and use of biogas in
agriculture. Ed. 1. Praha: VZT. 120 p.
scenarios reveal possible leaks, which may occur in Krich, K. & et al. 2005. Biomethane from Dairy Waste: A
the biogas plants. Sourcebook for the Production and Use of Renewable
With an increasing number of biogas plants is Natural Gas in California. 282 p.
increasing the risk of accidents at these facilities. Ministery of the InteriorFire Rescue Service of the
The impact of incidents will be more of a local Czech Republic, 20042014. Statistical Yearbook.

175

AMER16_Book.indb 175 3/15/2016 11:27:24 AM


Petersson, A. & Wellinge, A. 2009. Biogas upgrading Schroeder, V., Schalau, B. & Molnarne, M. 2014. Explo-
technologies: Developments and innovations. IEA sion Protection in Biogas and Hybrid Power Plants.
Bioenergy. Procedia Engineering. Vol. 84. 259272 p.
Rasi, S., Veijanen, A. & Rintala, A J. 2007. Trace com- Schulz, H. & Eder, B. 2001. BiogasPraxis. Grundlagen
pounds of biogas from different biogas production PlanungAnlagenbauBeispiele. 1. Ed. kobuch
plants. Energy. Staufen. 167 p.
Rutz, D., Al Seady, T., Prassl, H., Kttner, M., Finster- Straka, F.& et al. 2010. Biogas: [Guide For Teaching,
walder, S.V. & Jassen, R. 2008. Biogas handbook. Den- Design and operation of biogas systems]. 3. ed. Praha:
mark: University of Southrn Denmark Esbjerg. GAS. 305 p.
Rutz, D., Ramanauskalte, R. & Janssen, R. 2012. Hand-
book on Sustainable Heat Use from Biogas Plants.
Renewable Energies.

176

AMER16_Book.indb 176 3/15/2016 11:27:24 AM


Applied Mathematics in Engineering and Reliability Bri, Snel, Khanh & Dao (Eds)
2016 Taylor & Francis Group, London, ISBN 978-1-138-02928-6

Verification of the design for forced smoke and heat


removal from a sports hall

P. Kuera
VSBTechnical University of Ostrava, Ostrava, Czech Republic

H. Dvorsk
OSH FM, FrydekMistek, Czech Republic

ABSTRACT: Many civil facilities commonly include active fire safety systems that help to create favour-
able conditions in the event of a fire. One of these active fire safety systems is equipment that removes
smoke and heat. The article therefore focuses on a variant solution for forced fire ventilation in a concrete
sports hall and the use of mathematical modelling of a fire (Fire Dynamics Simulator) to verify the effec-
tiveness of the designed forced fire ventilation system, including simulations of the logical consequences
of the system under consideration.

1 INTRODCUTION The multifunctional sports hall, which is used


not only for sports, but also for cultural events,
Covered sports halls, which are used not only for has two usable aboveground floors and one usable
sports events, but also for other purposes (e.g. con- underground floor. The third aboveground floor
certs, exhibitions), attract a large number of peo- serves as a technical area and contains HVAC
ple. For the sake of any eventual emergency, it is units. The back of house areas of the hall is located
necessary to propose technical and organizational on the underground floor. Entrances to the hall
measures so that the design of such facilities mini- for the spectators, restrooms, refreshment seating
mizes the risk of panic situations. In the event of area and souvenir shops are located on the first
a fire, fire ventilation helps to remove combustible aboveground floor. VIP boxes and restaurants are
gases (products), smoke and heat, thus prolong- located on the second aboveground floor.
ing the time interval to ensure the safe escape of For the purpose of verifying fire ventilation,
persons. we chose the main area of the sports hall, i.e. the
Through mathematical modelling, this article playing surface with walkways and seating area for
aims to assess forced fire ventilation in sports halls. spectators, which constitutes a separate fire com-
Regarding the design of a system for removing partment (Figure 1).
smoke and heat (hereinafter referred to as ZOKT), The vertical clearance of the assessed hall space
two options were chosen. The difference between is 22.5 m. The dimensions of the ice surface are
these variants mainly consists in the number of fire
fans, their performance, division of the fire zone of
the sports hall into smoke sections, and the deci-
sion whether or not to use smoke barriers

2 DESCRIPTION OF THE OBJECT

For the model of the sports halls, based on math-


ematical modelling in order to verify two different
methods of forced fire ventilation, we selected a
characteristic sports facility whose parameters
were chosen as a representative sample of facilities
of a similar nature. These parameters were then
used as input for the buildings construction and Figure 1. A view of the indoor sports hall facility (Pyro-
the ZOKT design. Sim program).

177

AMER16_Book.indb 177 3/15/2016 11:27:24 AM


Figure 2. External layout of the sports hall (PyroSim
program).

60 m 30 m. The seating area reaches a height of Figure 3. Floorplan drawing of the ZOKT design,
9.2 m above the ice surface. The maximum dimen- option 1.
sions of the skin circumference above the seat-
ing area are 95 m 65 m. The building is roofed
with a steel framed structure. A rendering of the from the outside on the first underground floor. In
roof cladding and the halls exterior is shown in the event of a fire, the air supply and wall fans are
Figure 2. activated automatically by the electrical fire alarm
system. For the needs of energy balance, we con-
sider the possible simultaneous operation of two
3 DESCRIPTION OF THE EQUIPMENT ventilated sections.
FOR FORCED SMOKE AND The occurrence of a fire is expected only in one
HEAT REMOVAL smoke section; therefore, the calculation is per-
formed for representing smoke section no. 1.
According to the requirements for fire safety in In each smoke section, the suction power
construction, the sports hall must be fitted with will be provided by twelve wall fans with
equipment for removing smoke and heat. The class F200 120 (200C/120 min.), Vo,l = 11.75
structural system of the building is non-combus- m3s1 = 42.30 m3hour1 fire resistance. p = 200
tible; the structural components are of the DP1 Pa. The E 15 DP1 smoke wall has a D600 30 fire
type. The entire building is equipped with an elec- resistance.
trical fire alarm system. Permanent fire extinguish-
ing equipment is not considered.
3.2 Option 2
The equipment for forced smoke and heat
removal is designed with a forced gas outlet and a For the purpose of smoke and heat removal, the
natural air intake. hall areas consist of five smoke sections. The first
Fire ventilation uses fire wall fans that corre- four sections are designed identically to option
spond to operating temperatures of 200C for 120 number one. The fifth section is situated in the
min with fire resistance class F200 120. middle of the hall and forms a ring with a radius
The forced fire ventilation is designed as a of about 13 m.
forced self-acting ventilation system according to Assuming that the electrical fire alarm system
the requirements of (SN 73 0802, 2009), (SN 73 activates the relevant group of ZOKT fans, the
0831, 2011) in connection with (SN EN 121015, ZOKT design does not contain smoke barriers in
2008). the space under the roof structure.
Fire ventilation of sections nos. 14 is provided
by fire wall fans installed on the third aboveground
3.1 Option 1
floor. Section No. 5 uses fire roof radial fans
For the purpose of fire ventilation, the halls fire installed at the roof of the hall. Details of the loca-
zone areas are divided into four smoke sections. tion of the fans and smoke barriers are shown in
Smoke barriers (partitions) separating different Figure 4.
smoke sections are building structures that meet The supply of fresh air is provided through
the requirement of E 15 DP1, i.e. the criterion for inlets from the outside on the first underground
properties of D600.smoke barriers Wall fans for floor. In the event of a fire, the air supply and wall,
forced fire ventilation are installed on the third as well as the roof fans, are activated automatically
aboveground floor. Details of the location of the by the electrical fire alarm system. For the needs of
fans and smoke barriers are shown in Figure 3. energy balance, we consider the possible simulta-
The supply of fresh air is provided through inlets neous operation of two ventilated sections.

178

AMER16_Book.indb 178 3/15/2016 11:27:25 AM


for the worst case scenario when the playing sur-
face may contain stalls with a various assortment
of goods, 60 kgm2 (SN 73 0802, 2009), Annex
A, Tab. A.1, paragraph 6.2.1b)).
In order to verify the fire ventilation and repre-
sent the amount of combustible gases, the fire was
simplified and simulated using the model of free
fire development, presented in the methodological
guideline for preparing fire-fighting documenta-
tion (Hanuka, 1996) and the subsequent action
taken by fire brigades. The main time-dependent
fire design parameters were the fire area (m2) and
heat release rate (kWm2) determined in accord-
Figure 4. Floorplan drawing of the ZOKT design, ance with (SN EN 1991-1-2, 2004). The follow-
option 2. ing graph in Figure 5 shows the heat release rate
per square meter (RHRf).

The occurrence of a fire is expected only in one


4.3 Fire detection and activation of the ZOKT
smoke section; therefore, the calculation is per-
formed for representing smoke section no. 1. Although both fire ventilation variants were
In smoke sections nos. 14, the suction power designed differently, the activation of fire fans and
will be provided by five fire wall fans, each with logical consequences of fire safety equipment are
fire resistance class F200 120 (200C/120 min.), identical.
Vo,l = 12.35 m3s1 = 44.46 m3hour1. p = 200 Pa. The The sports hall is guarded by an electric fire
E 15 DP1 smoke wall has D600 30 fire resistance. alarm system. The system reports fire using optical
In smoke section no. 5, the suction power will be and acoustic alarms. It consists of automatic opti-
provided by five roof radial fire fans with fire resist- cal-smoke fire detectors with individual address-
ance class F200 120 (200C/120 min.), Vo,l = 12.35 ing, and call button points. These elements are
m3s1 = 44.46 m3hour1. p = 200 Pa. connected to the electrical fire alarm system centre
through circular lines. It is a two-stage alarm sig-
nalling system with day and night functions. Under
4 BUILDING A MODEL OF the roof of the hall are suction devices consisting
MULTIPURPOSE SPORTS HALL FOR of a network of pipes, sucking in air samples from
PERFORMING A MATHEMATICAL the protected space and bringing them to the laser
SIMULATION detector.
For the purposes of the model, the function of
For the mathematical simulation of the sports the electrical fire alarm system was significantly
hall, we used the Fire Dynamics Simulator (FDS) reduced and the logical consequences of the sys-
program. tem were simplified. Fire alarms work only using
ceiling detectors which, in the model, are placed
at a height of about 12 m above the floor of the
4.1 Model geometry
When compiling the geometry of the main hall
space with a playing surface, seating area and
walkways, constituting a separate fire zone, we
made substantial modifications and simplifications
regarding the choice of building materials. At the
same time, we defined a large amount of materials
that mutually differ in their chemical composition,
thermo-physical properties and spatial arrange-
ment (Kuera, 2010). Another input variable that
has a major impact on the fire simulation is the
method of modelling material pyrolysis.

4.2 Definition of the design fire


Because the sports hall should also serve for cul- Figure 5. Graph of the heat release rate per square
tural and similar purposes, the fire was designed meter (RHRf).

179

AMER16_Book.indb 179 3/15/2016 11:27:25 AM


guarded area. In the FDS program, the optical- intensive outflow of combustible gases outside the
smoke fire detectors consisting of the pipe system area under consideration.
are simulated using a beamdetector (McGrattan, After 60 seconds, the fire was in its beginning
2010). The doors in the escape routes are open stages; after 323 seconds, the combustible gases
and the air supply is assured throughout the fire accumulated at a height of about 13 m above the
simulation. floor of the hall, i.e. about 3.8 m above the floor
The model does not include the effects of HVAC of the highest walkway (the highest level of the
equipment. For the most realistic and reliable sim- seating area). In terms of fire safety of construc-
ulation, maximally two sections are activated upon tion and assessing the evacuation conditions, this
detection. height difference is safe. After 690 seconds, the
smoke already reached the lower border of smoke
barriers and gradually spread into adjacent sec-
5 THE RESULTS OF THE EFFECTIVENESS tions. This fact correlates with the calculation and
OF THE FIRE VENTILATION DESIGNS confirms its accuracy. At the time of the expected
intervention by fire brigades, the hall was consider-
In both simulations, a crucial factor for making a ably filled with smoke; however, the performance
comparison of the fire ventilation design was the and number of fire fans is dimensioned to meet the
time required to evacuate persons from the hall conditions for evacuating persons, conditions for
area to walkways (tu = 5.2 min 323 sec) and the active fire-fighting intervention and reducing ther-
time of filling the hall area with smoke (te = 11.49 mal stress on the building structures.
min 690 sec), pursuant to (SN 73 0802, 2009).
These values were determined from the starting
5.2 The results of the second ZOKT
project and were fixed for both options.
design option
When simulating a fire on the playing surface, the
5.1 The results of the first ZOKT design option
detectors were first activated in section no. 5 and
When simulating a fire on the playing surface, the then in section no. 1. Additional triggering of fire
detectors were first activated in section no. 1 and fans was not possible due to the operational limits
then in section no. 3. Additional triggering of fire of the reserve source. The graph in Figure 7 again
fans was not possible due to the operational limits shows individual time intervals between the time of
of the reserve source. The graph in Figure 6 shows evacuation, activation of individual sections and
individual time intervals between the time of evac- the time of filling the space with smoke.
uation, activation of individual sections and the The time interval between the activation of cen-
time of filling the space with smoke. tral section no. 5 and section no. 1 is fairly broad.
The time interval between the activation in sec- Fire fans in section no. 5 and section no. 1 were
tion no. 1 and section no. 3 is very narrow. Fire started after 459 seconds and 690 seconds, which
fans in section no. 1 and section no. 3 were acti- corresponds to the time of filling the considered
vated after 457 seconds and 496 seconds, respec- space with smoke. It can be therefore concluded
tively. Therefore, this demonstrates an early and that the initial outflow of combustible gases

Figure 6. Graph of the time intervals in the first ZOKT Figure 7. Graph of the time intervals in the second
design option. ZOKT design option.

180

AMER16_Book.indb 180 3/15/2016 11:27:26 AM


through fire fans in section no. 5, located directly taneously due to the division of smoke sections in
in the dome of the hall, was timely; however, the the guarded area using the smoke barriers. Com-
prolonged time interval between starting the other bustible gases thus accumulate in activated sec-
section indicates the possibility of increased speed tions identically and there is no reason for a longer
of creating an accumulated layer of smoke in the time delay. In the second ZOKT design option,
area under consideration. the fire fans are activated with a significant time
After 60 seconds, the fire was in its beginning difference, after 690 seconds. This delay is due to
stages; after 323 seconds, the combustible gases the different way of dividing the space into smoke
were present over the entire surface of the assessed sections and the imaginary borders of these sec-
section due to their spreading. Smoke movement tions. The non-use of smoke barriers leads to the
was not prevented by any barrier; this resulted in accumulation of combustible gasses under the
smoke accumulation just under the roof structure highest part of the roof dome (section no. 5) and
at a height of 16 m above the hall floor, which is smoke spread throughout the area of the sports
about 6.8 m above the floor of the highest walkway hall results in late activation of the fire fans in the
(the highest level of the seating area). In terms of other section.
fire safety of construction and assessing evacua-
tion conditions, this height difference is safe. After
6.2 The formation of smoke
690 seconds, the lower smoke layer reached the
imaginary boundary of the original smoke barri- The decisive factor in assessing the two different
ers (12 m above the playing surface). By this time, ZOKT projects was primarily the evacuation time
however, the space was ventilated only by 5 ceil- (tu = 323 sec) and the accumulation of combusti-
ing fans. Due to the low intensity of the outflow ble gases within this time interval. Figure 8 com-
of combustible gases, the space was considerably pares both options exactly after 323 seconds. With
filled with smoke and especially combustible gases regard to the evacuation conditions, both ZOKT
spread into other smoke sections. The layer of designs can be regarded as satisfactory, because
clean air and its height under the smoke boundary the smoke is not so intense and does not endanger
above the playing surface correlates with the cal- people escaping from the sports hall.
culation and confirms its accuracy, but the density Such positive results are not achieved by fire
of smoke, its accumulation, spreading and thermal ventilation designs at time intervals moving beyond
stresses on the building structures at that time were the time of evacuation. The most noticeable dif-
very troublesome. At the time of intervention by ference in the function of fire ventilation consists
fire brigades, the space was already filled with so in the time of intervention by fire brigades, after
much smoke that the lower boundary of the smoke 900 seconds (15 minutes). The difference is illus-
accumulation layer reached a height of about 5 m trated in Figure 9. In case of the first option, the
above the playing surface and reduced visibility. ZOKT design fulfils its function smoothly. The
This confirmed the previous assumption of a very output from the fire fans and their activation is
rapid process of filling the space with smoke due
to the low power of fire fans. This lack resulted
in a considerable restriction of active fire-fighting
intervention and higher thermal stress on the build-
ing structures.

6 A COMPARISON OF THE
ZOKT OPTIONS

6.1 Activation of fire fans


The results of the mathematical model clearly
show that fire fans in the first sections are in both
cases activated roughly at the same time interval,
approximately after 458 seconds (after 457 seconds
regarding section no. 1 in the first option and after
459 seconds regarding section no. 5 in the second
option ). In the first ZOKT design option, the fire
fans of the other section (no. 3) are activated after Figure 8. The accumulation of combustible gases
496 seconds, which is approximately at the same after 323 seconds (option 1 on the top, option 2 on the
time. Fire fans in both sections are started simul- bottom).

181

AMER16_Book.indb 181 3/15/2016 11:27:26 AM


Figure 9. The accumulation of combustible gases
after 900 seconds (option 1 on the top, option 2 on the
bottom).
Figure 10. Distribution of the representative temper-
ature of combustible gases 40C (option 1 on the top,
option 2 on the bottom).
sufficient to ventilate the sports hall and allows the
fire brigades to perform their fire-fighting activi-
ties. Regarding the second option, the functioning
of the fire ventilation cannot be considered as sat-
isfactory. Smoke in the space and strongly reduced
visibility may limit fire-fighting activities.
This negative function of fire ventilation is
caused by the missing installation of smoke bar-
riers, the unsatisfactory calculated performance
of fire fans and their number in each given smoke
section. Finally, the area of the sports hall is also
thickly filled with smoke due to the wide time
interval between starting individual sections in the
second ZOKT design option.

6.3 Representative temperatures of the


combustible gases in the area
In assessing temperatures, we created a simula-
tion representing the lowest temperature of the
combustible gases at the lower end of the smoke
layer (40C). This representative temperature was
chosen primarily because it is the temperature that
negatively affects the human body. The simulation
was assessed at the time after 900 seconds when
the accumulation of combustible gases was most
intense. Figure 10 shows the level of this repre- Figure 11. Distribution of the representative temper-
ature of combustible gases 40C (option 1 on the top,
sentative temperature. In the first ZOKT design
option 2 on the bottom).
option, the neutral plane occurs at a sufficient
height. Temperatures below the neutral plane are
lower and provide a sufficient area of clean air. In the thermal load acting on the building structures
the second ZOKT design option, the neutral plane is therefore much stronger.
occurs nearly at the floor of the fire section. Above In order to have the clearest possible idea of
this plane, we can assume temperatures much achieving temperatures in the area, particularly
higher than in the previous ZOKT design option; above the level of the representative temperature

182

AMER16_Book.indb 182 3/15/2016 11:27:27 AM


Table 1. The results of assessing the two different ZOKT designs for the sports hall.

Practical requirements ZOKT design option 1 ZOKT design option 2

Early activation of fire fans in the primary section YES YES


Early activation of fire fans in the secondary section YES NO
Optimum smoke density and smoke accumulation YES NO
Ensuring conditions for the evacuation of people YES YES
Ensuring conditions for fire-fighting activities YES NO
Sufficient dimensioning of the performance of fire fans YES NO
Quality location of fire fans NO YES

of 40C, we provided the following comparison point. Although this project meets the conditions
imagessee Figure 11. They demonstrate the dis- of evacuation, it does not meet the conditions for
tribution of temperatures in the area (i.e. tempera- effective fire-fighting activities. Thick smoke in the
tures of gases), in the middle of the sports hall and hall area at the time of the arrival of fire brigades
the height (12 m above the playing surface) where would result in the slowing and hindering of such
the smoke barriers start to occur. fire-fighting activities. It could also lead to changes
Figure 11 illustrates temperatures in the area in the material properties of steel structures due to
along a longitudinal plane. Temperature distribu- their higher thermal stress.
tion in other planes is comparable; therefore, only
one demonstration was selected. Nevertheless,
despite this one-sided view, the temperatures are 8 CONCLUSION
very distinct. In the first ZOKT design option, the
space temperature reaches 115C (in red). In the Verification of the effectiveness of forced fire
second ZOKT design option, the reached tempera- ventilation of a sports hall through mathematical
ture is slightly higher, i.e. 120C. modelling shows the first ZOKT design option to
be optimistic, more practical, safer and more effec-
tive than the second ZOKT option.
7 DISCUSSION Fire modelling is certainly a promising area,
which will find its application in many practical
Using the Fire Dynamics Simulator simulation situations and may also explain many ambiguities,
model, we looked at two different ZOKT design especially in the case of interactions between fire
options for the protected area of the sports hall. safety systems. A combination of standard compu-
Table 1 below presents the basic questions that tational techniques and modelling seems to be the
were asked during the assessment of the two fire optimal approach that ultimately leads to financial
ventilation options. savings and optimizations of project designs.
The first ZOKT design option, where the
guarded area is divided into four smoke sections
using smoke barriers and where each section is ven-
tilated by 12 fire fans, meets all the requirements REFERENCES
of the proper project of equipment for smoke and
SN 73 0802. 2009. Fire protection of buildings
heat removal. The only unsatisfactory aspect was Non-industrial buildings. Prague: Czech Office for
the location of 12 fire fans in the wall of the roof Standards, Metrology and Testing, 122 p. (in Czech).
structure. The ZOKT system could achieve higher SN 73 0831. 2011. Fire protection of buildings Assembly
efficiency if the fire fans were installed in the ceiling rooms. Prague: Czech Office for Standards, Metrology
of the roof structure where the building structures and Testing, 36 p. (in Czech).
are most thermally stressed. Such placement of fire SN EN 12101-1. 2006. Smoke and heat control systems -
fans is advisable to consult with experts through Part 1: Specification for smoke barriers. Prague: Czech
static drawings. Office for Standards, Metrology and Testing, 44 p.
In the second ZOKT design option, the guarded (in Czech).
SN EN 12101-3. 2003. Smoke and heat control systems
area was divided into five smoke sections with- - Part 3: Specification for powered smoke and heat
out using any smoke barriers; each section was exhaust ventilators. Prague: Czech Office for Stand-
ventilated by 5 fire fans; section 5 had a circular ards, Metrology and Testing, 32 p. (in Czech).
shape and its fire fans were installed in the ceil- SN EN 1991-1-2. 2004. Eurocode 1: Actions on struc-
ing of the roof structure, in the central highest tures - Part 1-2: General actions - Actions on structures

183

AMER16_Book.indb 183 3/15/2016 11:27:28 AM


exposed to fire. Prague: Czech Office for Standards, Kuera, P., Pezdov, Z. Zklady matematickho mod-
Metrology and Testing, 56 p. (in Czech). elovn poru. Association of Fire and Safety
SN P CEN/TR 12101-5. 2008. Smoke and heat control Engineering, (2010), 111 p. (in Czech). ISBN 978-80-
systemsPart 5: Guidelines on functional recommen- 7385-095-1.
dations and calculation methods for smoke and heat McGrattan, K. et al. 2010. Fire dynamics Simulator
exhaust ventilation systems. Prague: Czech Office for (Version 5) Users Guide. NIST Special Publication
Standards, Metrology and Testing, 100 p. (in Czech). 1019-5, National Institute of Standards and Technol-
Hanuka, Z. 1996. Methodical instructions for for prepar- ogy, Building and Fire Research Laboratory, Mary-
ing fire-fighting documentation (2nd Edition), Prague: land, USA.
Fire Rescue Service of the Czech Republic. FACOM,
78 p. (in Czech). ISBN 80-902121-0-7.

184

AMER16_Book.indb 184 3/15/2016 11:27:28 AM


Stochastic reliability modelling, applications of stochastic processes

AMER16_Book.indb 185 3/15/2016 11:27:28 AM


This page intentionally left blank
Applied Mathematics in Engineering and Reliability Bri, Snel, Khanh & Dao (Eds)
2016 Taylor & Francis Group, London, ISBN 978-1-138-02928-6

The methods of parametric synthesis on the basis


of acceptability region discrete approximation

Y. Katueva & D. Nazarov


Laboratory of Complex Systems Reliability Control, Institute of Automation and Control
Processes FEB RAS, Vladivostok, Russia

ABSTRACT: The methods of parametric synthesis (parameter sizing) for providing parametric reliabil-
ity using acceptability region discrete approximation with a regular grid are discussed. These methods are
based on parametric optimization using deterministic criterion for the case of lack of information on par-
ametric deviation trends and their distribution laws. A volume of a convex symmetrical figure inscribed
into acceptability region is considered as an objective function of this optimization task. The methods
of inscribing these figures into discrete approximation of acceptability region based on multidimensional
implementation of Moore and von Neumann neighbourhoods are proposed in this work.

1 INTRODUCTION y( x ) y max , (1)


min

The stage of parametric synthesis at analog engi-


where y(x), ymin and ymax are m-vectors of system
neering system design consists in determination of
responses (output parameters) and their specifica-
nominal parameter values which yield system per-
tions, e.g. y1(x) - average power, y2(x) - delay, y3(x)
formances within their requirements. Optimal par-
-gain.
ametric synthesis procedure requires the account
Possible variations of internal parameters are
of parameter deviations under influence of various
defined by the conditions of components physi-
factors including manufacturing ones to provide
cal realizability, and their specifications. The con-
parametric reliability of an engineering system.
straints for these parameters variations represent a
Unfortunately, the information on probabilistic
bounding box:
characteristics of parametric deviations, and conse-
quent system parametric faults are often unknown. n
Therefore, rational solutions on parameter sizing BT { : xi xi xi max , i = , , ,nn}. (2)
are required under the conditions of a lack of this
information. For this purpose, the system should The inequalities (1) define a region Dx in the
be designed robust to maximum set of paramet- space of design parameters
ric deviations, focusing on the worst case. Thus, as
an objective function of optimal parametric syn- Dx = { n
: min (x) max }. (3)
thesis, it is proposed to use system performance
reserve esing, (Abramov et al. 2007), (Abramov Dx is called the Region of Acceptability (RA)
and Nazarov 2012). Geometric methods for find- for the system. Figure 1 illustrates such a region.
ing nominal parameter values, which provide max- The engineering system parameters are subject
imum of systems performance reserve, based on to random variations (aging, wear, temperature
discrete representation of an Acceptability Region variations) and the variations may be considered
(AR) with the use of a regular grid are proposed. as stochastic processes:

X = X( x,
x ). (4)
2 THE PARAMETRIC SYNTHESIS
PROBLEM The stochastic processes of parameter vari-
ations X mean random manufacturing realiza-
Suppose that we have a system which depends on tion of systems components and thier upcoming
a set of n parameters x = ( 1,..., n )T . We will say degradation. Therefore, the conditions (1) can be
that system is acceptable if y(x) satisfy the condi- met only with a certain probability
tions (1):

187

AMER16_Book.indb 187 3/15/2016 11:27:28 AM


The distance dist ( Dx ) may be estimated as
the shortest distance from point x to AR bound-
ary DDx in every coordinate direction. The next
practical algorithm of the criterion (7) calculation
is based on the approximation of region Dx.
Lets note that the information about Dx config-
uration and its boundary D Dx is unknown. Among
various methods of AR approximation proposed
(Xu et al. 2015), (Director et al. 1978), (Krishna
and Director 1995), (Bernacki et al. 1989), (Grasso
et al. 2009), in this paper, we will use a discrete
approximation, using a set of elementary hyper-
cubes, based on multidimensional probing method
on a regular grid (Abramov and Nazarov 2012).
In this work, the methods for solving optimal
parametric synthesis problem (7) on the basis of
discrete representation of AR using various objec-
Figure 1. Region of acceptability Dx defined by system
response functions.
tive functions.

3 THE ACCEPTABILITY REGION


P ( x ) P(
P ( y min ( X( t)
t )))) [ ,T ])
t [0 (5) DISCRETE APPROXIMATION BASED
max
ON A REGULAR GRID
or
A circumscribed box B0
P ( ) = P(
P ( ( x,t ) Dx t [ 0,T ]). (6)
B0 { BT : ai0 xi bi0 , i = 1, 2, ,nn},
ai0 xi , bi0 a xi (8)
The probability (5), (6) is an estimation of the xDx xDx
designed system reliability.
In general, the parametric optimization (opti-
mal parametric synthesis) problem can be stated is usually constructed prior to AR construction in
as follows. order to narrow search area and can be considered
Given the characteristics of random processes as zero-order estimation of AR configuration.
X( ) of system parameters variations, a region of Usually Monte-Carlo method is used to determine
admissible deviation BT and a service time T, find its borders a 0, b 0 (Abramov et al. 2007).
such a deterministic vector of parameter ratings The discrete approximation of AR is constructed
(nominals) x 0 = ( 10 , ..., n0 )T that the reliability (5), on the basis of B0 with equidistant splitting g of
(6) be maximized (Abramov, Katueva, & Nazarov every i-th parameters axis within ai0 bi0 , i 1, n
2007). into li , i = 1, n ranges which forms a regular grid
Unfortunately, both the degradation model (4), inside this circumscribed box B0.
and the region Dx are unknown. The practical way The grid nodes define a vertices of Elementary
in uncertainty conditions consists in replacing the Boxes (EB) which are used for AR approximation.
original stochastic criterion with a certain deter- Every single EB is identified with a set of indices
( 1 2 n ) , where 1 ki li , 1,
1, n . All these EB
ministic one. One of them is a so-called a minimal
serviceability reserve. comprise a set Bg {
ek k kn : ki li i = , n . }
System performance reserve allows to estimate The amount of EB can be calculated using (9):
the distance from systems parameters vector n
x from AR boundary, D Dx, and, consequently, L Bg = li . (9)
parameters variation margins which keep the per- i =1
formances within their specifications (1). In this
case, the optimal parametric synthesis problem is The grid step hi , i = 1, n for every parameter is
reduced to finding of a point x which has maxi- obtained using
mum distance from the AR boundary (Abramov
et al. 2007), (Abramov et al. 2008), (Abramov and hi bi0 ai0 li , i 1, n. (10)
Nazarov 2012), (Abramov and Nazarov 2015):

x The bounds of an EB can be obtained using the


x arg ma di t( x, x ). (7)
xDx following expressions:

188

AMER16_Book.indb 188 3/15/2016 11:27:29 AM


aiki = ai0 + ( ki )hi ,
(11)
bik aiki + hi ki = 1,,lli i = 1, n.

An EB consists of points inside parameter space


within its bounds:

ek1k2 kn { B : aiki xi bik }


i = 1, n . (12)

Thus, the circumscribed box B0 can be consid-


ered as a union of EB as it is expressed in (13):
l1 l2 ln
B0 B0g = ek k & k . 1 2 n
(13)
k1 =11 k2 1 kn =1
Figure 2. Region of acceptability Dx discrete
Although every EB is uniquely identified with approximations.
a set of indices ( 1 2 n ) , the most appropriate
way of their enumeration is using scalar index
p 1 2 L which is univalently associated with the RA discrete approximation is comprised of EB
EB indices, and can be calculated using (14): which representative points fulfilling perform-
ances design specifications (1):
p k1k2 kn ) = k1 (k
k2 )l1 +
(14)
+ (k
k3 )l1l2 ( kn )l l2 ln 1.
Dxg {e kk kn B : kk
r
kn
= }. (18)
Inverse indices transformation ki(j) can be k1 = , l1,, kn = ,l
, ln .
obtained with sequential calculations using (15):
p 1 If an index set for all EB comprising AR approx-
kn = +1 imation is defined as:
l1l2 ...ln1
p knl l2 ...ln
kn1 =
l1l2 ...ln2
+1
(15) I Dx {p L: kk
r
kn
= }, (19)
...
n where x rp x rk1k2 kn and ki ki ( p ) according to
k1 p ( ki l li ) , n > 1. (15), we can reformulate (18) into the following
i =2 form (20):
Thus, every EB can be enumerated with scalar
index e p Bg , p 1 L . Dxg

ep B

p I Dx . (20)
Every EB is assigned with a representative point

x rk1k2 kn = (r1k r2k2 ,rnkn )T usually located in its geo-


metric centre: The amount of elementary boxes which approx-
imate AR is denoted by M Dxg , and it is true
hi also that M I Dx .
riki = ai0 + hi ki , i 1, n. (16) Data structures for storing information on grid
2
and set Dxg are described in (Abramov et al. 2008),
The Figure 2 illustrates the AR discrete approxi- and (Abramov and Nazarov 2012). Essential dif-
mations for 2-dimension input parameter space. ficulty of AR approximation consists in high
According to univalent relation between indi- dimension of parameter space, incomplete prior
ces ( 1 2 information and only pointwise exploration of
n ) and p given in (14) and (15), lets
denote x rp x rk1k2 kn . A representative point can be parameter space with system performances (1) cal-
referenced both by indices ( 1 2 culation. The application of parallel computing sig-
n ) set and cor-
responding scalar index p. nificantly facilitates AR approximation (Abramov
Membership characteristic function for AR dis- et al. 2009), (Abramov and Nazarov 2015).
crete approximation is defined in (17): The AR data structures can be also exploited p for
calculation of a centroid xc = (x
( x1c , x2c ,...xnc )T during
determination of the set Dxg which approximates
1,if
if y min y( k k k ) y max
max ,
( kk
r
k
)= (17) AR. The centroid coordinates are calculated with
0 otherwise. the following expression:

189

AMER16_Book.indb 189 3/15/2016 11:27:32 AM


M
0 i ( ki 1, li ki ). (23)
xrp i =1,n
p =1
xic = p I Dx , i = 1, n, (21)
M The example of application the method of
optimal EB selection via inscribing r-cube with
where representative points are taken only from EB maximal r for 2-dimensional parameter space is
indexed in the AR approximation index set (19). It illustrated in Figure (4).
was proven that for convex AR this centroid xc is a Lets define r-neighborhood of an EB k1k2 kn Bg
solution to optimal parametric synthesis problem r
with indices ( 1 2 n ) as a set E k1k2 kn Bg com-
(Abramov et al. 2007). prised of EB with indices ( 1 2 n ) which fulfill
(24)

4 THE CRITERION OF SYSTEM n


PERFORMANCE RESERVE | mi ki | r. (24)
i =1
The Criterion of System Performance Reserve (7)
implies calculation of the minimal distance r to Thus, the set Ekr1k2 kn is a figure defined with
AR boundary for every EB which belongs to AR (25):
approximation, excluding boundary EB. Every EB
ek1k2 kn Bg is assigned a weight w(ek1k2 kn ) = r n
of distance to the nearest boundary EB. After Ekr1k2 kkn em m2 mn Bg : | mi ki | r . (25)
assigning weights, the EB with maximal weight are i =1
emphasized as optimal ones.
The optimal parametric synthesis problem for
For a discrete approximation of AR, the prob-
providing maximal system performance reserve
lem (7) of finding optimal parameters vector
consists in searching of an EB ekr1k2 kn Dxg
consists in finding EB which acts as a centre of a
which acts as a center of maximal r-neighborhood
convex symmetrical figure comprised of EB from
Ekr1k2 kn Dxg comprised of EB which belong to
AR approximation. The volume of this figure is
AR approximation Dxg
calculated as the amount of EB it consists of. The r g
The value r for r-neighborhood of ek1k2 kn Dx
measurement unit of distances between EB (in r g
an EB Ek1k2 kn Dx expresses the minimal Man-
particular from any EB to AR boundary) is an EB
hattan distance (Kier et al. 2005), (Schiff 2008)
rib.
from this EB to AR approximation boundary. A
Lets consider various methods of measure-
g set of EB which have maximal Manhattan distance
ment of a distance from any EB ek1k2 kn Dx to
to AR boundary and act as centers of maximal
AR boundary D g
Dx . As the minimal distance to
r-neighborhoods in two-dimensional space is illus-
AR boundary can not exceed a half of bounding
trated in Figure (5).
box range (for the most optimistic case when AR
The determination of r-neighborhoods imple-
fully fills the bounding box), it is correct to sup-
ments the method of narrowing areas which
pose that
requires analytical description of AR boundary
with a finite set of hypersurfaces and consists in
r i ( kic 1,lli kic ), (22) iterative narrowing of the region towards the opti-
i =1,n
mal center. As for the discrete AR representation,
the step unit per one iteration of the region narrow-
where li , i = 1, 2,...n is the amount of grid steps for ing is one EB (one grid step (10). The procedure of
i-th parameter. It is reasonable to say that r = 0 for
every EB which belongs to AR boundary.
One of the methods of determining of the short-
est distance from an EB to AR boundary consists
in constructing of a maximal cube comprised of
EB from AR approximation with the centre in this
EB. This multidimensional cube with the range r
is called r-cube. The method of r-cube construc-
tion is based on Moore neighborhood (Kier et al.
2005), (Schiff 2008) idea applied to multidimen-
sional space as illustrated in Figure (3).
In addition to the expression (22) we can say
that the distance r for every EB ek1k2 kn which acts Figure 3. Two-dimensional r-cubes. (Left) r = 1; (right)
as a center of r-cube has the following limitations: r = 2.

190

AMER16_Book.indb 190 3/15/2016 11:27:36 AM


For every single EB, a maximal narrowing border
level coefficient can be calculated. This coefficient
equals to minimal Manhattan distance from this
EB to the boundary D Dxg of discrete AR approxi-
mation. This maximal narrowing border level coef-
ficient can be evaluated via constructing maximal
r-neighborhood around this EB. Thus, the method
of narrowing boundaries in the case of discrete AR
approximation is reduced to determination of EB
with maximal r-neighborhood (see Figure 6).
Figure 4. The centers of maximal r-cube inscribed.
Despite high computational cost of EB enu-
meration, using discrete RA approximation in
the task of narrowing regions has the following
advantages:
There is not need to determine new points at
every iteration of region narrowing;
There is no need to approximate the bound-
ary with hyper-surfaces (polynomial or hyper-
spherical);
There is no need to determine equidistant sur-
faces and sets of their contact points at every
single iteration;
No checks of the narrowing boundary degen-
Figure 5. Two-dimensional example of r-neighborhood eration into a point are required;
with (left) r = 1; (right) r = 2. The step of boundary narrowing is fixed and
equal to the length EB rib (10).
The algorithm of determination of the maxi-
mal r-cube allows to obtain global solution of the
parametric synthesis problem (7) for enough large
and convex AR, the algorithm of determination of
maximal r-neighborhood yields a set of local opti-
mal solutions of the problem (7).

5 CONCLUSIONS

It is evident that discrete acceptability region approx-


Figure 6. The centers of maximal r-neighborhoods imation described in this work consumes computa-
inscribed. tional resources during its construction and requires
much resources for storing its data, and powerful
computing facilities with parallel computing tech-
nologies should be widely involved during solving of
narrowing of AR discrete approximation bound- this task. Nevertheless such approximation provides
ary performs moving neighborhood boundary to the most complete and detailed description of mul-
the adjacent EB backward from nearest boundary tidimensional acceptability region configuration a
BgB of EB set (25). priori unknown, which is typical for actual complex
A zero-order narrowing boundary is represented systems with plenty of varying internal parameters.
by the AR discrete approximation boundary D Dxg . The methods proposed in this work are based on
A first-order narrowing border consists of EB with the criterion of maximal performance reserve, and
Manhattan distance to AR boundary equal to 1, aimed to facilitation of parametric synthesis task
i.e. first-order narrowing border is comprised of solution with the account of parametric reliability
EB which have maximal r-neighborhood with r = 1. requirements, and lack of information on para-
The next level of narrowing boundary is comprised metric deviation trends. Locating system internal
of EB with r-neighborhood for r = 2, and so on parameters into the center of a figure inscribed in an
until narrowing boundary is merged into separate acceptability region implements worst-case strategy
EB. These EB are considered as optimal ones in the of parameter sizing task. Inscribing of a figure of
sense of system performance reserve (7). maximal volume in acceptability region maximizes

191

AMER16_Book.indb 191 3/15/2016 11:27:39 AM


system performance reserve within the scope of this Abramov, O. & D. Nazarov (2012). Regions of accepta-
worst-case strategy. Acceptability region can also be bility in reliability design. Reliability: Theory & Appli-
used in the task of evaluation of variation ranges cations. 7, No.3(26), 4349.
within the region which allows to estimate system Abramov, O. & D. Nazarov (2015). Regions of accept-
ability using reliability-oriented design. Recent Devel-
sensitivity and reveal its vulnerability to deviations opments on Reliability, Maintenance and Safety. WIT
of a particular parameter. Transaction on Engineering Sciences 108, 376387,
The algorithms proposed in this work were doi:10.2495/QR2MSE140431.
tested on models of various analog circuits (tran- Bernacki, R., J. Bandler, J. Song, & Q.J. Zhang (1989).
sistor-transistor logic, multivibrators, amplifiers). Efficient quadratic approximation for statistical
design. IEEE Transactions on Circuits and Systems.
36, no. 11, 14491454.
ACKNOWLEDGEMENT Director, S., G. Hatchel, & L. Vidigal (1978). Computa-
tionally efficient yield estimation procedures based on
simplicial approximation. IEEE Trtansactions on Cir-
This work is partially supported with the grant cuits and Systems. 25, no. 3, 121130.
14-08-00149 of the Russian Foundation for Basic Grasso, F., S. Manetti, & M. Piccirilli (2009). A method
Research. for acceptability region representation in analogue lin-
ear networks. International Journal of Circuit Theory
and Applications. 37, 10511061, doi:10.1002/cta.518.
REFERENCES Kier, L.B., P.G. Seybold, & C.K. Chao-Kun Cheng
(2005). Modeling Chemical Systems using Cellular
Abramov, O., Y. Katueva, & D. Nazarov (2007). Reliabil- Automata. Netherlands: Springer.
itydirected distributed computer-aided design system. Krishna, K. & S. Director (1995). The linearized per-
Proceedings of the IEEE International Conference on formance penalty (lpp) method for optimization of
Industrial Engineering and Engineering Management, parametric yield and its reliability. IEEE Transactions
Singapore., 11711175. on Computer-Aided Design of Integrated Circuits and
Abramov, O., Y. Katueva, & D. Nazarov (2008). Con- Systems. 14, no. 12, 15571568.
struction of acceptability region for parametric reli- Schiff, J.L. (2008). Cellular automata: a discrete view of
ability optimization. Reliability & Risk Analysis: the world. University of Auckland: John Wiley & Sons
Theory & Applications. 1 No.3, 2028. inc.
Abramov, O., Y. Katueva, & D. Nazarov (2009). Distrib- Xu, X.B., Z. Liu, Y.W. Chen, D.L. Xu, & C.L. Wen
uted computing environment for reliability-oriented (2015). Circuit tolerance design using belief rule base.
design. Reliability & Risk Analysis: Theory & Applica- Mathematical Problems in Engineering 2015, 12 p. doi:
tions. 2, No.1(12), 3946. 10.1155/2015/908027.

192

AMER16_Book.indb 192 3/15/2016 11:27:40 AM


Applied Mathematics in Engineering and Reliability Bri, Snel, Khanh & Dao (Eds)
2016 Taylor & Francis Group, London, ISBN 978-1-138-02928-6

Inference for a one-memory Self-Exciting Point Process

E. Gouno & R. Damaj


Laboratoire de Mathmatiques de Bretagne Atlantique, Universit de Bretagne Sud, Vannes, France

ABSTRACT: We consider a one-memory self-exciting point process with a given intensity function.
Properties of this point process are studied and the Mino distribution is introduced as the interarrival
times distribution of the process. This distribution has a hazard function which decreases or increases over
a period of time before becoming constant. A maximum likelihood procedure is driven to obtain the MLE
of the Mino distribution parameters. Quality of the estimates is investigated through generated data.

1 INTRODUCTION estimation of the process intensity parameters


is considered. The method is applied on simu-
Introduced by Hawkes (1971), Self-Exciting Point lated data. The results are compared with those
Processes (SEPP) are counting processes which obtained by Mino (2001).
intensity depends on all or part of the history of
the process itself. These models find applications in
many fields: seismology (Ogata 1999), neurophysi- 2 ONE-MEMORY SELF-EXCITING
ology (Johnson 1996), genetics, epidemiology, reli- POINT PROCESSES
ability (Ruggeri & Soyer 2008), economy (Bowsher
2007), economy (Bowsher 2007). The intensity of We focus our attention on the one-memory self-
SEPP is not only a function of time but it is also a exciting point processes { (t ) } with intensity
function of the number of jumps occurring in the process:
process. In some situations, only a small number
of more recent events will influence the evolution ( )
( ) (1)
of the process; then the process is a self-exciting
processes with limited memory. In this work, we where wN (t ) is the occurrence time of the N (t )th
consider the case where the intensity process of jump, > 0 , 1 et > 0.
the self-exciting point process depends only on If = 0 or if goes to + , and
the latest occurrence, i.e. is a one-memory self- the process is a standard homogeneous Poisson
exciting point processes. The motivation to inves- process.
tigate such a one-memory SEPP is a reliability If > 0, ( ) increases after each jump of the
study of electrical equipments exposed to thun- process; the process is said to be excited.
derstorm. In this study, this process appears as a If 1 < 0 , ( ) decreases after each jump
good candidate to describe the effect of lightning of the process; the process is said to be inhib-
strike on the reliability of the equipments. There- ited. The Figures 1 and 2 show representations
fore a method is required to make inference on of intensity for different values of , . These
one-memory SEPP. We assume that the impulse representations are obtained from simulations of
response function characterizing the intensity occurrence dates of the process w1, , wN (T ) (see
process is modeled as an exponential function appendix A).
having a constant coefficient that takes positive or
negative values. This model has been considered Proposition 1. Let { (t ) } be a one-memory
by Mino (2001) who suggested a method using an self-exciting point process with intensity process
EM algorithm, to obtain the maximum likelihood { (t)
() } defined as in (1). Then the interarrival
estimates of the parameters without solving the times Ti , i = 1, 2, form a sequence of statistically
non linear optimization problems usually involved. independent random variables with cumulative dis-
We introduce and define the Mino distribution to tribution function:
describe the distribution of the interarrival times.

( )
Then we show that the SEPP considered by Mino
is a renewal process where the interarrival times Pr(Ti t ) = 1 exp t 1 e t (2)

has a Mino distribution. Maximum likelihood

193

CH24_16.indd 193 3/15/2016 2:56:53 PM


3 THE MINO DISTRIBUTION

Let X be a r.v. following a Mino distribution (3) with


parameters ( ). When = 0 or when goes
to +, X follows an exponential distribution with
parameter . The Figures 3 and 4 display the density
for different parameters values. The hazard function
for the Mino distribution is displayed in Figure 5.
One can see that this distribution is convenient to
Figure 1. Intensity of an excited one-memory self- model random variables with decreasing or increas-
exciting point process with: = 1, = 100 , = 250 et ing hazard rate in early life that becomes constant in
T = 0.1 ms. useful life. To express the expectation of a Mino r.v.,
we introduce the function defined by:

x
( ; ) = z a 1 e z dz. (4)
0


Setting = and assuming > 0, the mean of
X can be computed as:

( )
E{X } e . (5)
( )
Figure 2. Intensity of an inhibited one-memory self-
exciting point process with: = 0.5 , = 100 , = 250 If < 0, the mean can be expressed as:
et T = 0.1 ms.
( )
E{X } e . (6)
(|| | )
and the probability density function is:
where ( ) is the lower incomplete gamma func-
tion that is:
t t
f t) = ( e ) e p [t + ( e )] (3)
x
( ) = u a 1e u du.
0
Proof: The proof follows from Snyder and Miller
(1991), theorem 6.3.4 p.314 and its corollary p.316. Some examples of expectation values are given
We have: in Table 1.

P (Ti t ) P[ N (wi 1,,w


wi 1 + t ) ]
= exp { wi t
wi
( s )dds }
= exp { wi +
wi 1
+
}

= exp [t + (1 e t )]

from which (2) and (3) are easily deduced.


In the sequel, we say that a random variable X
with a probability distribution of the form (1) fol-
lows a Mino distribution with parameters ( ).
We denote: X M Mino( ) .
It follows from proposition 1 that a one-mem-
ory self-exciting point process with intensity (1), is
equivalent to a renewal process with independent
interarrivals times having a Mino distribution. In
the next section, we investigate some properties of Figure 3. Probability density function of a Mino distri-
the Mino distribution. bution for = 100 , = 50 and different values of .

194

AMER16_Book.indb 194 3/15/2016 11:27:42 AM


event occurrences (observable) and the other one
representing no point events occurrences (unob-
servable). The author claims that this approach
provides simpler equations to realize the maximum
likelihood estimates avoiding the classical non-
linear optimisation problem. We suggest to consider
the classical maximum likelihood method using the
Mino distribution that we have defined previously.
This approach is easy to implement and allows to
check the existence and uniqueness of the MLE.
Let ( 1 n ) be a n-sample of independent
random variables from a Mino distribution with
parameters ( ). The log-likelihood is:

( + e )
n
log ( , ) log + log
log ti

i =1 (7)

(1 )
n n
ti e ti
Figure 4. Probability density function of a Mino distribu- i= i =1
tion with = 100 , = 0.8 and different values of .
Remark that this expression is similar to the
expression obtained considering the logarithm of
the sample function for a self-exciting point proc-
ess given by theorem 6.2.2, p. 302. from Snyder and
Miller (1991).
From (7), the likelihood equations are:

n n n
(1 e i ) + = 0
t
( 8)
i =1 i =1
n n ti
+ e =0 (9 )
ti
i =1 i =1 1 + e

n
n

2 1 e ti e ti
i
(10 )
i =1
i 1
n
ti e ti
ti
=0 (11)
i =1 1 + e
Figure 5. Hazard rate function of a Mino Distribution
with = 100 . A Newton-Raphson algorithm is used to solve
this system of equations which does not admit
Table 1. Examples of expected-value for = 500. explicit solution. Existence and uniqueness of
the MLE can be proven applying theorem 2.6
E(X) p.761 from Mkelinen et al. (1981). One needs to
prove that the gradient vector vanishes in at least
1 0.011828
one point and that the Hessian matrix is negative
= 100 0.5 0.010872
definite at every point where the gradient vanishes.
1 0.007937
The Hessian matrix is:
1 0.006697
= 200 0.5 0.005777
2 log L 2 log L 2 log L
1 0.003770

2

2 log L 2 log L 2 log L
4 PARAMETERS ESTIMATION H=
2
Mino (2001) proposes to use an EM algorithm 2
to estimate the parameters. He introduces arti- log 2 log L 2 log L
ficially two data models; one representing point 2

195

AMER16_Book.indb 195 3/15/2016 11:27:46 AM


where L stands for L (,, ). We have: The sign of D3 can be studied considering con-
ditions on the parameters.
2 log L n
= 2
2 5 APPLICATIONS
2 log L n
e 2 ti
2
= t 2 The MLE is computed for different sets of param-
i =1 (1 + )
eters values and samples sizes identical to those
log L
2
2 n
2 n
= 3 1 e ti e ti
chosen by Mino (2001). 100 samples of inter-
2
i =1 2 i =0 arrival times are generated using the inversion
ti 2e ti
+ ti2e ti +
n n
method described in appendix A, for each setting.
i =1 ti 2 The means and the standard deviation of the
i =1 (1 + e )
n MLE are displayed in Tables 2 and 3. One can see
log L
2
1
= (1 e ) ti that the MLE are very close to the input param-
i =1 eters for and . The results for are slightly
2 log L n n worse. The standard deviation is rather small in
= 2 ti e ti all cases and decreases as the sample size increases
i =1 i =1
as expected.
2 log L = n (11e t ) n t e ti
2 i =1
i
i =1
n
t e ti n
ti e 2 ti
+
i
1+ e
i =1 1+ i 1( e ti )2
Table 2. Mean and S.D. of the MLE calculated from
100 Monte Carlo runs.
The Hessian is negative definite if the first upper-
left minor is negative, the second upper-left minor is Sample True Estimates
positive and the principal minor of order 3 is nega- size Parameters means S.D.
tive. The first upper-left minor is obviously negative.
The second upper-left minor D2 is: 20000 = 100 99.753 0.099
= 1 1.002 0.001
2 = 500 521.867 2.309
n n 2 1 n = 100
D2 = Ai
2 i =11 2
i =1
(1 e ti ) 10000
= 1
99.820
1.020
0.137
0.001
ti
= 500 517.082 3.047
where Ai = 1+ e ti .
e
5000 = 100 99.595 1.790
= 1 1.020 0.002
i =11(
i 1 Ai.
n n
Equation (9) gives )=
Thus = 500 520.823 4.462
20000 = 100 100.261 0.202
= 0.5 0.502 0.003

2
1 n 2 n = 500 494.628 14.255
D2 n A
Ai
2 i =11
i 10000 = 100 100.170 0.241
i =1
= 0.5 0.5003 0.002
2
1 1
n
1 n = 500 488.221 17.603
2 n
i =1
Ai2 2 i
A
n i =1

5000 = 100 100.481 0.342

= 0.5 0.498 0.001
n
1 1 n = 500 486.176 22.988
2
= Ai Ai )2 0
n i =11 ni 1 20000 = 100 100.076 0.449
=1 1.009 0.013
For the principal minor of ordre 3, the following = 500 498.702 13.193
expression can be obtained: 10000 = 100 100.033 0.337
=1 1.035 0.014
= 500 523.564 14.525
n n
2 n
2
D3
2 i =1
ti
2 ti
e ti2e t Ai2 ti Ai D2 5000 = 100 99.808 0.892
i =1
1 i =1 =1 1.010 0.021

2
n n n
= 500 500.517 18.356
+ ti Ai Aj ti Ai
2

1 i =1
j =1 i =1

196

AMER16_Book.indb 196 3/15/2016 11:27:48 AM


Table 3. Mean and S.D. of the MLE calculated from
100 Monte Carlo runs.

Sample True Estimates


size parameters mean SD

20000 = 200 198.649 0.206


= 1 1.021 0.0008
= 500 527.448 1.983
10000 = 200 199.108 0.279
= 1 1.022 0.001
= 500 525.699 2.460
5000 = 200 198.979 0.464
= 1 1.021 0.001 Figure 6. Histogram obtained with sample size equal to
= 500 524.578 3.626 50000 for a Mino distribution with parameters = 100,
= 1 and = 500.
20000 = 200 200.347 0.271
= 0.5 0.501 0.001
= 500 501.277 4.894 1
x+ ( e x ) = (12)
10000 = 200 200.524 0.436 log(1 F ( x ))
= 0.5 0.502 0.002
= 500 496.981 8.153 This problem can be solved using an iterative
5000 = 200 200.202 0.515 scheme. One can suggests the following algorithm
= 0.5 0.499 0.003 to generate realisations of a Mino distribution:
= 500 502.108 10.306 1. generate a uniform number u on [0,1]
20000 = 200 200.065 0.328 2. Starting from an initial value t ( ) ,
=1 0.999 0.004
while | ( p++ ) t ( p ) | > ( ),
= 500 500.495 3.636
10000 = 200 199.868 0.527 compute t ( p+ ) t ( p ) (t ( ( p ) ) / (t ( p ) )
x 1
=1 1.007 0.006 where ( ) = x + ( )
log u
= 500 500.486 5.493 and ( ) = 1 + e x .
5000 = 200 199.868 0.650 The Figure 6 displays the exact pdf and the histo-
=1 1.007 0.008 gram of 50000 realisations obtained with the algo-
= 500 503.351 7.966 rithm previously described, for a Mino distribution
with parameters = 100, = 1 and = 500.

REFERENCES
6 CONCLUSION
Bowsher, C. (2007). Modeling security market events in
In this work we have investigated a particular self- continuous time: intensity based, multivariate point
process model. J. Econometrics 141, 876912.
exciting point process. We suggest to consider this proc-
Hawkes, A. (1971). Spectra of some self-exciting and
ess as a renewal process and we define the interarrival mutually exciting point process. Biometrika 58, 8390.
times distribution that we dename Mino distribution. Johnson, D. H. (1996). Point process models of single-
Some properties of this distribution are explored. Sta- neuron discharges. J. Comput. Neurosci. 3, 275299.
tistical inference is driven via the maximum likelihood Mkelinen, T., K. Schimdt, & G. Styan (1981). On the
approach. Results are obtained on simulation data. existence and uniqueness of the maximum likelihood
Further work will be conducted to develop a Bayesian estimate of vector-valued parameter in fixed-size sam-
approach and to consider goodness of fit test. ples. Ann. Statist. 9, 758767.
Mino, H. (2001). Parameter estimation of the intensity
process of self-exciting point processes using the EMal-
Appendix A: Simulation of a Mino distribution gorithm. IEEE Trans. Instrum. Meas. 50(3), 658664.
To obtain realisation of a r.v. having a Mino distribu- Ogata, Y. (1999). Seismicity analysis through pointprocess
modelling: a review. Pure Appl. Geophys. 155, 471507.
tion, we use the following well-known result: Let F be
Ruggeri, F. & R. Soyer (2008). Advances in Bayesian Soft-
a cumulative distribution function. Then the cdf of the ware Reliability Modeling. Advances in Mathematical
r.v. F1(U) where U is a uniform r.v. on [0,1], is F. For Modeling for Reliability, T. Bedford et al. (Eds), IOS
the Mino distribution the inverse of the cdf cannot Press, 165176.
be obtained in closed-form since it is supposed to be Snyder, L. & I. Miller (1991). Random Point Processes in
deduced expressing x with F(x) from equation (12). Time and Space. Springer.

197

AMER16_Book.indb 197 3/15/2016 11:27:49 AM


This page intentionally left blank
System reliability analysis

AMER16_Book.indb 199 3/15/2016 11:27:51 AM


This page intentionally left blank
Applied Mathematics in Engineering and Reliability Bri, Snel, Khanh & Dao (Eds)
2016 Taylor & Francis Group, London, ISBN 978-1-138-02928-6

House events matrix for application in shutdown probabilistic


safety assessment

M. epin
Faculty of Electrical Engineering, University of Ljubljana, Ljubljana, Slovenia

ABSTRACT: The fault tree is a method for identification and assessment of combinations of the unde-
sired events that can lead to the undesired state of the system. The objective of the work is to present a
mathematical model of extended fault tree method with the house events matrix to enable integration of
several system models into one model. The mathematical model of the fault tree and of its extension with
the house events matrix is presented. The theory is supported by simple examples, which facilitates the
understanding of the approach.

1 INTRODUCTION integrate the top event with logical gates and which
integrate the logical gates with other logical gates
The fault tree is a widely used method for evalu- and with the primary events (Ren, Dugan, 1998,
ation of reliability and safety (Vesely, 2002, epin, Mavko, 2002).
Kumamoto, Henley, 1996). It is applied in various The primary events are the events, which are
industries and fields of application (epin, 1997, not further developed and represent the com-
epin 2005). Its repute is gained primarily when ponents and their failure modes or human error
integrated with the event tree analysis as a part events. Not all the component failure modes are
of the probabilistic safety assessment related to considered. The analysis is focused to those, which
nuclear safety (epin, 2002, Martorel et al. 2006) can cause failure of the system. In other words,
and related to aircraft safety. the analysis is focused to those which can cause
the top event.
The primary events can be basic events or
1.1 Objective house events. The basic events are the ultimate
The objective of the work is to present a mathe- parts of the fault tree, which represent the unde-
matical model of extended fault tree method with sired events on the level of the components, e.g.
the house events matrix to enable integration of the component failures, the missed actuation
several system models into one model. signals, the human errors, effects of the test and
Integration of several system models into one maintenance activities, the common cause analy-
model may facilitate the extension of probabilistic sis contributions.
safety assessment, which was originally made for The house events represent the logic switches.
power plant full power operation, to consideration They represent conditions set either to true or to
of other modes of operation (Kiper 2002, epin, false, which support the modelling of connections
Mavko, 2002, epin, Prosen, 2008). between the gates and the basic events and enable
that the fault tree better represents the system
operation and its environment.
The fault tree is mathematically represented by a
2 METHOD
set of boolean equations.
The fault tree is a method for identification and Gi = f(Gp, Bj, Hs); i, p {1..P},j {1..J}, s {1..S} (1)
assessment of combinations of the undesired
events that can lead to the undesired state of the Gi logical gate i
system, which is either a system fault or a failure Gp logical gate p
of specific function of the system (Kumamoto, Bj basic event j
Henley, 1996, PRA Guide, 1982). The undesired Hs house event s
state of the system is represented by a top event. p number of gates
The fault tree can be represented in graphical j number of basic events
form or in the form of Boolean equations, which s number of house events

201

AMER16_Book.indb 201 3/15/2016 11:27:51 AM


The qualitative analysis is a process of Boolean complete set of conditions that may appear in the
reduction, where all Boolean equations are inserted analysis.
one into another. Then they are rearranged into a The house events matrix represents the values of
sum of products considering the laws of Boolean house events for all conditions of the analysis in
algebra. one dimension (all columns of the matrix) and for
all house events in the other dimension (all rows of
m the matrix).
C = Bj
MCSi (2) The house events matrix identifies values of
j =1 house events for all house event names related to
the system analysis in its rows and for all the condi-
MCSi minimal cut set i tions in its columns.
Bj basic event j Values, which are either 0 or 1 (or false or true)
m number of basic events in minimal cut set i are assigned to each house event for all conditions
in the matrix.
The sum of products represents the minimal The quantification of a fault tree equipped with
cut sets. Minimal cut sets are combinations of the house events matrix mathematically is similar to
smallest number of basic events, which if occur the fault tree without house events. The differ-
simultaneously, may lead to the top event. In other ence is in a number of results sets. One fault tree
words, the minimal cut set represents the smallest without house events has one set of results. Fault
set of component failures, which can cause failure tree with house events matrix has so many sets of
of the system. results as it is the number of different house events
matrix columns.
Z
Fault tree quantification equation is performed
P MCSi (3)
using the following equation.
z =1

Z
Z number of minimal cut sets
PTOP PMCCSi PMCCSi MCSj
C +
House events disappear from the results equa- z =1 i j
tion, because their values such as 0 or 1 are used in + PMCCSi MCSj
C j MCSk
C (4)
the Boolean reduction of equations. i j <k
n
In theory, different house events values in a set of
house events may change the model significantly. ... + ( )n1 P MCSi
C
i =1
This is the key of the idea behind the house
events matrix. Namely, in probabilistic safety
PTOP probability of a top event (failure prob-
assessment as it was initially used, it is possible
ability of a system or function, which is defined in
that for a single safety system, several fault trees
a top event)
are needed. They may differ because of different
PMCSi probability of minimal cut set i
success criteria. For example, in some configura-
Z number of minimal cut sets
tion we can rely only to one out of two system
n number of basic events in the largest mini-
trains, if the other is in maintenance. In the other
mal cut set (related to number of basic events rep-
configuration, we have two system trains available.
resenting minimal cut set)
Fault trees may differ due to different boundary
The probabilities of minimal cut sets should be
conditions, as they are linked to different scenarios
calculated considering their possible dependency.
with different requirements.
For example: auxiliary feedwater system with C =
PMCSi
two motor driven pumps and one motor driven
P 1 PB 2 P 1 PB 3 P PB 2 ... (5)
pump is available in nuclear power plant with pres-
surized reactor. The complete system can be con- P m PB1 PB ... PBm 1
sidered in majority of conditions. In the conditions
of station blackout, electrical power is not avail- Under assumption that the basic events sets are
able and motor driven pumps are not applicable, mutually independent, the equation is simplified.
but turbine pump and related equipment is appli-
cable. So, the model is much more applicable, if the m
motor driven pumps and their related equipment P C
CSi PBj (6)
are cut off from the model for the station blackout j =1
condition.
The house events matrix is introduced to list PMCSi probability of minimal cut set i
the house events and their respective values for the PBj probability of basic event Bj

202

AMER16_Book.indb 202 3/15/2016 11:27:51 AM


m number of basic events in minimal cut set i
Probability of basic event depends on the nature
of functioning of component modeled in the basic
event and its respective failure mode.
Probability model is selected in relation to the
failure mode of the component in the basic event.
Parameters of probabilistic model are obtained
from data base.
Generic data base can be used, but site specific
data base for particular equipment is much more
suitable.

2.1 Application of house events matrix


Figure 1 shows the simplest application of the
house event under AND gate G1. Logical gates are
represented by rectangles. House events are repre-
sented by pentagons. Logical gate defines the rela-
tion between input events to the upper event and
the upper event. Triangle is used for continuation.
If the value of the house event H1 is 0, then
equipment modeled in gate G2 does not apply in
the model. The result of G1 is empty set, because Figure 2. Example of house event under AND gate.
G1 happens if H1 and G2 both happen.
If the value of the house event H1 is 1, then
equipment modeled in gate G2 represents the com- If house event H2 is switched on, which means
plete event G1. its value is 1, and house event H1 is switched off,
Figure 2 shows example fault tree G. It inte- which means its value is 0 (and value of its nega-
grates two variants G1 and G2 of continued fault tion is 1), the gate G2 propagates to G2 A and con-
tree with house events H1 and H2. Gates G1 A and sequently to gate G. At the same time the gate G1
G2 A are in between. Circle above house event rep- does not propagate up, because it is switched off by
resent negation of this house event. house event H1 set to 0 and by negation of house
If house event H1 is switched on, which means event H2 (which is 0).
its value is 1, and house event H2 is switched off, Figure 2 shows one fault tree G, where two vari-
which means its value is 0 (and value of its nega- ants G1 and G2 are represented in one model. At
tion is 1), the gate G1 propagates to G1 A and con- the same time it is assured that wrong combina-
sequently to gate G. At the same time the gate G2 tion of values of house events results in empty set
does not propagate up, because it is switched off by of fault tree analysis which can represent an alarm
house event H2 set to 0 and by negation of house that the model should be checked. Namely, both
event H1 (which is 0). house events set to 1 or both set to 0 would give
empty set of results.
Figure 3 shows a fault tree example for auxiliary
feedwater system, where 5 system configurations
are joined in a single fault tree.

2.2 Application of house events matrix for


shutdown probabilistic safety assessment
House events matrix can be increasingly important
in shutdown probabilistic safety assessment, which
is an important issue considered in the last years
(epin, Prosen, 2008, NUREG/CR-6144, 1995,
Papazouglou, 1998, Swaminathan, Smidts, 1999).
Probabilistic safety assessment of a nuclear
power plant deals with a number of safety systems,
large number of components and is complex. Its
complexity increases significantly when other than
Figure 1. Example of house event under AND gate. power operation modes are considered.

203

AMER16_Book.indb 203 3/15/2016 11:27:51 AM


Figure 3. Fault tree example for auxiliary feedwater system.

Table 1. House event matrix for 4 modes of operation of a nuclear power plant for human failure events included in
the fault trees of safety systems.

Mode 1 Mode 2 Mode 3 Mode 4

Power Operation Startup Hot Standby Hot Shutdown

Event Group Event Identification POS-M1-G1 POS-M2-G1 POS-M3-G1 POS-M4-G1

Human HFE01 1 1 0 0
Failure HFE01A 0 0 1 1
Events HFE02 1 1 0 0
HFE02A 0 0 1 1
HFE03 1 1 0 0
HFE03A 0 0 1 1
HFE04 1 1 0 0
HFE04A 0 0 1 1
HFE05 1 1 0 0
HFE05A 0 0 1 1
HFE06 1 1 0 0
HFE06A 0 0 1 1
HFE07 1 1 0 0
HFE07A 0 0 1 1
HFE08 1 1 0 0
HFE08A 0 0 1 1
HFE09 1 1 0 0
HFE09A 0 0 1 1

204

AMER16_Book.indb 204 3/15/2016 11:27:52 AM


Table 1 shows house event matrix for 4 modes Figure 4 shows one of fault tree portions,
of operation of a nuclear power plant for human which were introduced to replace a single event
failure events included in the fault trees of safety HFE01 in original model for power operation
systems. with a fault tree HFE01 including 2 house events
It includes 9 human failure events, which appear representing 2 basic events, which replace basic
in different fault trees or event trees and which need event HFE01.
a change on their respective human error probabil- Basic event HFE01 remains in the new fault tree
ities due to different plant conditions. The human for the power operation, where house event with
error probability should change in the model due the same event is needed for its activation.
to several reasons. One of them is the time avail- Basic event HFE01 A represents an addition
able for action, which is larger in shutdown, so to the model together with both house events,
probability is smaller in such conditions. and is applicable for plant shutdown mode and
includes smaller human error probability then
HFE01.
Gate HFE01 is split to gate HFE01- and gate
HFE01 A in order to keep the naming scheme to
keep the transparency.
Name HFE01- is selected because gates have
to be uniquely named and name HFE01 exists in
the fault tree as an upper event. HFE01 represents
operator failure to establish emergency boration if
automatic boration with normally considered path
fails.
Table 2 shows house events matrix for 4 modes
of operation of a nuclear power plant for initiating
events only.
Figure 4. Example of fault tree replacing basic event.

Table 2. House events matrix for 4 modes of operation of a nuclear power plant for initiating events only.

Mode 1 Mode 2 Mode 3 Mode 4

Power Operation Startup Hot Standby Hot Shutdown

Event Group Event Identification POS-M1-G1 POS-M2-G1 POS-M3-G1 POS-M4-G1

Initiating Events ATWS- 1 1 0 0


CCWS- 1 1 1 1
ESWS- 1 1 1 1
IAIR- 1 1 1 1
ISLO- 1 1 1 0
ISLO1 0 0 0 1
LDC-- 1 1 1 1
LLOC- 1 1 1 0
LLOC1 0 0 0 1
LOSP- 1 1 1 1
MLOC- 1 1 1 0
MLOC1 0 0 0 1
SBO-- 1 1 1 1
SGTR- 1 1 0 0
SGTR1 0 0 1 0
SLB-- 1 1 0 1
SLOC- 1 1 1 0
SLOC1 0 0 0 1
TRMF- 1 1 0 1
TR--- 1 1 0 1
VESF- 1 1 1 0
VESF1 0 0 0 1

205

AMER16_Book.indb 205 3/15/2016 11:27:52 AM


3 CONCLUSIONS epin M., 2002, Optimization of Safety Equipment Out-
ages Improves Safety, Reliability Engineering and Sys-
The objective of the work was to present a math- tem Safety, 77,7180.
ematical model of extended fault tree method with epin M., 2005, Analysis of Truncation Limit in Proba-
bilistic Safety Assessment, Reliability Engineering and
the house events matrix to enable integration of System Safety, 87, 395403.
several system models into one model, which is epin M., R. Prosen, 2008, Probabilistic Safety Assess-
done. ment for Hot Standby and Hot Shutdown, Proceed-
The mathematical model of the fault tree and of ings of Nuclear Energy for New Europe.
its extension with the house events matrix is pre- Kiper K., 2002, Insights from an All-Modes PSA at Sea-
sented. The theory is supported by simple exam- brook Station, Proceedings of PSA 2002, p. 429434.
ples, which facilitates the understanding of the Kumamoto H., E.J. Henley, 1996, Probabilistic Risk
approach. Assessment and Management for Engineers and Scien-
Furthermore, the theory is supported by realistic tists, IEEE Press, New York.
Martorell, S., Carlos, S., Villanueva, J.F., Snchez, A.I.,
examples from probabilistic safety assessment. The Galvan, B., Salazar, D., epin, M., 2006, Use of Mul-
deficiency of the approach is the software support, tiple Objective Evolutionary Algorithms in Optimiz-
because existing probabilistic safety assessment ing Surveillance Requirements, Reliability Engineering
models are extremely complex and if software plat- and System Safety, 91 (9), 10271038.
form for evaluation does not support consideration NUREG/CR-6144, 1995, Evaluation of Potential Severe
of house events, the approach is not practical. Accidents During Low Power and Shutdown Opera-
If the house events are supported, their use tions at Surry Unit 1, US NRC.
and the application of the house events matrix Papazoglou I.A., 1998, Mathematical Foundations
may significantly contribute to reduce complex- of Event Trees, Reliability Engineering and System
Safety, 61, 169183.
ity of the models in case that they are expanded PRA Guide, 1982, Probabilistic Risk Assessment Proce-
with consideration of other modes than full power dures Guide, NUREG/CR-2300, Vol. 1,2, US NRC,
operation. Washington DC.
Ren Y., J.B. Dugan, 1998, Optimal Design of Reliable
Systems Using Static and Dynamic Fault Trees, IEEE
REFERENCES Transactions on Reliability, 234244.
Swaminathan S., C. Smidts, 1999, The Mathematical
epin M., B. Mavko, 1997, Probabilistic Safety Assess- Formulation for the Event Sequence Diagram Frame-
ment Improves Surveillance Requirements in Techni- work, Reliability Engineering and System Safety, 65,
cal Specifications, Reliability Engineering and Systems 103118.
Safety, 56, 6977. Vesely W., J. Dugan, J. Fragola, J. Minarick, J. Railsback,
epin M., B. Mavko, 2002, A Dynamic Fault Tree, Reli- 2002, Fault Tree Handbook with Aerospace Applica-
ability Engineering and System Safety, Vol. 75, No. 1, tions, National Aeronautics and Space Administration,
pp. 8391. NASA.

206

AMER16_Book.indb 206 3/15/2016 11:27:53 AM


Applied Mathematics in Engineering and Reliability Bri, Snel, Khanh & Dao (Eds)
2016 Taylor & Francis Group, London, ISBN 978-1-138-02928-6

Imprecise system reliability using the survival signature

Frank P.A. Coolen


Department of Mathematical Sciences, Durham University, Durham, UK

Tahani Coolen-Maturi
Durham University Business School, Durham University, Durham, UK

Louis J.M. Aslett


Department of Statistics, Oxford University, Oxford, UK

Gero Walter
School of Industrial Engineering, Eindhoven University of Technology, Eindhoven, The Netherlands

ABSTRACT: The survival signature has been introduced to simplify quantification of reliability of
systems which consist of components of different types, with multiple components of at least one of
these types. The survival signature generalizes the system signature, which has attracted much interest in
the theoretical reliability literature but has limited practical value as it can only be used for systems with
a single type of components. The key property for uncertainty quantification of the survival signature,
in line with the signature, is full separation of aspects of the system structure and failure times of the
system components. This is particularly useful for statistical inference on the system reliability based on
component failure times.
This paper provides a brief overview of the survival signature and its use for statistical inference for
system reliability. We show the application of generalized Bayesian methods and nonparametric predic-
tive inference, both these inference methods use imprecise probabilities to quantify uncertainty, where
imprecision reflects the amount of information available. The paper ends with a discussion of related
research challenges.

1 INTRODUCTION that ( ) 0 and ( ) 1 , so the system fails if all


its components fail and it functions if all its com-
In mathematical theory of reliability the main focus ponents function.
is on the functioning of a system given the function- For larger systems, working with the full struc-
ing, or not, of its components and the structure of ture function may be complicated, and one may
the system. The mathematical concept which is cen- particularly only need a summary of the struc-
tral to this theory is the structure function (Barlow & ture function in case the system has exchangeable
Proschan 1975). For a system with m components, components of one or more types. We use the
the state vector is x ( x1, x2 , , xm ) { , }m , with term exchangeable components to indicate that
xi = 1 if the i th component functions and xi = 0 the failure times of the components in the system
if not. The labelling of the components is arbitrary are exchangeable (De Finetti 1974). Recently, we
but must be fixed to define x . The structure func- introduced such a summary, called the survival sig-
tion : { , }m { , }, defined for all possible x , nature, to facilitate reliability analyses for systems
takes the value 1 if the system functions and 0 if the with multiple types of components (Coolen &
system does not function for state vector x . Most Coolen-Maturi 2012). In case of just a single type
practical systems are coherent, which means that of components, the survival signature is closely
( ) is not decreasing in any of the components related to the system signature (Samaniego 2007),
of x , so system functioning cannot be improved which is well-established and the topic of many
by worse performance of one or more of its com- research papers during the last decade. However,
ponents. The assumption of coherent systems generalization of the signature to systems with
is made throughout this paper and is convenient multiple types of components is extremely com-
from the perspective of uncertainty quantification plicated (as it involves ordering order statistics of
for system reliability. It is further logical to assume different distributions), so much so that it cannot

207

CH26_Coolen.indd 207 3/17/2016 8:45:30 PM


m1 mK
be applied to most practical systems. In addition to
the possible use for such systems, where the benefit P (TS t) (l )
only occurs if there are multiple components of l1 = 0 lK = 0
K m
(1)
the same types, the survival signature is arguably
also easier to interpret than the signature. k [ F (t )]mk lk [1 F (t )]lk
l k k
k =1 k
Consider a system with K 1 types of compo-
nents, with mk components of type k { , , K } The main advantage of the survival signature,
and k =1 mk m . Assume that the random fail-
K
in line with this property of the signature for sys-
ure times of components of the same type are tems with a single type of components (Samaniego
exchangeable (De Finetti 1974). Due to the arbi- 2007), is that the information about the system
trary ordering of the components in the state vec- structure is fully separated from the information
tor, components of the same type can be grouped about functioning of the components, which sim-
together, leading to a state vector that can be written plifies related statistical inference as well as con-
as x = ( x1, x 2, , x K ) , with x k = ( x1k , x2k , , xmkk ) siderations of optimal system design. In particular
the sub-vector representing the states of the com- for study of system reliability over time, with the
ponents of type k. structure of the system, and hence the survival
The survival signature for such a system, denoted signature, not changing, this separation also ena-
by ( 1 K ) , with lk mk for k ,, K bles relatively straightforward statistical infer-
, is defined as the probability for the event that ences where even the use of imprecise probabilistic
the system functions given that precisely lk of methods (Augustin, Coolen, de Cooman, & Trof-
its mk components of type k function, for each faes 2014, Coolen & Utkin 2011) is quite straight-
k { , , K } (Coolen & Coolen-Maturi 2012).

k k
( )
There are ml k state vectors x k with i =k1 xik lk .
Let Slk denote the set of these state vectors for
m forward. Such methods have the advantage that
imprecision for the system survival function reflects
the amount of information available. The next
components of type k and let Sl1 , ,lK denote the set two sections briefly discuss such methods of sta-
of all state vectors for the whole system for which tistical inference for the system failure time. First
i =k1 xik lk , k , , K . We also introduce the
m
we show an application of generalized Bayesian
notation l (l1, , lK ) . Due to the exchangeability methods, with a set of prior distributions instead
assumption for the failure times of the mk com- of a single prior distribution. This is followed by a
ponents of type k, all the state vectors x k Slkk are brief discussion and application of nonparametric
equally likely to occur, hence (Coolen & Coolen- predictive inference (Coolen 2011), a frequentist
Maturi 2012) statistical method which is based on relatively few
K 1 assumptions, enabled through the use of imprecise
mk probabilities, and which does not require the use
( ) =


l ( )
of prior distributions. The paper ends with a brief
k =1 k x Sl1 , ,lK
discussion of research challenges, particularly with
regard to upscaling the survival signature method-
Let Ctk { , , , mk } denote the number of ology for application to large-scale real-world sys-
components of type k in the system that function tems and networks.
at time t > 0 . Then, for system failure time TS ,
m1 mK K 2 IMPRECISE BAYESIAN INFERENCE
P (TS t) (l )P ( {Ctk = lk })
l1 = 0 lK = 0 k =1 The reliability of a system, for which the survival
signature is available, is quite straightforwardly
There are no restrictions on dependence of the quantified through its survival function, as shown
failure times of components of different types, as in the previous section. We briefly consider a sce-
the probability P ( K {Ctk = lk }) can take any form nario where we have test data that enable learn-
k =1
of dependence into account, for example one can ing about the reliability of the components of
include common-cause failures quite straight- different types in the system, where we assume
forwardly into this approach (Coolen & Coolen- independence of the failure times of components
Maturi 2015). However, there is a substantial of different types. The numbers of components
simplification if one assumes that the failure times in the system, of each type,
y that are functioning
of components of different types are independent, at time t, denoted by Ctk for k ,, K , are the
and even more so if one assumes that the failure random quantities of main interest. One attrac-
times of components of type k are conditionally tive statistical method to learn about these ran-
independent and identically distributed with CDF dom quantities from test data is provided by the
Fk (t ) . With these assumptions, we get Bayesian framework of statistics, which can be

208

AMER16_Book.indb 208 3/15/2016 11:27:54 AM


applied with the assumption of a parametric dis- The use of such prior sets, with only an interval of
tribution for the component failure times (Walter, possible values specified for each parameter, pro-
Graham, & Coolen 2015) or in a nonparamet- vides much flexibility for modelling prior beliefs
ric manner (Aslett, Coolen, & Wilson 2015). We and indeterminacy, together with interesting ways
briefly illustrate the latter approach. in which the corresponding sets of posterior (pre-
Assume that there are mk components of type dictive) distributions and related inferences can
k in the system, and we are interested in the prob- vary. Most noticeably, this model enables con-
ability distribution of Ctk . Suppose that nk com- flict between prior beliefs and data to be shown
ponents of the same type k were tested, these are through increased imprecision, that is difference
not the components that are in the system but their between upper and lower probabilities for an event
failure times are assumed to be exchangeable with of interest (Walter 2013). We illustrate the use of
those in the system. We assume that for all tested this model, together with the survival signature, for
components
p the failure time has been observed, a small system in Example 1, without attention to
let stk denote the number of these components such prior-data conflict, further details on this will
that still functioned at time t. A convenient and be presented elsewhere (Walter, Aslett, & Coolen
basic model for Ctk is the Binomial distribution, 2016).
where the probability of success, that is a com-
ponent still to be functioning at time t, can, in the
Example 1
Bayesian framework, be conveniently modelled as
a random quantity with a Beta prior distribution. As a small example, consider the system with three
Different to the standard parameterization for types of components presented in Figure 1. The
the Beta distribution, we define a Beta prior dis- survival signature of this system is given in Table 1,
tribution through parameters nk( ,t) and yk( ,t) with where all cases with l3 = 0 have been omitted as
as interpretations a pseudocount of components the system cannot function if the component of
and the expected value of the success probabil- Type 3 does not function, hence ( 1 2 ) 0 for
ity, respectively. Hence, these parameters can be all ( 1 2 ) .
interpreted in the sense that the prior distribution For component types 1 and 2, we consider a near-
represents beliefs reflecting the same information noninformative set of prior survival functions. For
as would result from observing nk( ,t) components components of type 3, we consider an informative
of which nk( ,t) yk( ,t) still function at time t (Walter set of prior survival functions as given in Table 2.
2013). Doing this leads to straightforward updat- This set could result from eliciting prior survival
ing, using the test information consisting of obser- probabilities at times t = 0,1, 2, 3, 4, 5 only, and
vations of nk components of which sk ,t were using those values to deduce such prior probabili-
still functioning at time t. The updating results in ties for all other values of t without further assump-
a similar Beta distribution as the prior, but now tions. These prior assumptions, together with sets
with parameter values nk( n,t) = nk( ,t) + nk and yk( n,t) of posterior survival functions, are illustrated in
the weighted average of yk( ,t) and sk ,t / nk , with Figure 3 (presented at the end of the paper); test
weights proportional to nk( ,t) and nk , respectively. data for components of type 1 and 2 are taken as
This leads to the posterior predictive distribution { } and { } , respectively.
(Walter, Aslett, & Coolen 2016) For components of type 3 test data are taken as
{ } , which are well in line with expec-
m
tations according to the set of prior distributions.
Ctk
P (C lk stk ) =
k
The posterior sets of survival functions for each

lk component type and for the whole system show con-
B (lk + nk( n,t) yk( n,t) , mk lk + nk( n,t) ( yk( n,t) )) siderably smaller imprecision than the correspond-
ing prior sets, which is mainly due to the low prior
B ( nk( n,t) yk( n,t) nk( n,t) (11 yk( n,t) ))

This model can also relatively straightforwardly


be used with a set of Beta prior distributions rather
than a single one, a generalization fitting in the the-
ory of imprecise probability (Augustin, Coolen, de
Cooman, & Troffaes 2014). At each value of t one
calculates the
k
infimum k
and supremum of the prob-
ability P (C
Ct lk st ) over the set of prior p param-
eters, with nk ,t n k ,t , nk ,t and yk( ,t) y(k ,t) yk( ,t) ,
( ) ( ) ( )

with the bounds of these intervals chosen to reflect


a priori available knowledge and its limitations. Figure 1. System with 3 types of components.

209

AMER16_Book.indb 209 3/15/2016 11:27:57 AM


Table 1. Survival signature for the system in Figure 1 events of interest which are explicitly in terms of
for cases with l3 = 1. one or more future observations. NPI can be con-
sidered suitable if there is hardly any knowledge
l1 l2 (l1, l2 ,1) l1 l2 (l1, l2,1) about the random quantity of interest, other than
0 0 0 0 1 0
the data which we assume consist of n observa-
1 0 0 1 1 0 tions, or if one does not want to use such further
2 0 1/3 2 1 2/3 information, e.g. to study effects of additional
3 0 1 3 1 1 assumptions underlying other statistical meth-
4 0 1 4 1 1 ods. NPI uses lower and upper probabilities, also
known as imprecise probabilities, to quantify
uncertainty (Augustin, Coolen, de Cooman, &
Troffaes 2014) and has strong consistency proper-
Table 2. Lower and upper prior functioning probability ties from frequentist statistics perspective (Augus-
bounds for component type 3 in the system of Figure 1. tin & Coolen 2004, Coolen 2011). NPI provides
a solution to some explicit goals formulated for
t (0, 1) (1, 2) (2, 3) (3, 4) (4, 5) objective (Bayesian) inference, which cannot be
obtained when using precise probabilities (Coolen
(0) 0.625 0.375 0.250 0.125 0.010
y3,t 2006), and it never leads to results that are in con-
flict with inferences based on empirical probabili-
(0) 0.999 0.875 0.500 0.375 0.250 ties. NPI for system survival functions, using the
y3,t survival signature, was recently presented (Coolen,
Coolen-Maturi, & Al-nefaiee 2014) and is briefly
summarized here.
We now present NPI lower and upper sur-
strength intervals we chose for this example, namely vival functions for the failure time TS of a sys-
n1(,0t ) n 1(,0t ) = n2(,0t ) n2(,0t ) = [1, 2 ], n (30,t) n 3( ,t) = [1 4 ] for all tem consisting of multiple types of components,
t. We see that posterior lower and upper survival using the system signature combined with NPI
functions drop at those times t when there is a for Bernoulli data (Coolen 1998). This enables the
failure time in the test data, or a drop in the prior NPI method to be applied to, in principle, all sys-
survival probability bounds. Note that the lower tems. The failure times of components of differ-
bound for prior system survival function is zero for ent types are assumed to be independent. It must
all t due to the prior lower bound of zero for type 1 be emphasized that the NPI framework does not
components, and for the system to function at least assume an underlying population distribution in
two components of type 1 must function. A fur- relation to random quantities, and therefore also
ther reason why the imprecision reduces substan- not that these are conditionally independent given
tially in this example is that the data do not conflict some probability distribution. In fact, NPI explic-
with the prior beliefs. With these sets of prior dis- itly takes the inter-dependence of multiple future
tributions such prior-data conflict can only really observations into account. This requires a some-
occur for components of type 3, as such conflict what different approach for dealing with imprecise
logically requires at least reasonably strong prior probabilities to that presented for the imprecise
beliefs to be taken into account through the set Bayesian approach in the previous section.
of prior distributions. If test failure times for the NPI will be used for learning about the compo-
components of type 3 were unexpectedly small or nents of a specific type in the system, from data
large, the imprecision in the lower and upper pos- consisting of failure times for components that
terior survival functions for this component would are exchangeable with these. We assume therefore
increase, with a similar effect on the corresponding that such data are available, for example resulting
overall lower and upper system survival functions. from testing or previous use of such components.
A detailed analysis illustrating this effect will be It is assumed that failure times are available for all
presented elsewhere (Walter, Aslett, & Coolen tested components. As in the previous section, let
2016). nk , for k { , , K } , denote the number of com-
ponents of type k for which test failure data are
available, and let stk denote the number of these
3 NONPARAMETRIC PREDICTIVE components which still function at time t.
INFERENCE The NPI lower survival function is derived as
follows. Remember that Ctk denotes the number
Nonparametric Predictive Inference (NPI) (Coolen of components of type k in the system which
2011) is a frequentist statistical framework based function at time t, where it is assumed that failure
on relatively few assumptions and considering ends the functioning of a component. Under the

210

AMER16_Book.indb 210 3/15/2016 11:28:01 AM


assumptions for the NPI approach (Coolen 1998), In this expression, P denotes the NPI lower
we derive the following lower bound for the sur- probability for Bernoulli data (Coolen 1998). This
vival function construction ensures that minimum possible weight
is given to small values of Ctk , resulting in the NPI
m1 mK K upper survival function for the system failure time
P (TS t) (l ) D(Ctk lk )
m1 mK
l1 = 0 lK = 0 k =1 K
P (TS t) (l ) D(Ctk lk )
where l1 = 0 lK = 0 k =1

We illustrate this NPI method for system reli-


D(CCtk lk P (Ctk lk ) P (Ctk lk )= ability using the survival signature in Example 2
1
n + m sk lk (Coolen, Coolen-Maturi, & Al-nefaiee 2014).
k k t

n k 1
k st
Example 2
nk stk + mk lk

nk stk

Consider the system with K = 2 types of com-
ponents as presented in Figure 2. The survival
signature for this system is presented in Table 3,
In this expression, P denotes the NPI upper it is easily verified by checking all possible com-
probability for Bernoulli data (Coolen 1998). For binations of the specific components of each type
each component type k, the function D ensures which function or not.
that maximum possible probability, corresponding To illustrate NPI for the system survival time,
to NPI for Bernoulli data (Coolen 1998), ), is assigned suppose that n1 = 2 components exchangeable with
to the event Ctk = 0 , so D(Ctk ) = P (C
(Ctk ). those of type 1 and n2 = 2 components exchange-
Then, D(Ctk ) is defined by putting the maxi- able with those of type
p 2 were tested. First suppose
mum possible remaining probability mass, from that failure times t12 t11 < t22 t21 were observed,
the total probability mass available for the event with t kj the j-th ordered failure time of a compo-
Ctk 1 , to the event Ctk = 1 . This is achieved by nent of type k. The resulting NPI lower and upper
D(Ctk ) = P (C(Ctk ) P (Ctk ) . This argu- survival functions for the system failure time TS
ment is continued, by assigning for increasing lk are specified in Table 4, together with the results
the maximum possible remaining probability mass
D(Ctk lk ) . As the survival signature is increasing
in lk for coherent systems,
y as assumed in this paper,
and the resulting D is a precise probability distribu-
tion, the right-hand side of the inequality above is
indeed a lower bound and it is the maximum possible
lower bound. As such, it is the NPI lower probability
for the event TS t , giving the NPI lower survival
function for the system failure time (for t > 0 )

m1 mK K
P (TS t) (l ) D(Ctk lk )
l1 = 0 lK = 0 k =1 Figure 2. System with 2 types of components.

The corresponding NPI upper survival function


for TS is similarly derived, using the upper bound Table 3. Survival signature of the system in Figure 2.
m1 mK K l1 l2 (l1, l2) l1 l2 (l1, l2)
P (TS t) (l ) D(Ctk lk )
l1 = 0 lK = 0 k =1 0 0 0 2 0 0
0 1 0 2 1 0
where 0 2 0 2 2 4/9
0 3 0 2 3 6/9
D(CCtk lk P (Ctk lk ) P (Ctk lk ) = 1 0 0 3 0 1
1 1 1 0 3 1 1
n + m s k l n s k + m l 1
k k t k k t k k 1 2 1/9 3 2 1


nk stk n sk 1 3 3/9 3 3 1
k t

211

AMER16_Book.indb 211 3/15/2016 11:28:02 AM


Figure 3. Prior and posterior sets of survival functions for the system in Figure 1 and its three component types. The
component failure times, that form the test data, are denoted with tick marks near the time axis.

Table 4. Lower and upper survival functions for the sys- for the case with the test failure times ordered as
tem in Figure 2 and two data orderings. t11 t12 < t21 t22 .
For the ordering t12 t11 < t22 t21 , in the first
t12 t11 < t22 t21 interval in Table 4 we have not yet seen a failure
in the test data, so the NPI upper probability that
P (Ts t) P (Ts t) P (TS t) the system will function is equal to one, which is
logical as we base the inferences on the data with
2 0.553 1 few additional assumptions. In the second inter-
( 1) val, one failure of type 2 has occurred but we do
0.458 1 not have any evidence from the data against the
( 12 1
1) possibility that a component of type 1 will cer-
0.148 0.553 tainly function at times in this interval, so the NPI
( 11 2
2) upper survival function remains one. In the fourth
0.100 0.458 interval, both type 2 components have failed but
( 22 1
2) only one component of type 1 has failed. In this
0 0.148 interval, to consider the lower survival function
( 21 ) the system is effectively reduced to a series system
consisting of three components of type 1, with
t11 t12 < t21 t22 one success and one failure as data, denoted by
(2, 1). As such a series system only functions if all
t P (TS t) P (TS t) three components function, the NPI lower sur-
vival function within this fourth interval is equal
1 0.553 1 to S TS (t ) = 13 24 53 = 0.100 , which follows by
( 1) sequential reasoning, using that, based on n obser-
0.230 0.667 vations consisting of s successes and n s failures,
( 11 2
1) denoted as data (n, s), the NPI lower probability
0.148 0.553 for the next observation to be a success is equal
( 12 1
2) to s / ( n + ) (Coolen 1998). The NPI lower prob-
0 0.230 ability for the first component to function, given
( 21 2
2) test data (2,1), is equal to 1/3. Then the second
0 0.148 component is considered, conditional on the first
( 22 ) component functioning, which combines with the
test data to two out of three components observed

212

AMER16_Book.indb 212 3/15/2016 11:28:05 AM


(or assumed) to be functioning, so combined data is an advantage, and also the monotonicity of the
(3,2), hence this second component will also func- survival signature for coherent systems is very use-
tion with NPI lower probability 2/4. Similarly, the ful if one can only derive it partially. For small to
NPI lower probability for the third component to medium-sized systems and networks, the survival
function, conditional on functioning of the first signature is particularly easy to compute using the
two components in the system, so with combined Reliability Theory R package (Aslett 2016b), avail-
data (4,3), is equal to 3/5. In the final interval, we able from www.louisaslett.com. Using this pack-
are beyond the failure times of all the tested com- age it is straightforward to express your system in
ponents, so we no longer have evidence in favour terms of an undirected graphical structure, after
of the system to function, so S TS (t ) = 0, but the which a single call to the function compute System
system might of course still function as reflected Survival Signature suffices. The function will com-
by S TS (t ) = 0.148. pute all of the cut sets of the system and perform
For the second case in Table 4, with data order- the combinatorial analysis, returning a table which
ing t11 t12 < t21 t22, we have S TS (t ) = 0.667 in the contains the survival signature just as in Tables 2
second interval, where one failure of type 1 has and 3. For example, computation of the survival
occurred in the test data. In the fourth interval, signature for the system in Figure 1 is achieved
both tested components of type 1 have failed, lead- with 3 simple commands
ing to S TS (t ) = 0. Both of these values are directly
related to the required functioning of the left-most
component in Figure 2.

4 DISCUSSION

The survival signature is a powerful and quite


basic concept. As such, further generalizations are
conceptually easy, for example one can straight-
forwardly generalize the survival signature to mul-
ti-state systems such that it again summarizes the Full instructions and some worked examples are avail-
structure function in a manner that is sufficient for able within the package. There are numerous other func-
a range of uncertainty quantifications for the sys- tions in the package, enabling computation of: the legacy
tem reliability. The survival signature can also be system signature (Samaniego 2007); the continuous-time
used with a generalization of the system structure Markov chain representation of repairable systems; as
well as numerous inference algorithms for Bayesian infer-
function where the latter is a probability instead
ence on the system signature using only system-level data
of a binary function, or even an imprecise prob- (Aslett 2013).
ability. This enables uncertainty of system func-
tioning for given states of its components to be Full instructions and some worked examples are
taken into account, which may be convenient, for available within the package. There are numerous
example, to take uncertain demands or environ- other functions in the package, enabling compu-
ments for the system into consideration. In this tation of: the legacy system signature (Samaniego
paper, we only considered test data with observed 2007); the continuous-time Markov chain repre-
failure times for all tested components. If test data sentation of repairable systems; as well as numer-
also contain right-censored observations, this can ous inference algorithms for Bayesian inference on
also be dealt with, both in the imprecise Bayesian the system signature using only system-level data
and NPI approaches (Walter, Graham, & Coolen (Aslett 2013).
2015, Coolen & Yan 2004, Maturi 2010) (more The survival signature enables some interesting
information about NPI is available from www. applications which would otherwise be intractably
npi-statistics.com). This generalization is further difficult. For example, often a system designer may
relevant as, instead of assuming availability of test consider the design (structure) of their system to
data, it allows us to take process data for the actual be a trade secret and so be unwilling to release it to
components in a system into account while this component manufacturers, while at the same time
system is operating, hence enabling inference on component manufacturers are frequently unwill-
the remaining time until system failure. ing to release anything more than summary figures
Upscaling the survival signature to large real- for components, e.g. mean-time-between-failures.
world systems and networks, consisting of thou- These two opposing goals lead to a situation in
sands of components, is a major challenge. However, which it would seem unrealistic to achieve a full
even for such systems the fact that one only needs probabilistic reliability assessment and to honour
to derive the survival signature once for a system the privacy requirements of all parties. However,

213

AMER16_Book.indb 213 3/15/2016 11:28:08 AM


recent work (Aslett 2016a) makes use of the sur- Barlow, R. & F. Proschan (1975). Statistical Theory of
vival signature to allow cryptographically secure Reliability and Life Testing. New York: Holt, Rinehart
evaluation of the system reliability function, where and Winston.
the functional form resulting from the survival sig- Coolen, F. (1998). Low structure imprecise predictive
inference for Bayes problem. Statistics & Probability
nature decomposition in Equation (1) is crucial to Letters 36, 349357.
enabling encrypted computation using so-called Coolen, F. (2006). On nonparametric predictive inference
homomorphic encryption schemes (Aslett, Esper- and objective Bayesianism. Journal of Logic, Language
ana, & Holmes 2015). The equivalent decompo- and Information 15, 2147.
sition in terms of the structure function leads to Coolen, F. (2011). Nonparametric predictive inference.
difficulties in encrypted computation, so that this In M. Lovric (Ed.), International Encyclopedia of Sta-
application may be intractable without use of the tistical Science, pp. 968970. Springer.
survival signature. Coolen, F. & T. Coolen-Maturi (2012). On generalizing
the signature to systems with multiple types of com-
ponents. In W. Zamojski, J. Mazurkiewicz, J. Sugier, T.
Walkowiak, and J. Kacprzyk (Eds.), Complex Systems
ACKNOWLEDGEMENTS and Dependability, pp. 115130. Springer.
Coolen, F. & T. Coolen-Maturi (2015). Predictive infer-
The authors wish to thank Professor Radim Bris ence for system reliability after common-cause com-
for his kind invitation to present this work at the ponent failures. Reliability Engineering and System
ICAMER 2016 conference. Louis Aslett was sup- Safety 135, 2733.
ported by the i-like project (EPSRC grant reference Coolen, F., T. Coolen-Maturi, & A. Al-nefaiee (2014).
number EP/K014463/1). Gero Walter was sup- Nonparametric predictive inference for system reli-
ported by the DINALOG project Coordinated ability using the survival signature. Journal of Risk and
Reliability 228, 437448.
Advanced Maintenance and Logistics Planning for Coolen, F. & L. Utkin (2011). Imprecise reliability. In M.
the Process Industries (CAMPI). Lovric (Ed.), International Encyclopedia of Statistical
Science, pp. 649650. Springer.
Coolen, F. & K. Yan (2004). Nonparametric predictive
REFERENCES inference with right-censored data. Journal of Statisti-
cal Planning and Inference 126, 2554.
Aslett, L. (2013). MCMC for Inference on Phase-type and De Finetti, B. (1974). Theory of Probability. Chichester:
Masked System Lifetime Models. Ph. D. thesis, Trinity Wiley.
College Dublin. Maturi, T. (2010). Nonparametric Predictive Inference
Aslett, L. (2016a). Cryptographically secure multiparty for Multiple Comparisons. Ph. D. thesis, Durham
evaluation of system reliability. Pending journal University.
submission. Samaniego, F. (2007). System Signatures and their
Aslett, L. (2016b). Reliability Theory: Tools for structural Applications in Engineering Reliability. New York:
reliability analysis. R package. Springer.
Aslett, L., F. Coolen, & S. Wilson (2015). Bayesian infer- Walter, G. (2013). Generalized Bayesian Inference under
ence for reliability of systems and networks using the Prior-Data Conflict. Ph. D. thesis, Ludwig Maximil-
survival signature. Risk Analysis 35, 16401651. ian University of Munich.
Aslett, L., P. Esperana, & C. Holmes (2015). A review Walter, G., L. Aslett, & F. Coolen (2016). Bayesian non-
of homomorphic encryption and software tools for parametric system reliability with sets of priors. In
encrypted statistical machine learning. Technical submission.
report, University of Oxford. Walter, G., A. Graham, & F. Coolen (2015). Robust
Augustin, T. & F. Coolen (2004). Nonparametric predic- Bayesian estimation of system reliability for scarce
tive inference and interval probability. Journal of Sta- and surprising data. In L. Podofillini, B. Sudret, B.
tistical Planning and Inference 124, 251272. Stojadinovi, E. Zio, and W. Krger (Eds.), Safety and
Augustin, T., F. Coolen, G. de Cooman, & M. Troffaes Reliability of Complex Engineered Systems: ESREL
(2014). Introduction to Imprecise Probabilities. Chich- 2015, pp. 19911998. CRC Press.
ester: Wiley.

214

AMER16_Book.indb 214 3/15/2016 11:28:09 AM


Applied Mathematics in Engineering and Reliability Bri, Snel, Khanh & Dao (Eds)
2016 Taylor & Francis Group, London, ISBN 978-1-138-02928-6

Parallel algorithms of system availability evaluation

M. Kvassay, V. Levashenko & E. Zaitseva


University of Zilina, Zilina, Slovakia

ABSTRACT: There are different methods for the calculation of indices and measures in reliability anal-
ysis. Some of most used indices are system availability/reliability and importance measures. In this paper
new algorithms for the calculation of system availability and some of importance measures are developed
based on the parallel procedures. The principal step in the development of these algorithms are construc-
tion of matrix procedures for the calculation these indices and measures.

1 INTRODUCTION Analysis of element importance is used in the system


design, diagnosis, and optimization. Many IMs are
The estimation of a system reliability is provided used today to allow for various aspects of the impact
based on different indices and measures. As a rule of system elements on its failure or operability.
the computational complexity of the calculation of
these indices and measures depends on the system
dimension. One of way this computational com- 2 A SYSTEM REPRESENTATION BY
plexity decreasing is the use of parallel procedure STRUCTURE FUNCTION
(Green et al. 2011, Lingfeng & Singh 2009).
Kucharev et al. (1990) shown that the parallel 2.1 The structure function
procedure can be designed based on matrix inter-
pretation of computational procedure. Therefore Consider a system of n components. Every com-
the transformation of traditional computational ponent state is designated as xi (i = 1,...,n) where
procedures for the calculation of indices and meas- xi = 1 is working state of the component and
ures in matrix form is important step in the design xi = 0 indicates the system failure. The probability
of parallel algorithms. In this paper we consider of failure is defined for every system component as
such transformation for calculation of system qi = Pr{xi = 0}. Therefore the probability of the i-th
availability and some of Importance Measures component is pi = Pr{xi = 1} = 1 qi.
(IMs). The initial representation and mathemati- The structure function of the system defines
cal description of investigated system in this case correlation of system state depend on system com-
must be defined in matrix or vector form. There ponent states unambiguously (Zaitseva 2012):
are some typical form of investigated system repre-
sentation in reliability analysis: structure function; (x) = (x1, , xn): {0, 1}n {0, 1}. (1)
Markovian model; Mote-Carlo model etc. The
structure function can be considered as Boolean There are two groups of system type that are
function (Barlow & Proschan 1975). This inter- coherent and non-coherent system. The coherent
pretations allows using of vector representation system has assumption (Beeson & Andrews 2003,
of Boolean function for a structure function too. Fricks & Trivedi 2003):
Therefore in this paper the parallel algorithm for
a. The system and its components have two states:
the calculation of system availability is developed
up (working) and down (failed);
based on structure function of system. The struc-
b. All system components are relevant to system;
ture based algorithms are used in the development
c. The system structure function is monotone non-
of parallel algorithms for calculation of IMs too.
decreasing: (x1, , 1, , xn) (x1, , 1, ,
The importance analysis enables one to estimate
xn);
the impact of a system element on the system fail-
d. The failure and repair rate of the components
ure or functioning. Consideration is given at that
are constant;
to the structural distinctions of the system and the
e. Repaired components are as good as new.
failure/operability probabilities of its elements. By
the system operability is meant its ability to function The system is non-coherent if one or more of
at a fixed time instant (Barlow & Proschan 1975). these assumption are not thru.

215

CH27_Zaitseva.indd 215 3/15/2016 12:52:52 PM


Consider typical form of the structure function terminal nodes, labelled 0 and 1. Each non-terminal
(1) representation. The function (1) is Boolean func- node is labelled with a function variable xi and has
tion. It is permits to use mathematical approach of two outgoing edges. The left edge is labelled 0
Boolean algebra for this function representation and the other outgoing edge is labelled 1.
and investigation its properties. A BDD is a widely used tool reliability analysis.
There are some representations of the structure Some methods for reliability analysis based on this
function (1) in point of view Boolean algebra. tool are discussed in papers (Zaitseva et al. 2015,
Truth table, Binary Decision Diagram (BDD) and Chang et al. 2004).
analytical representation (formula) can be used for Terminal nodes of the BDD correspond to the
the structure function initial description according system state. Non-terminal node outgoing edges
to (Brown & Vranesic 2000). are interpreted as component states.
For example, the BDD of the structure function of
the series parallel system in Fig. 2 is shown in Fig. 3.
2.2 Table and matric representation of the
structure function
2.4 Analytical representation of the structure
A truth table includes a list of combinations of 1s
function
and 0s assigned to the binary variables, and column
that shows the value of the function each binary The analytical representation has different form for
combination (Fig. 1). The number of rows in the the function. As a rule this function are define by
truth table is 2n, where n is the number of variables. the formula with operators AND, OR and NOT.
The binary combination of variables in the For example, the structure function of the system
truth table can be ordered from 0 to 2n-1 according in Fig. 2 can be presented by the formula:
to coding (lexicographical order). The fixed order
of variable allows consider the column of func- ( ) = AND(
N ( 1,OR( 2 , x3 )). (2)
tion values only (Fig. 1). Such representation of
Boolean function is named as truth table column But there are other analytical representations
vector or truth vector. for Boolean function and one of them is arithmetic
Therefore the structure function of system (1) polynomial form A(x) (Kucharev et al. 1990):
can be represented by the truth table or truth vec-
tor unambiguously. For example, consider the trivial 2 n 1
system of three components (n = 3) in Fig. 2. The ( ) A( ) = a( ) k1 k2
x1 x2 ...xnkn =
truth table is shown in Table 1 and the truth vector of k =0
this system is x = [0 0 0 0 0 1 1 1]T. Consider the truth = a( ) + a ( ) xn + a ( ) xn + a (3) xn 1xn + ... +
vector element x(5) = 1. The state vector for this func- +a ( 2n 2 )
x1...
. xn + a(
n
)
x ...xn ,
tion value is defined by the transformation of the
parameter i = 5 into binary representation: i = 5 (i1, (3)
i2, i3) = (1, 0, 1). Therefore, the truth vector element
x(5) = 1 agrees with the function value (1, 0, 1) = 1. Table 1. Truth table of the structure function.

Values of variables,
2.3 Binary decision diagram x1, x2, x3 Function values, (x)
A BDD is a directed acyclic graph of a Boolean 000 0
function representation. This graph has two 001 0
010 0
011 0
100 0
101 1
110 1
111 1

Figure 1. Truth vector of the structure function. Figure 2. A simple series-parallel system.

216

CH27_Zaitseva.indd 216 3/15/2016 12:52:53 PM


where a(k) is polynomial coefficients; k1kikn is
a  x
A (7)
binary description of parameter k (k = 0, 1, , n
k
2n-1); xi i xi if ki = 1 and xiki = 1 if ki = 0.  is inverse matrix for A and:
This arithmetic polynomial form (3) in matrix where A n n

form is (Kucharev et al. 1990):



A  A
A  , A  = 1 0 . (8)
x An a (4) n n 1 1 1 1

where x = [x(0) x(1) x(2n-1)]T is the truth vector of
function (1); a = [a(0) a(1) a(2n-1)]T is vector of coef- For example, consider the trivial system of three
ficients a(k) for polynomial (3); An is matrix that is components (n = 3) in Fig. 2. The truth vector of
calculated by recurrent equation: this function is x = [0 0 0 0 0 1 1 1]T. Computer
the coefficient of polynomial (3) according equa-
tion (8):
1 0
An A A n 1, A1 = (5)
1 1 a K2 x
1 0 0 0 0 0 0 0 0 0
where is symbol of Kronecker product and the 1 1 0
element of the matrix A1 is calculated as: 0 0 0 0 0 0 0
1 0 1 0 0 0 0 0 0 0
ast st , (6)
1 1 1 1 0 0 0 0 0 0
= = .
1 0 0 0 1 0 0 0 0 0
for s, t {0, 1} and 00 1.
The polynomial coefficients a(k) can be calculate 1 1 0 0 1 1 0 0 1 1
based on inverse matrix procedure (Kucharev et al. 1 0 1 0 1 0 1 0 1 1
1990):
11 1 1 1 1 1 1 1 1 1
(9)
Describe the arithmetic polynomial form A(x)
for the structure function of the system in Fig. 1.
The arithmetic polynomial form for n = 3 is:

7
( ) A( ) = a( ) k1 k2 k3
x1 x2 x3
k =0
= a ( ) x10 x20 x30 + a ( ) x10 x20 x31 + a ( 2 ) x10 x12 x30
+ a (3) x10 x12 x31 + a ( 4 ) x11x20 x30 + a (5 ) x11x20 x31 (10)
+ a ( ) x11x12 x30 + a ( ) x11x12 x31
= a ( ) + a ( ) x3 + a ( ) x2 + a ( ) x2 x3 + a ( ) x11
+ a ( ) x1 x3 + a ( ) x1 x2 + a ( ) x1 x2 x3

Use the vector coefficients of the structure func-


tion (9) a = [0 0 0 0 0 1 1 -1]T for this form:

( ) A( ) = x1 x3 + x1 x2 x1 x2 x3 (11)

3 THE SYSTEM AVAILABILITY

Every system component is characterized by


probability pi (represents the availability of
Figure 3. BDD of structure function of simple series- component i) and probability qi (defines its
parallel system in Fig. 2. unavailability):

217

CH27_Zaitseva.indd 217 3/15/2016 12:52:53 PM


pi = Pr{xi = 1}, qi = Pr{xi = 0}, pi + qi = 1. (12)

When the system structure function and availabili-


ties of all system components are known, then system
availability/unavailability can be computed as follows
(Barlow & Proschan 1975, Schneeweiss 2009):

A = Pr{(x) = 1}, U = Pr{(x) = 0}, A + U = 1. (13)

The availability is one of the most important


characteristics of any system. It can also be used to
compute other reliability characteristics, e.g. mean Figure 4. Calculation of the coefficients of the probabi-
listic form of the structure function.
time to failure, mean time to repair, etc. (Beeson &
Andrews 2003, Schneeweiss 2009).
There is interesting property of the arithmetic We can see, the availability of the system com-
polynomial form. The replacement of the variables puted by new methods and traditional way are
xi by the probabilities of component working pi equal. But the calculation of system availability
allows to obtain the probabilistic form of the struc- based on matrix procedure is formal background
ture function that is system availability. for the development of parallel algorithms (Kucha-
Theorem 1. The system availability (probability rev et al. 1990). For example, the flow diagrams for
of the working state) is calculate by arithmetical the calculation of the coefficient of the probabi-
polynomial form (3) in which the Boolean varia- listic form to compute the system availability are
bles xi are changed by relevant probability of com- design based on the transformation (9) and is pre-
ponent state working: sented in Fig. 4.

2 n 1
A a( k ) p1k
1
p2k2 ... pnkn (14) 4 IMPORTANCE ANALYSIS
k =0
The availability is one of the most important char-
where a(k) is coefficients polynomial (3); pi (i = 1, acteristics of any system. It can also be used to
..., n) is probability of the i-th component working compute other reliability characteristics, e.g. mean
state, and piki = 1 if ki = 0 and piki pi if ki = 1. time to failure, mean time to repair, etc. (Barlow &
Proof. According to (Kucharev et al. 1990) Proschan 1975, Schneeweiss 2009). But they do not
the arithmetical polynomial form is canoni- permit to identify the influence of individual sys-
cal form of Boolean function representation. tem components on the proper work of the system.
Therefore all elements of the polynomial form For this purpose, there exist other measures that
(3) a (k ) x1k1 x2k2 ... xnkn are mutually independ- are known as Importance Measures (IM). The IMs
ent events. And variables of Boolean function are are used in part of reliability analysis that is known
interpreted as independent events according to as importance analysis. The comprehensive study
Kumar & Breuer 1981. Therefore in case of proba- of these measures has been performed in work [4].
bilistic analysis the Boolean function variables can IMs have been widely used for identifying system
be replaced by probabilities of this events. weaknesses and supporting system improvement
For example, compute the availability of the activities from design perspective. With the known
system in Fig. 1 based on the arithmetical polyno- values of IMs of all components, proper actions
mial form of this system structure function (11). can be taken on the weakest component to improve
According to the Theorem 1 the Boolean variables sys-tem availability at minimal costs or effort.
of this form are replaced by the probabilities pi of There exist a lot of IMs, but the most often used
the system components functioning: are the Structural Importance (SI), Birnbaums
Importance (BI), Criticality Importance (CI)
A p1 p3 + p1 p2 p1 p2 p3 . (15) (Table 2).
Different mathematical methods and algorithms
In comparison, compute this system availability can be used to calculate these in-dices. Ones of them
according to traditional way based on the structure are Direct Partial Boolean Derivatives (DPBDs)
function AND-OR-representation (2): that have been introduced for importance analy-
sis in paper (Moret & Thomason 1984). In paper
A {AND( x1,OR( x2 , x3 ))} p1( p2 + p3 p2 p3 ) (Zaitseva & Levashenko 2013), the mathematical
= p1 p2 + p1 p3 p1 p2 p3 . background of DPBDs application has been con-
sidered. But efficient algorithm for computation
(16) of DPBDs has not been proposed. In this paper,

218

CH27_Zaitseva.indd 218 3/15/2016 12:52:55 PM


( j j )
Table 2. Basic importance measures.
={ ( ) }{ ( ) },. (17)
Importance i ( )
Measure Meaning
where (ai, x) = (x1, x2,, xi-1, a, xi+1,, xn), a, j
SI The SI concentrates only on the
topological structure of the
{0, 1} and is the symbol of equivalence operator
system. It is defined as the (logical bi-conditional).
relative number of situations in Clearly, there exist four DPBDs for every vari-
which a given component is able xi (Bochmann & Posthoff 1981, Zaitseva &
critical for the system activity Levashenko 2013):
BI The BI of a given component is
defined as the probability that ( ) , ( ), ( ) , ( ) .
the component is critical for the
system work. i( ) i( ) i( ) xxi (1 0)
CI The CI of a given component is
calculated as the probability that In reliability analysis, the first two DPBDs
the system failure has been caused can be used to identify situations in which a fail-
by the component failure, given ure (repair) of component i results system failure
that the system is failed.
(repair). Similarly, the second two DPBDs iden-
tify situations when the system failure (repair) is
caused by the i-th component repair (failure). The
a new parallel algorithm for the calculation of a second two derivatives exist (are not equal to zero)
DPBD is developed. for a noncoherent systems (Zaitseva & Levashenko
As alternative result for the new algorithm, algo- 2013).
rithms in (Zaitseva et al 2015) can be considered. For example, consider a system of three com-
The authors of the paper (Zaitseva et al 2015) ponents (n = 3) in Fig. 1. The influence of the first
proposed algorithms for calculation of a DPBD component failure on the system can be analyzed
based on the structure function representation by by DPBD (10)/x1(10). This derivative has
a Binary Decision Diagram (BDD) that includes three nonzero values for state vectors x = (x1, x2, x3):
parallel procedure too. But the algorithms in (Zait- (10, 1, 1), (10, 0, 1) and (10, 1, 0). Therefore,
seva et al 2015) need a special transformation of the failure of the first component causes a system
initial representation of the structure function breakdown for working state of the second and the
into a BDD, and this increases the computation third component or working state of one of them.
complexity. The system is not functioning if the second and
the third components are failed and, therefore, a
failure of the first component does not influence
4.1 Direct partial Boolean derivatives system availability.
A DPBD is a part of Logical Differential Calculus
(Moret & Thomason 1984, Bochmann & Posthoff
4.2 Importance measures and Direct Partial
1981). In analysis of Boolean functions, a DPBD
Boolean Derivatives
allows identifying situations in which the change of
a Boolean variable results the change of the value In reliability analysis, the structure function and
of Boolean function. In case of reliability analy- the system components are used instead of the
sis, the system is defined by the structure function Boolean function and the Boolean variables,
(1) that is a Boolean function. Therefore, a DPBD respectively. Using this coincidence, the authors
can be used for the structure function analysis too. of the papers (Zaitseva & Levashenko 2013) have
In terms of reliability analysis, a DPBD allows developed techniques for analysis of influence of
investigation the influence of a structure function individual system components on system failure/
variable (= component state) change on a function functioning using DPBDs. Let us summarize the
value change (= system state). Therefore, a DPBD definitions of IMs (Table 2) for the system failure
of the structure function permits indicating com- based on DPBDs.
ponents states (state vectors) for which the change The SI of component is defined as the rela-
of one component state causes a change of the tive number of situations, in which the com-
system state (availability). These vectors agree with ponent is critical for system failure. Therefore,
the system boundary states (Moret & Thomason the SI of component can be defined by DPBD
1984, Zaitseva & Levashenko 2013). (10)/xi(10) as the relative number of
DPBD (j j)/xi(a ) of the structure state vectors for which the considered DPBD has
function (x) with respect to variable xi is defined nonzero values (Zaitseva & Levashenko 2013,
as follows (Bochmann & Posthoff 1981): Zaitseva 2012):

219

CH27_Zaitseva.indd 219 3/15/2016 12:52:56 PM


Table 3. IMs for the system in Fig. 1.
i( )
SII i = . (18)
2 n1 Component x1 x2 x3

where is a number of nonzero values of DPBD Probability of component state, pi 0.90 0.70 0.65
(10)/xi(10) and 2n-1 is a size of the DPBD. SIi 0.75 0.25 0.25
Similarly, the modified SI, which takes into MSIi 1.00 0.50 0.50
account the necessary condition for component BIi 0.90 0.32 0.27
being critical, can be defined as follows (Zaitseva CIi 0.46 0.49 0.49
& Levashenko 2013, Zaitseva 2012):
4.3 Parallel algorithm for the calculation of direct
i( )
SI i =
MSI . (19) partial Boolean derivatives
i
One of possible way for the formal development
of parallel algorithms is transform mathematical
where i is a number of state vectors for which
background into matrix algebra. Therefore, consider
(1i, x) = 1.
DPBD (17) in matrix interpretation. As the first step
The BI of component i defines the probability
in such transformation, the initial data (structure
that the i-th system component is critical for sys-
function) has to be presented as a vector or matrix.
tem failure. Using DPBDs, this IM can be defined
The structure function is defined as a truth vec-
as the probability that the DPBD is nonzero
tor (Fig. 1) in matrix algorithm for calculation of
(Zaitseva & Levashenko 2013)
DPBD. The truth vector of DPBD (derivative vec-
tor) is calculated based on the truth vector of the
BII i = { xi }. (20) structure function as:

A lot of IMs are based on the BI, e.g. the CI, ( j j ) i( )


Barlow-Proschan, Bayesian, redundancy, etc. For
example, the CI is calculated as follows (Kuo & = (
( ) ) ( ( )) (22)

Zhu 2012):
where P(i,l) is the differentiation matrix with size
q 2n-1 2n that is defined as:
CII i = BI i i . (21)
U
P( l)
= M (i 1)
l l M ( n i ) . (23)
where qi is component state probability (1) and U is
the system unavailability.
and M(w) is diagonal matrix with size 2w 2w, l l
To illustrate the calculation of all IMs using
is the vector for which l = s for the matrix P( ,a )
DPBDs consider the system in Fig. 1. Values of IMs
and l = a for matrix P( ,a ) , and is the Kronecker
for this system are computed in Table 3. Accord-
product (Kucharev et al. 1990).
Note that the calculation ( j ) and ( j )
ing to these IMs, the first component has the most
influence on the system failure from point of view
in (22) agrees with the definition of state vectors for
of the system structure, because the values of the
which the function value is j and j , respectively.
SI, MSI and BI are greatest for this component. The
The matrices P( ,a ) and P( ,a ) allows indicating
CI is maximal for the second and third components
variables with values a and a , respectively. The
and, therefore, it indicates the first component as
operation AND () integrates
g these conductions.
non-important taking into account the probability ( j j ) i ( )
DPBD does not
of failure of this component (it is minimal for this
depend on the i-th variable (Bochmann & Post-
component, i.e. q1 = 0.10). The FVIs implies that
hoff 1981). Therefore, the derivative vector (22)
the second and third components contribute to sys-
( j j ) i ( ) has size of 2n1.
tem failure with the most probability.
Consider an example for calculation of deriva-
So, DPBDs are one of possible mathematical
tive vector x(10)/x1(10) for the structure
approaches that can be used in importance analy-
function with the truth vector x = [0 0 0 0 0 1 1 1]T
sis, and they allow us to calculate all often used IMs
(it is the truth vector of the structure function of
(Table 2). Mathematical background of its applica-
the system depicted in Fig. 1). According to (22),
tion for the definition of IM has been considered
the rule for the calculation of this derivative is:
in papers (Zaitseva & Levashenko 2013, Zaitseva
2012). In this paper new algorithm for the calcu-
( ) 1( )
lation of DPBD based on a parallel procedure is
developed. = ( ( )) ( ( )) = [ 0111]T .
(24)

220

CH27_Zaitseva.indd 220 3/15/2016 12:52:56 PM


where matrices P( ,1) and P( ,0 ) are defined based
on the rule (23) as:

P( 1)
= M ( 0)
0)
[1 0] M ( 2 )

and

P( 1)
= M ( 0)
0)
[0 1] M ( 2 ) .

The derivative vector X(10)/x2(10)


DPBD indicate three state vectors x = (x1, x2, x3):
(10, 1, 1), (10, 0, 1) and (10, 1, 0). There-
fore, the failure of the first component causes a
system breakdown for working state of the sec-
ond and the third components or working state
of one of them. This result is equal to result that
has been calculated by definition (17) for DPBD
(10)/x1(10).
A matrix procedure can be transform in paral-
lel procedure according to (Kucharev et al. 1990).
Therefore the equation (22) can be interpreted
by parallel procedure. For example, the flow dia-
grams for the calculation of the derivative vec-
tors x(10)/x1(10), x(10)/x2(10) and
x(10)/x3(10) for the structure function of
the system in Fig. 2 according (22) are presented
in Fig. 5. These diagrams illustrate the possibility
to use parallel procedures for the calculation of
DPBD.
Figure 5. Calculation of DPBDs based on parallel
procedures.
5 CONCLUSION

In this paper the new algorithm based on the par-


allel procedures is proposed for the calculation
of system availability and most often-used IMs
(Table 2). The algorithm for the computation of
the system availability are based on the use of the
probabilistic form of the structure function in
point of view of Boolean algebra. The parallel pro-
cedure is used for the calculation the coefficients of
this form (14).
The algorithm for the calculation of IMs are Figure 6. Computation time for calculation of DPBDs
based on the use of the DPLDs (17). The paral- based on analytical and parallel procedures.
lel procedure allows to compute the values of the
derivative (22). The computational complexity of ACKNOWLEDGEMENT
the proposed algorithm is less in comparison with
algorithm based on the typical analytical calcula- This work was partially supported by the grant
tion (Fig. 6). VEGA 1/0498/14 and the grant of 7th RTD
The proposed algorithm for the calculation Framework Program No 610425 (RASimAs).
of IMs based on the parallel procedures can be
used in many practical applications. The princi-
pal step in these applications is representation REFERENCES
of the investigated object by the structure func-
tion. As a rule the structure function is defined Barlow, R.E., Proschan, F. (1975). Importance of system
based on analysis of the structure of investigated components and fault tree events, Stochastic Processes
object. and their Applications, 3(2): 153173.

221

CH27_Zaitseva.indd 221 3/15/2016 12:52:59 PM


Beeson, S., Andrews, J.D. (2003) Importance measure for Lingfeng, W, Singh, C. (2009). Multi-deme parallel
non-coherent-system analysis, IEEE Trans. Reliability genetic algorithm in reliability analysis of composite
52(3) 2003: 301310. power systems, Proc. of IEEE Conf. on PowerTech,
Bochmann, D., Posthoff, C. (1981). Binary Dynamic Sys- 2009.
tems. Berlin, Academic Verlag. Moret, B.M.E., Thomason, M.G. (1984). Boolean Dif-
Brown, S., Vranesic, Z. (2000) Fundamentals of Digital ference Techniques for Time-Sequence and Com-
Logic with VHDL design, McGraw-Hill. mon-Cause Analysis of Fault-Trees, IEEE Trans.
Chang, Y., Amari, S., Kuo, S. (2004) Computing system Reliability, R-33: 399405.
failure frequencies and reliability importance measures Schneeweiss, W.G. (2009). A short Boolean derivation
using OBDD, IEEE Trans Computers, 53(1): 5468. of mean failure frequency for any (also non-coherent)
Fricks, R., Trivedi, K. (2003) Importance analysis with system, Reliability and Engineering System Safety,
Markov chains, Proc 49th IEEE Annual Reliability 94(8): 13631367.
Maintainability Symp. Tampa, USA, 2003. Zaitseva, E. (2012). Importance analysis of a multi-state
Green, R.C., Lingfeng Wang, Alam, M., Singh, C. (2011). system based on multiple-valued logic methods. In:
Intelligent and parallel state space pruning for power Lisnianski A. and Frenkel I. (eds) Recent Advances
system reliability analysis using MPI on a multicore in System Reliability: Signatures, Multi-state Sys-
platform, Proc IEEE PES on Innovative Smart Grid tems and Statistical Inference, London: Springer, pp.
Technologies (ISGT), 2011. 113134.
Kucharev, G.A. Shmerko, V.P., Zaitseva, E.N. (1990) Zaitseva, E., Levashenko, V. (2013). Importance Analy-
Multiple-Valued Data Processing Algorithms and sis by Logical Differential Calculus. Automation and
Systolic Processors, Minsk: Nauka and Technica. Remote Control, 74(2): 171182.
Kumar, S.K., Breuer, M. (1981). Probabilistic Aspects of Zaitseva, E., Levashenko, V., Kostolny, J., (2015) Appli-
Boolean Switching Function via a New Transform, cation of logical differential calculus and binary deci-
Journal of the Association for Computing Machinery, sion diagramin importance analysis. Eksploatacja i
28(3): 502520. Niezawodnosc Maintenance and Reliability; 17 (3):
Kuo, W., Zhu, X. (2012). Importance Measures in Reli- 379388.
ability, Risk and Optimization. John Wiley & Sons,
Ltd.

222

CH27_Zaitseva.indd 222 3/15/2016 12:53:00 PM


Advanced mathematical methods for engineering

Advanced methods to solve partial differential equations

AMER16_Book.indb 223 3/15/2016 11:28:17 AM


This page intentionally left blank
Applied Mathematics in Engineering and Reliability Bri, Snel, Khanh & Dao (Eds)
2016 Taylor & Francis Group, London, ISBN 978-1-138-02928-6

A note on the Cauchy-Riemann equation on a class of convex


domains of finite and infinite type in 2

L.K. Ha
Faculty of Mathematics and Computer Science University of Science,
Vietnam National University, Ho Chi Minh City, Vietnam

ABSTRACT: In this paper, we provide an extension of the Hlder regularity result by Range in (Range
1978) to a certain class of strict finite and infinite type convex domains in 2. A new notion of type is
introduced for arbitrary convex domains in 2 with smooth boundaries. This type generalizes the notion
of strict finite type in the original theory (Range 1978) as well as consists many cases of infinite type in
which Ranges method is fail to be applied.

1 INTRODUCTION In the lecture of Michael Range (Range 1978)


given at the International Conferences Cortona,
Let ( 1 n ) be the complex Euclidean coor- Italy, 19761977, he proved the following facts:
dinates of n, with n 1 , and let n be a on the smooth boundary convex domain of strict
bounded domain. The Cauchy-Riemann complex finite type m ( m = 1, 2, .)
on C 1( ) -functions is defined to be
m = {( z1, z2 ) 2
:|| z |2 m + | z2 |2 < }, (1)
n
u
= d z j,
j =1 z j
the Cauchy-Riemann equation = is solvable.
Moreover, the solution u is Hlder continuous
ntin of

(
where z = 2 +
1
j

)
with zj = xj + 1y j ,
order 2m
1
whenever is a (0, 1) C 1( m ) -form.
Here, is defined in the sense of distributions.
j = 1,...,n.
Also in this important lecture, he showed that
One of most fundamental and important prob-
on the infinite type smooth boundary convex
lems in multidimensional complex analysis is to
domain
solve the Cauchy-Riemann equation
1
= = {( z1, z2 ) 2
: exp( + / ). exp s
| | (2)
for a given (0, 1 )-form j =1 j d z j. In the com-
2
+ | z2 |2 1 < 0},
plex plane, this problem is trivial (Hrmander 1990).
In higher dimension spaces, the solution of -equa- for 0 1 , the Cauchy-Riemann equation is
tion is explicitly constructed and the regularity theory although solvable, there is no solution which is
is also well-understood on the unit ball (Rudin 1980). Hlder continuous of any positive order. Hence,
Moreover, the study to this problem is completely it is reasonable that we can conjecture if the
established on strongly pseudoconvex domains -equation is solvable in other Hlder class in
which are most beautiful domains in several some weak sense on .
complex variables, see (Hrmander 1965), (Henkin Recently, sup-norm estimates for the solution
1969), (Henkin and Romanov 1971), (Romanov to the -equation on have been established by
1976). Recently, some existence and regularity Fornaess-Lee-Zhang in (Fornaess et al. 2011) and
results have been proved on certain analytic convex Khanh in (Khanh 2013). The main purpose in this
domains in n, see (Bruma and Castillo 1984), (Ahn paper is to give a positive and general answer to the
and Cho 2003), (Fornaess et al. 2011), (Khanh 2013), above conjecture. The main method in this paper is
(Ha et al. 2014). On ggeneral domains, the solvability based on a new proof in (Khanh 2013), (Ha et al.
and regularity to the -equation are still open. 2014).

225

AMER16_Book.indb 225 3/15/2016 11:28:17 AM


2 MAIN RESULT 2. The domain is called convex of maximal type
F if it has these above properties at every point
Let be a bounded domain in 2 with smooth , with the common function F. Actually,
boundary b . Let be a defining function for , it follows that by compactness of b, we can
that means, is a real value C -function defined choose the common function F for all boundary
on a neighborhood of b such that points P bb .

= {( z1, 2 ) : ( z1 , 2 ) < } Example 2.1


Let
and d 0 on b . Then is said to be (analytic)
convex if
m = {( z1, z2 ) 2
:|| z |2 m + | z2 |2 < }.
2
2
( x, y )a j ak 0 on b, Then, m is convex of maximal type F (t ) = t m ,
j k =1 x j yk see (Range 1978).
Let
for every ( 1 2 ) 0 with j =1 a j x ( y ) ( x, y ) = 0
2
j j
on b. 1
The Leray map on is defined by = {( z1, z2 ) 2
: exp( + / ). exp
| |s
+ | z2 | 1 < 0},.
2
( ) = ( )(
)( 1 1 ) + ( )( 2 2 )
1 2
Then, for 0 1 , is convex of maximal type
1 , see (Verdera 1984).
for b . S
Si
Since the convexity of , F (t ) = exp ( s )
Re(
=
) 0 for b and z and
for all ( ) b . It is well-
32.t
Let f be an increasing function such that
lim f (t ) = +, from (Khanh 2013), (Ha and Khanh
( ) 0
j

so t
known that for each b , the complex hyper- 2015), we recall the weak f-Hlder space on
surface { ( , z ) } and the complex tangent as
space to b at are actually the same. Moreover,
the Leray map has the following properties: f ( ) = {u :|| u || f :=|| u ||
1. is of C1-class in ( ) . + sup
u z ,zz h f (| h |
1
). | u(z
u( z + h ) u(( z ) |< }.
2. ( ) is holomorphic on .
3. | ( , ) | A > 0 for z , | | c for some When f t ) = t , for 0 1 , we obtain the
constant c > 0. standard Hlder space H ( ) .
Definition 2.1. Let Type be a set of all smooth, The main results in this paper are follows.
increasing functions F : [ ,
, ) [ , ) such that
Theorem 2.3. Let be a bounded convex domain
( = 0;
1. F(0) in 2 with smooth boundary b. Let F T Type and

2. 0 | l ( 2 ) | dr < for some > 0; assume that is convex of maximal type F.
3. F ( r ) is increasing. Then, for every (0, 1) form
f whose coefficients
r
The convex domain is called of admitting belong to L ( ) and = 0 on in the weak
an maximal type F T Type at P bb if on the sense, there exists a function u L ( ) such that
neighborhood 0 | P |< c , for some 0 < c c,
we have =

| ( , ) | | ( ) | + | ( , )] | + (|| 1 |2 ), (3) in the weak sense and || u || || || .


Moreover, we also have
for every b B ( P , c ) , and z | z |< c .
Here and in what follows, the notations  and Theorem 2.4. Let be a domain as in Theorem 2.3
 denote inequalities up to a positive constant, and we define
and means the combination of  and  . 1
d F (t )
Remark 2.2. f d ):=
1
dt ,
0 t
1. The definition 2.1 is independent of the choice
on holomorphic coordinates in a neighborhood
of P and of the particular defining function . where F* is the inverse of F.

226

AMER16_Book.indb 226 3/15/2016 11:28:19 AM


Then, for every (0, 1) form
f whose coefficients for any f with 0 < f d 11 ) < d 1
.
belong to L ( ) and = 0 on in the weak
sense, there exists a function u f ( ) such that For the proof of the above theorem, we refer
the reader to (Range 1978), (Bruma and Castillo
= 1984), (Range 1986) or (Chen and Shaw 2001).
Now, in order to proof the f-Hlder estimate for
the first integral in (4), we recall the following Gen-
in the weak sense and || u || f || || .
eral Hardy-Littlewood Lemma proved by Khanh
The proof of Theorem 2.3 is actually contained
(Khanh 2013).
in the proof of Theorem 2.4 with more easier com-
putations. Hence, we omit the details of the proof Lemma 3.3. Let be a bounded smooth domain in
of Theorem 2.3. n
and let be a defining function of . Let G : +
+ be an increasing function such that G t(t ) is
d
3 PROOF OF THE MAIN RESULT decreasing and G t(t ) dt < for d > 0 small enough.
1 0
If u C ( ) such that
The proof of Theorem 2.4 is separated to two
(| ( ) |)
G (|
parts: The first one is to recall briefly the Henkins | u ( x ) |  r x ,
for every
construction for solutions to the -equation. For | ( )|
more general definitions and properties, we refer to
the excellent book by Chen and Shaw (Chen and then
Shaw 2001). The second one is to estimate all inte-
gral terms in this construction. f |x y | 1 ) | u ( x ) u ( y ) |<
In the following definitions, only the convexity
of is required for defining . uniformly in x, y , x y, and where
( )
1
G (t )
Definition 3.1. (Homotopy Kernel for -solution f d ):=
1 d
dt .
t
on convex domains). Hence, to complete the proof of Theorem 2.4, it
For [ 0,1] , let define: is enough to prove the following result.
z j j
w 0j ( z ) = and w1j ( z ) ( ) (1 z ) , Proposition 3.4. For the above definitions and nota-
for j = 1,2. |z |
2
j
tions, we have
w j ( z ) = ( )w 0j ( z ) + w1j ( z ) , for j = 1,
2. F (| ( z ) |)
2 ,0 , 1 d 2 , S | z 2,
2 ,0 ( , , z ) | 
k
1 2 2 1 ,
where : = + d . k [ 0 ,1] | (z) |
2,1 , 1 z 2 2 z 1 1 d 2 .
and G (t ) := F (t ) satisfies the hypothesis of
Let choose a differentiable triangulation Lemma 3.3, where F* is the inverse of F.
{ k } of the boundary b, in which the Proof. For simplicity, we can drop the index k.
simplices Sk being so small that the above construc- From the definition 3.1, integrating in [ 0,1] ,
tions can be carried out for Sk, and the func- we have
tions and w1j depending on the index k. Then,
the forms 2k,0 and 2k,1 are defined as the restric-
1
tions of 2,0 and 2,1 on Sk. S ] , ] z , z S | ( , z ) | . | z |2
b
(5)
(
Theorem 3.2. (Existence). Let define the linear oper- 1
ator T : C01,1( ) C 1( ) as follows + d ),
| ( , z ) | . | |
2

1 l
where d is the surface measure of b.
T =
4

2 S [ ,1, ] k,
[ 0 ,1]
2 ,0 . (4)
Since | ( , ) | A > 0 for z fixed in , | | c
k =1 k

for some constant c > 0, it is enough to estimate the
Then, if = 0 on , we have integral over S B(B z , c ) . Based on Henkins tech-
niques, we re-introduce the following real coordi-
nate system t (t ,t
t3 ) (t(t1,tt2 ,t3 ) R 2 [ , )
( ) = on .
t1( z ) = t2 ( z ) = 0, here z b
Moreover,
f s | z z |= dist( z b ),
satisfie
t3 =| Im ( , ) |,
[ 0,1] ,
f
 || ||

227

AMER16_Book.indb 227 3/15/2016 11:28:26 AM


such that S B ( ,c ) {t :|| |
| R} and The lastst step in this part is to check the function
d ) |S B(z
( c ) dtt1d
dtt2dt3 . The existence of such a G (t ) : = F (t ) satisfies the conditions in Lemma
coordinate system follow from the implicit func- 3.3.
tion theorem and the convexity for |p(z)| and c suf- Then, by (7) we have
ficiently small.
Since | || | | | + | ( ) | , we obtain F (| ( z ) |)
I1( z ) + I 2 ( z )  ,
| (z) |
( ( ,c ))[[ 0 1]
(z
| z 2,
2 ,0 ( , ) |
d F (t )
1

dtt1dt2dt3 and so u ( ) in which f d ) : = 0 dt ,


f 1
 (6) t
|t| R ( | ( z ) | + (| t |2 ))(| t | + | ( ) |)2 for small d > 0.
 3  
Now, since F (t ) is increasing and

F (t )
is
: = I (|
( | ( )|) t
dtt1dt2dt3 decreasing, for some small R > 0, | ln(F(t2))| is decreas-
+ . ing for all 0 t R . Thus, by the hypothesis (2) of
|t| R (t + | ( z ) | + | F (t |2 2
| t |
 3   F, we have
: = II ( | ( z )|)
R
Some simple computations imply that |l ( 2
)| | | |l ( 2 ) | dt <
0 0

G (| ( z ) |)
I1(| ( z ) |) | n((| ( z ) |) |2  (7) for all 0 R . As a consequence, F (t ) | l t |
| (z) | is finite for all 0 t F ( R ) and lim t | ln
2
l F ( ) | is
t 0
zero. These facts and the second hypothesis of F
for any G satisfying Lemma 3.3. imply
On the other hand, we also have
d F (t ) F (d )

I 2 (| ( z ) |) 
R 1
dr 0 t
dt 0 y(ln F ( y2 dy
| ( z ) | F (r )
0 2
F (d )
F ( | ( z )| ) 1 = F (d ) ln d 0 ( n F ( y2 ))dy
d
= dr <
0 | ( z ) | F (r 2 )
R 1
+ dr, for d > 0 small enough.
F ( | ( z )| ) | ( z ) | F ( r 2 )
Hence, we have the conclusion that u f ( ) .
F (r )
Since r
is increasing, we have
REFERENCES
F (r 2 ) r2
for all r F (| ( z ) |) . Ahn, H. & H.R. Cho (2000). Optimal Hlder and Lp esti-
| ( z ) | F (| ( z ) |)
mates for b on boundaries of convex domains of
finite type. J. Math. Anal. Appl. 286(1), 281294.
Hence, Bruma, J. & J. del Castillo (1984). Hlder and Lp-estimates
for the -equation in some convex domains with real-
analytic boundary. Math. Ann. 296(4), 527539.
R 1 F (| ( z ) |)
F ( | ( z )| ) | dr . Chen, S.C. & M.C. Shaw (2001). Partial Differential

( z ) | F (r ) 2
4 | (z) | Equations in Several Complex Variables. AMS/IP,
Studies in Advanced Mathematics, AMS.
Fornaess, J. E. & L. Lee, Y. Zhang (2011). On supnorm
It is easy to see that estimates for on infinite type convex domains in
2
. J. Geom. Anal. 21, 495512.
F ( | ( z )| ) 1 F (| ( z ) |) Ha, L.K. & Khanh, T.V. & A. Raich (2014). Lp-estimates
0 | ( z ) | F (r ) 2
dr
| (z) |
. for the -equation on a class of infinite type domains.
Int. J. Math. 25, 1450106 [15pages].
Ha, L.K. & T.V. Khanh (2015). Boundary regularity of
Thus, the solution to the Complex Monge-Ampre equation
on pseudoconvex domains of infinite type. Math. Res.
Lett. 22(2), 467484.
F (| ( z ) |) Henkin, G.M. & A.V. Romanov (1971). Exact Hlder
I 2 (| ( z ) |)  . estimates for the solutions of the -equation. Math
| (z) | USSR Izvestija, 5, 11801192.

228

AMER16_Book.indb 228 3/15/2016 11:28:31 AM


Henkin, G.M. (1969). Integral representations of func- Range, R.M. (1978). On the Hlder estimates for u = f
tions holomorphic in strictly-pseudoconvex domains on weakly pseudoconvex domains, Proc. Inter. Conf.
and some applications. Math. USSR Sbornik. 7(4), Cortona, Italy 19761977. Scoula. Norm. Sup. Pisa,
597616. 247267.
Henkin, G.M. (1970). Integral representations of func- Range, R.M. (1986). Holomorphic Functions and Inte-
tions in strictly-pseudoconvex domains and applica- gral Representations in Several Complex Variables.
tions to the -problem. Math. USSR Sbornik. 11, Springer-Vedag, Berlin/New York.
273281. Romanov, A.V. (1976). A formula and estimates for solu-
Hrmander, L. (1965). L2 estimates and existence theo- tions of the tangential Cauchy-Riemann equation.
rems for the operator. Acta. Math. 113, 89125. Math. Sb. 99, 5883.
Hrmander, L. (1990). An introduction to complex analy- Rudin, W. (1980). Function theory in the unit ball of n.
sis in Several Complex Variables. Third edition, Van Springer-Varlag, New York.
Nostrand, Princeton, N. J. Verdera, J. ((1984). L-continuity of Henkin operators
Khanh, T.V. (2013). Supnorm and f-Hlder estimates for solving in certain weakly pseudoconvex domains
on convex domains of general type in . J. Math.
2
of 2. Proc. Roy. Soc. Edinburgh, 99, 2533.
Anal. Appl. 430, 522531.

229

AMER16_Book.indb 229 3/15/2016 11:28:35 AM


This page intentionally left blank
Applied Mathematics in Engineering and Reliability Bri, Snel, Khanh & Dao (Eds)
2016 Taylor & Francis Group, London, ISBN 978-1-138-02928-6

A review on global and non-global existence of solutions of source


types of degenerate parabolic equations with a singular absorption:
Complete quenching phenomenon

D.N. Anh & K.H. Van


Faculty of Mathematics and Statistics, Ton Duc Thang University, Ho Chi Minh City, Vietnam

J.I. Daz
Instituto de Matemtica Interdisciplinar, Universidad Complutense de Madrid, Madrid, Spain

ABSTRACT: We study the global and non-global existence of solutions of degenerate singular para-
bolic equation with sources. In the case of global existence, we prove that any solution must vanish iden-
tically after a finite time if either the initial data or the source term or the measure of domain is small
enough.

1 INTRODUCTION t u + u u> }
{u> = f (u, x,t ) i ( , ),

This paper is to study the nonnegative solutions u = ( , ), (2)
of the source types of one dimensional degenerate u ( x, 0 ) = u ( x ) i ,
parabolic equations with a singular absorption 0

Problem (2) can be considered as a limit of


(| u | p 2 u ) + u X f u, x , t )
t {u > } = mathematical models describing enzymatic kinetics
x x x
in I ( , ),
(see (Banks 1975)), or the Langmuir-Hinshelwood
u( x1 , t ) = u( x2 , t ) t ( 0, ), (1) model of the heterogeneous chemical catalyst (see,
u ( x , 0 ) u ( x ) i I, e.g. (W. Strieder 1973) p. 68, (Daz 1985), (Phillips
0
1987) and references therein). This case was studied

in (Phillips 1987), (Kawohl 1996), (Levine 1993),
where I = (x1, x2) is an open bounded interval in (Dvila & Montenegro 2004), (Winkler 2007), and
, (0, 1), p > 2, and X{ } denotes the char- so forth. These authors focused on studying the
acteristic function of the set of points (x, t) where existence of solution, and the behaviors of solutions.
u(x, t) > 0, i.e For example, D. Phillips (Phillips 1987) proved the
existence of solution for the Cauchy problem asso-
ciating (2) in the case f = 0. He also showed that any
1, if u > 0, solution must quench after a finite time.
X{ } =
0, if u 0. In (Dvila & Montenegro 2004), J. Davila and
M. Montenegro proved the existence of solution
of equation (2) if the source term f(u) is sub-linear,
Note that the absorption term u X{u> }
becomes singular when u is near to 0, and we impose i.e: |f(u)| C(u + 1), for u 0. Moreover, they also
u X{u> } = 0 whenever u = 0. Through this paper, showed that the measure of the set {(x, t) (0,
we always as that f I [ 0, ) R is a nonneg- ): u(x, t) = 0} is positive. In other words, the solu-
ative function satisfying the following hypothesis tion may exhibit the quenching behavior. Still in the
sub-linear case, M. Montenegro (Montenegro 2011)
considered equation (2) with the source term .f(u)
f 1
( I [ 0,, )), andd f ( , x, ) = 0
instead of f(u, x, t). He showed that there exists a
(( , t ) ( 0, ).
( ) 1
positive real number 0 so that if (0, 0), then any
There is a nonnegativ
n e real f
funcatio n h C ( ) solution must vanish identically after a finite time.
such that f u , x , t ) h ( u
u)), ( x , t ) I ( , ). Recently, problem (1) in N-dimension was con-
sidered by Giacomoni et al., (Giacomoni, Sauvy, &
In the case N -dimension and p = 2, equation Shmarev 2014), with the source term f (u, x) satis-
(1) becomes fying a natural growth condition, i.e:

231

AMER16_Book.indb 231 3/15/2016 11:28:35 AM


( H1 ) f ( , x ) .u q 1 + v, globally bounded if provided that I = and initial
data u0(x) .I(x), for some > 0, and I is the
with , 0, and q 1. These authors proved first eigenfunction corresponding to I.
a local existence result. Unfortunately, their proof Roughly speaking, such a weak solution of equa-
of local existence of solution is not correct. Then, tion (1) is a sub-solution of equation (3). Thus, the
our first purpose is to prove the local existence of strong comparison theorem implies that the global
solution of equation (1), even for a more general existence result holds for equation (1) if provided
class of functions f(u, x, t) satisfying (H) instead of either q < p, or q p and u0 (resp. , the measure
(H1) in (Giacomoni, Sauvy, & Shmarev 2014). 0
01 of I) is small enough. By this observation, we will
|x|2 show that any weak solution of equation (1) exists
For example, the function f u, x,t ) = t +1 (e (e u )
satisfies (H). But, it does not satisfy any natural globally if provided that either
growth conditions in (H1). i. q < p, or
We note that the assumption f(0, x, t) = 0 in (H) ii. q p and u0 (resp. the measure of I, ) is small
is a necessary condition for the existence of solu- enough, or
tion. If this one is violated, then equation (1) may iii. q = p, and |I | is sufficiently small.
have no solution. For instance, we will show at
the end that equation (1) has no weak solution if Note that the result of (iii) is new because
f u ) = .u q 1 + v , for v > 0. Thus, this assumption the solutions exist globally even > I, while the
seems to be the sufficient condition for the exist- unique solution of equation (3) blows up whenever
ence of solution for a particular class of functions > I (compare to Theorem 2.2 and Theorem 3.1,
f(u, x, t) satisfying a certain growth condition. (Giacomoni, Sauvy, & Shmarev 2014)).
The second purpose of this article is to study The conclusion (iii) can be explained as follows.
the existence and nonexistence of global solution As mentioned above, we will prove an estimate for
of equation (1) for the case where f satisfies a natu- |ux| involving a certain power of u.
ral growth condition (H1). Let us first remind some
classical results for the global and non-global exist- | x ( x, ) |p .u1 ( ,t),
)), fo a.e ( ,t)) (0, ).
ence of solution of equation (1) without the singu- (4)
lar absorption term:
Intuitively, inequality (4) says that the absorp-
t (| ux | p2 ux x = f (u, x,t ) i I ( , ), tion u X{u> } strengthen the diffusion term to

u( x1,t ) = u( x2 ,t ) = 0 t ( , ),
) (3) against the effect of the source term. By this rea-
u(( , 0 ) son, the global existence result can be extended
0 (x ) i ,
to the case: > I, and 0 < I is small. At the
end, we will provide some numerical experiences in
For a simple introduction, we only discuss the order to illustrate the difference between solutions
case: f(u, x, t) = .uq1, q > 1, > 0. For a more of both equations (1) and (3).
general class of f, we refer to (Levine 1990), The final goal of this paper is to consider the
(Zhao 1993), (Galaktionov 1994), (Galaktionov & quenching phenomenon of solutions of equation
Vazquez 2002), and references therein. (1), that nonnegative solution is extinct after a finite
In (Tsutsumi 1973), M. Tsutsumi proved that time. As already known, in the case p = 2, f 0, any
if q < p, then problem (3) has global nonnegative weak nonnegative solution of equation (1) vanishes
solutions whenever initial data u0 belongs to some identically after a finite time, even beginning with
Sobolev space. The case q p is quite delicate that a large initial data, see e.g (Phillips 1987), (Dao,
there are both nonnegative global solutions, and Diaz, & Sauvy 2016), (Winkler 2007), (Dvila &
solutions which blow up in a finite time. Indeed, Montenegro 2004), and references therein. This
J. N. Zhao (Zhao 1993) showed that when q p, property y arises due to the presence of the singular
equation (3) has a global solution if the measure of term u X{u> }.
I is small enough, and it has no global solution if For the case f(u) = .uq1, Giacomoni et al.
the measure of I is large enough. The fact that the showed that the quenching phenomenon occurs if
first eigenvalue of p (denoted as I ) decreases q p, and I > , see Theorem 2:2, (Giacomoni,
with increasing domain can be also used as an intu- Sauvy, & Shmarev 2014). Their argument is based
itive explanation for Zhaos result. In the critical on the observation that the diffusion term domi-
case q = p, Y. Li and C. Xie (Li & Xie 2003) showed nates the source term f(u) in these cases (see also
that if I > , equation (3) has then a unique global (Montenegro 2011) for the case p = 2). However,
bounded solution. While, the unique solution of this argument is no longer applicable to the
equation (3) blows up in a finite time if provided remains, such as q = p and I ; or q > p. Thus,
I < . We also note that the unique solution is we are interested in the following question that

232

AMER16_Book.indb 232 3/15/2016 11:28:37 AM


whether or not the quenching phenomenon occurs And
for the remain cases. Our answer is positive under
the additional conditions on u0, I, or . Then, a 1
brief of our quenching results is as follows: 1( uf)= max | Du f ( , x, ) | p ,
Any weak solution of equation (1) must vanish 0 ( 0) ( ) 0 T0 ]
1
identically after a finite time if provided either
2 ( x f)= m x
ma | Dx f ( , x, ) | p .
a. q p, and is small enough (Note that u0 can be 0 ( 0 ),
) ( ,) [ 0,T0 ]
large in this case); or
b. q p, and u0 ||L ( ) (resp. the measure of I) is As a consequence of (6), for any > 0 there is a
small enough; or positive constant C = C (, p, ) such that
c. q = p, and | I| is small enough.
The conclusion (a) means that the source term 1

.uq1 is so small that this perturbation does not | ( x, ) u( y, s ) | C | y | |t |3 ,


(7)
effect so much to the quenching property of solu-
tions of equation (1). A simulation result at the x, y I , t, s .
end will illustrate the above result.
Next, let us denote by J and J the first non-
negative normalized eigenfunction and the first
2 PRELIMINARY AND MAIN RESULTS eigenvalue of the problem

At the beginning, let us introduce the notion of a x (| xJ | p 2 x J ) = J Jp 1 in J ,



weak solution of equation (1). J = (l1, l2 ) ,
J (ll = J (ll2 ) = 0.
Definition 1 Given 0 u0 W01 p ( I ) . A func-
tion u 0 is called a weak solution of equation
(1) if f(u, x, t), u X{ } L1(I (0, T)), and It is well known that the formula of the first
p 1p 1 eigenvalue (see (R. L. Biezuner & Martins 2009))
u L ( , T ;W0 ( I )) L (I
( I ( 0, T )))) ([ 0, )); L ( I )
satisfies equation (1) in the sense of distributions is
D (I (0, )), i.e,
p
p /p

J = ( p ) , with p = 2 .
0 I
( t | ux | 2
x x

{ )
} (5) l2 l1 sin( / p )
(8)
+ f (u, x,, ) ) , Cc ( (0, )
As mentioned above, equation (1) has no solu-
Note that u0 C0,(I), with = 1 1p , since the tion if f u ) = .u q 1 + v , for some v > 0. By this
Sobolev imbedding. Then, we have the local exist- reason, we only consider f(u) = .uq1 for the theo-
ence theorem. rems below. Then, we first have the global existence
result when || u0 ||L ( ) is small.
Theorem 2 Let 0 u0 W01, p (I), and f satisfy (H). Theorem 3 Given > 0, and q p. Let f(u) = .
Then, there exists a time T0 > 0 so that equation (1) uq1. Assume that u0 W01, p (I) such that || u0 ||L ( )
has a maximal weak solution u in I (0, T0). Moreo- is small enough. Then, the weak solutions of equa-
ver, u satisfies the following estimate tion (1) are globally bounded. Moreover, they vanish
identically after a finite time.
1
1
1 1+ Next, we have the global existence result if

| x ( x, ) | .u ( ,t)) p p
(T0 ) + (resp. the measure of I) is small enough.
Theorem 4 Given u0 W01, p (I), and q p. Let
1+ f(u) = .uq1. Assume that (resp. the measure of I)
p
( f )+ is small enough. Then, the weak solutions of equa-
0 ).
) 1( Du (6)
1+ tion (1) are globally bounded. Moreover, they vanish

p identically after a finite time.
( 0 ) 2 ( Dx f ) + 1 ,
Particularly, we have the following result for the
critical case q = p.
Theorem 5 Given u0 W01, p (I), and q = p. Let
for a.e (x, t) I (0, T0), where is the flat solution f(u) = .uq1. Assume that |I | is small enough.
satisfying the ordinary differential equation: Then, the weak solutions of equation (1) is globally
bounded. Moreover, they vanish identically after a
t h( ), and (0 ) =|| u0 || . finite time.

233

AMER16_Book.indb 233 3/15/2016 11:28:38 AM


Remark 6 By the comparison principle, the con-
clusions of Theorem 3, Theorem 4 and Theorem
5 still hold if f satisfies (H) and f(u, x, t) .uq1.
Concerning the non-global existence of solu-
tions of equation (1), we first remind a result of
(Giacomoni, Sauvy, & Shmarev 2014).
Proposition 7 Let f(u) = .uq1. Let q > p, and u0
W01, p (I). Assume that E(0) < 0 with

1 1 1 q
E (t ) = | ux (t ) | p + u (t ) u (t ) dx. Figure 2. Evolution of the maximal solution of equa-
I p 1 q tion (1).

Then every solution of equation (1) blows up in a


finite time.
In the critical case q = p, we show that the maxi-
mal solution u cannot be globally bounded if pro-
vided E(0) 0.
Theorem 8 Let and u0 W01, p (I), and q = p. Let
f(u) = .uq1. Assume that E(0) 0. Then, the solu-
tion u cannot be globally bounded.

3 SIMULATION RESULTS
Figure 3. Evolution of the maximal solution of equa-
In this part, we will illustrate our theoretical results tion (1).
with some numerical experiences. In the sequel, we
consider equation (1) and equation (3) for the case:
q = p = 2.3, = 0.8, I = (0, L), and u0(x) = x(L x), REFERENCES
and f(u) = .uq1.
We fix L = 3.1273. It follows then from (8) that Aris, R. (1975). The Mathematical Theory of Diffusion
I = 0.9999. and Reaction in Permeable Catalysts. Oxford Univer-
With = 1 > I (just a little bit difference), the sity Press.
unique solution of equation (3) blows up after Bandle, C. & C.M. Brauner (1986). Singular perturbation
t = 4286, see Figure 1. While = 1.269, the maxi- method in a parabolic problem with free boundary.
mal solution of equation (1) vanishes after t = 7.6, Boole Press Conf. Ser. 8, 714.
see Figure 2. Banks, H.T. (1975). Modeling and control in the bio-
With = 1.270, the maximal solution of equa- medical sciences. lecture notes in biomathematics.
Springer-Verlag, Berlin-New York 6.
tion (1) blows up at t = 23, see Figure 3. Intuitively, Biezuner, R.L., G.E. & E.M. Martins (2009). Computing
the absorption u X{ } supports the nonlin- the first eigenvalue of the p-laplacian via the inverse
ear diffusion an amount 0.up1, with 0 = 1.269 power method. Funct. Anal. 257, 243270.
0.9999 = 0.2691. By this reason, for any (0, Boccardo, L. & F. Murat (1992). Almost everywhere con-
1.269), the solutions of equation (1) exist globally vergence of the gradients of solutions to elliptic and
and they vanish after a finite time. parabolic equations. Nonlinear Anal. Theory, Methods
and Applications 19(6), 581597.
Boccardo, L. & T. Gallouet (1989). Nonlinear elliptic and
parabolic equations involving measure data. Funct.
Anal. 87, 149169.
Coddington, E. & N. Levinson (1955). Theory of Ordi-
nary Differential Equations. New York: McGraw-Hill.
Dao, A.N. & J.I. Diaz. A gradient estimate to a degen-
erate parabolic equation with a singular absorption
term: global and local quenching phenomena. To
appear Jour. Math. Anal. Appl..
Dao, A.N., J.I. Diaz, & P. Sauvy (2016). Quenching phe-
nomenon of singular parabolic problems with l1 initial
data. In preparation.
Dvila, J. & M. Montenegro (2004). Existence and
Figure 1. Evolution of the unique solution of equation asymptotic behavior for a singular parabolic equation.
(3). Transactions of the AMS 357, 18011828.

234

AMER16_Book.indb 234 3/15/2016 11:28:41 AM


Daz, J.I. (1985). Nonlinear partial differential equations Levine, H.A. (1993). Quenching and beyond: a survey
and free boundaries, research notes in mathematics. of recent results. nonlinear mathematical problems in
Pitman 106. industry ii. Internat. Ser. Math. Sci. Appl. 2, 501512.
Fila, M. & B. Kawohl (1990). Is quenching in infinite Li, Y. & C. Xie (2003). Blow-up for p-laplacian parabolic
time possible. Q. Appl. Math. 48(3), 531534. equations. Electronic Jour. Diff. Equa. 20, 112.
Galaktionov, V.A. & J.L. Vazquez (2002). The problem Montenegro, M. (2011). Complete quenching for sin-
of blow-up in nonlinear parabolic equations. Discrete gular parabolic problems. J. Math. Anal. Appl. 384,
and continuous dynamical systems 8, 399433. 591596.
Galaktionov, V.A. (1994). Blow-up for quasilinear heat Ph. Benilan, J.I.D. (2004). Pointwise gradient estimates
equations with critical fujitas exponents. Proceedings of solutions of one dimensional nonlinear parabolic
of the Royal Society of Edinburgh 124A, 517525. problems. Evolution Equations 3, 557602.
Giacomoni, J., P. Sauvy, & S. Shmarev (2014). Complete Phillips, D. (1987). Existence of solutions of quenching
quenching for a quasilinear parabolic equation. J. problems. Appl. Anal. 24, 253264.
Math. Anal. Appl. 410, 607624. Strieder, W., R.A. (1973). Variational Methods Applied to
Herrero, M.A., J.L.V. (1982). On the propagation prop- Problems of Diffusion and Reaction. Berlin: Springer-
erties of a nonlinear degenerate parabolic equation. Verlag.
Comm. in PDE 7(12), 13811402. Tsutsumi, M. (19721973). Existence and nonexistence
Kawohl, B. & R. Kersner (1992). On degenerate diffusion of global solutions for nonlinear parabolic equations.
with very strong absorption. Mathematical Methods Publ. RIMS, Kyoto Univ. 8, 211229.
in the Applied Sciences 7(15), 469477. Winkler, M. (2007). Nonuniqueness in the quenching
Kawohl, B. (1996). Remarks on quenching. Doc. Math., problem. Math. Ann. 339, 559597.
J. DMV 1, 199208. Zh. Q. Wu, J. N. Zhao, J. X. Y. & H. L. Li (2001).
Ladyzenskaja, O.A., V.A.S. & N.N. Uralceva (1988). Nonlinear Diffusion Equations. World Scientific,
Linear and Quasi-Linear Equations of Parabolic Type. Singapore.
AMS 23. Zhao, J.N. (1993). Existence and nonexistence of solu-
Levine, H.A. (1990). The role of critical exponents in tions for ut = div(|u|p2u) + f(u, u, x, t). J. Math.
blowup theorems. SIAM Review 32(2), 262288. Anal. Appl. 172, 130146.

235

AMER16_Book.indb 235 3/15/2016 11:28:42 AM


This page intentionally left blank
Applied Mathematics in Engineering and Reliability Bri, Snel, Khanh & Dao (Eds)
2016 Taylor & Francis Group, London, ISBN 978-1-138-02928-6

A spectral decomposition in vector spherical


harmonics for Stokes equations

M.-P. Tran
Faculty of MathematicsStatistics, Ton Duc Thang University, Ho Chi Minh City, Vietnam

T.-N. Nguyen
Department of Mathematics, HCMC University of Education, Ho Chi Minh City, Vietnam

ABSTRACT: The main goal of this paper is to present a spectral decomposition of velocity and pres-
sure fields of the Stokes equations outside a unit ball. These expansions bases on the basis of vector
spherical harmonics. Moreover, we show that this basis diagonalises the Neumann to Dirichlet operator.

1 INTRODUCTION 1
G( ) =
8 r
(Id + er er ).
We consider the Stokes problem in the domain
0 B(0,1) where 0 := R3  B(0,1) . Given a
velocity field g defined on S 2 : B(0,1) , we seek We refer the readers to (Nguyen 2013) for more
the velocity and pressure fields ( ) satisfying detail properties of these operators. The main goal
of this paper is to give a spectral decomposition of
Neumann to Dirichlet operator in basis of vector
u + p = 0 i 0 B ( 0,1),
spherical harmonics.
u = 0 i 0 B ( 0,1), (1.1) The sequel of this paper is organized as follows.
u = g on S 2 . In the next section, we describe the basis of vec-
tor spherical harmonics. We follow the notation in
The well-posedness and regularity results of this (Ndlec 2001) where these objects are introduced
equation can be found in the book of Galdi (Galdi in the context of electromagnetism. In Section 3,
1994). If ( ) is sufficiently smooth, the surface we present the decomposition of the solution of the
density of forces applied by the boundary of B(0, Stokes problem. Eventually, using the decomposi-
1) is defined by tion of the velocity and pressure field, we obtain an
expansion of the corresponding Dirichlet to Neu-
(
f = u + u ) n. mann operator DN in vector spherical harmonics.

The Dirichlet to Neumann operator DN can 2 VECTOR SPHERICAL HARMONICS


be defined as
Let us recall the definition and some properties of
DNg
N f W 1/ 2,2 (S 2 , 3
) g W 1/2 2 (S 2 , R3 ), vector spherical
p harmonics. We consider the unit
sphere S 2 in R3. The case of a sphere of arbitrary
where W 1/2 2 (S 2 , R3 ), W 1/2 2 (S 2 , R3 ) define the radius follows by a change of scale. In this geometry,
fractional Sobolev and its dual space. This opera- it is natural to define a point of R3 by its spherical
tor is a continuous linear isomorphism. Its inverse coordinates ( ) , where r is the radius and
is called the Neumann to Dirichlet operator ND the two Euler angles. These coordinates are related
which is defined by the convolution between the to the euclidean coordinates ( 1 2 3 ) by
Green function with the surface force density
x1 = r sin cos ,
ND f: = G * f,
x2 = r sin sin i ,
The Green function G is given by x3 = r cos .

237

AMER16_Book.indb 237 3/15/2016 11:28:42 AM


In these coordinates, the surface gradient of the The associated Legendre functions Plm are
function u, denoted S 2 u , is defined as given by
1 u u 
S 2 = sin e + , (2.1) dm
Plm ( )m ( x 2 )m/ 2 Pl ( x ).
   dx m
where e r, e and e are the unitary vectors. Let
H 1(S 2 ) denotes the Hilbert space The spherical harmonics of order l are the 2l 1
functions as follows: for l , lm l
H1( 2
) { L2 ( 2
) : S 2 L2 ( 2 3
}
) ,
Yl m ( ) 2CClm Pl m ((cos ))cos(( m ), if m > 0,
with its hermitian product Yl ( )
m
2Clm Pl|m| (cos ) i (| | ), if < 0,
Yl ( ) = Clm Pl 0 (cos ), if m = 0,
m
( )H ( )
= 1

4 S
uv + 2 S u S 2 vd .
S
where
We will denote by S 2 the Laplace-Beltrami
operator on the unit sphere S2, defined as ( l ) (l | m |)!
Clm = .
4 (l | m |)!
1 2 u 1 u
S2 u = + sin .
sin2 2 sin For x S2 , we respectively define
Tl ,m , I l ,m , Nl ,m as the traces on S 2 of the har-
The Laplace-Betrami operator is self-adjoint monic polynomials
in the space L2((S2) and it is coercive on the space
H 1(S 2 ) L20 (S 2 ) . It admits a family of eigenfunc- Tl ,m ( x ) = S 2Yl m ( x ) x : S2Yl m ( x ) TS
T 2,
tions which constitutes an orthogonal Hilbert basis
of the space L2(S2). This basis is also orthogonal for I l ,m ( x ) = S 2Yl m+1( x ) + (l + 1) m
+1 (
l +1 ) ,
the scalar product in H1(S2). These eigenfunctions Nl ,m ( ) = S 2 m
l 1 + lY
Yl m1( )x.
are called spherical harmonics. They are described
in Theorem 2.1. Notice that by construction the components of
Let Hl be the space of homogeneous polynomi- Tl ,m , I l ,m , Nl ,m belong to Yl , that is
als of degree l in three variables that are moreover
harmonic in R3. Let Yl be the space of the restric- S 2Y + l (l + )Y = 0, for Y = Tl ,m , I l ,m , Nl ,m .
tions to the unit sphere S 2 of polynomials in Hl .
Theorem 2.1. ((Ndlec 2001)). Let Yl m ,| m | l , Using the tangential gradient defined by (2) and
denote an orthonormal basis of Yl for the hermitian the Euler relation for the normal derivatives, we
product of L2 (S 2 ) . The functions Yl m , for l 0 obtain
and | m || l , form an orthogonal basis in L2 (S 2 ) ,
which is also orthogonal in H 1(S 2 ) . Moreover, Yl Theorem 2.2 ((((Ndlec 2001)). )) For each l 0 , the
coincides with the subspace spanned by the eigen- {
family ( l ,m )|m| l ( l ,m )|m|
m | l + ( l , m )|m
|m| l forms }
functions of the Laplace-Beltrami operator associ- an orthogonal basis of H (S 2 ) and of L (S 2 3 ) .
ated with the eigenvalue l (l + ) , i.e., Further, they satisfy

S 2Yl m + l (l + )Yl m = 0. S 2
| Tl ,m ( x ) |2 d l (l + 1),

S | I l ,m ( x ) | d
2
2
(l + 1)(2l + 3),
The eigenvalue l (l + ) has multiplicity 2l 1 .
By Theorem 2.1 and the Greens formula, we have S 2
| Nl ,m ( x ) |2 d = l( l ).

S 2 m 2
= l (l + 1). Let u L
L2 (S 2 , 3
) , then u decomposes as
l
L2
l l +1
We consider the Legendre polynomial Pl : u( ) il ,mTl ,m ( ) + jl ,m I l ,m ( )
l 1 m = l l 0 m = l 1
l 1
( )l d l
Pl ( x ) = ( x 2 )l , x [[ 1 1]. + kl ,m Nl ,m ( x ).
2 l! dx l
l
l m =
= l +1

238

AMER16_Book.indb 238 3/15/2016 11:28:45 AM


For simplicity, we use the symbol instead of using div(aer ) r (2 / r )a,
l 1 m = l ll 0 lm+=
1 lm=
l 1 and l
1
l +1.

we obtain
3 DECOMPOSITION OF VELOCITY AND
PRESSURE FIELD divu( ) ( )r (( )
(rjrjl ,m ( )
) jl ,m Yl ,m

The regularity results of Stoke equation are classi- + lr (( l ,m l ,m .


)k Y

cal and can be found in the book of (Galdi 1994).


The main point of this section is to establish the Since divu = 0 , we deduce
decomposition of the velocity and pressure field in
vector spherical harmonics.
k1,0 = 0, (3.3)
1/ 2 ,2
Theorem 3.1. Let g W (S ,2 3
) such that
S 2
g n = 0 and let ( ) be the variational solution ( ) r 2 j
jl,m r( ) jl ,m + ( )kl ,m = 0.
of (1.1) ( u D( 0 B (0,
(0,1)),
)) p 2
( 0 ( 0,1))
(3.4)
and B ( )
p = 0 ). If the decomposition of g in the
basis of vector spherical harmonics reads We now decompose the first relation of equa-
tions (1.1). We have
g( ) glT,mTl ,m ( ) + glI,m I l ,m ( )
(3.1)
+ glN,m Nl ,m ( ), l ,m x
p( ) =
x
2 ) (l + )Yl ,m ( )er
l ,m (
then we obtain the decomposition of the velocity r l + 2 S r r
field u and of the pressure field p in vector spherical = ( l 1,m )r ( l + )N
l ,m ( /
x/ ),
harmonics for r > 1 , as follows,

u( ) glT,mr (
Tl ,m + glI,m r
) ( )
I l ,m +
and
(

)( )
glI 2,m (r 2 1 + glN,m r l 1)
Nl , m ,
2l u( ) = r ( + )
(riil,m 2liil,m )Tl ,m
l 1 l ( 2l 1)glI ,m
p( x ) = l +1
I l ,m ( x / r ) Nl + ,m ( x / r ) er . + r( + ) (rjl, ljl,m )Il ,m
m = l (l )r
+ r(l + ) (rkl ,m lkl ,m )Nl ,m .
(3.2)
Proof. We recall that by the regularity results in Identifying these expansions in vector spherical
(Galdi 1994), the velocity and pressure field can be harmonics, we obtain
decomposed in the basis of vector spherical har-
monics. Since p = 0 , we put
riil,m 2li
lil,m = 0, l , | m | l, (7)
l
p( x ) = l ,mr
(l )
( r ).
Yl ,m (x
l m = l
rjjl,m 2ljj
jl,m = 0, l , | m | l + 1, (8)
We decompose u in the form

u( ) il ,m (r ( )Tl ,m ( )) rk kl,m = l
kl,m 2lk ,m r, l 1, | m | l 1. (9)
+ jl ,m (r (( ) I l ,m ( ))
+ kl ,m (r ( l +1) Nl ,m ( )) .
We deduce from (3.5), (3.6), the boundary con-
dition u g on 0 and the condition of decay
at infinity that
This form is chosen because
r ( l + )Tl ,m ( x ), r (l( l + ) I l ,m ( x ) and r ( l + ) Nl ,m ( x ) are
harmonics. Using vector spherical harmonics and il ,m (r ) = glT,m l 1,| m | l ,
the formula

239

AMER16_Book.indb 239 3/15/2016 11:28:52 AM


jl ,m (r ) = glI,m l 0,, | m | l + 1.
DN ext = er

p r
+ ut + pe
u |Sex
2
.
t

The relations (3.3), (3.4) lead to For x S 2 , we compute

k1,0 g1N,0 , 0,0 = 0, )( ) = ( + )glT,m (


( r )( , m r )e
)er
(2l 3)(l 1) I
= , m ( r 1) +
2
kl ,m gl glN,m . gl , S l ,m er gl ,mS 2 l ,m er
I
2l
+
( + )gl ,m ( l ,m r r glN,mS 2 l ,m er
From (3.7) we get ( 2l 3)(
) 1 l) I
+

gl 2,m + ( )glN,m ( l ,m r )e
)er
( l )( l )l I l
l ,m = gl ,m l 1, 0 | m | l .
l +1 (4.3)
To reduce the three first terms, we use vector
Eventually, with the convention gI 1,0 = 0 and
spherical harmonics and obtain Tl ,m er = 0 and
all of the above equalities, we obtain the decompo-
sition of p and u in vector spherical harmonics as
in (3.2). I l ,m er = (ll Yl +1,m ,
Nl ,m er = lYYl 1,m .

4 DECOMPOSITION OF NEUMANN TO For the three remaining terms, we remark that


DIRICHLET OPERATOR for a regular vector field V of TS2 and a spherical
harmonic Y of S2 in R3 , we have
In this section, we obtain an expansion of the
Dirichlet to Neumann operator DN in vector {S } er = V ,
spherical harmonics. We refer the readers to (Halp-
{S ( r )} er = S 2Y .
ern 2001) for the similar result.
Theorem 4.1. Let g W 1/ 2,2 (S 2 , 3 ) and let ( ) Using vector spherical harmonics, it leads
be a solution of (1.1). Then the vector spherical har-
monic basis diagonalizes the Neumann to Dirichlet
{ S l, } er = Tl ,m ,
operator ND defined on ( , ) .
In particular, if the decomposition of g in the {S l, } e = S Yl + ,m ,
basis of vector spherical harmonics is given by (3.1), {S l , m } er = (l + ) S 2Yl 1, m .
then we have
The equality (4.3) then becomes
gT (l + )glI,m
NDg = l ,m + I l ,m
2l + 1 4l 2 + 8l + 3
(l )gl ,m
N (4.1) )( ) = ( + )2 glI, Y +
( r )( , m er
+ Nl , m . ( )( )gl ,mYl ,m er
I
4l 2 1
+
(l )llgglN,mYl ,m er + glT,mTl ,m
Proof. Let us first decompose the following
operator +
( l )glI,mS Yl + ,m + (
l )glN,mS 2 l ,m .


DN jumpg er
u + ut + per 2 . After simplifying and using again the formula
|S
of vector spherical harmonics, we obtain,
We have
)( ) = gl ,
( r )( lglI,m I l ,m
DN jumpg = DN exttg + DN iinttg, (4.2) l ,m
+ ( + )glN,m Nl ,m .
where DN N ext and DN N int correspond to the exte-
rior and interior solutions. With the same kind of computation, we obtain
Let us first decompose DN N ext . We have (see L. Halpern in (Halpern 2001))

240

AMER16_Book.indb 240 3/15/2016 11:28:56 AM


)2 I Using the first equation of (1.1), we get
g( ) ( )glT,mTl ,m +
3((
gl ,m I l ,m
l+2
( 2 1)
l(4
+ ( )glN,m Nl ,m , = .
l 1

where g( ) r ut + per 2
. Eventually, we Then we compute
|Sex
get t

2l 2 + 1
DN exttg( x ) (l per ( + T
)glT,mTl ,m ) er = Nl , m .
l 1
2
l +3 I (4.4)
+ gl ,m I l ,m + 2(l
2ll
(l
(l )glN,m Nl ,m ,
l+2 We remark that the coefficient of N1,0 is zero.
Eventually, DN
N int writes as:
To decompose DN N int , we solve the inte-
rior problem (1) in the unit ball B( ) with DN
N int g(x
x l )glT,mTl ,m + 2lglI,m I l ,m
g Tl , g = I l ,m and then g = Nl ,m .
2l 2 + 1 N
For g = Tl ,m , since + gl ,m Nl ,m .
l 1
x  r lTl ,m ( x / r ) (4.5)
Finally, the decomposition of the Dirichlet to
is harmonic and divergence free, we have a solution Neumann operator is obtained by (4.2), (4.4) and
of the form u = lTl ,m ( / r ) and p = 0 . Using the (4.5),
above formulas, it is easy to check that

per ( + T
) er l )Tl ,m .
DN
N jump g( x ) ( l )glT,mTl ,m
2
l +3 I 4l 2 1 N (4.6)
+ gl ,m I l ,m +
4ll 4l
gl ,m Nl ,m .
For g = I l ,m , we still have a solution of the form l+2 l 1
u = l I l ,m ( / r ) and p = 0 . Similarly, we calculate
This is the desired decomposition.
per ( + T
) er l l ,m .
lI
REFERENCES
For g = Nl ,m , the mapping
Galdi, G.P. (1994). An introduction to the mathematical
x  r l Nl , m ( x / r ) theory of the Navier-Stokes equations. Vol. I, Volume
38 of Springer Tracts in Natural Philosophy. New
York: Springer-Verlag. Linearized steady problems.
is not divergence free. Proceeding as in the case of Halpern, L. (2001). A spectral method for the Stokes
the exterior domain, we look for a solution of the problem in three-dimensional unbounded domains.
form, Math. Comp. 70(236), 14171436 (electronic).
Ndelec, J.C. (2001). Acoustic and electromagnetic equa-
u = r l Nl , m + r l 2 ( r 2 ) I l ,m ,
tions, Volume 144 of Applied Mathematical Sciences.
New York: Springer-Verlag. Integral representations
p = r l 1Yl ,m . for harmonic problems.
Nguyen, T.N. (2013). Convergence to equilibrium for dis-
The condition u = 0 yields crete gradient-like flows and An accurate method for
the motion of suspended particles in a Stokes fluid. Dis-
sertation. Ecole Polytechnique.
l( l + )
= .
(l )
2(l

241

AMER16_Book.indb 241 3/15/2016 11:29:00 AM


This page intentionally left blank
Applied Mathematics in Engineering and Reliability Bri, Snel, Khanh & Dao (Eds)
2016 Taylor & Francis Group, London, ISBN 978-1-138-02928-6

On convergence result for a finite element scheme of Landau-Lifschitz-


Gilbert equation

M.-P. Tran
Faculty of MathematicsStatistics, Ton Duc Thang University, Ho Chi Minh City, Vietnam

ABSTRACT: In this paper, we prove a convergence result of discrete solution of Landau-Lifschitz-Gilbert


equation using a precise finite element scheme which was proposed by F. Alouges et al. in (Alouges,
Kritsikis, Steiner, & Toussaint 2014). The convergence result is established in both space and time-space
discretization.

(
1 INTRODUCTION 1 2
D( ) = d | m | dx
d H d ( ) mdx
d
2

)
Recently, some convergence results for solution
of gradient-like system have been studied by may 2 d Q (
mdx ) dx .
2

authors in both continuous and discrete problems,
such as (Haraux & Jendoubi 1998, Haraux 2012, We use the same notations as in (Alouges,
Haraux & Jendoubi 2015) for continuous problem Kritsikis, Steiner, & Toussaint 2014), i.e., the vec-
and (Grasselli & Pierre 2012, Alaa & Pierre 2013, tor field H ext models an applied magnetic field,
Merlet & Pierre 2010, Merlet & Nguyen 2013) for H aniso = Q(e m )e denotes the anisotropy field,
discrete problem. These results have many applica- the stray field H d is the magnetic field, d is the
tions in partial differential equations. Continue the exchange constant and Q is the anisotropy con-
works in (Merlet & Nguyen 2013), we apply some stant. It is supplemented with initial and boundary
convergence results in this paper to a discretization conditions
of the Landau-Lifschitz-Gilbertequations using a
precise finite element scheme which was proposed
m
by F. Alouges et al. in (Alouges, Kritsikis, Steiner, & = ,
Toussaint 2014). n
The Landau-Lifschitz-Gilbert equations was first m( x, 0 ) = m0 ( x ) S 2 .
proposed by Landau and Lifschitz in (Landau &
Lifschitz 1935). These equations describe the evo- Notice that, at least formally, this evolution sys-
lution of the magnetization m : ( , ) S2 tem preserves the constraint | ( , t ) | = 1, x .
inside a ferromagnetic body occupying an open We will consider a discretization of the follow-
region R3 . This system of equations reads ing variational formulation of (1.1),

t m t m = m H efff , in , (1.1)
t 2
,
(1.2)
where > 0 is a damping parameter and m( m) +
d ( m) ext + aniso ( m )) ,

denotes the three dimensional cross product. The
so-called effective magnetic field H efff is given by for every H 1( , R3 ) which furthermore satis-
the functional derivative of micromagnetic energy fies ( ) m( ) = 0 a.e. in . It is known that for
D, more precisely every initial data m0 H 1( , S 2 ) , this variational
formulation admits a solution for all time (see
D (Alouges 2008)).
H efff ( ) d2 Hd ( )
m The main idea comes from the fact that the Dirich-
+ H ext + Q(( )e,
)e let energy function D of Landau-Lifschitz-Gilbert
equation is a Lyapunov function for (1.2). Indeed,
where the energy D is given by considering a smooth solution m( x t ) , we compute,

243

AMER16_Book.indb 243 3/15/2016 11:29:05 AM


d We consider the -scheme of the gradient-like
D( m( , t )) = m t m(x
( , t ) dx. system as follows
dt

un un = tt G (un ) t(1 )G (un ). (6)


Since, for every, x , t || m( x, t ) ||2 is con-
stant, we have t ( , t ) m( x, t ) = 0. So, we can
choose = t m(,, t ) in (1.2) and deduce, Theorem 2.3. (Nguyen 2013) Let [ 0, 1] and un
be the sequence defined by the -scheme (2.4). If F
d is one-sided Lipschitz and G is Lipschitz then the
D( m( , t )) = || t m ||2 (x
( , t ) ddx 0, sequence un converges to .
dt
For more general theorem, the authors studied
a convergence result for a projected -scheme. Let
as claimed. u0 M and n = 0, 1, 2, ..., the projected -scheme
The sequel of this parer is organized as follows. has two steps
In the next section, we recall some convergence
results for the gradient-like systems. These results Step 1 : find vn Tun M such that
will be applied for our works. The convergence of n = Gun ( n n ) + (1
(1 )Gun . (2.5)
numerical solution of Landau-Lifschitz-Gilbert
equation are established in both cases space and Step 2 : set
e un : M ( un + tvn )
time-space discretization. The first case is presented
in Section 3 and the second one is in Section 4. We assume that the family Gu satisfies these fol-
lowing conditions for all u, u M and v, v Tu M :

2 CONVERGENCE RESULTS FOR Gu (u ) = G(


G (u ), || G (u ) || C ,
(2.6)
GRADIENT-LIKE SYSTEMS || Gu (u v ) Gu (u v ) || K || v v || ,

In this section, we recall some abstract convergence Moreover, we also assume that the projec-
results in recent study in (Nguyen 2013). Let M be tion acts only at second order, that is there exists
a Riemannian manifold embedded in R d and the , R > 0 such that
inner product on every tangent space Tu M is the
restriction of the euclidian inner product on R d . || M ( )( ) || R || v ||2 , f || v ||< . (2.7)
We consider a tangent vector field G C( C M ,T
TM )
and a function F C C 1(M,
M ). We say that G and
F satisfy the angle and comparability condition Theorem 2.4. (Nguyen 2013) Let un be the sequence
if there exists a real number > 0 such that for all defined by the project -scheme (2.5) and assume
u M, that these above conditions (2.6) are satisfied. Then
the sequence un converges to .
G (u ), F (u ) || G (u ) ||2 || F (u ) ||2 . (2.1)
3 SPACE DISCRETIZATION
We assume that F is a strict Lyapunov function
for the gradient-like system We discretize the problem in space using P1-Finite
Elements. Let us introduce some notation. Let
u(t ) = G (u(t )), u(t ) M . (2.2) ( h )h be a regular family of conformal triangu-
lations of the domain parameterized by the
Theorem 2.1. (ojasiewicz 1971) If F : R d R space step h. Let ( ih )i be the vertices of h and
is real analytic in some neighborhood of a point (ih )1ii N( ) the set of associated basis functions of
then F satisfies the Lojasiewicz inequality at , that the so-called P1( h ) discretization. That is to say
means: there exist > 0 and [ 0, 1/ ) such the functions (ih )i are globally continuous and
that linear on each triangle (or tetrahedron in 3D) and
satisfy ih hj ij . We define

| (u ) ( ) |1 || F ( ) ||, u B(
B(( , ) . Nh
(2.3) Vh
m

miih : i, mi R3 ,
i =1
Theorem 2.2. (Nguyen 2013) Assume that G and
F satisfy the angle and comparability condi-
Mh : {m V h
i, mi S . }
tion 2.1 and let u be a global solution of 2.2 and
there exists such that F satisfies the Lojasiewicz Notice that Mh is a manifold isomorphic to
( 2 )Nh . For any m = i =1 miih M h , we intro-
N
inequality 2.3 at . Then u(t) converges to as t
goes to infinity. duce the tangent space

244

AMER16_Book.indb 244 3/15/2016 11:29:08 AM


N We also introduce the section G : Mh TMh
Tm h M h = v vihih : i
i, mih vih = 0 . defined by G ( m h ) : = p h where p h Tm h M h solves:
i =1 h Tm h M h,

The space discretization of the variational for- Nh


mulation (1.2) reads, h
h ( mih pih ) h ih

i =1
=d
2 h h
m h ( ) m0h M h , h Tm h (t )M h ,
+ m ( h
d (m
h
) ext aniso ( m
h
)) h .
Nh
t m h . h ( mih t mih ) ih ih

(3.3)
i =1 (3.1)
= d m +
2 h h The function G is well defined. Indeed, it is suf-
ficient to check that the bilinear form bm h defined
on Tm h M h Tm h M h by
m ( d ( m ) + ext + aniso ( m )) .
h h h h

Nh
Remark 3.1. We have replaced the term bm ( p h , h ) ph h

((m
mih pih ) ih ih

( n n ) h in the original scheme of [?] by i =1

(3.4)
Nh
( min pin ) ih ih .
has a positive symmetric part. Using
we see that bm ( p h , p h )
= 0,
|| p h ||2L2 ( )2 and bm h is
pih pih
i =1
coercive on Tm h M h Tm h M h . So, by definition,
This modification is equivalent to using the quad-
m h C (R+ , M h ) solves the variational formula-
rature formula:
tion (3.1) if and only if
Nh

f ddx f ( xih ) ih , d h
m G ( m h ) t > 0,, m h ( ) m0h .
i =1 dt

for the computation of this integral. The conver- We now check that the hypotheses of Theo-
gence to equilibrium results below are still true with rem 2.2 hold.
an exact quadrature formula, but the proof is slightly
more complicated, see Remark 4.2. Theorem 3.2. The functions G and F defined above
We now interpret this variational formulation as satisfy the angle and comparability condition (2.1).
a gradient-like differential system of the form (2.2). Moreover, the Lyapunov function F satisfies a
For this we introduce the Lyapunov functional ojasiewicz inequality (2.3) in the neighborhood of
F : M h H 1( , 3 ) R defined by any point mh of the manifold M M h .

Proof. For the first point, let us fix m h M h and


F (mh ) =
1
2
( d | m h | dx H d (m

( m h ) m hdx write p h G( G m h ) and q h F ( m h ). Choosing
h = q h in (3.3) and using (3.2), we obtain
2

m hddx Q (e
(e m h )2 dx .
)
Nh

As usual, the ggradient of this functional is h h


L2
( h
i
h
i )
h

i i
h
d || q h ||2L2
h i =1
q F ( m h ) Ah m h, where Ah is the rigidity
matrix associated to the P1-FE discretization: + mh ( d (m
h )+ H
ext + Haniso ( m
h )) q h .

qh h d 2 m h h We use the classical estimate from elliptic regu-


L2
larity theory, namely
m ( h
d (m
h
) ext aniso ( m
h
)) h

=d 2

mih hj ih hj || H d ( ) ||L ( )
C || m ||2L2 ( )
.

i j

mh ( d (m
h
)+ ext + aniso ( m
h
)) h Moreover, we can obtain the same estimate for

= : A m , h
h h
2
. applied field H ext and anisotropy field H aniso,
L
where the constant C depends on Q and | | . Then
(3.2) the Cauchy-Schwarz inequality, the identities

245

AMER16_Book.indb 245 3/15/2016 11:29:13 AM


|| mih || = 1 and the equivalence of norms in finite
dimension yield
of mappings {
g Gm h m h + Tm h Tm h M h defined
by Gm h (u h ) = p h where p Tm h M h solves the
}
variational formulation h Tm h M h ,
|| q h ||L C || p h ||L2 .
Nh
On the other hand, choosing = p in (3.3), h h h
h (uih pih ) h ih

we get i =1
= d 2 h h
|| p h ||2L d 2 q h , ph + mn ( d (m
n)
ext aniso ( m
n )) h .
L2
+ mh ( d ((m
m h
) ext aniso ( m
h
)) p h .
Notice that Gm h only depends on mh through
the space of test functions Tm h M h . As above, we
So, we have see that ph is well defined and uniquely defined by
this variational formulation through the coercivity
q h , ph || p h ||2L | q h ||2L2 , of the bilinear for bm h (see 3.4).
L2
Lemma 4.1. Let mn, pn be defined in the scheme (4.1).
with depends on Q, | |, and d: i.e. the pair Then,
( ) satisfies the tangential angle condition
and comparability condition (2.1). pn Gm n ( m n ttpp n ) ( )Gm n ( m n ). (4.2)
For the second point, F ( m h ) is a polynomial
function of ( ih )1ii Nh ( 2 )Nh , hence it is ana-
lytic. The manifold M h (S 2 )Nh being analytic, we Proof. Let us set qh = Gmn
can use an analytic chart (for example a product of ( n p n ), r h m
( m n
) . By definition of
stereographic projections) defined in a neighborhood Gm n and linearity, we see that the function
of mh. We apply
p Theorem 2.1 to the analytic func- ph q h + ( )r h satisfies
tion F  1 and deduce that it satisfies a ojasiewicz
inequality in the neighborhood of ( h ). Nh
We deduce from the Theorem 3.2: h
h ( mih pih ) h ih

i =1
Theorem 3.3. Assume m h (t ) is a solution of (3.1). Nh
Since M M h is compact ( h ) is not empty. y ( i i ) i i
h h h

Consequently there exists M h such that u m h i =1
satisfies the conclusion of Theorems 2.2. =
d 2 ( mn +

) h ,
+ m n ( d ( m n ) + ext + aniso ( m n )) h ,

4 TIME-SPACE DISCRETIZATION h Tm h M h .
We now consider the -scheme proposed by F.
Alouges in (Alouges, Kritsikis, Steiner, & Tous- We see that in the third term of the left hand side,
saint 2014). Given an initial m 0 M h, choose the triple product ( in ih ) ih vanishes. Indeed,
[ 0, 1] and a time step t . For n = 0, 1, 2, ... , the the three vectors pin rih , ih belong to the two
algorithm has two steps: dimensional tangent space vih : vih min = .{ }
So, it turns out that ph and pn solve the same (well-
Find p n TTmn M h such h that
h h Tmh M h , posed)) variational formulation. We conclude that
p h p n as claimed.

h
N
p
n h
(min pin ) ih ih Remark 4.2. If we had used the original variational
i =1
= d 2 ( n + n ) h formulation, with obvious changes in the definition
f
of Gm h , then the term t ( p n r h ) h would
+ mn ( (mn ) +
ext + aniso ( m )) .
n h
d not vanish in general and the identity (4.2) would
Nh be wrong. In this case, we can not link the scheme
Set m : mi t pi mi t pi i , and iterate.
n +1 n n n n h
of (Alouges, Kritsikis, Steiner, & Toussaint 2014)
i =1 to our projected -scheme. However, this term is of
(4.1) small magnitude and using the present ideas, it is not
difficult to establish that Theorems 2.3 and 2.4 apply
Let us rewrite this scheme as a projected -scheme to this scheme and conclude to the convergence to
of the form (2.5). For this we introduce the family equilibrium of the sequence (mn).

246

AMER16_Book.indb 246 3/15/2016 11:29:18 AM


Theorem 4.3. The functions F, G and { m } sat- Alouges, F., E. Kritsikis, J. Steiner, & J.C. Toussaint
isfy hypotheses (2.6). Moreover, the projection (2014). A convergent and precise finite element scheme
for Landau-Lifschitz-Gilbert equation. Numer. Math.
i =1 zih zih ih , satisfies (2.7).
Nh
M h ( z h ) : 128(3), 407430.
Grasselli, M. & M. Pierre (2012). Convergence to equi-
y in (2.6)
Proof. The first identity ( is obvious. Next, librium of solutions of the backward Euler scheme
for m h M h and p h G( G m h ) , using h = p h for asymptotically autonomous second-order gra-
in (3.3), we obtain dient-like systems. Commun. Pure Appl. Anal. 11(6),
23932416.
|| p h ||2L ( 2 ) || m h ||L2 || p h ||L2 , Haraux, A. (2012). Some applications of the ojasiewicz
gradient inequality. Commun. Pure Appl. Anal. 11(6),
24172427.
and we conclude from the equivalence of the norms Haraux, A. & M.A. Jendoubi (1998). Convergence of
in finite dimensional spaces, that G is bounded on solutions of second-order gradient-like systems with
the compact manifold M h . The Lipschitz estimate analytic nonlinearities. J. Differential Equations 144(2),
in (2.6) is also a consequence of this fact and of the 313320.
uniform coercivity of the bilinear forms bm h . The Haraux, A. & M.A. Jendoubi (2015). The convergence
problem for dissipative autonomous systems. Springer
Lipschitz estimate on F is also obvious since F
Briefs in Mathematics. Springer, Cham; BCAM
is smooth on the compact manifold M h . Basque Center for Applied Mathematics, Bilbao.
Eventually, we easily y see that ((2.7)) holds. Indeed, if Classical methods and recent advances, BCAM
v h Tm h M h , then | h + vih |2 = | h |2 + | vih |2 1, so SpringerBriefs.
M h ( m h v h ) is just the L2-projection of ( ih ih ) Landau, L. & I. Lifschitz (1935). On the theory of the
h dispersion of magnetic permeability in feromagnetic
on the product of balls ( (0,1))Nh ( 3 )N . bodies. Phys. Zeitsch. der Sow. 8, 153169.
The previous Theorem 4.1 and 4.3 show that the ojasiewicz, S. (1971). Sur les ensembles semi-
n
sequence ( n ) satisfies all the hypotheses for analytiques. In Actes du Congr`es International des
Theorem 2.3. Hence, we have: Mathematiciens (Nice, 1970), Tome 2, pp. 237241.
Gauthier-Villars, Paris.
Theorem 4.4. There exists t > 0 such that if Merlet, B. & T.N. Nguyen (2013). Convergence to equi-
t ( , t) and ( n ) M h is a sequence that librium for discretizations of gradient-like flows on
complies to the scheme (4.1), then there exists Riemannian manifolds. Differential Integral Equa-
M h such that (mn) converges to . tions 26(56), 571602.
Merlet, B. & M. Pierre (2010). Convergence to equilib-
rium for the backward Euler scheme and applications.
REFERENCES Commun. Pure Appl. Anal. 9(3), 685702.
Nguyen, T.N. (2013). Convergence to equilibrium for dis-
crete gradient-like flows and An accurate method for
Alaa, N.E. & M. Pierre (2013). Convergence to equilib-
the motion of suspended particles in a Stokes fluid. Dis-
rium for discretized gradient-like systems with analytic
sertation. Ecole Polytechnique.
features. IMA J. Numer. Anal. 33(4), 12911321.
Alouges, F. (2008). A new finite element scheme for
Landau-Lifchitz equations. Discrete Contin. Dyn.
Syst. Ser. S 1(2), 187196.

247

AMER16_Book.indb 247 3/15/2016 11:29:23 AM


This page intentionally left blank
Applied Mathematics in Engineering and Reliability Bri, Snel, Khanh & Dao (Eds)
2016 Taylor & Francis Group, London, ISBN 978-1-138-02928-6

N
Some results on the viscous Cahn-Hilliard equation in

L.T.T. Bui
Faculty of Mathematics and Computer Science, University of Science, Vietnam National University,
Ho Chi Minh City, Vietnam

N.A. Dao
Faculty of Mathematics and Statistics, Ton Duc Thang University, Ho Chi Minh City, Vietnam

ABSTRACT: We study existence and uniqueness of solution for the viscous Cahn-Hilliard equation
under weak growth assumptions on the nonlinearity and in whole domain N. We also address a priori
estimates which are sufficient to investigate singular passage to the limit over different small parameters.

1 INTRODUCTION are encountered in the mathematical literature,


which, e.g. in case of equation (2), (3), reduce to:
Forward-backward parabolic equations arise in a
i. [(u)]t with > 0, leading to third order
variety of applications, such as edge detection in
pseudo-parabolic equations ( > 0 being a small
image processing (Perona & Malik (1990)), aggre-
parameter; for example, see (Novick-Cohen &
gation models in population dynamics (Padrn
Pego (1991), Plotnikov (1994), Smarrazzo &
(1998)), and stratified turbulent shear flow (Baren-
Tesei (2012), Smarrazzo & Tesei (2013), Bui
blatt & Bertsch & Dal Passo & Prostokishin &
et al. 2014a));
Ughi (1993)), theory of phase transitions (Brokate
ii. 2u, leading to fourth-order Cahn-Hilliard
& Sprekels (1996), Bellettini & Fusco & Guglielmi
type equations (for example, see (Bellettini &
(2006)). A well-known equation of this type is the
Fusco & Guglielmi (2006), Plotnikov (1997),
Perona-Malik equation,
Slemrod (1991)) and references therein).
w Remarkably, when (u) = u either of the above
wt = div 2
, (1) regularizations can be regarded as a particular case
1 + w of the viscous Cahn-Hilliard equation,

which is parabolic if |w| < 1 and backward para- vut = [ u u + ut ] , ,v > ), (4)
bolic if |w| > 1. Similarly, the equation
choosing either = or = ; here (u) = u3 u
or ( ) = 1+ u2 for equation (2), whereas in general
u
u
ut = (2)
1 + u 2 it denotes a non-monotonic function.
Equation (4) has been derived by several authors
is parabolic if |u| < 1 and backward parabolic if using different physical considerations (in par-
|u| > 1. Observe that in one space dimension the ticular, see (Gurtin (1996), Jckle & Frisch (1986),
above equations are formally related setting u = wx. Novick-Cohen (1988))). It is worth mentioning the
A different well-known equation of application in wide literature concerning both the relationship
theory of phase transitions is between the viscous Cahn-Hilliard equation and
phase field models, and generalized versions of the
equation suggested in (Gurtin (1996)).
ut (u ) (3) Concerning equation (4) with v = 1 , the
existence results were obtained under suitable
where the famous choice of nonlinearity is (u) = nonlinearity in bounded smooth domain of N
u3 u. (see (Carvalho & Dotko (2007)), (Elliott &
Clearly, forward-backward parabolic equations Stuart (1996)), (Bui et al. 2014b)). Moreover, in
lead to ill-posed problems. Often a higher order the latter reference authors give us the rigorous
term is added to the right-hand side to regularize proof of convergence to solutions of either the
the equation. Two main classes of additional terms Cahn-Hilliard equation, or of the Allen-Cahn

249

AMER16_Book.indb 249 3/15/2016 11:29:26 AM


equation, or of the Sobolev equation, depending Definition 1. Let (0, ), (0,1), and let
on the choice of the parameter , . Recently, in u0 H 2 ( ) H 01 ( ) . By a strict solution of
(Dotko et al. 2012) authors give the analysis of problem (7) we mean any function u C([ 0,T ];
equation (4) in N under some restrictive assump- H 2 ( ) H 01 ( )) C 1([ 0,T ]; L2 ( )) such that (u)
tions on the growth of nonlinearity . C([0,T]; L2()), and
In the light of the above considerations, we
study the following viscous Cahn-Hilliard para- ut v in Qn
bolic problem (8)
u = u0 in {0}
( ) t [ () ] i
N
( ,T )
in strong sense. Here C ([ 0,T ]; H 2 ()) H 01 ( ))

| u = lim | |
lim | | | u = 0 fortt ( ,T ) (5) and for every t [0, T] the function v(,t) is the
N unique solution of the elliptic problem
u = u 0 in R { },

where the nonlinearity satisfies the following + ( = ( ) u in


(9)
assumptions: (,t)) = .

(H1) there exists K > 0 such that
The function v is called chemical potential.
| (u ) |
| K( q 1
|u| ) (6) Definition 2. Let (0, ), (0,1), and let u0 H2
( N). By a strict solution off pproblem ((5)) we mean any
for some q (1, ) if N = 1, 2, or q 1, N +2
N 2 if( function u C ([ 0,,T ]];; H 2 ( N )) C 1([ ,T ]];; L ( R N ))
such that ( ) C ([ 0, ]; 2 ( R N )) , and
N 3.
We obtain the existence results with more exten-
sive of class of nonlinearities which include the ut inQ
i
critical growth in (Dotko et al. 2012). By the same (10)
way but more technical, we also give the analysis of u = u0 in R N {0}
singular limits of problem (5) as in (Bui et al. 2014b).
Here are the description of our main method. Firstly, in strong sense. Here v C ([0,T]; H2 ( N) H10 ( N))
we state and prove the existence of weak solution of and for every t [0,T] the function v(,t) is the unique
the viscous Cahn-Hilliard problem in a ball Bn which solution of the elliptic problem
has center at origin and radius n :
+ ( = ( ) u in
( ) t [ ( ) ] i ( ,T ) (11)
lim| | ( ,t ) = .
n

u = u = 0 o n ( ,T ) (7)
u = u in Bn { }. The function is called chemical potential.
0
A well-posedness result for problem (7) under
It is worth to mention that this result (see Theo- assumption (H1) is the content of the following theo-
rem 3) is also an improvment of that in (Bui et al. rem.
2014b). Second, we establish the family of uniformly
bounded estimates on those solution independent Theorem 3. Let (0, ), (0, 1), and
of n (see Lemma 4). Then we can pass to the limit let satisfy assumption (H1). Then for every
as n w to get a desired result (see Theorem 5). u0 H 2 ( ) H 01 ( ) there exists a unique strict
Finally, by taking advantage of the set of uniformly solution of problem (7).
bounded estimates on solution of problem (5) with Lemma 4. Let (0, ), (0, 1), ,u0 H2 ()
respect to appropriate parameters, we investigate its H10 () and let satisfy assumption (H1). Let {un}
singular limits to get solutions of Sobolev equation be the sequence of solutions to problems (7) given by
or Cahn-Hilliard equation. Here is the different way Theorem 3, with u0 n u0 . Then for every t (0,T]
to get solutions of well-known equations which are there holds
extensively investigated in the literature.
This paper was organized as follow : Section 1
is for the introduction of our problem. Our main
n (un )( x,t )dx 2
| un | ( x,t )dx
2
d
results are presented in Section 2.
t t
+ unt2
dxdds + ( ) | un |2 dxds
0 0 (12)

= n ( 0 n )( )dx
2
2 MAIN RESULTS d | 0 n |2 dx,

u
In this paper, let = Bn and Qn = Bn (0, T), Q = where ( ) = ( )ds.
N
(0, T). 0

250

AMER16_Book.indb 250 3/15/2016 11:29:27 AM


By using the above uniform estimates, we can REFERENCES
state and prove our main theorem as follows:
Barenblatt, G.I., M. Bertsch, R. Dal Passo, V.M. Pros-
Theorem 5. Let (0, ), (0, 1), and let tokishin & M. Ughi (1993). A mathematical problem of
satisfy assumption (H1). Then for every u0 H2( N) turbulent heat and mass transfer in stably stratified tur-
there exists a unique strict solution of problem (5). bulent shear flow. J. Fluid Mech. 253, 341358.
Moreover, for every > 0 there exists M > 0 (only Bellettini, G., G. Fusco & N. Guglielmi (2006). A concept
depending on the norm || u0 ||H ( )) such that for any of solution for forward-backward equations of the form
( ( ))
2 N

( , ) and (0,1) ut 12 ux and numerical experiments for the


x
singular perturbation ut uxxxx + 12 ux
2
( ( )) x
.
( ) L (( 0, )); ( R N ))
M, (13) Discrete Cont. Dyn. Syst., 16, 259274.
Brokate M. & J. Sprekels (1996). Hysteresis and Phase Tran-
u sitions. Applied Mathematical Sciences, 121 (Springer,
where ( ) 0 ( )ds; 1996).
Bui, L.T.T., F. Smarrazzo & A. Tesei (2014a). Sobolev reg-
u L (( 0,T );H ( R N ))
M; (14) ularization of a class of forward-backward parabolic
equations. Journal of differential Equations, 257 (5),
ut M; (15) 14031456.
L (Q ) Bui, L.T.T., F. Smarrazzo & A. Tesei (2014b). Passage to the
limit over small parameters of a viscous Cahn-Hilliard
( ) L ( )
M; (16) equation. J. Math. Analysis App., 420 (2), 12651300.
Carvalho A.N. & T. Dotko (2007). Dynamics of viscous
3
Cahn-Hilliard equation. Cadernos de Matemtica, 8,
2 u L2 (Q )
M; (17) 347373.
Elliott, C.M. & A.M. Stuart (1996). Viscous Cahn-Hilliard
1 L2 (( 0,T );H 1 ( R N ))
M; (18) Equation, II. Analysis. Journal of differential Equations.
128, 387414.
L (( 0,T );H 1 ( R N ))
M; (19) Gurtin M. (1996). Generalized Ginzburg-Landau and
Cahn-Hilliard equations based on a microforce balance.
) M. (20) Physica D 92, 178192.
L2 ((
((00, )); 2
( R N )) Jckle, J. & H.L. Frisch (1986). Properties of a generalized
diffusion equation with memory. J. Chem. Phys. 85,
Further estimates of the solution given by The- 16211627.
orem 5 are the content of the following theorem. Novick-Cohen, A. & R.L. Pego (1991). Stable patterns in a
viscous diffusion equation. Trans. Amer. Math. Soc. 324,
Theorem 6. Let (0, ), (0, 1) and u0 H2 331351.
( N). Let satisfy (H1) and the following one: Novick-Cohen, A. (1988). On the viscous Cahn-Hilliard
(H2) there exists u0 > 0 such that '(u) > 0 if |u| $ u0. equation. in Material Instabilities in Continuum
Let u be the solution of problem (5) given by The- Mechanics and Related Mathematical Problems (J. M.
orem 5. Then for every (0, ) and (0, 1) Ball, Ed.), pp. 329342, Clarendon Press.
Padrn, V. (1998). Sobolev regularization of a nonlinear
2 LT
ill-posed parabolic problem as a model for aggregating
populations. Comm. Partial Differential Equations, 23,
1+ e 457486.
u L (( 0,T );L ( R N ))
u0 H1( N ; (21)
) 1 Perona, P. & J. Malik (1990). Scale space and edge detection
using anisotropic diffusion. IEEE Trans. Pattern Anal.
2 LT Mach. Intell. 12, 629639.
Plotnikov, P.I. (1994). Passing to the limit with respect to vis-
2 1 + e
cosity in an equation with variable parabolicity direction.
u u0 ; (22) Diff. Equ. 30, 614622.
L (( 0,T );H ( R N )) H1( N
) Plotnikov, P.I. (1997). Passage to the limit over a small
parameter in the Cahn-Hilliard equations. Siberian Math.
J. 38, 550566.
2 LT Slemrod, M. (1991). Dynamics of measure-valued solutions
to a backward-forward heat equation. J. Dynam. Differ-
1+ e
u u0 H1( N . (23) ential Equations 3, 128.
L (Q ) ) Smarrazzo F. & A. Tesei (2012). Degenerate regularization
of forward-backward parabolic equations: The regular-
Moreover, for every > 0 and (0,1) ized problem. Arch. Rational Mech. Anal. 204, 85139.
there exists M > 0 (only depending on the norm Smarrazzo, F. & A. Tesei (2013). Degenerate regularization
|| u0 ||H 1 ( N ) and on , and diverging as 0+, of forward-backward parabolic equations: The vanishing
1) such that for any ( , ) and n N viscosity limit. Math. Ann. 355, 551584.
Tomasz Dotko, Maria B. Kania & Chunyou Sun (2012).
Analysis of the viscous Cahn-Hilliard equation in RN.
( ) L (Q )
M. (24) Journal of differential Equations, 252, 27712791.

251

CH32_27.indd 251 3/15/2016 12:28:54 PM


This page intentionally left blank
Inverse problems

AMER16_Book.indb 253 3/15/2016 11:29:33 AM


This page intentionally left blank
Applied Mathematics in Engineering and Reliability Bri, Snel, Khanh & Dao (Eds)
2016 Taylor & Francis Group, London, ISBN 978-1-138-02928-6

On a multi-dimensional initial inverse heat problem


with a time-dependent coefficient

C.D. Khanh & N.H. Tuan


Faculty of Mathematics and Statistics, Ton Duc Thang University, Ho Chi Minh City, Vietnam

ABSTRACT: In this paper, we solve an initial inverse problem for an inhomogeneous heat equation.
The problem is ill-posed, as the solution exhibits unstable dependence on the given data functions. Up to
now, most of studies are focused on the homogeneous problem, and with constant coefficients. Recently,
we solved the heat problem with time-dependent coefficients in 1-D for a homogeneous heat equation.
This work is a continuous expansion of previous results (See (Quan 2011)) (N.H. Tuan & Triet 2013).
Herein we introduce two efficient regularization methods, the quasi-boundary-type and high frequency
truncation methods. Some error estimates between the regularization solutions and the exact solution are
obtained even in the Sobolev space H1 and H2.

Keywords: ill-posed problems; boundary value method; truncation method; heat equation; regularization

1 INTRODUCTION of existence, it does not depend continuously on


the given datum. In fact, from a small noise of con-
Forward heat conduction problems related to taminated physical measurement, the correspond-
the heat equation is to predict the temperature ing solutions may have a large error. This makes
field at a subsequent time of a medium from the the numerical computation difficult, hence a regu-
knowledge of an initial temperature and boundary larization is needed.
conditions. On the other hand, an inverse heat con- Many papers are devoted to special cases of
duction problem is to recover the temperature dis- the problem (1) in one dimension. For instance,
tribution at a certain time from the knowledge of when b(t ) = 1 and f x t ) = 0, the problem (1)
the temperature and boundary conditions at a later has been investigated by many authors, such as
time. The inverse problems for heat equation are of John (John 1960) who introduced a fundamental
great importance in engineering applications, and concept to prescribe a bound on the solution at
aim to detect a previous status of physical field t T with relaxation of an initial data g; Lattes
from its present information. They can be applied and Lions (Latts & J.L. Lion 1967), Showalter
to several practical areas, such as image processing, (Showalter 1974), and Ewing (Ewing 1975) used
mathematical finance, physics and mechanics of quasi-reversibility method. Other approaches
continuous media, etc. In this paper, we consider including the least squares methods with
the problem of finding the temperature u( x t ), Tikhonov-type regularization were introduced by
such that Ames and Epperson (Ames & Epperson 1997),
and Miller (Miller 1970). A parallel method for
u backward parabolic problems is proposed by Lee
t = b(t )uu + f ( x, t ), ( x, t ) (0, T ), and Sheen (J. Lee 2006, J. Lee 2009). This problem
was also investigated by many other authors, such
u | = , t ( , T ), (1)
as Clark and Oppenheimer (G.W. Clark & S.F.
u( x, T ) = g ( x ),
) x ,
Oppenheimer 1994), Ames et al. (Ames &
Epperson 1997), Denche and Bessila (Denche &
Bessila 2005), Tautenhahn et al. (T. Schroter
where is an open bounded and connected 1996), Melnikova et al. (I.V. Melnikova 1993b, I.V.
domain in R n with sufficiently smooth boundary, Melnikova 1993a), Fu (X.L. Feng 2008, C.L. Fu
is Laplace operator on R n , is the boundary 2007), Yildiz et al. (B. Yildiz 2000, B.Yildiz 2003).
of , and b(t), g(x), f x t ) are given. It is well- When b(t ) = 1 and f x t ) 0, the problem (1) has
known that the backward problem is ill-posed, i.e., been studied by Trong et al. (Trong & Tuan 2006,
its solution does not always exist, and in the case Trong & Tuan 2008).

255

AMER16_Book.indb 255 3/16/2016 11:00:54 AM


Up to now, the backward heat problem with Known facts (Eigenvalues of the Laplace operator)
the time-dependent coefficient of u in the main
1. Each eigenvalue of is real. The family of
equation is still continuously investigated. This
eigenvalues { p }p =1 satisfy 0 < 1 2 3 ...,
kind of equation ut b(t )u = f x, t ) has many
and p as p .
applications in groundwater pollution. It is a simple
2. There exists an orthonormal basis { p }p =1 of
form of the advection-convection, which appear in
L2 ( ), where X p H 01 ( ) is an eigenfunction
groundwater pollution source identification prob-
corresponding to :
lems (See (Atmadja & Bagtzoglou 2003)).
This work is a continuous expansion of our
X p ( x ) = pX p ( x ), x
previous results (Quan 2011). We solve the heat X ( x ) = 0, x , (2)
equation in the multi-dimensional case by two reg- p
ularization methods, the modified quasi-boundary
value method and the truncation method. The first for p = 1, 2, ...
method is the perturbation method, whereby we Let 0 q < . By y S q ( ) we denote the space of
modified the source term f and the final data g. The all functions g L2 ( ) with the property
main idea of the quasi-boundary-value method is
to replace the boundary value problem with an

approximate well-posed one, then to construct (1 + )2q | g p |2 < , (3)


p =1
approximate solutions of the given boundary value
problem. This method has been applied in many
problems, such as the evolution operator differ-
ential equation (G.W. Clark & S.F. Oppenheimer where g p g( x )X p ( x )dx
d . We define || g ||Sq() =
1994), the hyper-parabolic partial differential equa-
tion (Showalter 1983), the elliptic equations (Feng p =1 (1 + p )2q | g p |2 . If q = 0 then S q ( ) is L2 ( )).
2010), etc.
(See [?] Chapter V) and (X.L. Feng 2008) (page 179).
The second method is based on ideas of the
As we know, the forward heat problem
paper (C.L. Fu 2007). Moreover, using the trun-
cation method can be easily obtained an error u
estimate to archive a better convergence rate. This t = b(t )u + f ( x, t ), ( x, t ) ( , T )
fact has been confirmed in (X.L. Feng 2010, C.L. u | = 0,,
Fu 2007). The truncated regularization method t ( , T ) (4)

is an effective method for solving some ill-posed u( x, ) = g ( x ), x
problems, and it has been successfully applied to
some inverse heat conduction problems (Berntsson where f L2 (( , T ); L2 ())
)) g L2 ( ) has a unique
1999). solution. However, for the backward problem (1)
The outline of the rest of the paper is as where f L2 (( , T ); L2 ( )), g L2 ( ), there is no
follows. In the next section, we simply analyze the guarantee that the solution exists. In the following
ill-posedness of the problem (1). In Sections 3 and Theorem, we consider the existence condition of
4, we introduce two regularized methods and error solution to the problem (1) under the following
estimates between the exact solution and the regu- condition on f and g.
larized solutions, respectively. Theorem 2.2. The problem (1) has a unique solu-
tion u if and only if

( )
2 MATHEMATICAL INITIAL INVERSE
T
PROBLEM OF HEAT CONDUCTION exp p b s ds
=1
Let b : [ , T ] R be a continuous function on
( )
2
T
T
[ 0 ] satisfying 0 1 b( ) 2, [ 0,
0 T ]; it is
assumed to be differentiable for every t and satisfy
gp

p s b d f p ( s )dds < , (5)

0
0 < b(t ) C1 for t ( ,T ).
Throughout this article, we denote the L2-norm where gp = g(x)Xp(x)dx, fp(x) = f(x, s)Xp(x)dx.
by || . || , and the inner product on L2 ( ) by Proof: Suppose the problem (1) ( ) has an exact
<, >. We also suppose that f L2 (( , T ); L2 ( )) solution u C ([ 0, T ]];; H 01 ( )) C 1((0, T ); L2 ( )).
and g L2 ( ) . First, we state a few properties of We have
the eigenvalues of the operator on the open,
bounded and connected domain with Dirichlet u
< , Xp > = b(t ) < u, X p (.) >
boundary conditions, which can also be referred to t
Section 6.5 in (Evans 1997). +< f t ), X p (.) > . (6)

256

AMER16_Book.indb 256 3/16/2016 11:00:56 AM


( )
Integrating by parts, we have
T
v( x ) = exp p b( s )ds
< u, X p (.) > = u( x, t )X p( x )ddx p =1

( )
(7) T
= p < u( t ), X p (.) >. T
gp

p p s b( ))d f p ( s )dds X p ( x ). (14)

0
Combining (6) and (7), we obtain
up (t ) + b t ) pu p (t ) = f p t ) which is equivalent to It is easy to see that v L2 ( ). Then, we con-
sider the problem of finding u from the forward
heat problem
p tT b )d
e u p (t ) (t )
u
t = b(t )u + f ( x, t ),
T
b d
=e f (t ). (8)
p u | = , t ( , T ) (15)

where upon u( x, 0 ) = v( x ), x .


The problem (15) is the forward problem so it
T p sT b )d
t e

u p ( s ) ( s )ds

has a unique solution u (See (Evans 1997)). We
have

( )
T
T
=
( )
p s b d f p s) , (9)
t
t u ( x, t ) [e p p b( ))d ) X p (x ) >
v( x ),
p =1

( )
or t
t
+ exp
e p
p s b( ) f p ( s)) ]X p ( ).
gp ( p s
T
b )
d u p (t )
0

(16)

( )
T
T
= p s b d f p s )ds. (10) Thus
t

( )

e
T
Hence u ( x, T ) p p b( )d ) X p (x ) >
v( x ),
v(x
p =1

( ) ( )

T
T T
u p (t ) = < u( , t ), X p (.) e p p t b( )d + exp
e p p s b( ) f p ( )ds X p ( ).

( )
0
T
T
gp

p
p s b( )d f p s )ds .

(11) (17)
t
Combining (14), (17), and by a simple computa-
Letting t = 0 in (11), we have tion, we get

( )

T
up p 0 b( s )d
ds u( x T ) g pX p ( x ) = g( x ). (18)
p =1

( )
T
T
gp

p p s b( ))d f p ( s )dds .

(12)
Hence, u is the unique solution of the problem (1).
0
Remark 1:
Then 1. When b(t ) = 1, f x t ) = 0 , the problem (1) has
a unique solution if and only if g satisfies the fol-

( )
lowing strong regularity assumption
T
(., ) ||2
|| u(., exp
e p p 0 ( s )ds
p =1
e2TT | < g(.), X p (.) > |2 < .
( )
2
T
T (19)
gp

expp p s ( ))d f p s )ds < .

(13) p =1
0
This assumption is also given by Lemma 1 in
If (5) holds, then we define (G.W. Clark & S.F. Oppenheimer 1994).

257

AMER16_Book.indb 257 3/16/2016 11:01:01 AM


2. Some examples of f and g which satisfying (5) So
are given in the Numerical Experiments Section.
G(t ) = 4bb2 (t ) (w(x, t ))
2
dx

Theorem 2.3 The problem ((1) ) has at most one
= 4 w 2 ( x, t )ddx b (t ) (w( x, t )) dx. 2
solution in C([ 0, T ] H 01 ( )) C 1((
((00, T ) L2 ( )). If t
(1) has a solution u, then it is defined by (26)

( )
We have
T
u( x t ) e p p t b( )d
p =1
C1

( )
T
T G (t )G (t ) (G (t ))2 G (t )G (t )
gp

p p s b( ))d f p ( s )dds X p ( x ). (20)

B1
t = 4 w dx
d wt2dx

2b(t ) w 2dx (w( x, t ))2 dx
In spite of the uniqueness, the problem (1) is still

( )
2
ill-posed and some regularization methods are nec-
4 w x t wt x t dx
essary. In next sections, we propose two approxi-
mating problems. C1
+2 b(t ) w 2dx ( ( x, ))2 dx
Proof: The proof of this Theorem is divide into B1

d 4( )
two steps. 2
= 4 wt dx
2
( , ). ( , )
Step 1. The problem (1) has at most one solution.
Let u(x, t), v(x, t) be two solutions of C
+ 2 1 b( ) b( ) w dx ( ( x, )))2 .
the problem
p (1) such that u, v C([0, T]; B1
(0 T ); L2 ( )) . Put w(x, t) = u(x, t)
H 01 ( )) C 1((0
v(x, t); then w satisfies the problem (27)
Using the Hlder inequality, we have
wt b(t )w = 0,

w | = 0, t ( , T ), (21)
w( x, T ) 0, x . 4 2
wt dx
2

( )
2
4 0. (28)
Now, setting G (t ) = w (x ( x t )ddx ( t T ), 2

and by taking the derivative of G(t ), we have C
Since 0 1 b(t 0 < b(t ) C1 , we get B1 b(t )
b(t) 0. Then
1
G(t ) = 2 w( x, t ) . wt ( x, t )dx

= 2b(t ) w( x, t ) . w( x, t )dx (22) C1
G (t )G (t ) (G (t ))2 G (t )G (t ) 0. (29)
= 2 w( x, t ) t ( x , t )dx.
B1

C1
t
Using the Green formula, we obtain We define the function m(t ) = e B1 , and then
regard G as a function of m. Let us introduce an
auxiliary function
G(t ) = 2b(t ) ( ( , t ))2 ddx. (23)

F ( m ) = ln[G (t( m ))]. (30)
Hence
B1
Since t C1
l m , we have
G(t ) = 4b(t ) ( , )
wt ( x, t )dx

G(t( m ))t ( m ) B G(t( m ))
2b(t ) (w
( x, t ))2 dx
d . (24) F ( m ) = = 1 . (31)
G (t( m )) C1m G (t( m ))
Moreover, using the integration by parts and
and
wt ( x t ) b(t )w(x
( t ), we get
B1 G(t( m )) B2
( ) w ( x , t )
wt ( x, t )dx F ( m ) = + 21 2
2
C1m G (t( m )) C1 m
= 4b(t ) w( x, t ) wt ( x, t ) (25)
G (t( m ))G (t( m )) [G
G (t( m ))]2
= 4b (t ) (
2
( x, )) 2
. 2
G (t( m ))

258

AMER16_Book.indb 258 3/16/2016 11:01:04 AM


G (t( m ))G (t( m )) [G(t( m ))]2 u | , t ( , T ),

( )
F ( m ) =
( )
2 T
exp p b d
u ( x, T ) =

( )
C (37)
B1 G (t( m ))G (t( m )) p =1 k p p
+ exp
T
1 (32)
( )
2 0
C
G (t( m )) B m g pX p ( x ), x .

m ) 0. Hence
Using (29) and (32), we obtain F ( where 0 1, k 1 and f p t ), g p are defined by
F is a convex function on the interval 1 m m1
f p t ) = f x t )X p ( x )ddx,
C1
T
with m1 e B1 . According to the convex property (38)
of function F(m), we have gp g(x d .
( x )X p ( x )dx

m 1 m m
F (m) F (m ) + 1 F ( ). (33) The idea of the problem (36) has a long mathe-
m1 1 m1 1 matical history going back to (Showalter 1983,
G.W. Clark & S.F. Oppenheimer 1994, Denche &
In addition, from (30), inequality (33) is equiva- Bessila 2005). Adding an appropriate corrector
lent to into the given data u( x T ) is a key idea in the
theory of the quasi-boundary value method (or
m 1 m1 m
modified quasi-boundary value method). Using
G (t ) [G (T )] m11 [G (0 )] m11 . (34)
this method, Clark and Oppenheimer (G.W.
Clark & S.F. Oppenheimer 1994), and Denche
Since G(T ) = 0, we conclude that G(t ) = 0 for and Bessila (Denche & Bessila 2005), regularized
0 t T . This implies that u( x t ) v( x t ) . The a similar backward problem by replacing the given
proof of the step 1 is completed. condition by
Step 2. The problem (1) has a solution which is
u(T ) + u( ) g (39)
defined in (20).
Using (11), we have (20).
In spite of the uniqueness, the problem (1) is still and
ill-posed and some regularization methods are nec-
essary. In next sections, we propose two approxi- u(T ) u ( ) g, (40)
mating problems.
respectively. Tuan and Trong (Trong & Tuan
2008) presented a different perturbation of g
3 A MODIFIED QUASI-BOUNDARY by a new term u( x, T ) A(, T )g , where A( , g )
VALUE METHOD AND ERROR satisfies some suitable conditions. The prob-
ESTIMATES IN L2 lem (36) is a generalized version of the
regularized problem given in (Trong & Tuan
In practice, we get the given data g by measuring at 2008).
discrete data. Hence, instead of g, we shall get an In the next Theorem, we shall study the exist-
inexact data g L2 ( ) satisfying ence, uniqueness and stability of a (weak) solution
of the problem (36).
|| g g || . (35)
Theorem 3.1 The problem (46) has a unique solu-
In this section, we shall regularize the problem tion u C([0,( T]; L2()) L2((0, T); H01())
(1) in the following one C 1((0, T ) H 01 ( )). The solution depends continu-
ously on g in L2 ( ).
In Step 2, the stability of the solution is given.
u First, we state the following Lemma
= b(t )u
t

+
exp ( )
f p t )X p ( x ) Proof:
The proof is divided into two steps. In Step 1,
( )
, (36)
T the existence and the uniqueness of a solution of
p =1 p p
k + exp
0 (36) is showed; the (unique) solution u of prob-
( x, t ) (0, T ) lem (36) is given by

259

AMER16_Book.indb 259 3/16/2016 11:01:08 AM


( )
t Step 1. The existence and the uniqueness of a
exp p b d solution of the problem (36). We divide this step
u ( x, t ) =
( )
T
into two parts.
p =1 k + exp
p p b d Part A If u W satisfies (51) then u is solu-
tion of the problem (36). We can verify directly
T
that u W . We have
gp e
T b d
f p (ss dds X p ( x ).
t

(41) < ut( t ), X p (.) >


In Step 2, the stability of the solution is given.
=
b( p
b(t ) exp ( )
( )
First, we state the following Lemma
T
Lemma 3.1 + exp
k
p p 0 b( )d
For M , , x > 0, k 1, we have the following T

g p e s
T )d
inequality f p s )ds
t

1

( kM )k
k . (42) +
(
exp )
( )
Mx f p (t )
x + e
k
ln k ( M )
k k + ex
e p ( )
(48)
Proof: Let f x ) = 1
, we have = b(( ) < u (., ), p (.) >

( )
k Mx
x
x e T
k Mx
exp p 0 ( )d
kx Me +
( )
f x) = . (43) T
f p (t )
(x k + e M
Mx
x 2
) k exp
p p ( )d
)

The equation f x ) = 0 gives a unique solution = b(( ) < u (., ), p (.) >

x0 such that kx0k 1 Me Mx0 = 0. It means that


x0k 1e Mx0 = kM. Thus the function f achieves its +
(
exp p
T
0
d )
( )
T
f p t ).
maximum at a unique point x x0 . Hence k p p 0 b

1
f x) . (44) This implies that
x0k + e Mx0

Since e Mx0 k k 1
M 0
x , we have
ut b(t ) u +
exp ( p
T
b d )
f x)
x0k
1
+e Mx0

x0k +
1
k k 1
x
. (45)
p =1 k + exp
p ( p
T
b d ) (49)

M 0 f p (t )X p ( x ).
Mx
By using the inequality e M
Mx0, we get
By letting t T in (41), we get
M 1
= x0k 1e Mx k 1 e (kk
( )
M
Mx
e Mx0 T
k M (46) exp p b d
1 u ( x, T ) =
( )
= k 1 e kMxM 0
. T
g pX p ( x ).
M p =1 k + exp
p b d
p

Mk k
This gives e kMx
M 0
or M 0 ln( M
kMx ).
k
k k
Therefore, u is the solution of the problem (36).
Therefore x0 1
kM
ln( M
k
). Hence, we obtain Part B. The problem (36) has at most one solu-
tion in W.
1 ( kM )k We can prove this part in a similar way as in the
f x) k . (47)
x0k ln k ( Mk ) step 1 of Theorem 2.2. This ends.
The Lemma is completely proved, and Step 2. The solution of the problem (36) depends
Now we pass to the proof of Theorem 3.1. continuously on g in L2 ( ) .
Denote W = C([0, T]; L2()) L2((0, T); Let w and v be two solutions of (36) correspond-
(0 T ); H 01 ( )) .
H 01 ( )) C 1((0 ing to the given values g and h.

260

AMER16_Book.indb 260 3/16/2016 11:01:12 AM


From (41), we have
|| u ( , T ) ( x ) ||2L2

( ) 2 2pk g 2p
t
exp b( )d = . (55)

( )
p 2
w ( x, t ) =
( )
p =1 T
T k exp
p ( ))d
p =1 k + exp
p p b( )d (50) p

T

g p e p s
T )d Using the following estimate
f p s )ds X p ( x ).
t

( )
2
k
exp ( ) t
b( )d
+ p

( )
p 0
v ( x, t ) =
( ) > 2 2k
T + p (56)
p =1 k + exp b( )d (51)

( )
p 0
T
T p
T
)d > exp 2 b s ds
hp t e s
f p s )ds X p ( x ). 0

we get
where gp = g(x)Xp(x)dx, hp(x) = h(x)
Xp(x)dx. It follows that || u ( , T ) ( x ) ||2L2

( )
N
T
(., ) v(.,
|| w(., (., ) ||2 2 2 k g 2p expp ( s)) . (57)

( )
t
2 p =1 2
exp p ( ))d
=
( )
T
( p p) (52) By taking such that
p =1 k + exp
p p ( )d
0

< ,
( )
2
1 N
( p p) . 2 2 k g 2p p
p =1 k p( B2T )
exp( p =1

Using the inequality we obtain || u ( , T ) ( x ) ||2L .


k
By using (53), we have the following error
1 k 1
( B2T )k estimate
( kTB
T ) ln
x k + e B2Tx k
2
|| u ( , T ) ( x ) ||2L2
k (53)
( B3 ) 2 2 k g 2p
= B4 1 ln =

( )
2
p =1 T
k exp
p p ( )d
)
( B T )k
where B3 , B4 = ( kB
B2T )k, we conclude that
k 2 k g 2p
2
TB2 2
p =1 (
k
B
k +e )
|| w(., ((.,, ) ||L2 B4 1 ln 3
(., ) v(., || g h ||L2 . 2 k
( B3 )
2 B42 2 ln

2 k g 2p
(54) p =1
2 k
B
This ends the proof of the step 2 and the proof = B42 ln 3 ||g||S2 k .
of Theorem 3.1.
Theorem 3.2 Let g S k ( ) for k > 0 . Then we
have where upon
k k
B3 B3
|| u (., ) g(.)
(.) ||L || u ( , T ) (x)
( x)) ||L ln || ||S .
4 ln || ||S . 4



Proof: Let > 0. Since g ( x ) = p =1 g pX p ( x ), Theorem 3.3
Let g L2 ( ) be as Theorem 3.2, and
then

there exists a positive integer N for which T
p N +1 g 2p < / 2. We have 0 || f s ) ||S2 k ds < . If u ( x, ) converges in

261

AMER16_Book.indb 261 3/16/2016 11:01:16 AM


L2(), then the problem (1) has a unique solution u. for every t [0, T] and C = B4supt[0,T]
Furthermore, the regularized solution u ( x, t ) con- || u(( ) ||S k ( ).
verges to u(t ) uniformly in t as tends to zero.
Proof: Let u be the solution of the problem
Proof: Assume that lim u ( , 0 ) 0 (x) exists. (36) corresponding to g, and let v be the solution
0
Let of the problem (46) corresponding to g, where g
and g are in right hand side of (36). Using (20)
( )

t
u( x t ) [e p p b( )d u0 p and (41), we get
p =1

( )

t t | (., t ) (., t ), X p (.) |
+ exp
( )
e p p s b( )d f p s ) ]X p ( )
0
exp
p k
( )
=
( )
(58)
T
k + exp
p b( )
where u0p = u0(x)Xp(x)dx, fp(s) = < f(x, s), p 0

Xp(x) >. It is easy to check that u satisfies T


T
( ) f p ( s)) )
ut b(t )u + f x, t ) and u(x, t) = 0 for x . We gp expp p s
will prove that u( x T ) g ( x ). Using the inequality t
( ) 2 2( 2 2
), we have


|| u (., ) u(.,
(., ) ||2L2
k
exp
p ( T
p 0 b( )d )

exp( 2 b(( )ds )(u0p
t
0
u0 p )2
exp
k

T
p p ( t
T
b( )d g p )
p =1
p p b( )dd f p ( s )ds
s

( )

t t k exp d )
+ t

2 t
p s b d t
p =1
( )
0
T
2 2 k B2T
k exp
p p b( )d gp
f p2 s )ds + e k t

( )
2 T
k p p b( )d f p s )ds ).
T s
+ exp p p b d
(59) k exp
t
t
|| u (., 0 ) u0 (.) ||2L2
It follows that
t 2
+ T2
0
( k + e TBB )2 2 k f p2 s )ds
(., ) u (., ) ||2L2
2
p =1 || u(.,

|| u (., 0 ) u0 (.) ||2L2
2 k = | (., t )
( t ), X p (.) |2
(.,
B T
+ B42 ln 3

0 || f s ) ||S2 k ds. p =1
2k
k
B

B42 ln 3

2k | u( x, t ), X p ( x ) |2
Hence lim || u ( ) u(( ) ||L = 0 . Thus p =1
0 2 k

lim || u (., ) u(.,
u(((., ) ||L2 = 0 . Using the Theorem B
0 C 2 ln 3 .
3.2, we have lim || u (., ) g ((.)) ||L2 = 0. Therefore,
0
u( x, T ) g ( x ). Hence, u( x, t ) is the unique solution (61)
of the problem (1). From (59), we also conclude Hence
that u ( x, t ) converges to u( x t ) uniformly in t.
k
Theorem 3.4 Let f L2 (0, T ; L2 ( )) and B
g L2 ( ). Suppose that the problem (1) has (., ) u (.,
|| u(., ( , ) ||L2 C ln 3 . (62)

a unique q solution u( x t ) in C([0, T];
(0 T ); L2 ( )) which satisfies
H 01 ( )) C 1((0 f u(., t) Using (62) and Step 2 of Theorem 3.1, we get
S k ( ) for any t [ 0,T ] . Let g L2 ( ) be a
measured data such that || g g || Then there || v (., ) u(.,
(., ) ||L2
exists a function v satisfying
|| v ((., ) u (., ) ||L2 + || u (.,
( ) u(., t ) ||L2
k k k
B B B
|| u(., ( , ) ||L2 [C + ] ln 3
(., ) v (., (60) 1 ln 3 || g g || + C ln 3 (63)

262

AMER16_Book.indb 262 3/16/2016 11:01:20 AM


B
k for any p 1. Then w Pw and it depends
[ + ] ln 3 continuously on g, i.e. if wi is the solution with
respect to gi, i = 1, 2 , then
for every t ( , T ) and where C is defined in Theo- w1(t ) w2 (t ) eB (T t ) M
g1 g2
rem 3.4. This ends the proof of Theorem 3.4. L2 ( ) L2 ( )

Remark 2.
Proof: Note that w(t ) is well-defined because
If k = 1 then the condition u(., t ) SS k ( ) is < w(t ), X p ( x ) > = 0 if p Q. This fact also implies
equivalent to the condition u(., t ) 2
( ). that w Pw . Now for two solutions w1, w2 we have
Hence, this condition is natural and acceptable.
To estimate the error in higher Sobolev spaces w1( t ) w2 (., t )
2
L2 ( )
such as H1 and H2, we can not continue to use 2
the modified quasi-boundary value method. We = < w1(t ) w2 (t ), X p ( x ) > (68)
present a truncation method in the next Section. p R

( )
2

4 REGULARIZATION BY THE
= exp < g1 g2 , X p ( x ) >
p R
TRUNCATION METHOD AND ERROR (T t ) M 2
ESTIMATES IN L2, H1, H2 e2B g1(.) 2 (.) L2 ( ).

Suppose that the p problem (1) has an exact solution (69)


u C ([ 0, T ]];; H 01 ( )) C 1((0, T ); L2 ( )), according Recalling the value of M, we have the desired
to (20), we have estimate.

( )

T Theorem 4.2
u ( x, t ) e p p b( )d Assume that the problem (1)
( ) has at most one
( )
t
p =1
(weak) solution u C C ,T ; L C 1((0, T );

( )
T
T
)) corresponding to f L2 ((0, T ); L2 ( )) and
gp

p p s b( )d f p ( s )ddss X p ( x )

L2())
g L2 ( ). Let g be measured data such that
t

(64)
g g L2 ( )
.
Let
R { p , p N p M}, Define the regularized solution u L2 ((0, T );
(65) 2
L ()) from g as in (67). Then for each t [ 0, T ] ,
Q { p , p N p > M}.
C ( ) H 01 ( ) and lim u ( ) = u(( ) in
u(t ) C
0
In spite of the uniqueness, the problem is still ln( 1 )
L2 ( ) if we choose M = .
ill-posed, and a regularization is necessary. For 2TBB2
each > 0, we introduce the truncation mapping Proof:
P : L1()) C ( ) H 01 ( ) Note that u(t ) = Pu(t ) C ( ) H 01 ( ) as
in Remark 1. Moreover using the stability in Theo-
Pw( x ) < w, X p ( x ) > X p ( x ). (66) rem 4.1 we find that
p R
u(., t ) u(., t ) L2 ( )
In fact, P is a finite-dimensional orthogonal pro-
jection on L2 ( ). We shall approximate the original Pu(t
(t ) Pu(t ) L2 ( )
+ P uu(t
(t ) u(t ) L2 ( )
(71)
problem by the following well-posed problem.

t
2T
+ | < u(., t ), X p (.) > | 2

p Q
Theorem 4.1
For each f L2 (((0, T ); L2 ( )) and g L2 ( ), t

let w L2 ((0, T ); L2 ( )) be defined by Note that T converges to zero as 0 and


t > 0. To obtain the convergence of the second
< w( x, t )X p ( x > = ( ) term in the right-hand side of (71), we note that

( )
T
T | < u(., t ), X p (.) |2 u(., t ) L2 ( )
< .
< Pg, X p ( x ) > s b( (67) p Q
t
< P f ( , s ), X p ( ) > ds and M as 0.

263

AMER16_Book.indb 263 3/16/2016 11:01:25 AM


In the above theorem, we have not given an error
= | u(., t ), X p (.) |2.
2
estimate because the condition of the exact solution u(.,
( , t) L2 ( )
(77)
u is so weak (we even did not require u(t ) H 01 ( )). p =1
And the error estimate at t = 0 is useless. However
in practical application we may expect that the exact Using (77) we have
solution is smoother. In such cases, many explicit
> M
error estimates are shown in the next section. An
essential point should be stated that the regularized | < u(., t ), X p (.) > |2
p 1
solution is the same in any case. This is a substantial
1
usefulless for practical applications because even if
we do not know how good the exact solution is, we
| < u(., t ), X p (.) > |2
M p =1
(78)
are always ensured that the regularized solution still 1 2
works without any further adjustment. = u(.,
(., ) L2 ( ).
M
From the usual viewpoint of variational
method, it is natural to assume that u(., t ) H 01 ( )
This implies that
for all t [ 0, T ] . Moreover, if f is smooth and u is
a classical solution of the heat equation (1), then 1
u(., t ) H
H 2 ( ) H 01 ( ) for all t [ 0, T ] . For these | < u(., t ), X p (.) > |2
M
u(., t ) .
L2 ( )
two situations we have the following explicit error p Q
estimates.
Substituting the latter inequality into the esti-
Theorem 4.3 mate (71) in the proof of Theorem 4.2, we obtain
Let u, u be as in Theorem 4.2, and let t [ 0, T ] . the error estimate in L2.
Let us choose M =
ln( 1 )
. To prove the convergence in H 01 we use the
2TB B2 identity (77) and the stability of Theorem 2 again
i. Assume that u(., t ) H 01 ( ) . Then lim u (., ) =
0
u(., t) in H 01 ( ) and u (.,
( , t ) u(., t )
2
L2 ( )

u (., t ) u(., t ) L2 ( ) = | u(., t ) u(., t ), X p ( x ) > |2
p =1
t ln(( 1 ) 2TB
B2 M
2T + u(., t ) . (72)
| < u(., t ) u(., t ), X p (.) > |2
2
2TB
B2 ln( 1
)
L ( ) =
p =1
> M
ii. Assume that u(., t ) H
2
H ( ) H 01 ( ) . Then + | < u(., t ), X p (.) > |2
lim u( ) = u(( ) in H 2 ( ) and p =1
0 M
u(., t ) u(., t ) L2 ( )
M |< P u ( t ) P
P u(., t ), X p (.) >|2 (79)
p =1
t ln(( 1 ) 2TB
B2 > M
2T +
2TB
B2 ln( 1 )
u(., t ) 2
H ( )
(73) + | < u(., t ), X p (.) |2
p =1
u (., t ) u(., t ) M || P u (., t ) P u(., t ) ||2L2 (
( , t)
H 01 ( ) )
p > M
ln(( 1 ) 2TB
B2 + | u(.,
. t ), X p (.) |2
t 2
2T + u(., t ) H2( )
. (74)
2TB
B2 ln( 1
) p =1
> M
ln(( 1 )

t
T + | (., t ), X p (.) |2.
Here we use the norms 2TB
B2 p =1
2 2
w H0
w L2
, (75) The second term in the right-hand side of (79)
converges to 0 as 0 because of the conver-
2 2 2 2
w H
= w L2
+ w H0
+ w L2
. (76) gence in (77). Thus the convergence in H 01 has
been proved.
ii. We now assume that u(., t ) H H 2( ) H 01 ( ).
Proof:
We have an identity similar to (77)
i. By using the integration by parts and the
2 | < u(., t ), X p (.)
2
Parsevals equality, it is straightforward to check |2 u(., t ) L2 ( )
. (80)
that if u(t ) H 01 ( ) then p 1

264

AMER16_Book.indb 264 3/16/2016 11:01:32 AM


The error estimate in L2() follows (71) and the know u(t) H2() H01(), and do not have
following inequality enough information on the exact solution. However,
when u is smoother then an explicit error estimate
> M in H 2 ( ) may be derived. In the last theorem, we
| < u(., t ), X p (.) > |2 shall give the error estimates in some special cases
p 1 when the exact solution is known. From the proof
1 of Theorem 4.3 shown that in fact u(t ) H 01 ( ) and
2 | < u(., t ), X p (.) > |2
M 2 p 1
(81)
u(t ) H
H 2 ( ) H 01 ( ) are equivalent to
1 2
2 u(., t ) H 2 ( ). 2 k | < u(., t ), X p (.) > |2 < (85)
M p 1

Similarly, from (79) and the estimate with k = 1, 2, respectively. We shall see that from
the condition (41) above with k > 2 we may
> M
improve the estimate, and particularly give an error
| < u(., t ), X p (.) > |2 estimate in H 2 ( ). We next consider a stronger
p 1
condition, although it is quite strict for the linear
1
2 | < u(., t ), X p (.) > |2
M p 1
(82) case, as we discussed in previous section, if the
above condition (85) holds then we have a better
1 2 convergence rate.
u(.,
(., ) H 2 ( ),
M
Theorem 4.4
Let u, u, M be as in Theorem 4.2 and let [ 0, ].
we find that
i. Assume that
2
u (.,
( , t ) u(., t ) L2 ( )
t 1 ln(( 1 ) 2
(83) (., ) ||S2 k ( ) = 22kk |
|| u(., ((., t )), X p ((.)) |2 < ,
+
T
u(., t ) H2( )
. p =1
2TB
B2 M
(86)
Using the inequality a b ( a b ) we
2

obtain the error estimate in H 01 . for some constant k > 2. Then


Finally we prove the convergence in H 2 ( ).
Similarly to (79) we have || u (., ) u(.,
(., ) ||L2 ( )
k
t 2 B2T
ln(( 1 ) (87)
2 T
+ 1
|| u(.,
(., ) ||S k ( ) ,
(u u )(., t ) L2 B2 ln( )
2TB
2
= 2 < u (., t ) u(.
( , tt)), X p (.) > u (., t ) u(., t )
p 1 H 01 ( )
2k 1
M
2 2 B2T
ln(( 1 ) 2


t
M 2 < u (., t ) u(., t ), X p (.) > 2T + || (., t ) ||S ,
p 1 B2 ln( 1 )
2TB ( )

> M
2 (88)
+ 2 < u( x, t ), X p ( x ) > (84)
u (., t ) u(., t )
p 1 H2( )
2
M 2 P u (., t ) P u(., t )
2k 2

L2 ( ) t
1
ln(( ) 2 B2T 2

3 2T + 3 || u(.,
(., ) ||S k ( ).
ln( 1 )
> M
2 2TB
B2
+ 2 < u(., t ), X p (.) >
p 1 (89)
> M
ln( 1 ) 2

t
2T
T
+ 2
< u(., t ), X p (.) > 0 Here we assume e for the estimate in H ( ). 2
2TB2 p 1
ii. Assume that
as 0 due to the convergence in (80).
Fr (t ) = e 2rr | u(., t )), X p ((.)) |2 <
Remark 3. p 1
In the subsection (ii) of Theorem 4.3, an error
estimate in H 2 ( ) is not given because we only for some constant r > 0. Then

265

AMER16_Book.indb 265 3/16/2016 11:01:37 AM


u (., t ) u(., t ) L2 ( )
u (., t ) u(., t ) L2 ( )
> M
ln(( ) 1 (90) ln(( 1 )

t r t
22T
T
+ Fr (t ) , 2T
2T
+ | u(., t ), X p (.) > |2 (96)
2TB
B2 2TB
B2 p 1

u (., t ) u(., t ) t ln(( 1 ) r 2T


H 01 ( ) 2 + Fr ( ) .
1 2TB2
t ln(( 1 ) Fr (t ) ln( ) r (91)
22T + 2T
2TB
B2 2 B2T Note that the function
is increasing
u (., t ) u(., t ) when 1. Thus
H2( )
1
t ln(( 1 ) Fr (t ) ln( ) r (92)
2r( M )
> M 1.
3 22T
+3 2T .
2TB
B2 2TB
B2
It implies that
Here we assume e 2T for the estimate
in H 01 ( ), and e 4T for the estimate in > M
H 2( ) . | < u(., t ), X p (.) > |2
p 1
Proof: M e2r( M )
| < u(., t ), X p (.) > |2 (97)
p 1
i. We use the same way as in the proof of Theo- rT

rem 4.3. We shall prove the error estimates in M Fr (t ) .


H 2 ( ) (the other ones are similar and easier).
From The error estimate in H 01 ( ) follows from the
above estimate and (79).

> M Similarly, because the function 2 is
2 | < u(., t ), X p (.) > |2 increasing when 2, we find that
p 1
1

M 2k 2 2k | < u(., t ), X p (.) > |2 (93) 2 2 2 r( M )
> M 2.
p 1
(., ) ||S2 k ( )
|| u(., It follows that

M 2 k 2
> M

and (84) we find that | < u(., t ), X p (.) > |2


p 1
M 2 e 2 r ( M )
| < u( x, t ), X p (.) > |2 (98)
2 T t || u(., t ) ||S2 k ( ) p 1
(u u )(., t ) L2
T
+ . r
M 2 k 2
M 2 Fr (t ) T .

Using Thus (84) reduces to

w w + w + || w
w ||L2 3 || w ||L2 (94) 2
H L2 H 01 (u u )(., t ) L2
ln(( 1 )
and M 1 we conclude the desired estimate in
t r
2T + M 2 Fr (t ) T . (99)
H 2 ( ). 2TB
B2
ii. From (71) and
Hence
> M
| < u(., t ), X p (.) > | 2
(u u )(., t )
p 1 (95) L2

e
r T t r
2 rM 2rr
e | < u(., t ), X p (.) > | Fr (t ) 2 T
2T
+ M Fr (t ) 2T (100)
p 1
t ln( 1 ) Fr (t ) ln( 1 ) r 2T
= 2T
+ .

we get the error estimate in L2 ( ) : 2TB


B2 2TB2

266

AMER16_Book.indb 266 3/16/2016 11:01:41 AM


Using Evans, L.C. (1997). Partial differential equation. American
Mathematical Society, Providence, Rhode Island 19.
w w + w + || w
w ||L2 3 || w ||L2 . (101) Ewing, R. (1975). The approximation of certain para-
H L2 H 01 bolic equations backward in time by sobolev equa-
tion. SIAM J. Math. Anal. 6, 283294.
we have Feng, X.L., L. Elden, C. (2010). Stability and regulariza-
tion of a backward parabolic pde with variable coef-
ficient. J. Inverse and Ill-posed Problems 18, 217243.
u (., t ) u(., t ) H2( ) Feng, X.L., L. Elden, C.F. (2008). Numerical approxi-
3 (u u )(., t ) L2
(102) mation of solution of nonhomogeneous backward
1
heat conduction problem in bounded region. J. Math.
t ln(( 1 ) Fr (t ) ln
l ( ) r Comp. Simulation 79, no. 2, 177188.
3 2T
+3 .
2T
Feng, Xiao-Li; Elden, L.F. C.L. (2010). A quasibound-
2TB
B2 2TB2
ary-value method for the cauchy problem for elliptic
equations with nonhomogeneous neumann data. J.
5 CONCLUSION Inverse Ill-Posed Probl. 18, 617645.
Fu, C.L., X.X. Tuan, Z.Q. (2007). Fourier regularization
for a backward heat equation. J. Math. Anal. Apll.
In this paper, we solved a backward in time prob- 331, 472480.
lem of the heat equation with time-dependent coef- Isakov, V. (1998). Inverse problems for partial differential
ficients and inhomogeneous source. We suggested equation. Springer-Verlag, New York.
two new methods; the truncation of high frequency John, F. (1960). Continuous dependence on data for solu-
and the quasi-boundary-type method. In the theo- tions of partial differential equations with a prescribed
retical results, we obtained the error estimation of bound. Comm. Pure Appl. Math (13), 551585.
Hlder type in L2 , H 1, H 2 norms based on some Latts, R. & J.L. Lion (1967). Methode de quasirevers-
assumptions of the exact solution. In the numeri- ibility et application. Dunod, Paris.
cal results, it shows that both methods are stable Lee, J., D.S. (2006). A parallel method for backward par-
abolic problem based on the laplace transformation.
and converged to the exact solutions at t = 0. In SIAM J.Nummer. Anal. 44.
comparison between two regularized methods, the Lee, J., D.S. (2009). F. Johns stability conditions versus
MQBV shows a better numerical performance in A. Carassos SECB constraint for backward parabolic
term of error estimation and convergence rate. problems. Inverse Problem.
However, it should be stated that the regularized Melnikova, I.V., A.F. (1993a). The cauchy problem.
solutions of our methods are based on the series Three approaches, monograph and surveys in pure
expression of solution, which may lead a limita- and applied mathematics. London - New York: Chap-
tion of the methods for applications in a general man and Hall 120.
domain where a solution is described by any physi- Melnikova, I.V., S.B. (1993b). I.v. melnikova, s.v.
bochkareva. Dok.Akad.Nauk. 329, 270273.
cal meaning or interesting problem.
Miller, K. (1970). Least squares methods for ill-posed
problems with a prescribed bound. SIAM J. Math.
REFERENCES Anal., 5274.
Nam, P.T., D.D. Trong, N.T. (2010). The truncation method
Ames, K.A. & J. Epperson (1997). A kernel-based method for a two-dimensional nonhomogeneous backward heat
for the approximate solution of backward parabolic problem. Appl. Math. Comput. 216, 34233432.
problems. SIAM J. Numer. Anal. 34, Vol. 8, 127145. Payne, L. (1973). Some general remarks on improperly
Atmadja, J. & A. Bagtzoglou (2003). Marching-jury back- posed problems for partial differential equations. 130:
ward beam equation and quasi-reversibility methods Symposium on Non-well Posed Problems and Loga-
for hydrologic inversion: Application to contaminant rithmic Convexity, Lecture Notes in Mathematics,.
plume spatial distribution recovery. WRR 39. Quan, Pham Hoang; Trong, D.D.T.L. M.T.N.H. (2011).
Berntsson, F. (1999). A spectral method for solving the A modified quasi-boundary value method for regu-
sideways heat equation. Inverse Problem 15, 891906. larizing of a backward problem with time-dependent
Burmistrova, V. (2005). Regularization method for para- coefficient. Inverse Probl. Sci. Eng. 19, 409423.
bolic equation with variable operator. J. Appl. Math. Schroter, T., U.T. (1996). On optimal regularization
no. 4, 382392. methods for the backward heat equation. Z. Anal.
Clark G.W., & S.F. Oppenheimer (1994). Quasirevers- Anw. 15, 475493.
ibility methods for non-well posed problems. Elect. J. Shidfar, A., A.Z. (2005). A numerical technique for back-
Diff. Eqns. 301, 19. ward inverse heat conduction problems in one dimen-
Denche, M. & K. Bessila (2005). A modified quasibound- sional space. Appl. Math. Comput. 171, 10161024.
ary value method for ill-posed problems, Volume 301. J. Showalter, R. (1974). The final value problem for evolu-
Math. Anal. Appl,. tion equations. J. Math. Anal. Appl. 47, 563572.
Elden, L., F. Berntsson, T.R. (2000). Wavelet and fou- Showalter, R. (1983). Cauchy problem for hyper -para-
rier methods for solving the sideways heat equation. bolic partial differential equations. in Trends in the
SIAM J. Sci. Comput. 21(6), 21872205. Theory and Practice of Non-Linear Analysis.

267

AMER16_Book.indb 267 3/16/2016 11:01:45 AM


Trong, D. & N. Tuan (2006). Regularization and error coefficient: Regularization and error estimates. Appl.
estimates for nonhomogeneous backward heat Math. Comp. 219, 60666073.
problem. pp. 110. Yildiz, B., H. Yetis, A. (2003). A stability estimate on the
Trong, D. & N. Tuan (2008). A nonhomogeneous regularized solution of the backward heat problem.
backward heat problem: Regularization and error Appl. Math. Comp. 135, 561567.
estimates. Electron. J. Diff. Eqns. 33, 114. Yildiz, B., M.O. (2000). Stability of the solution of
Tuan, N. & D. Trong (2010). A nonlinear parabolic backward heat equation on a weak conpactum. Appl.
equation backward in time: regularization with new Math. Comput. 111, 16.
error estimates. Nonlinear Anal. 73, 18421852.
Tuan, N.H., P.H. Quan, D.T. & L. Triet (2013). On
a backward heat problem with time-dependent

268

AMER16_Book.indb 268 3/16/2016 11:01:46 AM


Advanced numerical methods

AMER16_Book.indb 269 3/15/2016 11:30:20 AM


This page intentionally left blank
Applied Mathematics in Engineering and Reliability Bri, Snel, Khanh & Dao (Eds)
2016 Taylor & Francis Group, London, ISBN 978-1-138-02928-6

A study of Boundary Element Method for 3D homogeneous


Helmholtz equation with Dirichlet boundary condition

M.-P. Tran & V.-K. Huynh


Faculty of Mathematics and Statistics, Ton Duc Thang University, Ho Chi Minh City, Vietnam

ABSTRACT: In the paper, we study the numerical solution to 3D homogeneous Helmholtz equation
with Dirichlet boundary condition. The discretization of the problem is considered using the Boundary
Element Method (BEM). The problem in the whole domain is first established in terms of the interior
and exterior boundary integral equation. Based on the Green formula, analytical solution to Helmholtz
equation is represented in terms of the boundary data. At last, in this study, we also apply the Finite Dif-
ference Method (FDM) in order to give comparative results. Numerical performances are then proposed
to indicate the validity of our method.

1 INTRODUCTION
where is a continuous function defined on R3 .
The full 3D Helmholtz problem (1.1) is consid-
The Helmholtz equation, carries the name of the
ered to divide into two subproblems, called interior
physicist Hermann Ludwig Ferdinand von Helm-
and exterior problems accordingly.
holtz, was contributed to the mathematical acoustics
The interior Helmholtz problem with Dirichlet
and electromagnetics. It is an important equation
boundary condition is presented as:
to be solved in numerical electromagnetic problems,
for instance, the waveguide problems in physical
u + 2u = 0, in ,
phenomena, acoustic rediation, heat conduction, (1.4)
wave propagation and electrolocation/echolocation u = g , on ,
problems. The research of Helmholtz equation in
2D and 3D are studied in many literatures, such as in which, let us note that 2 cannot be a Dirichlet
(Colton 1998), (E.A. Spence & Fokas 2009), (Olaf eigenvalue for on , for frequencies f in (1.2)
& Sergej 2007), (Burton & Miller 1971), (Ihlenburg small enough and therefore the continuous prob-
& Babuska 1997), (Liu, Nakamura, & Potthast lem has a unique solution. Without this assump-
2007), (Goldstein 1982), (Jan. 2009) and a lot of tion, the Helmholtz operator is singular and there
references therein. A variety of problems require is either no solution or as infinite set of solution
solutions to the Helmholtz equation in both inte- to (1.4). Thus, this is important to do sufficiently
rior and exterior domains. In the case of no source, accurate discrete approximations.
the three-dimensional (3D) time-independent linear The exterior Helmholtz problem, also called scat-
Helmholtz equation is considered as: tering problem, the domain of the solution to this
problem is unbounded domain ext 3
 ,
u + 2u = 0, x 3
, (1.1) with Dirichlet boundary condition and the addi-
tional Sommerfeld radiation condition holds at
where > 0 is the wave number given by: infinity as follows:

2 f 2
= = = . (1.2)
c c u + 2u = 0, in ext ,

We introduce here = c
the wavelength of u = g , on ,
f
plane waves of frequency f . The governing
1
n u ( x ) i u ( x ) =O 2 , ,
Dirichlet boundary condition is considered on the
bounded Lipschitz domain R3 as: x x
u g on ,
g, (1.3) (1.5)

271

AMER16_Book.indb 271 3/15/2016 11:30:20 AM


where = , and nx represents the exterior nor- Two following theorems allow the construction
mal vector. In the scattering problems, typically the of the boundary element method, whose formula
function g is smooth restricted to the boundary. In design the solution calculation to the boundary
addition, the extra condition Sommerfeld must be value problems in the next section.
added to the condition that ensures the uniqueness
of solution to the problem (1.5). 2.1 Representation formula to the
Numerical solution to the Helmholtz equation interior problem
play a vital role in applications, likely mechanical,
acoustics, electromagnetics etc. Numerical methods Theorem 2.1. (Representation Formula for Bounded
for solving the Helmholtz problem have been an Domains), (Jan. 2009) Let R3 be a bounded
active research in recent years, where the Finite Ele- in C 1 domain and the boundary = is assumed
ment Method (FEM) and FDM have been applied to be smooth enough so that the integration by part
successfully. Reliable numerical methods lead to formula (in multi-dimensional space) holds. Let
the 3D homogeneous Helmholtz equation (1.1) are G denote the fundamental solution for the Helm-
discussed in (Jan. 2009) and several concerned refer- holtz equation in R3 and let n denote the outward
ences therein. In this paper, we also study the numer- normal vector to . Then for u C 2 ( ) we have
ical solution to Helmholtz equation, and give some the representation formula:
discrete schemes to the 3D problems. For instance,
u
one proposes the boundary element method to the u( x ) = ( y )G ( x y )ds( y )
boundary integral equations and gives comparison n
G (x ( , y) (2.3)
with the classical finite difference method. u( y ) ds( y )for x .
The rest of this paper is organized as follows. In ny
Section 2, we study the solution representation for-
mula of interior and exterior problems. In the next
section, we develop the variational framework to 2.2 Representation formula to the exterior
propose boundary integral value problems (BIPs), problem
with respected to single layer and double layer Theorem 2.2. (Representation Formula for
potentials. In Section 4, we employ some numeri- Unbounded Domains), (Jan. 2009) Let R3 be
cal methods applied to previous BIPs. For instance, a bounded in C 1 domain and the boundary =
both BEM and the classical FDM discretization are is assumed to be smooth enough so that the integra-
proposed therein. Some numerical experiments are tion by part formula (in multi dimensional space)
then presented in Section 5. In the last section, some holds. Let G denote the fundamental solution
relevant conclusions are drawn, we also discuss on for the Helmholtz equation in R3 and let n denote
some remarks, open questions and future work. the outward normal vector to . Let us define
ext 3
 . Then for u C 2 (ext ) satisfying
2 SOLUTION REPRESENTATION u + 2u = 0, in ext ,
FORMULA
and the Sommerfeld radiation condition
In this section, we give a brief review of the solu-
tion representation formula to both interior and
1
exterior Helmholtz equations (1.4), (1.5), follows u( x ) i u( x ) = O 2 , .
the results given in (Jan. 2009), respectively. nx x
Let us consider the fundamental solution to the
Helmholtz equation. The study of fundamental solu-
Then, we have the representation formula:
tion is used to formulate solution to our problem.
Let the function G : 3 3
C be the fun- G (x
( , y)
damental solution to the Helmholtz equation in u( x ) = u( y ) ds( y )
R3 , is defined by:
ny
u
( y )G ( x y )ds(yy ffor x ext .
( 2) = 0, (2.1) n
(2.4)
where 0 represents the Dirac delta distribution.
One also has:
3 BOUNDARY INTEGRAL EQUATIONS
x y
1 ei
G ( x y ) = , x y R3 , x y. (2.2) This section aims at providing the boundary inte-
4 x y
gral equations derived from representation formula

272

AMER16_Book.indb 272 3/15/2016 11:30:22 AM


in previous section. It allows us to find the solution Theorem 3.1. Let C( ) the space of continuous
to (1.1) in the form of single layer potential and functions. Then the representation formula in (2.3),
double layer potential. This leads to the indirect (2.4) can be rewritten as:
boundary element method, in resolving boundary
value problems. u( x ) = S
1in
iintt
u( x ) D 0int u( x ), x , (3.6)
Let us introduce the integral operators:
u( x ) = S
1ex
eextt
u x ) + D 0ext u( x ), x ext , (3.7)
u(
1/ 2
S
: H ( ) Hlo
locc ( ),
1

where 0int 0ext are interior and exterior Dirichlet


such that for x R  , 3
trace operators, defined as:

(
))(( ) G ( ) ( y )ds( y ), (3.1) 0int : Hloc
1
( ) H 1/ 2 ( ),
0 : Hloc ( ) H 1/ 2 ( ),
ext 1

and
such that
D : H 1/ 2 ( ) Hlo
locc ( ),
1

0intv = v | for v C ( ),
such that for x R3  , (3.8)
0extv = v | for v C (ext ),
G ( y )
( ))(( ) ny
( y )ds( y ), (3.2) and the 1int 1ext are interior and exterior Neumann
trace operators are defined respectively as:

where : R are density functions. Then, the 1int : Hloc


1
( ) H 1/ 2 ( ),
K denotes the double layer potential operator 1 : Hloc ( ) H 1/ 2 ( ),
ext 1

G ( , y )
( )( ) ny
( y )ds( y ), for x , (3.3) such that

v
1intv = for v C ( ),
The adjoint double layer potential operator K n
is defined by: (3.9)
v
1extv = for v C (ext ).
G ( , y ) n

( )( ) nx
( y )ds( y ), for x . (3.4)
Theorem 3.2. Let 0 and 1 be the Dirich-
let and Neumann trace operators. Then we have
As a result, let us recall the definitions of Dirich-
H 1/ 2 ( ) H 1/ 2 ( ) :

let and Neumann trace operators 0 1 , were pro-
posed in (Jan. 2009). The Dirichlet trace operator
0 is: 0
ext
0int
= 0,
1
ext
1int
= ,
0 : H loc
1
( ) H 1/ 2 ( ). 0 D = 0ext D 0int D = ,
1D = 1ext D 1int D = 0,
Combine the operator 0 with S
one has the
single layer potential: Theorem 3.3. (Jan. 2009) For H 1/ 2 ( ) there
holds
S H 1/ 2 ( ) H 1/ 2 ( ) S = 0 S
. (3.5)
1
1int (
k )( x ) ( ) ( K )( x ), x (3.10)
And the Neumann trace operator is given as: 2
1
1 : H loc
1
( ) H 1/ 2 ( ). 1ext (
k )( x ) ( ) ( K )( x ), x (3.11)
2
Combine the operator 1 with the single layer In addition, let us also introduce here the hyper-
potential one has the linear continuous mapping: singular integral operator, which is defined as the
negative Neumann trace of the double layer poten-
1S
: H 12
( ) H 1/ 2 ( ). tial K , denoted as E .

273

AMER16_Book.indb 273 3/15/2016 11:30:25 AM


Respectively, apply the interior Neumann trace
E : H 1/ 2 ( ) H 1/ 2 ( ), (3.12)
operator 1int both sides of (3.6) gives:
such that, for a smooth density function one has 1
( 1int )( ) ( int )( ) + ( 0
int
)( ),
( )( )
)( 1( ))( ), (3.13) 2

(3.18)
where the Neumann trace of the double layer
potential term 1( ) is given on the boundary obtains the Fredholm boundary integral equation
as: of the second kind:

1( 1 int
) 1ext ( ) 1 ( ),
int
H 1 2 ( ). ( 1 )( ) ( int
)( )
)( ( )( ), x .
1
2
(3.14)
(3.19)
3.1 Interior boundary value problem
According to these above equations it gives vari-
Theorem 3.4. If u is a solution to the interior Dirich- ational problems as
let Helmholtz problem:
1
u + 2u = 0, in , S int
u = I K g H 11 2 ( ),
2
iint (3.15)
0 u = g , on ,
(3.20)
with a bounded Lipschitz domain and Dirichlet and
boundaryy condition g H 1/ 2 ( ) , then the Neumann
trace 1int u satisfies the boundary integral equation 1
I
2
K 1in
iintt
u, E g
H 1 2 ( ).

int 1
S ( u( x )) = g ( x ) ( K g )( x ), x . (3.16)
2 (3.21)

and u has the representation formula (3.6).


Conversely, if 1int u satisfies the boundary inte- 3.2 Exterior boundary value problem
gral equation (3.16), then the representation formula Theorem 3.5. If u is a solution to the exterior Dirich-
(3.6) defines a solution u to the interior Dirichlet let problem:
problem (3.15).
Proof. The solution to the equation (3.15) is
given by the representation formula in (3.6):
u + 2u = 0, in ext ,
eext
u( x ) = S
1 u( x ) D 0int u( x ),
int
iint
x , 0 u = g , on ,

1
with unknown Neumann trace data
1int u H 1/ 2 ( ) . n u ( x ) i u ( x ) =O 2 , ,
x x
Apply the interior Dirichlet trace operator 0int
both sides of (3.6) for x and follow the Theo- (3.22)
rem 3.3, one gets the boundary integral equation
for x : with a bounded Lipschitz domain and Dirichlet
boundary condition g H 1/ 2 ( ) , then the Neu-
1 mann trace 1int u satisfies the boundary integral
( 0int )( ) ( 0int )( ) + ( 1
int
)( ), equation:
2

(3.17) ext 1
( )( ) = g(( ) + K g ( ), x . (3.23)
2
and since u g , x we get g , which 0int u
implies the Fredholm boundary integral equation and u has the representation formula (3.7).
of the first kind: Conversely, if 1int u satisfies the boundary inte-
gral equation (3.23), then the representation formula
int 1
S ( u( x )) = g ( x ) ( K g )( x ), x . (3.7) defines a solution u to the interior Dirichlet
2 problem (3.22).

274

AMER16_Book.indb 274 3/15/2016 11:30:30 AM


Proof. The solution can be represented as in M
(3.7) as: = = i , (4.1)
i =1
u( x ) = S
1ex
eextt
u x ) + D 0ext u( x ), x ext ,
u(
where i represents the boundary element for
with unknown Neumann trace data . 1ext u i 1, 2,..., M and M is large enough denotes the
Apply the Dirichlet interior and exterior trace number of elements. Each element i in (4.1)
operators 0ext and 1ext both sides of (3.7) and represents the discretized triangular elements, has
also follow the theorem 3.3, moreover it uses the three vertices X1k x1k y1k z1k ), X 2k x2k , y2k z2k ) , and
k k k k
X 3 x3 y3 z3 ) . Let us define a triangular domain
fact that 0ext u g on = , we have the follow-
R as following:
2
ing Fredholm boundary integral equation of the
first kind as:
= { = (1,2 ) 2
: 0 < 1 < 1; 0 < 2 < 1 1}.
ext 1 (4.2)
( )( ) g ( ) K g(( ), x . (3.24)
2
4.1.1 Piecewise constant basis function
Similarly, take the exterior Neumann trace oper- For every elements k we define the piecewise
ator 1ext both sides of (3.7) it obtains the second constant function k as follows
kind Fredholm boundary integral equation:
1 for X k
k (X ) = , k , ,..., M .
1 ext 0 elsewhere
( 1 )( ) ( ext
1 )( )
)( ( )( ), x .
2
The function k k can be identified as a
(3.25)
function as:
Furthermore, boundary integral equations (3.23)
and (3.25) are equivalent to variational problems: 1 for
) = .
0 elsewhere
1
S ext
u = I K g H 1/ 2 ( ),
2 Let T ( ) be the approximation of the New-

mann data, i.e., of the normal derivatives on .
(3.26) The linear space T ( ) is defined as
and
T ( ) = span{ k }M
k =1 ,
1
I
2
+ K u, E g
H 1/ 2 ( ). and for every complex-valued function g T ( )

can be represented
(3.27) M
g gk k ,
k =1
4 NUMERICAL METHODS FOR
HELMHOLTZ INTERIOR AND Let N denote the number of nodes of a given
EXTERIOR EQUATIONS triangular mesh. We define the family of func-
tions { l }lN=1 continuous over the whole discretized
4.1 Boundary element method boundary as following
In this section, we describe the discretization of 1 for x xl
boundary integral equations from the previous l = 0 for x xl .
section. In addition, we present a novel boundary piecewise affine otherwise
element method for parametrization of triangu-
lar mesh in 3D. The solutions to boundary value Let us define the linear space T ( ) as
problems (3.16) and (3.23) are then approximated
by numerical techniques as below. T ( ) = span{ l }lN=1.
Let us assume here that is a polyhedral
bounded Lipschitz domain in R3 . First, on A function k to an element l supp k can
decomposes the boundary R3 into a finite set be identified with one of the functions 1 2 3
of boundaries as: defined on the reference triangle as:

275

AMER16_Book.indb 275 3/15/2016 11:30:34 AM


1 1 2 2 1 3 2
. One gets the approximate solution uh to the
interior Dirichlet boundary value problem (1.4)
For every function g in T ( ) , it can be rep- is given by the discrete representation formula for
resented as x :

N M
g gl l . u( x ) uh ( x ) = Wk G ( x y )ds( y )
k
l =1 k =1
N
G ( x, y )
( gh )l l ( y ) ds( y ).
The linear space T ( ) is applied for the approx-
l =1
n( y )
imation of the Dirichlet data, i.e., the values of the
solution on . (4.8)

4.1.2 Discrete solution to interior problem 4.1.3 Discrete solution to exterior problem
Let us denote uh is the approximated solution Similarly, we denote uh is the approximated solu-
to the interior problem (1.4), and the discrete tion to the exterior problem (1.5), and the unknown
unknown Neumann data as: exterior Neumann data W 1ext uh . From (3.26),
for all h H h , we have
W : 1int uh . (4.3)
1
S W ,h I K gh h . (4.9)
From (3.20), for all h H h , we have 2

1
SW ,h
I K gh h . (4.4) We find the approximate forms in term of func-
2 tional bases as:

M N
We find the approximate forms in term of func-
tional bases as: W W k k T gh gl l T ( ).
k =1 l =1
M N
W Wk k T gh gl l T ( ). For every i 1, 2,..., M , from (4.9) it gives dis-
k =1 l =1 crete variational problem:

For every i 1, 2,..., M , from (4.4) we obtain M N 1


the linear system to find the approximate solution W k S k i gl I
2
K l , i

.
W to: k =1 l =1

(4.10)
1
VW = R P gh , (4.5) Finally, we obtain the linear system to find the
2
approximation of Neumann data W :
or simplicity in term:
1
V W = R + P gh , (4.11)
CW = D, (4.6) 2

where = ,..., M ; j = 1,..., N : where V R , P and gh are discretized as in the


previous section.
Cij = V (i j ) = x, y dds( x )ds( y ),
G (x One gets the approximate solution uh to the
i j
exterior Dirichlet boundary value problem (1.5)
1 (4.7)
Dj R + P ( gh ) m , is given by the discrete representation formula for
2 jm x ext :

M
and
u( x ) uh ( x ) = W k G (x y dds( y )
k
k =1
R ( j m ) j
m ( x )ds( x ), N
( gh )l l ( y )
G ( x, y )
ds( y ).
G (x
( , y) l =1
n( y )
P ( j m ) m ( y)
j n( y )
d ( x ).
ds( y )ds
(4.12)

276

AMER16_Book.indb 276 3/15/2016 11:30:39 AM


4.2 Algorithm where A is a complex matrix as the boundary con-
dition contains complex values, N N x N y N z
We finally present the algorithm for BEM that fol-
the total number of matrix u in , Nx (Ny, Nz)
lows description in Section 4. The algorithm is first
are number of discretized points along x (y, z)
proposed to the interior problem (1.4). Then, the
direction.
scheme can be easily applied to the exterior prob-
Following the discretized FDM scheme, the
lem (1.5) in 3D.
approximate solution to the homogeneous model
The BEM algorithm is described by following
is then obtained.
steps:
The convergence theory and the order of con-
1. Set the initial data: g, (not too large), , vergence to BEM and FDM are studied in (Sauter
number of nodes and elements; & Schwab 2010), (Erlangga 2005).
2. Initialize basis functions k l for the Neu-
mann and Dirichlet data, respectively;
3. Calculate matrices C and D following (4.7); 5 NUMERICAL EXPERIMENTS
4. Solve the sparse linear system (4.6);
5. Calculate numerical integrals on boundaries in In this section, some numerical performances are
(4.8) and numerical solution is then obtained. presented to confirm the efficiency of the BEM to
the interior Boundary Value Problem (BVP) in a
It is also remarkable that in the discretization
comparison with the classical FDM in Section 4.3.
integral calculations, one can use some recent
We consider the interior homogeneous Dirichlet
numerical quadrature rules, specially in case of
boundary Helmholtz problem on the domain ,
evaluating singular integral along surfaces in three
such as a unit sphere (center at the origin, radius 1)
dimensions. One of effective methods is Quad-
and a cube as in Figure 1. In this contribution, the
rature By Expansion method (QBX), it was pro-
numerical algorithm has been developed that effec-
posed in (ONeil, Klockner, Barnett, & Greengard
tively solves (1.4).
2013).
These following numerical simulations are per-
formed entirely within the Matlab environment.
4.3 Finite Difference Method (FDM) It has been noticed that in the application of the
piecewise constant basis functions as in Sec-
In this section, we describe the simplest approxi-
tion 4.1.1 and even coarser triangular meshes, the
mation of interior Helmholtz equation (1.4) by
large amount of RAM and time on computer
using finite difference method. In this contribu-
facilities are needed.
tion, we look for the solution in the 3D rectangular
Discrete solution is implemented to solve inte-
domain , where the preliminary values u( x y z )
rior problem (1.4) with = 2 . On the sphere, dis-
are known in the points of domain .
crete solution by FDM and BEM are presented in
Let us discretize the computational domain
Figures 2 and 3, respectively. Our results are validated
with a 3D uniform grid size with the spatial step
by numerical examples for sphere. It is noticed that,
h1 x h2 y, h3 z , respectively. Using cen-
differs from the FDM mesh, in triangular mesh one
tered finite difference approximations for the par-
uses the specific function to display numerical results.
tial derivatives and Laplaces operator in (1.4), the
Moreover, since it is difficult to display the results in
following second order accurate system of simulta-
four-dimensions, in these figures we fixed one of
neous equations is obtained for all interior nodes:
three directions (x or y or z). The numerical solutions
1
u( x y z ) = 2
[ u( x + h1 y, z)) u( x h1, y z )
6(( )
+u(x( y h2 z ) + u(( x, y h2 , z )
+u( x, y, z + h3 + u( x y, z h3 )]
+O (hh12 , h22 ,hh32 ).
(4.13)
Special equations are typically required for the
boundary nodes depending of the boundary con-
dition (1.3), it refers to (Hegedus 2009). The result
can be written as a large sparse equation linear
system:

Au = b, A C N N N
, u, b C N , (4.14) Figure 1. Domain for numerical experiments.

277

AMER16_Book.indb 277 3/15/2016 11:30:43 AM


Figure 2. Solution to the interior problem by FDM on Figure 6. Solution to the interior BVP by BEM on the
the sphere. cube.

We also present the solution inside the cube fol-


lowing Figure 5. It can be seen that the obtained
waveform in these figures is very clear and it is
easy to observe the values of solution under the
color bar.

6 CONCLUSION

In this paper we focus on the studying of numerical


Figure 3. Solution to the interior BVP by BEM on the solution to the homogeneous Helmholtz equation
sphere.
in 3D under the Dirichlet boundary condition.
The boundary element method is presented. For
instance, we first give the representation formula
with respect to internal problem in finite domain
and external problem in infinite domain, where
the solution should satisfy radiation condition at
infinity. Then, these problems can be reformulated
as the boundary integral equations in form of sin-
gular and double layer potentials. We then propose
the discrete formulation of variational Dirichlet
boundary value problems. Using polynomial piece-
wise constant basic functions for approximation of
Figure 4. Solution to the interior problem by FDM on solution we obtain a sparse system of equations.
the cube. The discretization of the problem is solved by the
BEM iterative method. In addition, the classical
FDM scheme is also given to give comparative
results. Many numerical computational examples
on standard domains are implemented which vali-
date the correctness and effectiveness of the algo-
rithm. This work gives an idea to solving the 3D
non-homogeneous Helmholtz equation with dif-
ferent boundary conditions numerically, that will
be analysed in future research.

REFERENCES
Figure 5. Solution to the interior problem by FDM
inside the cube.
Burton, A.J. & G.F. Miller (1971). The application of
integral equation methods to the numerical solution
of some exterior boundary-value problems. Proceed-
u( x y z ) are then displayed in a three dimensional ings of the Royal Society of London, Series A, Math-
coordinate system. ematical and Physical Sciences 323(1553), 201210.
Otherwise, on the cube, Figures 4 and 6 are the Colton, D. (1998). Inverse acoustic and electromagnetic
simulations results for the numerical solutions. scattering theory, second edition.

278

AMER16_Book.indb 278 3/15/2016 11:30:45 AM


Erlangga, A. (2005). A Robust and Efficient Iterative Liu, J., G. Nakamura, & R. Potthast (2007). A new
Method for the Numerical Solution of the Helmholtz approach and error analysis for reconstructing the
Equation. scattered wave by the point source method. Journal of
Goldstein, C.I. (1982). A fem for solving helmholtz Computational Mathematics 25(2), 113130.
type equations in waveguides and other unbounded ONeil, M., A. Klockner, A. Barnett, & L. Greengard
domains. Mathematics of Computation 39(160), (2013). Quadrature by expansion: A new method for
309324. the evaluation of layer potentials. J. Comput. Phys
Hegedus, G. (2009). The numerical solution of the three 252, 332349.
dimensional helmholtz equation with sommerfield Olaf, S. & R. Sergej (2007). The fast solution of boundary
boundary conditions. PIERS Proceedings, Moscow, integral equations. New York: Springer-Verlag, 279.
Russia. Sauter, S. & C. Schwab (2010). Boundary element meth-
Ihlenburg, F. & I. Babuska (1997). Solution of Helmholtz ods. Springer-Verlag, Heidelberg.
problems by knowledge-based fem. Computer Assisted Spence, E.A., & A. Fokas (2009). A new transform
Mechanics and Engineering Sciences 4, 397415. method I: domain-dependent funcdamental solutions
Jan., Z. (2009). The boundary element method for dirich- and integral representations. Proceeding of the Royal
let-newmann boundary value problem. Ostrava, 31. Society, series A, 123.

279

AMER16_Book.indb 279 3/15/2016 11:30:46 AM


This page intentionally left blank
Applied Mathematics in Engineering and Reliability Bri, Snel, Khanh & Dao (Eds)
2016 Taylor & Francis Group, London, ISBN 978-1-138-02928-6

A study of stochastic FEM method for porous media flow problem

R. Blaheta
Institute of Geonics of the CAS, Ostrava, Czech Republic

M. Bre & S. Domesov


VBTechnical University of Ostrava, Ostrava-Poruba, Czech Republic
Institute of Geonics of the CAS, Ostrava, Czech Republic

ABSTRACT: The paper provides an overview of the stochastic Finite Element Method (FEM) for the
investigation of the flow in heterogeneous porous materials with a microstructure being a Gaussian ran-
dom field. Quantities characterizing the flow are random variables and the aim is to estimate their prob-
ability distribution. The integral mean of the velocity over the domain is one of these quantities, which is
numerically analyzed for a described model problem. The estimation of those quantities is realized using
the standard Monte Carlo method and the multilevel Monte Carlo method. The paper also concerns the
use of the mixed finite element method for the solution of the Darcy flow and efficient assembling and
solving of the arising linear systems.

1 INTRODUCTION We will assume that k = k(


k x,
x ) is a random
variable, x and S . Here S is a sample
Many natural materials, like geomaterials and space equipped by a suitable probability model
biomaterials, possess a high level of heterogeneity with given parameters. Then the model outputs
which has to be properly treated for understanding as p, u and another quantities J( p u ) , e.g. the
and reliable modelling of processes in these materi- averages
als. As a special case, we shall consider groundwa-
ter flow, which is important in many applications, 1
as e.g. filtration and waste isolation. The ground- p =
p (3)
water flow can be further completed by transport
of chemicals and pollutants or connected with
deformation of the porous matrix. and
The groundwater flow can be described by the
boundary value problem 1
u
kppu (4)
iv( ) = in

p = p on D (1) will be also random variables and we will be inter-


( )n = 0 on N , ested in their characteristics as the mean (expecta-
tion) E and variance V .
where p is the pore (water) pressure, k is permeabil-
ity, u k pp is the Darcys velocity, p is a given
Dirichlet type boundary condition on D 2 STOCHASTIC MICROSTRUCTURE
and no flow is assumed as the Neumann type
boundary condition on D , n is the unit The permeability k = k(
k x,
x ) can be considered
outer normal to . as a random field in the domain or in selected
We shall consider a two field form of the above points within . Especially, we shall assume that
boundary value problem with two basic variables
p R1 and u : R n ln( ( x,)) = c1 , N ( 2
), (5)

k 1u + pp g in where N( 2 ) denote the normal distribution



div(uu f with the mean and variance 2. This lognormal
 (2)
p p on D character of permeability is supported by experi-
u n = un = 0 on N . mental tests on rock as well as experimentally

281

CH35_40.indd 281 3/15/2016 2:57:47 PM


found logarithmic relation between permeability exists. For X c LX , where L is the Choleski fac-
and porosity, see (Nelson et al. 1994, Freeze 1975). tor, it holds
Thus in (5) could be interpreted as the porosity
which gives to (5) a physical meaning. E( c
( c T
) ) E( T T
)
If X R n is a random field, such that
= LE(
E( T
LT
))L (11)
X i N(
N , ) , then the random field k connected
with selected points x ( i ) , i = 1,, n , can be = LL
LT = C
generated as
Note that the identity
ln( ) c1( ), (6) E( T T
) LE(E( T
LT follows from the lin-
))L
earity of the expectation operator E .
i.e. k e c1 e c1 X . For numerical experiments we
shall use c1 = 1,1 0 , i.e. 2.1 Model problem
As a model problem, we shall consider the ground-
k = e X . (7)
water flow given by equation (1) on the unit square
= 0,1 0,1 with the specific boundary condi-
In this case the components of2 k have lognor-
tionsthe pressure difference in x1 direction, see
mal distribution with the mean e / 2 and variance
(e

2

1 e .) 2

The random field X can be further smoothed by


Figure 1.
We shall be interested in different quantities as
e.g.
correlation, which provides the correlated random
field X c . The correlation is frequently described u( )),
1
as an exponential expression involving a correla- kefff 0 u(1, x2 d 2,
dx
tion length , e.g. u 1
u=
(i )
k p .

For the realization of this calculations the mixed


c( x y ) c(X c ( x ) X c ( y )) FEM method can be used, see section 4. If the per-
(8)
xp ( x y / ) .
= 2exp meability k will be a random field in , then these
quantities will be also random variables and we
shall compute their characteristics like the expecta-
Different methods can be used to generate the tion and variance.
correlated random field. The Choleski factoriza-
tion of the correlation matrix C is probably the
most straightforward one and will be used within 2.2 Visualization of the generated fields
the experiments in this paper. Further methods as For numerical experiments with the model prob-
a technique based on the discrete Fourier trans- lem, we use different values { , , } and
form can be found in the literature, see e.g. (Lord, { . , . } .
Powell, & Shardlow 2014, Powell 2014). The following figures show the visualization of
Given the set of selected points x ( i ) , i = 1,, n, the generated random field k for six combinations
we can define the correlation matrix C by of parameters and value. All of the Gaussian

C ((X E(X ))(X E(X ))T )


(9)
= E(XX T ) E(X ) (X T )

In the case of E( ) = 0 , it provides

C X T ), Cij = c(x
(XX ( x ( i ) , x ( j ) ). (10)

Theorem 1. (Generation of the correlated random


field). Let C LL LT be the Choleski factorization of
C, X be an uncorrelated random field, X i N(N , ).
Then X c LX is the correlated random field with
correlation matrix C. We can write X c N( N ,C ) .
Proof. If X i are uncorrelated and have zero mean
and unit variance for any i, then E( i j ) = ij and
therefore E( T
) = I . The correlation matrix is Figure 1. Test problem with pressure difference in x1
SPD and therefore the Choleski factorization direction.

282

AMER16_Book.indb 282 3/15/2016 11:30:48 AM


Figures 2. Random field for parameters: = 1, = 0.1. Figures 5. Random field for parameters: = 1, = 0.3.

Figures 3. Random field for parameters: = 2, = 0.1. Figures 6. Random field for parameters: = 2, = 0.3.

Figures 4. Random field for parameters: = 4, = 0.1. Figures 7. Random field for parameters: = 4, = 0.3.

283

AMER16_Book.indb 283 3/15/2016 11:30:53 AM


random fields were created from the same random Figure 8 show the pdf and cdf estimation for the
vector X, where X i N (0,1) , so we can observe random variable u x .
the effect of the parameters and changes on the
1

material microstructure.
3.2 Multilevel Monte Carlo method
The Figures 2, 3, 4, 5, 6, 7 show that the changes
of the parameter affects only the logarithmic For the mean value E(L ) of a random variable
scale of the values, which is caused by the linear L we can write
relation between g k and 2. The influence of
the parameter can be observed in a smoother L
material with growing . E(L ) E(0 ) + E(l
E( l 1 ). (15)
l =1

This leads to the multilevel Monte Carlo


3 MONTE CARLO METHODS
(MLMC) estimator
Consider the Darcy flow model problem. We are N Nl
1 0 ( ) L 1
interested in the estimation of the quantities E(L ) + N
N0 n =1
l( ) l( )
1 , (16)
l =1 l n =1
u( )), kefff and u . (12)
see (Cliffe, Giles, Scheichl, & Teckentrup 2011,
In the case of the Monte Carlo (MC) simu- Barth, Schwab, & Zollinger 2011). For different
lations, we consider this quantities as random levels l { , , L} the values l( n ) l(1
n)
are inde-
variables. pendent. However the values l and l(1
(n) n)
for
specific n { , , N } are correlated.
The variance of the MLMC estimator can be
3.1 Standard Monte Carlo method calculated as
Using the standard MC method, the expectation
E( ) of a random variable is estimated as a sam- L
1 2
ple average VMLMC = sl , (17)
l =0 Nl
1 N ( n)
,
N n =1
(13) where sl is the sample standard deviation on the
level l.
This approach was applied to the model prob-
where (n) for n { , , N } are random samples lem with the grid size d d. We were
wer interested in
of . The estimated probability distribution of the ( )
the random variable L = u x , i.e. the inte-
gral mean of the velocity over the 0,1 0,1
1
random variables is also described by the sample
standard deviation, the estimated probability den- domain calculated for the grid size d d. Samples
(d / )
sity function (pdf) and cumulative distribution of the random variable L 1 = u x are calcu-
1
function (cdf). lated as the integral mean of the velocity for the
The variance of the MC estimator is calculated grid d2 d2 , etc.
as There are different ways of calculating the coarse
grid approximation l 1 of l in order to achieve
1 2 strong correlation between this two random vari-
VMC = s , (14)
N ables (high correlation between l 1 and l leads to
low variance on the MLMC level l). In this paper
where s is the sample standard deviation. we describe two possible procedures for the coarse
The experiments were performed with the grid approximation.
following parameters: grid size: 200 200,
{ , , } , { . , . } , number of experiments: Procedure 1: Coarse grid approximation preserving
2104. the Gaussian random field distribution
The following tables show the estimated sample The samples l( n ) and l(1 n)
should be correlated,
average and sample standard deviation for the ran- therefore
her it is necessary to determine the way of
dom variables ux1 (0 5 0 5), ux2 (0.5, 0.5), u x and l(1) calculation. The value of l( n ) corresponds to
n

u x . The values after the symbol correspond


1

2
a specific sample k(d) of the Gaussian random field,
to the 95% confidence interval for the estimated which was obtained for a random vector X, where
value. For the random variable keff the same esti-
mation as for u x was obtained. The graphs in
X i N (0,1) , i
( n) { }
, , d 2 . To obtain the value
l 1 we first create a coarse material k(d/2) from a
1

284

AMER16_Book.indb 284 3/15/2016 11:30:54 AM


Figure 8. Estimated pdf and cdf of u x1
for = 0.1 (left) and = 0.1 (right).

Table 1. Sample average of ux1 ( 0 5 0 5) .


Table 4. Sample standard deviation of ux2 ( 0 5 0 5) .
= 0.3 = 0.1
= 0.3 = 0.1
=1 1.135 0.0132 1.0409 0.0096
=1 0.4492 0.0044 0.4201 0.0041
=2 1.7244 0.055 1.1649 0.0251
=2 2.1625 0.0212 1.2215 0.012
=4 11.9402 1.8383 1.9334 0.1447
=4 80.627 0.7902 6.9851 0.0685

Table 2. Sample standard deviation of ux1 ( 0 5 0 5) .

= 0.3 = 0.1 Table 5. Sample average of u x1 .

=1 0.9504 0.0093 0.6902 0.0068 = 0.3 = 0.1


=2 3.9656 0.0389 1.8104 0.0177
=4 132.6336 1.2999 10.4422 0.1023 =1 1.1228 0.0083 1.018 0.0032
=2 1.6821 0.0324 1.0983 0.0079
=4 10.0822 0.9339 1.7447 0.0409
Table 3. Sample average of ux2 ( 0 5 0 5) .

= 0.3 = 0.1 vector Y of length 14 d 2 , which is calculated from


the vector X values. For example
=1 0.0008 0.0062 0.0006 0.0058
=2 0.0276 0.03 0.002 0.0169 1
=4 0.512 1.1175 0.0173 0.0968 Y1
2
( X1 + X 2 Xd + Xd 2 ) (18)

285

AMER16_Book.indb 285 3/15/2016 11:30:59 AM


Table 6. Sample standard deviation of u . VMC T
x1 MC , (19)
VMLMC TMLMC
= 0.3 = 0.1

=1 0.6024 0.0059 0.2335 0.0023 where TMC is the total time of the MC simulation
=2 2.34 0.0229 0.5695 0.0056
and TMLMC time of the MLMC simulation, see the
=4 67.3806 0.6604 2.9495 0.0289
Table 10.
The value 1 for = 4 and = 0.3 in Table 10 is
caused by the fact that in this case it was evaluated
in the preliminary run, that only one level should
Table 7. Sample average of u x2 . be used, i.e. it is the standard MC method. In the
remaining cases all of the 4 levels were used.
= 0.3 = 0.1 The Table 11 shows the values of sl2 on each
of the levels l { , , } calculated in the prelimi-
=1 0.0005 0.0015 0.0001 0.001 nary run (level l = 1 corresponds to the coarsest
=2 0.003 0.0062 0.0002 0.0022 grid, while the remaining values present the dif-
=4 0.02 0.1623 0.0023 0.0109 ference between the fine and coarse grid on the
given level). We used these values to calculate the
numbers of samples to be executed on each of the
MLMC levels.
Table 8. Sample standard deviation of u x2 . The following table shows the ratio of the num-
bers of samples, that were used on different levels.
= 0.3 = 0.1
Procedure 2: Coarse grid approximation as
=1 0.1098 0.0011 0.069 0.0007 arithmetic mean of correlated random field
=2 0.4499 0.0044 0.1596 0.0016 In this case we use a similar approach as in the pro-
=4 11.7075 0.1147 0.7865 0.0077 cedure 1, but the key difference is that the smooth-
ing is applied to the correlated values,

1 c
Y1c = X1 X 2c X dc +1 + X dc + 2 (20)
(weighted arithmetic mean), etc. This approach 4
ensures that the values Yi follow the N (0,1) distri-
bution, therefore the obtained material k(d/2) is also (arithmetic mean). A disadvantage is that this
a Gaussian random field. The value of l(1n)
is then coarse grid approximation is not the same ran-
calculated on the coarse ggrid and remains corre- dom field, so in the lower MLMC levels we always
lated with the value of l( n ) . need to construct a new covariance matrix and
The MLMC method was tested on the model its Choleski factorization. The new covariance
problem with grid size 200 200, therefore is was matrix is created by averaging of elements of the
possible to use three coarser grids of dimensions fine grid covariance matrix according to the fine
100 100, 50 50 and 25 25. The numbers of grid to coarse grid elements mapping, this con-
samples Nl to be performed on specific levels were
calculated from a preliminary simulation run. In
this run the same number of samples was per- Table 9. MLMC method results for u x1 .
formed on each level and then the values of com-
putation time Tl and sample standard deviation = 0.3 = 0.1
sl were estimated for each level. The values of Nl
=1 1.1302 0.0039 1.0189 0.0007
were then calculated according to (Cliffe, Giles,
=2 1.6744 0.0189 1.1003 0.0021
Scheichl, & Teckentrup 2011) as N sl2 Tl , where
=4 9.6647 0.5259 1.745 0.0152
N is a constant common to all the levels.
The Table 9 presents the results of the MLMC
method that can be compared with the MC method
results (Table 5). Table 10. MLMC/MC efficiency calculated via (19).
The MLMC results were calculated with differ-
= 0.3 = 0.1
ent number of samples (i.e. different computation
time) than the MC results, therefore we propose the =1 1.9382 7.7148
following indicator for comparison of the efficiency. =2 1.1841 5.7974
The efficiency of the MLMC estimator in compari- =4 1 3.0011
son to the MC estimator will be calculated as

286

AMER16_Book.indb 286 3/15/2016 11:31:01 AM


struction comes from the linearity of the covari- culated efficiency compared to the MC estimator
ance. This disadvantage is compensated by very via formula (19) can be seen in the Table 14.
high correlation between fine grid and coarse grid The Table 15 shows the values of sl2 on each of
approximation. the levels l { , , } calculated in the preliminary
In the Figure 9 we show an example of coarse run.
grid approximations for both procedures. The following table shows the ratio of the
The Table 13 presents the results obtained for numbers of samples, that were used on different
u x (including 95% confidence interval). The cal- levels. In all the six cases at least three levels were
1
used.
Table 11. Variance on each MLMC level.

l=4 l=3 l=2 l=1


Table 13. MLMC method results for u x1
.
1 0.3 4.4102 4.1102 3.6102 0.35
0.1 1.4103 1.2103 1103 5.4102 = 0.3 = 0.1
2 0.3 1.4 0.84 1 4.5
=1 1.1298 0.0011 1.0189 0.0004
0.1 1.1102 1102 1.1102 0.29
4 0.3 =2 1.6945 0.0044 1.1001 0.0012
1105 5.1103 9.3103 2.3104
0.1 0.63 0.76 0.91 4.5 =4 10.3715 0.2517 1.7434 0.0087

Table 12. Ratios of Nl/N4 values for the six combina-


tions of parameters.
Table 14. MLMC efficiency calculated via (19).
N4 N3 N2 N1
= 0.3 = 0.1
1 0.3 1 2.23 4.70 33.86
0.1 1 2.12 4.53 75.15 =1 101.2564 90.7969
2 0.3 1 1.73 4.35 20.97 =2 87.3758 72.5988
0.1 1 2.19 5.04 60.55 =4 23.1090 38.6652
4 0.3 1 - - -
0.1 1 2.51 6.21 32.24

Table 15. Variance on each MLMC level.

l=4 l=3 l=2 l=1

1 0.3 4.4108 1.3106 6.2106 0.36


0.1 1.1107 1.9106 9.6106 5.4102
2 0.3 7.5106 2.0104 1.0103 5.5
0.1 5.1106 9.9105 4.1104 3.1101
4 0.3 1.4101 1.7 6.2101 1.1104
0.1 2.1103 2.0102 1.2101 7.2

Table 16. Ratios of Nl/N4 values for the six combina-


tions of parameters.

N4 N3 N2 N1

1 0.3 1 12.20 60.45 31410.66


0.1 1 9.66 49.56 9127.92
2 0.3 1 11.56 47.65 10186.95
0.1 1 9.66 47.36 2942.09
4 0.3 1 11.32 41.83 2550.90
0.1 1 8.04 38.23 906.28
Figure 9. Comparison of coarse grid approximations.

287

AMER16_Book.indb 287 3/15/2016 11:31:02 AM


4 MIXED FEM DISCRETIZATION mass matrix for the mean value of the permeability
AND SOLUTION k, M
trace ( M ) I and W = BBT , when BTW 1B
trace ( I )
becomes a projection.
The groundwater flow (1) can be implemented by Other possibilities are preconditioners for the
the mixed finite element method, e.g. in the way transformed system with the matrix
described in (Cliffe, Graham, Scheichl, & Stals
2000, Blaheta, Hasal, Domesov, & Bre 2014). M BT
The first advantage of the mixed formulation A = (25)
is in more accurate approximation of both pres- B 0
sures and velocities. The random permeability field
sampling then requires repeated assembling and as the HSS preconditioner
solving of the mixed FEM system, which has the
following saddle point structure M + I 0 I BT
P2 = (26)
0 I B I
Mu + B p = G
T
(21)
Bu = F or relaxed HSS preconditioner

Note that only the velocity mass matrix M M 0 I BT


depends on realization S , P3 = (27)
0 I B 0
Mij Mij ( ) = k ( ) 1
j i d , (22)
with a suitable parameter .

where j i are basis functions in the lowest


order Raviart-Thomas space. The repeated assem-
bling of the matrix

M BT
A= (23)
B 0

is therefore restricted to the pivot block. A fast


assembling of both M and B is implemented in the
RT1 code, see (Blaheta, Hasal, Domesov, & Bre
2014).
As a solution of the system, the discretized pres-
sure p and velocity u is obtained. The following
graphs at Figures 10, 11 and 12 show the visualiza-
tion of the solution for an example given by the
Gaussian random field 3. Figure 10. Discretized pressure p.
When repeatedly solving the system (21) by a
direct method, the benefit of B not dependent on
sampling is not exploited. The use of an iterative
solution method, such as MINRES or GMRES,
with block preconditioner, provides the chance
to save some effort as only the block correspond-
ing to M is changing. It is the case the following
preconditioners

M
BTW 1B BT
P1 = , (24)
0 W

with M
being a suitable approximation to M and
W being a block independent on sampling, e.g.
W 1r I , where r is a (large) regularization param-
eter, { , , } . Special cases are M
being a Figure 11. Discretized velocity u (first coordinate).

288

AMER16_Book.indb 288 3/15/2016 11:31:03 AM


The work is in progress, we plan to use a different
approach to the Gaussian random field generation
based on the Karhunen-Love (K-L) decomposi-
tion. This will allow us to solve the problem on
larger grids and as well it provides a different way
of using the MLMC method (MLMC levels will
correspond to the levels of the K-L decomposi-
tion). The K-L decomposition also provides a dif-
ferent approach to the stochastic PDEs solving by
e.g. the collocation method or the stochastic Galer-
kin method.

ACKNOWLEDGEMENT

This work was supported by The Ministry of


Figure 12. Discretized velocity u (second coordinate). Education, Youth and Sports from the National
Programme of Sustainability (NPU II) project
IT4Innovations excellence in scienceLQ1602.
5 CONCLUSIONS

The article presents the first results of the authors


REFERENCES
in the field of the stochastic partial differential
equations (PDEs) or stochastic FEM methods. Barth, A., C. Schwab, & N. Zollinger (2011). Multi-level
The simple and multilevel Monte Carlo methods monte carlo finite element method for elliptic pdes
are used as tools for stochastic simulations. with stochastic coefficients. Numerische Mathematik
We study the mixed FEM calculation of the 119(1), 123161.
Darcy flow problem with stochastic material coef- Blaheta, R., M. Hasal, S. Domesov, & M. Bre (2014).
ficients. We focused on the characterizations of the Rt1-code: A mixed rt0-p0 raviartthomas finite ele-
velocity, especially on the integral average of veloc- ment implementation. http://www.ugn.cas.cz/publish/
ity over the domain and the velocity in the middle software/RT1-code/RT1-code.zip.
Cliffe, K., M. Giles, R. Scheichl, & A.L. Teckentrup
of the the domain.
(2011). Multilevel monte carlo methods and applica-
The MC approach was used for the estimation tions to elliptic pdes with random coefficients. Com-
of the expected value, variance and distribution of puting and Visualization in Science 14(1), 315.
the studied random variables. Cliffe, K., I.G. Graham, R. Scheichl, & L. Stals (2000).
The MLMC method was used for the more Parallel computation of flow in heterogeneous media
efficient estimation of the expected value of modelled by mixed finite elements. Journal of Compu-
the random variable u x . We presented two tational Physics 164(2), 258282.
1
approaches to the coarse grid approximation, Freeze, R.A. (1975). A stochastic-conceptual analysis
the first one is straightforward and preserves of one-dimensional groundwater flow in nonuniform
homogeneous media. Water Resources Research 11(5),
the Gaussian random distribution on the coarse
725741.
grid, but was inefficient due to low correlation Lord, G.J., C.E. Powell, & T. Shardlow (2014). An Intro-
between the fine and coarse grid approximation. duction to Computational Stochastic PDEs. Cam-
The second one suffers from the more difficult bridge University Press.
sample generation on the coarse grids. Never- Nelson, P.H. et al. (1994). Permeability-porosity relation-
theless the second approach was more efficient ships in sedimentary rocks. The log analyst 35(03),
than the first one, according to Tables 10 and 14. 3862.
Depending on the problem parameters and Powell, C.E. (2014). Generating realisations of station-
we achieved variance reduction from about 23 ary gaussian random fields by circulant embedding.
https://www.nag.co.uk/doc/techrep/pdf/tr1_14.pdf.
to 101 .

289

AMER16_Book.indb 289 3/15/2016 11:31:06 AM


This page intentionally left blank
Applied Mathematics in Engineering and Reliability Bri, Snel, Khanh & Dao (Eds)
2016 Taylor & Francis Group, London, ISBN 978-1-138-02928-6

Parallel resolution of the Schur interface problem using


the Conjugate gradient method

M.-P. Tran
Faculty of Mathematics and Statistics, Ton Duc Thang University, Ho Chi Minh City, Vietnam

ABSTRACT: In numerical analysis, the Schur complement is the heart of domain decomposition
method. A lot of promising results have been derived to present its mathematical basis. In this paper, we
propose a numerical method of solving Schur interface problem using the conjugate gradient method.
The parallel STD algorithm is also described to give comparable results with the proposed method. Some
numerical experiments have been performed to show a good parallel efficiency and convergence of our
method. Efficient parallel computation requires a few minutes to be completed, and they are much less
coupled than the direct solvers. In the rest of this paper, some numerical results have been carried out to
show its convergence properties and some open problems are also discussed.

1 INTRODUCTION
Let R d be an open bounded domain with
the boundary = . Suppose that we want to
Domain decomposition methods are important
solve the following Poissons equation:
techniques for the numerical simulation. Domain
decomposition can be used in the framework of u = f in , (1.1)
several discretization method for Partial Differen-
tial Equations (PDEs) to get more efficient solution
on parallel computing. The basic idea in domain with Dirichlet boundary condition:
decomposition methods is to split the domain of
study into non-overlapping subdomains, on that one u g on .
g, (1.2)
has discretized problems are simple and convenient
to be solved. Many variants of domain decomposi- The Schur complement method splits up the
tion method have been proposed and investigated linear system into subproblems. To do so, let us
in (Milyukova 2001, B. Smith 1996) and references divide into p subdomains 1 2 , ..., p with
therein. In numerical analysis, the Schur comple- share interfaces 1 2 , ... . One divides the entire
ment method is the basic non-overlapping domain problem into smaller non-overlapping subdomain
decomposition method, one of the most popular problems, then solves the subdomain problems to
linear solvers. The Schur complement is a directed form interface problem and solves it. This paper
parallel method, that can be applied to solve any will discuss the Schur complement as proposing
sparse linear equation system. For instance, the the parallel implementations for general sparse lin-
parallel Schur complement method was followed by ear system. One considers the parallel solution to
(Mansfield 1990, S Kocak 2010) and a lot of recent one dimensional case (1D), where the Schur com-
literatures. In many practical applications, the pre- plement system on subdomain interfaces is solved
conditioned conjugate gradient method is used by conjugate gradient method.
because of its simplicity, one can refer to (Meyer The problem (1) and (2) are discretized to get
1990). Therefore, it is a convenient framework for the system:
the solution to our sparse matrix systems.
In this paper, we present a simple Schur comple- .U = F ,
AU (1.3)
ment using conjugate gradient method for solving
one-dimensional Poissons equation. We only con- where the stiffness matrix A, the load vector f
sider the classical PDE in this study to claim the par- and approximate solution U can be decomposed
allel efficiency of the proposed method for a simple into p groups, corresponding to subdomains
linear equations system. The idea of solving other 1 2 , ...,
p . In this study, we have just treated the
large equations systems using proposed method following 1D problem. For general elliptic problem,
turns out to be very successful in the same way. the Schur complement is more complicated so that

291

CH36_19.indd 291 3/15/2016 2:58:49 PM


it is difficult to find approximate solution by paral- The rest of this paper is organized as follows.
lel computation. In Section 2, we consider the Schur comple-
Suppose that we need to solve numerically ment method for solving the problem (1.7) and
the one dimensional PDE with inhomogeneous propose to solve problem using parallel pre-
Dirichlet boundary condition as following: conditioned conjugate gradient method, which
is currently one of the most popular domain
uxx F ( x ), on (a1,aa2 ), decomposition methods. The mathematical
u(a1 ) = , (1.4) description is then established by a simple cod-
u(a2 ) = . ing example with Matlab MPI (Message Pass-
ing Interface) standard where the programs are
implemented on multiple processors. In the next
Let v( x ) = u((1 x )a1 xa2 ) and make substitu-
section, one presents the parallel STD method
tion to (1.4), one obtains:
in order to compare with the previous Schur
complement method, and some Matlab MPI
vxx G ( x ), on = ( , ),
calls are also provided. Section 4 indicates some
v( ) = , (1.5) numerical examples testing both parallel com-
v( ) = , putational methods. Some conclusions are then
discussed in the last section to give the validity
where G( x ) = (a
(a1 2 )2 F ((1 x)) 1 2) . of method.
Let (1 x ) ) , it gives the
1D problem with homogeneous boundary condi-
tion as follows:

{
2 THE SCHUR INTERFACE PROBLEM
xx = G ( x ), on = ( , )
(1.6)
( ) ( ) = 0. In this section, we present the Schur interface
complement method that is applied to solve the
Therefore, without loss of generality, in this problem (1.7). One refers to (Mansfield 1990) the
paper we only study the following problem with Schur complement method given in the following
Dirichlet homogeneous boundary condition, as steps:
the same as (1.6):
1. The domain of Sproblem (1.7) is subdivided

{uxx F ( x ), on = ( , ),
u( ) u( ) = 0.
(1.7)
into non-overlapping subdomains using parallel
graph partition,
2. Rewrite the stiffness matrix A in the linear sys-
tem (8) in each subdomain and interface,
where u is unknown, F represents a continuous
3. Solve the subdomain problem to calculate the
source term. This problem (1.7) can be rewritten in
Schur matrix S from each known submatrix,
term of the linear system:
4. Solve the Schur complement system SU = G ,
5. Solve the subdomain system to obtain
Au = B, (1.8)
solution in the whole domain by parallel
algorithm.
where A is the sparse matrix. By using the Schur
method as proposed as in the paper, the Schur In (1.8), one subdivides the problem into p parts
complement matrix S is introduced and the system ( p 2 ). The vector solution U can be decomposed
(1.8) is rewritten in the abbreviated form: into p groups, that is U (U1,U 2 ,...,U p ) , where
U i (i p ) are corresponding to domain
S = , (1.9) 1 2 , ..., p , respectively. It is important to notice
that the decomposition of into the subdomain i
where the matrix S in the interface problem is related does not have any cross point. In this study, we also
to the entire problem (1.8). This paper also presents discuss two models of domain decomposition and
a proposed method allows Matlab users to take propose the parallel schemes for solving problem
advantages of the Message Passing Interface (MPI) (8) numerically, for the case p = 2 and for general
to design parallel performance. In particular, only p2.
the basic send and receive operations: MPI Send Suppose that the approximation to the weak
and MPI Recv are both blocking calls, respectively. formulation results of (1.1) and (1.2) is of the form
These calls are used to implement programs across (1.8), the stiffness matrix A is given as the follow-
multiple processors for parallel computation. ing sparse matrix:

292

AMER16_Book.indb 292 3/15/2016 11:31:08 AM


2 1 0 0 0 1
1 2 1 K 21 = [0 0 1];
0 0 h2
1
(2.6)
1
A = 2 0 1 2 0 0 . (2.1) K 23 = 2 [ 1 0 0 ].
h h
    
0 0 0  1 2 K 22 = 2
.
h2
The problem (1.8) can be rewritten in term of
2.1 The case p = 2 block system:

The case of two subdomains, i.e. p = 2 , is first K K12 0 U1 F1


11
considered.
Let k > 0 , the domain = ( , ) is then divided K 21 K22 K23 U2 = F2 , (2.7)

into 2k + 2 grid cells as follows: 0

K32 K33 U3 F3

0 0 1 k 2k 1 1 (2.2) where U1 U 3 and U2 be the vector solution asso-


ciated with each subdomain 1 2 and with the
We partition the domain into three non-overlap- interface:
ping subdomains denoted by 1 2 , 3 , respec-
tively. Apparently, one has: U1 ( x1, ,xk ) U = xk 1;
(2.8)
1 = { 1, 2 , , k } , U3 ( xk 2 x k ),
2 = { k 1} , and Fi are the components of the load vector in
3 = { k 2 , k + 3 2k } . each region:
Let us rewrite the stiffness matrix A in (2.1) in
term of the block matrix as follows: F1 F (1 k F2 F ( k + 1);
(2.9)
F3 = F ( k : k ).

K K12 0
11 Then, the original linear system (1.8) is divided
into three subproblems given as:
A= K 21 K22 K23 , (2.3)

0 K32 K33
K11U1 + K12U 2 = F1, (2.10)

where Kij are defined for i j = 1, 2, 3 as: K 21U1 + K 22U 2 + K 23U 3 = F2 , (2.11)

11 and K33 are matrices of order k, one denotes K32U 2 + K33U 3 = F3 . (2.12)
K11, K33 R k k as below:
From the first and third equation (2.10) and
(2.12), one obtains that:
2 1 0 0
1 2 1 0
1 U1 K111 ( F1 K12U 2 ) ,
K11 K33 = 20 1 2 0 . (2.4) (2.13)
h U 2 K331 ( F3 K32U 2 ) .
   
0 0 0  2 Make substitution to the second equation (2.11),
we arrive at the Schur complement equation:
K12 and K32 R k 1 :
U2 = G,
SU (2.14)

0 1 where the introducing Schur complement matrix


0
1 1 0
K12 = 2 ; K32 = 2 . (2.5) S 1
K 22 K 21K11 K12 K 23K331K32 (2.15)
h  h 

1 0 and

K21 and K 23 R1 k : G F2 K 23K331F3 K 21K111F1. (2.16)

293

AMER16_Book.indb 293 3/15/2016 11:31:10 AM


In the Schur complement method, the solution of
linear system can be approximated by first solving
the Schur system, and then solving the interior sys-
tem. Here, the equation (2.14) above can be solved
in parallel scheme with two processors (p = 2). It is
noticed that the computation of the inverse terms
in (2.15) and (2.16) can be done in parallel. Let us
describe the parallel schemes in the Figures 1 and 2.
In these figures, equal works are implemented on two
processors to calculate vector G and Sx, respectively.
Then, the equation (2.14) is solved by the par-
allel preconditioned conjugate gradient method
because of its simplicity and efficiency. The numer-
ical scheme is presented as in Figure 3, in which the
implemented program has been developed under
the Matlab computing environment. Figure 3. Algorithm to solve the equation Sx = G by
conjugate gradient method.
In the Schur complement domain decomposi-
tion method, each subdomain is handled by dif-
ferent processors. More precisely, the proposed
parallel solver with two processors is introduced.

2.2 The case for general p


Similar to the previous section for p = 2 , when the
domain decomposition method is used, the prob-
lem domain is to be divided into p subdomains
j ( j 1 2 3 p ) as in Figure 4, in which the

Figure 4. is divided into p non-overlapping


subdomains.

unknown qualities can be calculated simultane-


ously in parallel. For instance,
1 = x1, x2 ,..., xk ,
2 = xk 1 ,
3 = xk 2 , xk + 3 x2 k +1 ,
2 p 1 = x( p
p 1) k p x pk ( p 1) .
The general form of a linear algebraic problem
(for general p > 2 ) defined on the domain can
Figure 1. Update G by the parallel scheme with two also be rewritten in terms of (1.8), where the matrix
processors. A in the whole domain is presented as:

K11 K12 0 0
K K 22 0 0
21
0 K 32 K33  

A= 0     (2.17)
,
    0

   K(2 p 2 )( 2 p
p 2)
K( 2 p 2 )( 2 p 1)

0 0 K(2 p 1)( 2 p
p 2) p 1)
K( 2 p 1)( 2 p

where Kij are defined as following:

Figure 2. Update Sx by the parallel scheme. K( i )( i ) R k k


are given by:

294

AMER16_Book.indb 294 3/15/2016 11:31:14 AM


2 1 0 0 where S is defined as:
1 2 1 0
1 1 A
11
A12 0 0

K( i )( i )
= K2 = 0 1 2 0. (2.18) A A22 A23  0

2 2
h h 21
   

0 A32 A33  


0 0 0  2 S= , (2.28)
    0
k 1
K( i )( i ) and K ( i )( i ) R : 0   A( p A( p 2 )( p 1)
2 )( p
p 2)

0   A( p 11)( p 1)
0 A( p 11)( p 2 )

1 1 0
K( = 2 K3 = 2 ; (2.19) where matrices Aij are given:
h 
i )( i )
h

1 Aii K( i )( i )
K( i )( i )
1
K( i )( i )
)(
K( i )( i )
,
1
(2.29)
1 K ( 2 i )( )
K ( )(
K
) ( )( )
,

1 1 0 Ai ( i K( K ( 1i )K( i ), (2.30)
K( = 2 K5 = 2 . (2.20) ) i )( i ) )( i )( i
h 
i )( i )
h
A( i )i K( i )( i ) K ( 1i )( i )K( i )( i ). (2.31)
0
For i 1 2 p 1 one has
K( i )( i ) and K ( i )( i ) R1 k :

1 1 Gi F i K ( 2i )( i 1) K (21i 1)( 2 i 1) F2 i 1
K( = K 4 = 2 [0 0 1]; (2.32)
i )( i )
h2 h K ( 2i1i )( i +1) F2i +1,
(2.21)
1 1
K( = 2 K 6 = 2 [ 1 0 0 ].
i )( i )
h h and U (U 2 ,U 4 , ,U 2( p ) ) .
The problem of solving (2.27) is called the Schur
K( i )( i ) =2. interface problem, and the assembly and solution
of sub-matrices in (2.26) can be performed paral-
It can be noticed that since the subdomain i lely by different processors. The parallel implemen-
is disconnected to each other (non-overlap), the tation of the Schur conjugate gradient method can
corresponding block matrices Kij is also discon- be presented in three steps:
nected to each other. This allows us to make an
easy parallelization. 1. Step 1: Calculate the matrix G by the parallel
The linear sparse system (1.8) is split into several scheme in Figure 5;
particular blocks: 2. Step 2: Calculate Sx by parallel scheme as in
Figure 6;
K11U1 + K12U 2 = F1, (2.22) 3. Step 3: Solve the equation SU = G by the pre-
conditioned conjugate gradient method. The
K 21U1 + K 22U 2 + K 23U 3 = F2 , (2.23) similar procedure is applied to independent
K32U 2 + K33U 3 + K34U 4 = F3 , subproblems. Figure 3 also shows the pseudoc-
(2.24) ode to solve it numerically.

K( p )( p )U 2 p 2 + K( p )( p )U 2 p 1 = F2 p 1 , (2.25)

Discrete solution is obtained in terms of: 3 PARALLEL STD

1 3.1 Setting of the problem


U1 K11 ( F1 K1122U 2 ),
U2 p K(2 p
1
1)( 2 p 1) (F p 1 K(2 p 1)( 2 p 2 )U 2 p 2 ), (2.26) In this section, base on the idea of dividing a large
Uj
1
K jj (F j K j ( j )U ( j ) K jjU k +1 , ) system of equations into many small ones to solve
them efficiently, the parallel STD is also intro-
for j ( i ), i = 1, 3, 5...
duced. This method allows parallelization MPI to
Substitute this to the remain equations, we distribute the systems of equations to solve each
finally get linear equations system: subproblem. In the world of parallel computing,
the MPI is the standard implementing program on
SU = G , (2.27) multiple processors. The MPI scheme consists of

295

AMER16_Book.indb 295 3/15/2016 11:31:16 AM


A11U1 + A12U 2


A U
21 1 + A U
22 2 + A U
23 3
.U =
AU , (3.2)


A
( 2 p )( p
U
p 2) 2 p 2 + A( 2 p
p 1)( U
) 2 p 1) 2 p 1

(
where U U1,U 2 , ,U 2 p )
p 1 . Suppose that we have
comm = p , that means, it shares works for p proc-
Figure 5. Parallel scheme to calculate G. essors. Computation of quantities in the system in
each subdomain can be done by parallel scheme in

Figure 6. Parallel scheme to calculate Sx.

several libraries with a set of routines to send and


receive data messages (MPI_Send and MPI_Recv,
respectively). MPI can be configured to execute
one or several processes and run them in paral-
lel. In addition, it is possible to implement MPI
entirely within the Matlab environment, which is
handled by the following algorithm.
Figure 7. Parallel scheme to calculate AU.
Similar to the previous section, let us also divide
domain into p non-overlapping subdomains as
in Figure 4. One notices that in this MPI imple-
mentation, each partition is assigned to one proc-
essor. The stiffness matrix A is rewritten as:

A11 A12  0 0
A 
21 A22 0 0
0 A32 0 0
A= , (3.1)
    
0   A( m A( m 1) m
)( m
m 1)

0   Am ( ) Amm

where m 2 p 1. Each subdomain 3 5 , ..., 2 p 1


consists of k points, and they does not have any cross
points as in previous section. Then, one has:
1 = x1, x2 ,, xk ,
2 = xk 1 ,
3 = xk 2 , xk + 3  x2 k +1 ,
2 p 1 : x( p
p 1) k p x pk ( p 1) .
This yields the linear algebraic equations
.U = B , which can be rewritten in block form as
AU
follows: Figure 8. Parallel scheme to solve the problem.

296

AMER16_Book.indb 296 3/15/2016 11:31:20 AM


Figure 7. It also remarks that in parallel STD, all 4 NUMERICAL EXAMPLES
communications were handled with MPI.
Once we get the values B{i}, the processor 0 In this section, some numerical examples are pro-
will send to the others. And then, it is sufficient vided in order to demonstrate a good performance
to calculate p terms in linear system (1.8) on each of the proposed method. It is also important that
subdomain by parallel strategy. such results are inherently valid for parallel com-
puting. We survey our recent research on the par-
allel solvers to one-dimensional problem (1.4). In
3.2. Solve the problem by STD algorithm
future the problem in higher dimensions could be
The parallel STD algorithm is then given in Fig- considered and we will focus on a generic paral-
ure 8 using the Matlab codes, where the parallel lel implementation framework of thousands of
algorithm to calculate the inner product d = r, r processors.
is presented in Figure 9.
4.1 Example 1
Let us consider the function


u( x ) = e 2 x 4 sin x + cos x + 2 x, (4.1)
3 4

the analytical solution to the equation (1.4). Then,


the Schur interface by conjugate gradient algo-
rithm and the parallel STD scheme are applied
to get approximate solution to this problem. The
numerical results have been tested for the specified
error tolerance tol = 106 and the maxit = 102 the
number of iterations. The approximate solutions by
the Schur complement conjugate gradient method,
where the spatial domain of the problem is decom-
posed into p = 2, 3 subdomains. Otherwise, the
parallel STD scheme is also applied together with
p processors (p > 2). Some numerical results repre-
sented in Figure 10 demonstrate the effective use of
proposed Schur conjugate gradient method rather
than parallel STD algorithm. One can see that the
Schur conjugate gradient method gives good con-
vergent solution (is very closed to the exact solu-
tion) with both cases p =2 and p = 3 within two
Figure 9. Parallel scheme to calculate d = r, r . first iterations, meanwhile the STD does not give
the convergence after maxit iterations.

Figure 10. Numerical solution to Example 1, for different cases of p and k. (a): By STD algorithm, (b) and (c): By
Schur conjugate gradient method. Red: the exact solution. Blue: The approximate solution.

297

AMER16_Book.indb 297 3/15/2016 11:31:22 AM


Figure 11. Numerical solution to Example 2, for case p = 4 and k = 3. (a) By Conjugate Gradient method, (b) By STD
algorithm. Red: the exact solution. Blue: The approximate solution.

Figure 12. Numerical solution for case p = 3 and k = 30. (a) By conjugate gradient method, (b) By STD algorithm.
Red: the exact solution. Blue: The approximate solution.

4.2 Example 2 in Figure 11. One remarks that in this case, four
processors have been used, that is, p = 4.
Let us consider the function: As in the Example 1, the numerical behaviour of
parallel Schur conjugate gradient gives convergent
solution in only two first iterations. Nevertheless,
u( x ) = sin x + x 2 , (4.2)
2 in Figures 10 and 11, we give numerical evidence
that by the STD algorithm, the convergence is still
is the exact solution to (1.4). The numerical results not achieved after maxit = 200 iterations. Let us
have been implemented also for the tolerance tol = consider additional example, where the exact solu-
106 and the iterations maxit = 200. The approxi- tion u( x ) = 3x 3 5x 2 + 2 x to (1.4), the numerical
mate solution by conjugate gradient method and simulation of both parallel schemes are also pre-
the parallel STD implementation can be represented sented in Figure 12 to give a comparison.

298

AMER16_Book.indb 298 3/15/2016 11:31:23 AM


5 CONCLUSION sition method development and applications to
higher dimensional problems will be analyzed in
In this study, we have presented the non-overlap- future research.
ping Schur complement method, in which the lin-
ear system is calculated using parallel conjugate
gradient method. The basic idea is to split a large REFERENCES
system of equations into smaller systems that can
be solved independently in paralleled processors. Kocak, S., H.A. (2010). Parallel schur complement
The parallel STD algorithm is also described to method for large-scale systems on distributed memory
give comparable results with the proposed method. computers. Applied Mathematical Modelling 25(10),
Some numerical experiments have been performed 873886.
Mansfield, L. (1990). On the conjugate gradient solu-
to show a good parallel efficiency and convergence tion of the Schur complement system obtained from
of our method. At this point, it should be noted domain decomposition. SIAM journal os Numerical
that in our computational program, MPI librar- Analysis 27(6), 16121620.
ies can be called under Matlab environment. From Meyer, A. (1990). A parallel preconditioned conjugate
this result, one can recognize that this is a prom- gradient method using domain decomposition and
ising method that could be well adapted to solve inexact solvers on each subdomain. Computing 45(3),
a large sparse matrix systems under the parallel 217234.
implementation. Furthermore, it can be seen that Milyukova, O.Y. (2001). Parallel approximate factoriza-
the computational time is minimal and required tion method for solving discrete elliptic equations.
Parallel Computing 27(10), 13651379.
memory is optimal when subdomains are used. The Smith, B., P. Bjorstad, W.G. (1996). Domain Decompo-
method also works well in more general settings of sition: Parallel Multilevel Methods for Elliptic Partial
problems in many applications. Nevertheless, a lot Differential Equations. Cambridge: Cambridge Uni-
of works are still open from this study. Some other versity Press.
important topics in the field of domain decompo-

299

AMER16_Book.indb 299 3/15/2016 11:31:24 AM


This page intentionally left blank
Statistics and applied statistics

AMER16_Book.indb 301 3/15/2016 11:31:24 AM


This page intentionally left blank
Applied Mathematics in Engineering and Reliability Bri, Snel, Khanh & Dao (Eds)
2016 Taylor & Francis Group, London, ISBN 978-1-138-02928-6

Determining results of the Phadiatop test using logistic regression and


contingency tables based on a patients anamnesis

P. Kurov
VBTechnical University of Ostrava, Ostrava, Czech Republic

ABSTRACT: This article models the Phadiatop test. This study was created with support from the
Clinic of Occupational and Preventive Medicine to try to avoid unnecessary and costly testing. This esti-
mation used statistical methods, specifically logistic regression to predict a patient into a particular group,
and contingency tables to verify other dependences between the patients other characteristics. Patients
were categorized only on the basis of their personal and family anamnesis, age and sex. Patients were put
into the correct group (healthy or sick) with 64% probability. Also a testing based on age groups of the
patients was done using this database. The presence of the positive Phadiatop test was the most common
for people born between 1972 and 1981, where the genetic predispositions for a positive Phadiatop test
results are about 55%.

1 INTRODUCTION In this paper we discuss the logistic regression


approaches for obtaining the results of Phadiatop
The knowledge of the results of the Phadiatop test test based only on family, personal anamnesis and
is very important especially for diagnosis of aller- other characteristics. Besides, it also examines the
gic dermatitis and also for the professional medical mutual relationship between a positive Phadiatop
care for travellers (Hajdukov et al. 2005, 2009, Wil- test result, sex and age of the patient. In this paper
liamas et al. 2001). The Phadiatop test is used as a we deal with the prediction of each patient into one
measure of atopy. The atopy rate of inhabitants of of two groups of the Phadiatop test based logis-
the Czech Republic is increasing. Atopy could be tic regression (Kurov & Hajdukov 2013, Bri
understood as a personal or family predisposition et al. 2015). Next, we describe the connections of
to become, mostly in childhood or adolescence, genetic predispositions for the atopy according to
hyper-sensible to normal exposure of allergens, the age group of inhabitants. Database of patients
usually proteins. These individuals are more sensi- comes from 20102012.
tive to typical symptoms of asthma, eczema, etc.
According to disease severity, results of the
Phadiatop test are divided into the six following 2 THE USED OF METHODS
groups: Groups 0 and I indicate none or weak form
of atopy and the remaining groups (II, III, IV, V, 2.1 Logistic regression as a toll for discrimitation
and VI.) indicate increasing severe forms of atopic
symptoms. Unfortunately, the Phadiatop test is The logistic regression was not originally cre-
expensive, so we try to predict the results of the test ated for the purpose of discrimination, but it can
on the basis of a detailed family and personal anam- be successfully applied for this kind of analysis
nesis (Wthrich et al. 1995, 1996, Sigurs 2010). (Hosmer, & Lemeshow 2004, Menard 2009, Miner
Information obtained from personal and family 2009). A logistic regression model, which is modi-
anamneses of each patient were used for detecting fied for the purpose of discrimination, is defined as
the presence of asthma, allergic rhinitis, eczema follows. Let Y1,, Yn is a sequence of independent
or other forms of allergy (contact allergy, food, random variables with alternative distributions,
etc.). Family and personal anamnesis of each whose parameters satisfy:
patient were evaluated by medical expert. Further-
more, other characteristics were available for each 0 + x
patient: age and sex. Then, we created and verified ( ) e e + x + 1 ,
P Yi = 1 X i = xi =
0
a mathematical model for the accurate classifi- (1)
P (Yi = 0 X i = xi )= + x
1
cation of patients into one of two groups of the ,
Phadiatop test. e 0
+1

303

AMER16_Book.indb 303 3/15/2016 11:31:24 AM


for i = 1, , n where = (1, , n ), is unknown nl nl 2 + nl nl 2 + 2
p-dimensional parameter and X1, , Xn, are (p+1) - vlij = . (6)
nl3
dimensional random vectors (1, , n). This model
can be called a learning phase, in which both values
X i and Yi are known for each object (i.e. it is known To determine the rate of association, it is pos-
to which group each object belongs to). Based on sible to use Mantel-Haenszel common odds ratio
this knowledge, we try to predict parameters 1,,n estimate for L fourfold tables, which is given by the
and thus we try to estimate function ( x ) , where relation

nl nl 22
e 0 + x l =1
L
(x ) = P (Y = 1 X = x )= . (2) nl
e 0 + x + 1 MH = . (7)
L nl nl 21

Another object for which the classification is


l =1 n
l
unknown is assigned to one of two groups accord-
ing to the value of decision function ( x ) . This rate takes the value 1 in case of independ-
The object will be included in the first group if ence; the independence testing is based on the
( x ) >0.5. Otherwise, the object will be included natural logarithm of the calculated value (Agresti
in the second group. The main advantage of this 2003, Simonoff, 2003).
model is that it does not require conditions for dis-
tributions of random vectors X1, , Xn. However,
the model assumes a very specific form of prob- 3 OBTAINED RESULTS
ability P (Y = 1 X x ) and we should verify the
significance of the relationship Our database comes from the University Hospital
of Ostrava, the Department of Work and preven-
e 0 + x tive medicine. The database includes a total of 1,132
(x ) = P (Y = 1 X = x )= (3) records of patients who underwent the Phadiatop
e 0 + x + 1
test examination. For the purposes of our compar-
ison, we consider only the patients in the control
2.2 Variable dependence analysis in a contingency group, who filled the records completely (personal
table for 3 variables anamnesis, family anamnesis, gender and year of
birth). The control group is a group of patients
Independence of two dichotomous variables con- who were on preventive examinations without pre-
tingent on another categorical variable can be viously known diseases or travellers. The number
tested by using Cohran or Mantel-Haenszel statis- of complete records is 274.
tics. In the case of Cohran statistics, we used the The database contained these pieces of infor-
formula mation about individual patients: the Phadiatop
test result Group 0 have Phadiatop test 0 or I (no
( ),
2
L
nl
L
ml visible symptoms), so no treatment was necessary.
QC =
l l
(4) The remaining patients with Phadiatop test IIVI
l =1vl11
L are members of Group 1. Medical treatment is nec-
essary for these patients.
Logistic regression does not have any require-
where L represents the number of categories of a ments for the data arrangement, but we need a spe-
variable, the total number of subjects included in cific format of the data for the logistic regression.
the lth table (l = 1,2, , L) as nl, joint frequency as For this particular case we have one dependent vari-
nlij, row marginal frequency as nli+ and column mar- able Y, Phadiatop (Ph), which depends on two inde-
ginal frequency as nl+j. If a null hypothesis about pendent variables of Personal Anamnesis (PA) and
the dichotomous variable independence is true, the Family Anamnesis (FA). Variable Y can be 0 or 1,
expected frequency (average frequency) in the cell according to the membership of a patient to Group
in the lth table, ith row and jth column is given by 0 or Group 1, respectively. The illnesses which,
the relation according to doctors, influence the Phadiatop test
result the most were established as independent var-
nli + nl j iables. These are asthma, allergic rhinitis, eczema
mlij = . (5)
nl and others. The category Others represents the
score of various kinds of allergies (food allergies,
And the variance of this frequency by the etc). Each patients family and personal anamnesis
relation was examined for all these illnesses.

304

AMER16_Book.indb 304 3/15/2016 11:31:24 AM


For testing purposes, we define a depend- are statistically insignificant and could be excluded
ent variable Phadiatop test result and a total of from the model.
10 independent variables (4 variables of personal The predictive qualities of the logistic model
anamnesis, 4 for family anamnesis, year of birth are shown in Table 3. Here, it is obvious that the
and gender). An example of database is shown in model predicts better into group 0, thus, a group of
Table 1.
Table 3. Classification table using by Logistic regression.
3.1 Results obtained by Logistic regression
Predicted value
For testing using logistic regression, we thought
of all 10 independent variables and one depend- Observed Percentage
ent variable Phadiatop test, coded as 0 for healthy value Phadiatop 0 Phadiatop 1 correct
patients (test result 0 and I) and 1 for patients with
a disease (the result of testing II to VI). Results of Phadiatop 0 136 31 81.4
statistical significance of the individual independ- Phadiatop 1 68 39 36.4
63.9
ent variables are given in Table 2.
On the basis of Walds test and test of statistical
significance, we see that a statistically significant
variable in this case appears only PA_allerdic rhi-
tis, PA_ekzema and PA_Asthma. Other variables

Table 1. Evaluation and verification of the independent


variables for Logistic regression.

Number of
patient 5 16 35 40 41

Sex m m m m w
Year of birth 1973 1970 1974 1986 1991
PA_asthma 0 1 1 0 0
PA_allergic 1 1 0 0 0
rhinitis
PA_eczema 1 0 0 0 0
PA_others 0 0 0 1 0
FA_asthma 0 1 0 0 0
FA_allergic 1 0 0 0 0 Figure 1. Depiction of age groups contingent on the
rhinitis Phadiatop test result.
FA_eczema 1 0 0 0 0
FA_others 1 0 0 1 0

Table 2. Evaluation and verification of the independent


variables for Logistic regression.

Independent
variables Estimate Walds Significant

PA_asthma 0.428 2.480 0.115


PA_allergic rhinitis 1.111 15.780 0.000
PA_eczema 0.993 8.774 0.003
PA_others 0.422 2.338 0.126
FA_asthma 0.351 1.373 0.241
FA_allergic rhinitis 0.208 0.398 0.528
FA_eczema 0.411 1.429 0.232
FA_others 0.070 0.043 0.836
Sex 0.006 0.000 0.984
Year of birth 0.002 0.018 0.895
Constant 1.651 0.005 0.943 Figure 2. Depiction of patients sex contingent on the
Phadiatop test result.

305

AMER16_Book.indb 305 3/15/2016 11:31:25 AM


healthy patients. The total value of the prediction ables separately. The variables regarding a patients
model is 63.9%. anamneses are Personal Anamneses (PA) and
Family Anamneses (FA). The other variables,
3.2 In depth evaluation of the Phadiatop test Age and Sex, are variables which are not related
dependence on the sex and age variables to a patients anamnesis; therefore, their further
analysis in relation to the Phadiatop test result is
Based on the previous testing where several vari-
important.
ables proved to be statistically insignificant, it
The age of the patients ranges from 17 to 69
seems necessary to analyse the sex and age vari-
years of age (year of birth 19431995). For the
Table 4. Phadiatop test results by sex and year of birth purpose of a clearer analysis, we will divide the
of the patients. age variable into 4 groups. The representation
of patients in each of the groups by sex and age
Phadiatop is stated in Figures 1 and 2. We are examining
dependence for groups of patients who differ in
Year of birth Negative Positive Total age. Table 4 contains the overall summary of the
patients in the groups. Based on the stated data,
<= 1951 Sex Woman 3 4 7 we can see that the number of patients is larger in
Man 5 5 10
some groups than in others.
Total 8 9 17
Based on the data stated in Table 4, we can see
19521961 Sex Woman 9 6 15 that the most examined patients were born in the
Man 24 11 35 years 1972 to 1981, 87 in total. Next is the group
Total 33 17 50 of patients born between the years 1962 and 1971,
the total of 70 patients who underwent the exami-
19621971 Sex Woman 18 11 29 nation. The most patients with a positive Phadi-
Man 26 15 41 atop test result are in the age group 19721981, 33
Total 44 26 70 patients in total, of which 26 are men.
19721981 Sex Woman 18 7 25
Man 36 26 62 3.3 Resulting obtained by testing
Total 54 33 87 contingency tables
1982 + Sex Woman 9 8 17 Based on the results stated in Table 4, we can com-
Man 19 14 33 pare three groups of results based on the Pearson
Total 28 22 50

Table 5. Results of the Chi-Square Test by patient year of birth groups.

Year of birth Value df Asymp. Sig. (2-sided)

<= 1951 Pearson Chi-Square 0.084 1 0.772


Continuity Correction 0.000 1 1.000
Likelihood Ratio 0.084 1 0.771
N of Valid Cases 17
19521961 Pearson Chi-Square 0.344 1 0.558
Continuity Correction 0.068 1 0.794
Likelihood Ratio 0.339 1 0.560
N of Valid Cases 50
19621971 Pearson Chi-Square 0.013 1 0.909
Continuity Correction 0.000 1 1.000
Likelihood Ratio 0.013 1 0.909
N of Valid Cases 70
19721981 Pearson Chi-Square 1.470 1 0.225
Continuity Correction 0.937 1 0.333
Likelihood Ratio 1.510 1 0.219
N of Valid Cases 87
1982 + Pearson Chi-Square 0.098 1 0.754
Continuity Correction 0.000 1 0.990
Likelihood Ratio 0.098 1 0.755
N of Valid Cases 50

306

AMER16_Book.indb 306 3/15/2016 11:31:26 AM


Table 6. Conditional dependence tests. Table 7. Age proportional representation of the inhab-
itants based on the Phadiatop test.
Asymp. Sig.
Chi-Squared df (2-sided) Phadiatop Predisposition

Cochrans 0.020 1 0.886 For Ratio


Mantel-Haenszel 0.000 1 0.992 Year of II to the of the
birth 0 + I VI Genetic positive positive

To 1982 28 22 26 11 0.21
Chi-Square statistics, Table 5. If we are examining 1981 to 1972 54 33 45 18 0.31
dependence for groups of patients by their year of 1962 to 1971 44 26 27 11 0.24
birth, we can see from the stated values that in any 1952 to 1961 33 17 23 9 0.16
of the groups we do not reject the null hypothesis Before 1952 8 9 9 4 0.08
on independence of the Phadiatop test result on Sum 167 107 130 53 1
sex at a 5% level of significance.
To determine a more accurate result of the
Phadiatop test testing contingent on a patients sex, Positive asthma or allergic rhinitis and other
depending on whether we consider a patients age records then did not need to be taken into
as well, we will perform a conditional dependence account
test by means of Cohran and Mantel-Haenszel sta- Positive eczema and other diseases at once, if the
tistics, Table 6. asthma and rhinitis were negative.
Based on the performed tests, we do not reject
the null hypothesis on independence of the Phadi- Based on Table 7, it is obvious that the most
atop result on sex at a 5% level of significance. patients with a genetic predisposition are from the
According to the above stated calculated sta- age group born between 1981 and 1972, 45 patients
tistics it was verified that a patients sex and age out of the total of 130 patients who were found to
have no influence on the Phadiatop test result. The have a genetic predisposition to a positive result of
illnesses influencing the Phadiatop test result are the Phadiatop test, Table 7 (the column Predis-
Asthma, Allergic rhinitis and Eczema in a patients positionGenetic). The proportional representa-
personal anamnesis. tion of the patients with genetic predisposition is
(130/274 = 0.47).
This proves the division of the data file, where
4 THE PHADIATOP TEST RESULTS there are 107 positive patients out of 274, about
EVALUATION BASED ON THE AGE 39%. Based on the family predispositions, 47% of
AND GENETICAL PREDISPOSITIONS all patients should be in the positive Phadiatop
test group. Nevertheless, there are (53/107 = 0.49)
Based on the above stated information about a genetically predisposed patients with the positive
patients age and categorization of patients into Phadiatop test (107 records).
age groups, we will perform the search for genetic
predispositions to a positive Phadiatop test. The
positive (II to VI) and the negative (0 and I) Phadi- 5 CONCLUSION
atop test results are stated.
Based on the mentioned age groups and the Knowledge of the Phadiatop test is very signifi-
positive and negative Phadiatop tests division cant, for both patient examination in offices of
it is clear that a proportional representation of occupational and preventive medicine and for cor-
the diseased patients is more or less equal. The rect patient care (e.g. Travellers). Unfortunately
only group that differs and has the most positive performing the Phadiatop test is expensive; there-
patients is the age group born 19811972, thus, fore, there is an effort to model its result as accu-
the young patients. According to the available rately as possible using characteristics that are easy
information, the genetic predisposition for atopy to discover, such as a patients personal and family
(positive Phadiatop test II to VI) should be about anamnesis, and also e.g. age, sex etc.
30% for the inhabitants of the Czech Republic. The tested database came from the years 2010
For this analysis, a patient with genetic predispo- 2012 form the University Hospital of Ostrava;
sitions was that one who filled in the statistically there were 274 patient entries available for the so
significant positive diseases of the Phadiatop test called control group of patients, i.e. patients who
results into his family anamnesis. Thus, a patient have no specific illness, and the test is performed
with genetic predispositions included into his/her preventively (e.g. Travellers etc.). The testing used
family anamnesis: logistic regression; on the basis of the performed

307

AMER16_Book.indb 307 3/15/2016 11:31:26 AM


tests, the characteristics influencing the Phadiatop REFERENCES
test result were identified to be Asthma, Allergic
rhinitis and Eczema in a patients personal anam- Agresti, A. (2003). Logit models for multinomial
nesis. Family anamnesis proved to be statistically responses. Categorical Data Analysis, Second Edition,
insignificant. Characteristics which are not related 267313.
to any illnesses, in our case a patients age and sex, Bris, R., Majernik, J., Pancerz, K., & Zaitseva, E. (2015).
Applications of Computational Intelligence in Bio-
were tested separately by means of contingency medical Technology.
tables. Even here, it was confirmed that these Hajdukov, Z., Plov, J., & Kosek, V. (2005). The
characteristics do not influence the Phadiatop importance of Atopy Investigation in the Department
test result in any way. A model constructed on the of Travel Medicine. ALERGIE-PRAHA-, 7(2), 109.
basis of logistic regression categorizes a patient Hajdukov, Z., Vantuchov, Y., Klimkov, P., Makhoul,
into the correct group (healthy or sick) with M., & Hromdka, R. (2009). Atopy in patients with
63.9% reliability. This means that every fourth or allergic contact dermatitis. Journal of Czech Physi-
third patient is classified incorrectly. The model cians: Occupational therapy, (2), 6973.
works better for patients who belong to Group 0, Hosmer Jr, D.W., & Lemeshow, S. (2004). Applied logistic
regression. John Wiley & Sons.
healthy patients, where the model predicts with Kurov, P., & Hajdukov, Z. (2013, May). The use of
81% reliability. logistic and ordinal regression for the prediction of the
Another interesting result is for testing for phadiatop test results. In Digital Technologies (DT),
genetic predispositions to a positive Phadiatop 2013 International Conference on (pp. 111115). IEEE.
test. The group which was the most predisposed to Menard, S. (2009). Logistic regression: From introduc-
a positive Phadiatop test was the one for the year tory to advanced concepts and applications. Sage
of birth from 1981 to 1972; here 31% of patients Publications.
have a positive test result. In contrast, patients Miner, G., Nisbet, R., & Elder IV, J. (2009). Handbook of
born before 1961 have a lower genetic predisposi- statistical analysis and data mining applications. Aca-
demic Press.
tion. Our conclusion proves the presumption that Sigurs, N., Aljassim, F., Kjellman, B., Robinson, P.D.,
about 30% of population has the genetic predis- Sigurbergsson, F., Bjarnason, R., & Gustafsson, P.M.
position for the positive Phadiatop test. Based on (2010). Asthma and allergy patterns over 18 years after
our calculation, the proportional representation is severe RSV bronchiolitis in the first year of life. Tho-
about 47%. rax, 65(12), 10451052.
Simonoff, J.S. (2003). Analyzing Categorical Data.
Springer Science & Business Media.
ACKNOWLEDGMENT Williams, P.B., Siegel, C., & Portnoy, J. (2001). Efficacy
of a single diagnostic test for sensitization to common
inhalant allergens. Annals of Allergy, Asthma & Immu-
This paper was done thanks to cooperation with nology, 86(2), 196202.
The University Hospital of Ostrava, the Depart- Wthrich, B., Schindler, C., Leuenberger, P., &
ment of Clinic of Occupation and Preventive Ackermann-Liebrich, U. (1995). Prevalence of atopy
medicine. and pollinosis in the adult population of Switzerland
This work was supported by The Minis- (SAPALDIA study). International archives of allergy
try of Education, Youth and Sports from the and immunology, 106(2), 149156.
National Programme of Sustainability (NPU Wthrich, B., Schindler, C., Medici, T.C., Zellweger,
II) project IT4Innovations excellence in sci- J.P., Leuenberger, P.H., & Team, S. (1996). IgE levels,
enceLQ1602 and Operational Programme atopy markers and hay fever in relation to age, sex
and smoking status in a normal adult Swiss popula-
Education for CompetitivenessProject No. tion. Internationalarchives of allergy and immunol-
CZ.1.07/2.3.00/20.0296. ogy, 111(4), 396402.

308

AMER16_Book.indb 308 3/15/2016 11:31:26 AM


Applied Mathematics in Engineering and Reliability Bri, Snel, Khanh & Dao (Eds)
2016 Taylor & Francis Group, London, ISBN 978-1-138-02928-6

On the distortion risk measure using copulas

S. Ly
Faculty of Mathematics and Statistics, Ton Duc Thang University, Ho Chi Minh City, Vietnam

U.H. Pham
Faculty of Economic Mathematics, University of Economics and Laws, Ho Chi Minh City, Vietnam

R. Bri
VSBTechnical University of Ostrava, Ostrava, Czech Republic

ABSTRACT: Distortion risk measure is a very effective tool for quantifying losses in finance and insur-
ance while copulas play an important role in modeling dependence structure of random vectors. In this
paper, we propose a new method to estimate distortion risk measures and use copulas to find the distribu-
tion of a linear combination of two dependent continuous random variables. As a result, partial risks as
well as aggregate risk are definitely estimated via distortion risk measure using copulas approach.

1 INTRODUCTION
fY y ) = fX1 fX 2 (yy fX ((xx
1
fX 2 ( y x ) dx. (3)
Suppose that we have a portfolio Y consisting of
two assets X1 and X2 as follows: In Cherubini et al. (2011), the authors consider
the case X1 and X2 are not independent. In their
Y w1X1 w2X 2 , (1) approach, copula is used to define a C-convolution
given by
where, wi denotes the weight of asset i, i = 1, 2 .
Let FX1 FX 2 and FY be distribution functions of C
X1 X 2 and Y, respectively, where X1 and X2 are FX1 X2 ( y ) = FX FX ( y )
1 2
(4)
not independent. Here, our goal is to calculate the
risk of the portfolio Y under distortion risk meas-
=

1
0 u ( (
C u, FX 2 y FX11 ( u ) du, ))
ure, see Wang (2000), given by
where, C is a copula capturing dependence struc-

Rg [Y ] = g ( F Y ( y ) ) ddyy + g ( F Y ( y ) )
0 ture of X1 and X2.
dy
d ,
0
In this article, we consider a more general case in
(2) the sense that using copula to find the distribution
of the porfolio Y as a combination of two continu-
where, g is a distortion function and ous variables. After that, we will conduct an estima-
F Y ( y ) 1 FY ( y ) is a survival function of Y. tion for the risk of Y using distortion risk measure.
The risk measure Rg is formed using Choquet The paper is organized as follows. The intro-
integral, see Wang (2000). In some cases, Y denotes duction is presented in section 1. The preliminar-
non-negative loss, then the distortion risk measure ies about distortion risk measures and copulas are
only has the first part in (2). As we can see, the briefly recalled in section 2 and section 3. After
important thing is that we have to derive the dis- that, in section 4, we propose a new formula for
tribution of Y. estimating the risk. In section 5, a copula-based
Recall that if Y X1 + X 2 and X1, X2 are inde- method for finding the distribution of a linear
pendent, then it is well-known that the solution combination of random variables is established.
can be solved through convolution product of two Next, we show the applications in section 6 and the
density functions fX1 and fX 2 , given by conclusions are stated in the last section.

309

AMER16_Book.indb 309 3/15/2016 11:31:26 AM


2 DISTORTION RISK MEASURE theorem, see Nelsen (2006), there exists a unique
copula C such that
Suppose Y is a non-negative loss random variable
with distribution function FY. Then, it is well- H ( x1, x2 ) C FX1 ( x ) FX 2 ( x2 ) . (6)
known that the expectation of Y can be written in
the form:
This copula C captures the dependent structure
of X1 and X2 In particular, X1 and X2 are inde-
E (Y ) =
0
( FY ( y ) ) dy. pendent if and only if C (u v ) (u v ) uv; While
X1 and X2 are comonotonic (i.e. X 2 f (X1 )
However, the expectation is not used as a risk a.s., where f is strictly increasing) if and only
measure. Instead of using this quantity, ones pre- if C (u v ) M (u v ) = min(u, v ) and X1 and X2
fer to transform it with a function g leading to dis- are countermonotonic (i.e. X 2 f (X1 ) a.s.,
tortion risk measure defined as where f is strictly decreasing) if and only if
C (u v ) W (u v ) = max( u + v 1, 0 ).
Note (see Nelsen (2006)): for any copula C and
Rg [Y ] = g ( FY ( y ) ) dy
d , (5) for any ( ) I 2 , we have the bound property for
0
copula:
where g : [ ; ] [ ; ], such that g ( ) 0 g ( ) 1
and g is a non-decreasing function. Such, g is called W (u v ) C (u v ) M (u v ). (7)
distortion function.
A number of risks measures found in finance Since a copula can model the dependence struc-
and insurance literature are special cases of the ture of random variables, one can construct meas-
distortion risk measure, see Sereda et al. (2010). ures of dependence using copulas with suitable
metrics. In fact, some well-known measures can be
i. VaR: g (u ) = 1{ ; } (u ), for some ( ; ) .
o so
{
ii. ES (TVaR): g (u ) = min , u , ( ; ) . } written in terms of copula, see Nelsen (2006).
The Kendalls ( 1, X 2 ), or ( ) is
iii. Proportional hazard transform: g (u ) = u1/ ,
for some > 1.
( )
1
iv. Wangs transform: g (u ) = (u ) + . ( ) 4 C ( ,v ) (u,, ) (8)
I2
where, is a standard normal distribu-
tion function and is often chosed by
1( ), 0 < < 1 . The Spearmans ( 1 2) , or (C) is

For more class of distortion functions, one can ( ) 12 2 C ( ,v ) . (9)


see in Wang (1996). I

The upper and lower tail dependence are


3 COPULAS AND MEASURES
1 C (t,t )
OF DEPENDENCE U (C ) = lim , (10)

t 1 1 t
Let I = [ 0;1] be the closed unit interval and
I 2 = [ 0;1] [ 0;1] be the closed unit square. C (t,t )
L (C ) = lim . (11)
+
t0 t
p (two dimensional
Definition 1. (Copula) A 2-copula
copula) is a function C: I 2 I satisfying the
conditions: In Tran et al. (2015), we also proposed a new
non-parametric measure of dependence for two
i. C (u ) = C (v
(v ) = 0, for any u, v I . continuous random variables X1 and X1 with cop-
ii. C (u ) = u and C ( v ) = v, for any u, v I . ula C, is defined by
iii. For any u1,uu2 v1, v2 I such that u1 u2 and
v1 v2 , ( ) | ||2S || ||S2 , (12)
C (u2 v2 ) C (u2 v1 ) C (u1 v2 ) C (u1 v1 ) 0.
where, || C ||S denotes a modified Sobolev norm
The most important role in copula theory is for copula C, given by
from Sklars theorem (1959). In fact, let X1 and 1/ 2
X2 be random variables with continuous marginal C 2 (u,v ) C 2 (u,v )
distribution functions FX1 and FX 2 , respectively, || C ||S = 2 + dudv , (13)
I
u v
and a joint distribution function H, then by Sklars

310

AMER16_Book.indb 310 3/15/2016 11:31:28 AM


The measure (C) could be used as a measure Theorem 1. Suppose that ( X1 X 2 ) be a continuous
of monotone dependence because it attains its random vector having the marginal distributions F1
extreme values of 1 (or 1) if and only if X and Y and F2, respectively, and they are not independent.
are monotonic (or countermonotonic). Let C be an absolutely continuous copula modeling
dependence structure of a random vector ( X1 X 2 )
and define Y as
4 ESTIMATION OF DISTORTION
RISK MEASURE Y w1X1 w2X 2 , (15)
In this section, we are going to establish an expres- where, w1, w2 R  {0} .
sion for approximate distortion risk measure given Then, the density and distribution function of Y
by (5). Notice that we only consider Y as a non- are defined as follows:
negative loss variable. It is because for Y < 0 , we
can definitely plus a constant number m (large
enough) such that Y m = Y 0. Then, the dis- 1 1 y w1F1 1
(u)
tortion risk measure Rg [Y ] = Rg [Y ] m.
fY y ) =
w2 0 c u, F2 w2

Back then, to deal with an integral over infinite (16)
intervals, we firstly change variable to get a finite
y w1F1
f2
1
(u) du,

interval. In particular, one can take y = 1t t and the w2
risk Rg [Y ] becomes

1 t
Rg [Y ] = g 1 FY
1
dt. FY ( y) = sgn(w2 )
1 y w1F1 1
(u) du,
0 1 t ( t)
2 C u, F2
0 u w2

Let k (t ) = g ( FY ( t t )) ( 1t ) 2
and apply (17)
the composite trapezoidal rule, we have an where, c denotes the density of copula C and sgn (x)
approximation: is a sign function of x,

1 1 k( ) + k ( ) + n 1 k i . 1, if x > 0,
Rg [Y ] = k (t ) dt sgn( ) =
0 n 2 2 i =1 n 1, if x < 0.

It is straightforward to check that Proof. Firstly, we set up

k( ) g (1) = 1, Y1 w1X1 w2X 2


(18)
t 1 Y2 X1.
k( ) g 1 FY = 0,
t 1 1 t (1 t )2
i i n
2 Let F and f be the joint distribution and join
k = g 1 FY . density of ( 1 2 ) . Then, due to Sklars theorem
n n i ( ) 2
(1959), there exists a unique copula C such that

Therefore, we obtain a formula for approximate F ( x1, x2 ) C ( F1 ( x1 ) , F2 ( x2 ) ) . (19)


the risk Rg [Y ] as follow:
Or it may write in term of joint density func-
1 n 1 i n
Rg [Y ] + g 1 FY . (14) tion as
2 n i =1 n i ( n i )2
f ( x1, x2 ) c ( F1 ( x1 ) , F2 ( x2 ) ) f1 ( x1 ) f2 ( x2 ) , (20)

5 DISTRIBUTIONS OF A SUM OF TWO where, c denotes density of copula C given by


DEPENDENCE RANDOM VARIABLES
USING COPULAS 2C ( u1, u2 )
c ( u1,uu2 ) = . (21)
u1u2
We now turn to the main theorem deriving dis-
tribution of a sum of random variables using
copulas. From (18), the inverse transform is

311

AMER16_Book.indb 311 3/15/2016 11:31:34 AM


X1 Y2 , (u )
By taking v F2 1
1
y w1F1
, the formula (25)
Y1 wwY1Y2 becomes as shown: w2

X 2 = w2
.

t w1F11(u )
1 F2
c (u,v ) dvdu
w2

0 0
w2
The Jacobian of the transform is FY1 (t ) = w2
d
t w1F11(u)
X1 X1 1 F2
u v (
u,v ) dvdu

0
w2 C 2

0 1 = w2
w2

( )
Y1 Y2 1
J= = 1 w = 0. 1 t w F ( ) 1

X 2 X 2 1 = sgn (w2 ) u C u, F2 du. 1 1


w2 0 w 2
w2 w2
Y1 Y2
The proof is completed.
Then, substituting into (20), we get the joint Remark: Due to the fact that they are exchange-
density of (Y1 Y2 ) denoted by h as follows: able, it is totally possible to obtain other formulas
as (16) and (17), given by
y w1y2
h ( y1 y2 ) f y2 , 1 J y w2 F2 1 (v )

w2 1 1
fY y ) = 0 c F1 ,v

1 y w1y2 w1 w1
= c F1 ( y2 ) F2 1 (22) (26)
w2 y w2 F2 (v )
1
w2
f1 dv,
y w1y2 w1
f1 ( ) f2 1 .
w2
1 y w2 F2 1 (v )
FY ( y) = sgn(w1 ) C F1 ,v dv,
Therefore, one can derive the density of Y1 in 0 v w1
the following:
(27)

fY1 ( y1 ) = h ( y1 y2 ) dy
d 2 Let us consider a special case, w1 w2 = 1 . Then,


from (23), we obtain an expression
1 y1 w1y2
=
w2 c F1 ( y2 ) F2 w2


(23)

fY1 y1 ) = c ( F1 ( y2 ) F2 ( y1 y2 ) )

(28)
f1 ( y2 ) f2 ( y1 y2 ) dy
dy2 .
y w1y2
f1 ( y2 ) f2 1 dy2
w2 This formula can be seen as a general convolu-
1 1 y w1F1 1
(u) tion product (called C-convolution) of two depend-
=
w2 0
c u, F2 1
w2


(24) ent density functions. In fact, when X1 and X2 are
independent, their copula is C (u v) v ) uv . Thus,
y w1F1 ( )
1
its copula density is c(u v ) = 1. Again, we get the
f2 1 du usual convolution product
w2

Next, the computation of Y1s distribution is fY1 y1 ) = f
1

( y2 ) f2 ( y1 y2 ) dy
d 2. (29)
straightforward:
t
FY1 (t ) = f
Y1
( y1 ) dy1 6 APPLICATIONS
t 1 1 y1 w1F1 1 ( u )
= 0 c u, F2
w2 w2 Let us consider the portfolio consisting of two

y1 w1F1 ( )
1 (25) assets as follows:
f2 dudy1
w2
Y w1X1 w2X 2 , (30)
1 1 t y1 w1F11 ( u )
=
w2 0
c u, F2
w2

where, X1 presents return of Exxon Mobils stock;

y1 w1F1 1 ( u ) X2 presents return of JP Morgans stock and Y
f2 dy1du. denotes return of the portfolio.
w2

312

AMER16_Book.indb 312 3/15/2016 11:31:37 AM


The data are selected from New York Stock Table 1. Descriptive statistics for the portfolio Y.
Exchange during the year 2013 and 2014. Assume
that the portfolio investment would start in 2014. Statistics Exxon JP Portfolio
To build an optimal portfolio, one can use Markow-
Observations 260.00 260.00 260.00
itz rule for two assets by computing the weights w1 Minimum 0.0417 0.0424 0.0406
and w2 as follows:
Quartile 1 0.0056 0.0047 0.0049
Median 0.0002 0.0005 0.0006
E (X1 ) 22 E (X 2 ) 112 Arithmetic Mean 0.0003 0.0003 0.0003
w1 =
E (X1 ) 22 + E (X 2 ) 2
1 E (X1 ) + E (X 2 ) 112 Geometric Mean 0.0003 0.0003 0.0002
w2 1 w1, Quartile 3 0.0053 0.0069 0.0069
Maximum 0.0302 0.0352 0.0331
(31) SE Mean 0.0006 0.0007 0.0007
LCL Mean (0.95) 0.0015 0.0010 0.0010
where, 12 , 22
are the variances of X1 and X2, UCL Mean (0.95) 0.0010 0.0017 0.0016
respectively, and their covariance is 12 . Here, the Variance 0.0001 0.0001 0.0001
weights are calculated without risk free. Stdev 0.0102 0.0111 0.0108
Using the data 2013, we obtain sample means Skewness 0.4977 0.2753 0.2819
and variances given by x1 0 0006 x2 = 0.012, 12 = Kurtosis 2.3042 1.3998 1.3119
22 = 0.0001 and 12 = 4.754297 10 5 . Thus, the
optimal weights are determined by

w1 3% d w2 9 %.

First of all, the descriptive summary of the port-


folio is shown in Figure 1 and Table 1:
The important thing is that one has to make
a sketch of an association between the return X1
and X2 and this can be done by using scatter plot
as shown in Figure 2. Clearly, they are not inde-
pendent (the Pearsons correlation coefficient
r(X1 X 2 ) . In addition, one can find out that
although the weight for Exxon Mobils return is
very small, w1 = 0.03 , it has a quite large effect on
the portfolio values, r(X1 Y ) = 0.48 . Also, X2 and
Y have a perfect linear dependence, r ( X 2 ,Y ) = 1 .
Next step, we will construct a joint distribution
as well as a copula C modeling the dependence
structure of X1 and X2. Firstly, marginal distribu-
tions are estimated by using maximum likelihood
estimation method (MLE). As a result, F1, F2 and

Figure 2. The relationship among X1 X 2 and Y.

FY are approximated normal cumulative distribu-


tion functions (CDF), see Figures 3, 4, and 5.
To estimate a copula C  of X and X , one can
1 2
use the copula package from R, see Yan et al.
(2007). As we can see in Table 2, Student copula
(with parameter = 0.46 and degree of freedom
= 9 ) could be the best fit for dependence struc-
ture of X1 and X2 due to the fact that the maxi-
mized log likelihood value is the highest among
the common copulas such as family of normal,
Student, Gumbel, Frank and Clayton, as shown in
Table 2. To verify this fact, we apply goodness-of-
Figure 1. The Portfolio Performance. fit test using Cramer-von Mises statistic and then

313

AMER16_Book.indb 313 3/15/2016 11:31:40 AM


Table 2. .
Results for estimating copula C

Copula Param. Std. z value P-value Loglike

Normal 0.46 0.045 10.26 2 * 1016 28.9


Student 0.46; 9 0.053 8.68 2 * 1016 30.17
Gumbel 1.38 0.064 21.41 2 * 1016 25.70
Frank 2.89 0.426 6.79 1.15 * 1011 26.44
Clayton 0.70 0.094 7.39 1.45 * 1013 26.66

Table 3. Goodness-of-fit test for Student copula with


= 9.

Figure
g 3. F
approximates normal CDF, Copula Statistic Parameter P-value
( )
1
N %, . % .
%, 
C 0.015919 0.45947 0.6658

Table 4. Measures of dependence for Student copula


.
C
Copula Kendalls Spearmans Tail Index (C)

C 0.3043 0.4432 0.0834 0.4377

present the result in Table 3. It is clear that P-value


is 0.6658 which is higher than the significant level
, say = 1%. Hence, there is enough evidence to
conclude that the Student copula can be used to
model dependence structure of the two returns and

the degree of dependence is moderate, see Table 4.
Figure
g 4. F approximates normal CDF, Note: the tail indices U L .
( )
2
N 0.03%,1.112 % . Applying Sklars theorem, one can definitely
construct a join distribution for X1 and X2 as
follow:

H  (  ( x ),  ( x )),
 (x ,x ) C (32)
1 2 F1 1 F 2 2

where, F 1 and F
 2 approximate normal distribu-
 is a Student
tion as shown in Figure 3 and 4; C
copula with the parameter = 0.46 and = 9,
given by

 (u, v ) t t
C ,( (u ) ,t,t (v )) ,

where, t is the cumulative distribution function


of a bivariate Student distribution, is the correla-
tion coefficient and is the degree of freedom.
In section 5, we have shown a new method to
Figure
g 5. F 
approximates normal CDF, establish distribution of the return portfolio Y
( )
Y
N 0.02%,1.08 % .
2 which is a linear combination of two dependent
asset returns X1 and X2. In the above arguments,
the dependence structure has been determined by

314

AMER16_Book.indb 314 3/15/2016 11:31:42 AM


Student copula. Therefore, the density and distri- Table 5. Risk Measures for X1 X 2 ,Y .
bution of Y will come out naturally from (16) and
(17). The numerical results are plotted in Figure 6 Risk Exxon Mobil JP Morgan Portfolio
and 7. Furthermore, the graphs seem to perform
an approximately normal distribution that is con- VaR 5% 1.70% 1.79% 1.74%
sistent with the results using maximum likelihood VaR 1% 2.39% 2.53% 2.47%
estimation method as in Figure 5. ES 5% 2.12% 2.24% 2.19%
Finally, we can apply the formula (14) with ES 1% 2.74% 2.91% 2.84%
several distortion functions to estimate risks Wang 1 1.66% 1.88% 1.77%
for the portfolio Y, as shown in Table 5. Note: Wang 2 2.25% 2.46% 2.48%
n = 1000, 1 = 1(0.05) and 2 = 1(0.01).

7 CONCLUSIONS

We have proposed a new method to estimate dis-


tortion risk measures and use copula-based pro-
cedure to approach the distribution of a portfolio
consisting of dependent assets. The latter is our
main focus since all the information of dependence
is used properly. For further research, we are going
to study the optimization problem of a general
portfolio using the copulas approach as well as its
distributed behavior.

ACKNOWLEDGEMENT

This work was supported by The Ministry of


Education, Youth and Sports from the National
Programme of Sustainability (NPU II) project
IT4Innovations excellence in scienceLQ1602.

Figure 6. Density of Y using C-convolution. REFERENCES

Aas, K. (2004). Modelling the dependence structure of


financial assets: A survey of four copulas.
Balbs, A., J. Garrido, & S. Mayoral (2009). Properties of
distortion risk measures. Methodology and Computing
in Applied Probability 11(3), 385399.
Cherubini, U., E. Luciano, &W. Vecchiato (2004). Copula
methods in finance. John Wiley & Sons.
Cherubini, U., S. Mulinacci, & S. Romagnoli (2011). A
copulabased model of speculative price dynamics in
discrete time. Journal of Multivariate Analysis 102(6),
10471063.
Embrechts, P., F. Lindskog, & A. McNeil (2001). Mod-
elling dependence with copulas. Rapport technique,
Dpartement de mathmatiques, Institut Fdral de
Technologie de Zurich, Zurich.
Frees, E.W. & E.A. Valdez (1998). Understanding rela-
tionships using copulas. North American actuarial
journal 2(1), 125.
Joe, H. (1997). Multivariate models and multivariate
dependence concepts. CRC Press.
Kim, J.H. (2010). Bias correction for estimated distortion
risk measure using the bootstrap. Insurance: Mathe-
matics and Economics 47(2), 198205.
Figure 7. Cumulative distribution of Y using Nelsen, R.B. (2006). An introduction to copulas. Springer
C-convolution. Science & Business Media.

315

AMER16_Book.indb 315 3/15/2016 11:31:45 AM


This page intentionally left blank
Applied Mathematics in Engineering and Reliability Bri, Snel, Khanh & Dao (Eds)
2016 Taylor & Francis Group, London, ISBN 978-1-138-02928-6

On the performance of sequential procedures for detecting


a change, and Information Quality (InfoQ)

Ron S. Kenett
The KPA Group, Raanana, Israel, Department of Mathematics G. Peano, University of Turin, Italy
Institute for Drug Development, The Hebrew University of Jerusalem, Israel
Faculty of Economics, Ljubljana, Slovenia, Center for Risk Engineering, NYU Tandon School of Engineering,
New York, USA

ABSTRACT: The literature on statistical process control has focused on the Average Run Length
(ARL) to an alarm, as a performance criterion of sequential schemes. When the process is in control,
ARL0 denotes the ARL to false alarm and represents the in-control operating characteristic of the proce-
dure. The average run length from the occurrence of a change to its detection, typically denoted by ARL1,
represents the out-of-control operating characteristic. These indices however do not tell the whole story.
The concept of Information Quality (InfoQ) is defined as the potential of a dataset to achieve a specific
(scientific or practical) goal using a given empirical analysis method. InfoQ is derived from the Utility (U)
of applying an analysis (f) to a data set (X) for a given purpose (g). Formally, the concept of Information
Quality (InfoQ) is defined as: InfoQ(f, X, g) = U(f(X | g)). These four components are deconstructed into
eight dimensions that help assess the information quality of empirical research in general. In this paper,
we suggest the use of Probability of False Alarm (PFA) and Conditional Expected Delay (CED) as an
alternative to ARL0 and ARL1 enhances the Information Quality (InfoQ) of statistical process control
methods. We then review statistical process control methods from a perspective of the eight InfoQ dimen-
sions. As an extension, we discuss the concept of a system for statistical process control.

1 INTRODUCTION formance indicators using an InfoQ perspective.


Section 2 is an introduction to InfoQ, Section 3, 4
Change point detection and process control sequen- and 5 are about detection of change, false alarms
tial methods are designed to detect change. This and detection delay. Section 6 is a detailed analysis
paper discusses how performance indicators such of change detection using InfoQ, section 7 is pre-
as Conditional Expected Delay (CED) and Prob- senting a system for statistical process control and
ability of False Alarm (PFA) enhances the informa- section 8 presents a summary and discussion.
tion quality of statistical process control methods.
As an extension, a System for Statistical Process
Control (SSPC), in the context of a life cycle view 2 INFORMATION QUALITY
of statistics, is presented. A main point provided by
this approach is that ARL0 and ARL1 are not suf- Information Quality (InfoQ) is the potential of a
ficiently informative and therefore not adequate for dataset to achieve a specific (scientific or practi-
determining or comparing performance of alterna- cal) goal using a given empirical analysis method
tive process control sequential methods. (Kenett and Shmueli, 2014, 2016). InfoQ is dif-
Kenett and Shmueli (2014, 2016) formulate the ferent from data quality and data analysis quality,
concept of information quality (InfoQ). InfoQ is but is dependent on these components and on the
derived from the Utility (U) of applying an anal- relationship between them. InfoQ is derived from
ysis (f) to a data set (X) for a given purpose (g). the utility of applying an analysis (f) to a data set
Eight dimensions help assess the level of InfoQ (X) for a given purpose (g). Formally the concept
of a study. These are: Data Resolution, Data of Information Quality (InfoQ) is defined as:
Structure, Data Integration, Temporal Relevance,
Generalizability, Chronology of Data and Goal, InfoQ(f, X, g ) = U(f(X | g))
Operationalization, and Communication. These
eight dimensions represent a deconstruction of the InfoQ is therefore affected by the quality of
four InfoQ components: U, f, X and g. In this paper its components g (quality of goal definition),
we review and discuss change point detection per- X (data quality), f (analysis quality), and U

317

AMER16_Book.indb 317 3/15/2016 11:31:46 AM


(utility measure) as well as by the relationships finer scale. Data might be recorded by mul-
between X, f, g and U. Expanding on the four InfoQ tiple instruments or by multiple sources. To
components provides some additional insights. choose among the multiple measurements,
Analysis Goal (g): Data analysis is used for var- supplemental information about the reliability
ious purposes. Three general classes of goals are and precision of the measuring devices or data
causal explanations, predictions, and descriptions. sources is useful. A finer measurement scale
Causal explanations include questions such as is often associated with more noise; hence the
Which factors cause the outcome?. Descriptive choice of scale can affect the empirical analysis
goals include quantifying and testing for popula- directly. The data aggregation level must also
tion effects using data summaries, graphical visu- be evaluated in relation to the goal.
alizations, statistical models, and statistical tests. ii. Data Structure: Data structure relates to the type
Prediction goals include forecasting future values of data analysed and data characteristics such as
of a time series and predicting the output value of corrupted and missing values due to the study
new observations given a set of input variables. design or data collection mechanism. Data types
Data (X): The term data includes any type include structured numerical data in different
of data to which empirical analysis can be applied. forms (e.g., cross-sectional, time series, network
Data can arise from different collection tools such data) as well as unstructured, non-numerical
as surveys, laboratory tests, field and computer data (e.g., text, text with hyperlinks, audio, video,
experiments, simulations, web searches, observa- and semantic data). The InfoQ level of a certain
tional studies and more. Data can be univari- data type depends on the goal at hand.
ate or multivariate and of any size. It can contain iii. Data Integration: With the variety of data
semantic, unstructured information in the form sources and data types, there is often a need to
of text or images with or without a dynamic time integrate multiple sources and/or types. Often,
dimension. Data is the foundation of any applica- the integration of multiple data types creates
tion of empirical analysis. new knowledge regarding the goal at hand,
Data Analysis Method (f): The term data analy- thereby increasing InfoQ. For example, in online
sis refers to statistical analysis and data mining. auction research, the integration of temporal bid
This includes statistical models and methods (par- sequences with cross-sectional auction and seller
ametric, semi-parametric, non-parametric), data information leads to more precise predictions of
mining algorithms, and machine learning tools. final prices as well as to an ability to quantify the
Operations research methods, such as simplex effects of different factors on the price process.
optimization, where problems are modelled and iv. Temporal Relevance: The process of deriving
parametrized, also fall into this category. knowledge from data can be put on a time line
Utility (U): The extent to which the analysis that includes the data collection, data analysis, and
goal is achieved, as measured by some performance study deployment periods as well as the temporal
measure or utility. For example, in studies with gaps between the data collection, the data analy-
a predictive goal, a popular performance measure sis, and the study deployment stages. These dif-
is predictive accuracy. In descriptive studies, com- ferent durations and gaps can each affect InfoQ.
mon utility measures are goodness-of-fit measures. The data collection duration can increase or
In explanatory models, statistical power and good- decrease InfoQ, depending on the study goal, e.g.
ness-of-fit measures are common utility measures. studying longitudinal effects vs. a cross-sectional
Eight dimensions are used to deconstruct InfoQ goal. Similarly, if the collection period includes
and thereby provide an approach for assessing it. uncontrollable transitions, this can be useful or
These are: Data Resolution, Data Structure, Data disruptive, depending on the study goal.
Integration, Temporal Relevance, Chronology v. Chronology of Data and Goal: The choice of
of Data and Goal, Generalizability, Operation- variables to collect, the temporal relationship
alization and Communication. We proceed with a between them, and their meaning in the con-
description of these dimensions. text of the goal at hand also affects InfoQ. For
example, in the context of online auctions,
i. Data Resolution: Data resolution refers to the classic auction theory dictates that the number
measurement scale and aggregation level of of bidders is an important driver of auction
X. The measurement scale of the data needs price. Models based on this theory are useful
to be carefully evaluated in terms of its suit- for explaining the effect of the number of bid-
ability to the goal, the analysis methods to be ders on price. However, for the purpose of pre-
used, and the required resolution of U. Given dicting the price of ongoing online auctions,
the original recorded scale, the researcher where the number of bidders is unknown
should evaluate its adequacy. It is usually easy until the auction ends, the variable number
to produce a more aggregated scale (e.g., two of bidders, even if available in the data, is
income categories instead of ten), but not a useless. Hence, the level of InfoQ contained

318

AMER16_Book.indb 318 3/15/2016 11:31:46 AM


in number of bidders for models of auction advance notice of the occurrence of change, and
price depends on the goal at hand. detection must rely on observations made on the
vi. Generalizability: The utility of f(X|g) is system being monitored. Generally, post-change
dependent on the ability to generalize f to the observations differ stochastically from pre-change
appropriate population. There are two types ones, and a single observation or a finite set of
of generalization, statistical and scientific observations does not clearly differentiate the pre-
generalizability. Statistical generalizability and post-change regimes. Consequently, a trigger-
refers to inferring from a sample to a target happy detection scheme will give rise to many false
population. Scientific generalizability refers alarms, whereas a conservative procedure will be
to applying a model based on a particular tar- too slow to react. For comparison between differ-
get population to other populations. This can ent methods, operating characteristics of a detec-
mean either generalizing an estimated popula- tion scheme must be formulated.
tion pattern/model f to other populations, or As an example, consider the 4 run charts in
applying f estimated from one population to Figure 1. The series Y1-Y4 were generated with a
predict individual observations in other popu- change point at the 10th observation using MIN-
lations using domain specific knowledge. ITAB version 16.0. The data is a realization of a
vii. Operationalization: Operationalization relates normal distribution with mean = 10 and stand-
to both construct operationalization and ard deviation = 3, with shifts in the mean after
action operationalization. Constructs are that the 10th observation to 11.5, 13, 14.5 and 16
describe a phenomenon of theoretical inter- respectively.
est. Measurable data is an operationalization Just as in other situations where statistical
of underlying constructs. The relationship methods are applied, the approach to change
between the underlying construct and its point detection may be frequentist or Bayesian.
operationalization can vary, and its level rela- The problem has the flavour of testing hypotheses.
tive to the goal is another important aspect of At each stage, one must decide whether a change
InfoQ. The role of construct operationaliza- is in effect (and raise an alarm) or whether the
tion dependents on the goal, and especially process is in control (and continue the monitor-
on abstractions whether the goal is explana- ing). The frequentist approach calls for separate
tory, predictive, or descriptive. In explanatory operating characteristics for the in-control and the
models, based on underlying causal theories, out-of-control situations. In the Bayesian context,
multiple operationalizations might be accept- an operating characteristic combines the in- and
able for representing the construct of interest. out-of-control scenarios by means of the prior
As long as the data is assumed to measure the distribution on the change point, . A partial list
construct, the variable is considered adequate. of the vast literature on these topics includes Page
In contrast, in a predictive task, where the goal (1954), Shiryaev (1963), Lorden (1971), Lucas
is to create sufficiently accurate predictions of (1976), Zacks (1981), Kenett and Pollak (1983,
a certain measurable variable, the choice of 2012), Pollak (1985), Yashchin (1985), Zacks and
operationalized variables is critical. Action Kenett (1994), Kenett and Pollak (1996), Woodall
operationalization is characterizing the practi- and Montgomery (1999), Frisn (2003) and Box
cal implications of the information provided. and Luceo (2006).
viii. Communication: Effective communication of
the analysis and its utility directly impacts
InfoQ. There are plenty of examples where
miscommunication of valid results has led to
disasters, such as the NASA shuttle Challenger
disaster (Kenett and Thyregod, 2006). Commu-
nication media are visual, textual, and verbal in
the form of presentations and reports. Within
research environments, communication focuses
on written publications and conference presen-
tations. Research mentoring and the refereeing
process are aimed at improving communication
and InfoQ within the research community.

3 DETECTION OF CHANGE

Change happens and, invariably, its early detec- Figure 1. Four time series with change point at the 10th
tion is of importance. Usually we are not given observation.

319

AMER16_Book.indb 319 3/15/2016 11:31:46 AM


4 FALSE ALARMS of false alarm at every sampling stage (with every
subgroup or with every measurement result), i.e.
The popular index for false alarms, when viewing a FAR = 1- Pr{LCL < Yi < UCL}, where Yi is the
problem from a frequentist point of view, is ARL0, statistic being tracked for the ith sample (individ-
the Average Run Length to false alarm. Another ual value Xi, or average value X , or range R, or
quantity of interest is Pin-control(T = n|T n), the standard deviation S, etc.); LCL and UCL being
probability that a stopping time T will raise a false the lower and upper control limits used for proc-
alarm at time n, conditional on its not having raised ess monitoring. The FAP is the probability of at
a false alarm previously. This may depend on n, least one false alarm during the established period
so the worst-case scenario is indexed by supn< of time, i.e. FAP = 1- [1- Pr{LCL < Yi < UCL}]m
Pin-control(T = n|T n), and one may want to keep = 1-(1- FAR)m, where m is the number of subgroups
this quantity low. included in the process capability analysis phase
The ARL to false alarm has been criticized for and the i.i.d. assumption is implied. The value of
not fully considering the skewness of the distribu- FAP is used to calculate the limits for verifying the
tion of the run length (Lai, 1995, Zacks, 2004, Mei, stability of the process when establishing control
2008, Frisn, 2003, 2006, 2011). It should be noted limits with different limits. FAP is correcting for
however that, very often, the run length to false a multiple comparison effect during process capa-
alarm has approximately an exponential distribu- bility analysis by applying a Bonferroni correction.
tion and for discrete time, a geometric distribution Another approach, different from this family error
(Gold, 1989). Since an exponential distribution is wise consideration, is to apply a False Discovery
fully characterized by its mean, in such cases, the Rate (FDR) correction which considers, not the
ARL to false alarm fully describes the false alarm number of comparisons (data points) but the ratio
behaviour of a procedure (Knoth, 2015). In surveil- of false alarms relative to the number of alarms.
lance applications, processes cannot be reset like in For more on this topic see Kenett and Zacks (2014).
typical industrial applications where a machine can Because of the dynamic aspect of process control,
be stopped. In such cases an investigation is trig- both PFA and FDR have adequate performance
gered by an alarm and the state of alarm can be in retrospective data analysis (the process capabil-
persistent. Kenett and Pollak (1983) account for ity analysis phase) but limited relevance in future
the frequency of false alarms when a process can- looking process monitoring.
not be reset in an application of monitoring con-
genital malformations.
When considering the problem from a Bayesian 5 DELAY TO DETECTION
point of view, where there is a prior on the change
point , the quantity of interest for describing the From a frequentist point of view, the post-change
possibility of a false alarm is the probability of ARL is often characterized by ARL1, the ARL to
false alarm, PFA = P(T<). A common constraint detection assuming that the change is in effect at
on false alarms is P(T<). the very start. However, letting denote the serial
Shiryaev (1961, 2010) considers that the error number of the first post-change observation, the
to be minimized over all stopping times of X is conditional expected delay of detection, conditional
expressed as the linear combination of the prob- on a false alarm not having been raised before the
ability of the false alarm and the expected detec- (unknown) time of change , is CED( ) = E(T
tion delay. His change point scenario is defined +1| T ). CED may depend on , and there is no
as multi-cylic stationary and, in that context guarantee that CED, as a function of , is repre-
the Shyarev-Roberts (SR) procedure is optimal sented well by ARL1. If one has no anticipation of
(Kenett and Pollak, 1986, 1996). The Shiryaevs the time of change, sup< Pout- of- control CED() can be
multi-cyclic version of the change-point detection considered an appropriate index. In some sequen-
problem is equivalent to the generalized Bayesian tial procedures, such as the Shewhart control chart,
setup so that the SR procedure is exactly optimal in sup< Pout- of- control(T = |T ) = (ARL1)1. Moreover,
the generalized Bayesian sense as well. It should be CED() is constant for Shewhart charts. If there is
noted that neither the Cumulative Sum (CUSUM) a good chance that a change will be in effect right
nor the Exponentially Weighted Moving Average at the start, one may be interested in a fast initial
(EWMA) possess such strong optimality proper- response scheme, where CED(0) is made to be low,
ties (Kenett and Pollak, 2012). at the expense of a higher CED at later , so that
Related results, in the frequentist framework, ARL1 does not tell the whole story (Lucas and Cro-
are presented in Hillier (1969), which discusses the sier, 1988). If a change is likely to take place in a
False Alarm Rate (FAR), and Chakraborti et al. distant future, then limCED(), the conditional
(2008) which introduces the False Alarm Prob- steady-state ARL, may be of interest. An alternative
ability (FAP). The value of FAR is the probability index is Pout- of- control(T = |T ), and if one has no

320

AMER16_Book.indb 320 3/15/2016 11:31:47 AM


anticipation of the time of change, sup<(T = |T ) For the R code to run these simulations see
can be considered an appropriate index. Kenett and Zacks (2014) and the mistat R appli-
In principle, the CED may not be the primary cation available for download in https://cran.r-
characteristic of interest. For example, consider project.org/web/packages/mistat/index.html. For
the case of monitoring for the outbreak of an epi- usability and meaning of measures for evaluating
demic. For illustrations sake, suppose simplisti- detection schemes and general R code for comput-
cally that each infected person infects k others (all ing performance indicators of sequential methods,
within the next time unit). Thus, if the epidemic see Knoth (2006) and http://cran.r-project.org/
starts with one person, at the second time unit k+1 web/packages/spc. For assessment of surveillance
are infected, at the third time unit k2+k+1 have schemes see: http://economics.handels.gu.se/eng-
been infected, etc.; after n time units the number lish/Units+and+Centra/statistical_research_unit/
of infected people adds up to n(n+1)(2n+1)/6 software.
n3/3. Hence the primary object of interest would The skewed run length distributions render
be E ((T +1)3| T ). Or, if each infected person interpretations of ARL low in information quality
subsequently infects one other person every time and PFA and CED with higher information qual-
unit, n time units after the start of the epidemic the ity, especially in terms of operationalization and
number of infected people will be 1+1+2+4+8+ communication.
+2n-2 = 2n-1; hence the primary object of interest For more literature on CED see Kenett and
would be E (2T | T ). Pollak, 1983, 1986, 1996, 2012, Zacks and Kenett,
Even if the price for the delay in detection is 1994, Kenett and Zacks, 1998, 2014, Luceo and
linear in (T +1)+, ARL1 may not be a mean- Cofio, 2006 and Frisn, 2011.
ingful operating characteristic. For example, con- Another situation where ARL1 is not an appro-
sider monitoring for a change of a mean to a priate index is when one is willing to tolerate many
mean , when the baseline and the post-change false alarms. As an example, consider checking for
parameter are unknown. For example, suppose an intruder, where it is of utmost importance that
one wants to monitor a change in the average daily the intrusion be detected even at the price of mak-
water flow in a river, where one has no historic data ing many false alarms. Here, the characteristic of
and only Bayesian priors. Obviously, if the change interest is the expected delay and, again, this is dif-
occurs at the onset, no surveillance system will be ferent from ARL1.
able to differentiate between pre-change and post- When considering the problem from a Bayesian
change, so that the expected delay to detection will point of view, the quantity of interest for describing
equal the ARL to false alarm. Approximately, the the possibility that a change is in effect is P(n).
same will happen if the change takes place within a Usually, the speed of detection of a method defined
few observations after the onset of surveillance. If by a stopping time T is embodied by E(T +1| T
the change occurs later on, the pre-change obser- ). Note that although this looks like the CED,
vations may constitute a learning sample of suffi- there is a subtle difference: the CED regards as an
cient size to reduce the CED to the proportions of unknown constant, whereas the Bayesian expres-
the CED of a procedure, like in a situation where sion is, in effect, a weighted average of delay times.
the baseline parameters are known. Hence, ARL1 Considerations, as in the frequentist case, of E((T
is not a good index. Figure 2 shows simulated run +1)3| T ) or E(2T | T ) apply here, too.
lengths and the respective values of PFA, CED
and ARL for the four process scenarios shown
in Figure 1 when applying a two sided CUSUM
procedure set up to detect the specific change in
the scenario. This assumes the CUSUM is speci-
fied optimally, during process capability analysis
with exact knowledge of the process state before
and after change. The parameters used for the four
simulations are:

Teta Teta+ K+ h+ Teta- K- h-

10 11.5 10.75 17.9744 8.5 9.25 17.9744


13.0 11.50 8.9872 7.0 8.50 8.9872
14.5 12.25 5.9915 5.5 7.75 5.9915 Figure 2. Run length distributions with CUSUM for
16.0 13.00 4.4936 4.0 7.00 4.4936 the four series in Figure where changed occurred at the
10th observation.

321

AMER16_Book.indb 321 3/15/2016 11:31:47 AM


6 INFOQ ASSESSMENT OF been successful and the process is back under con-
CHANGE DETECTION trol. This aspect of process control is not alwfays
appreciated and many textbooks and papers con-
In this section we review the above considerations sider process control as a standard application of
from an InfoQ perspective. We begin with a discus- hypothesis testing. Part of this confusion might be
sion of the four InfoQ components. As introduced due to the relatively recent nomenclature of Phase
in Section 3, InfoQ is derived from the utility (U) I, where control limits are set and Phase II where
of applying an analysis (f) to a data set (X) for a monitoring is performed. In fact, Phase I and
given goal (g). Phase II are typically iterated. A preferred conven-
The goal of change point detection is typically tion would be to name this phases process capabil-
economically motivated. If we are able to design ity analysis and monitoring phases. For more on
a process with acceptable capability, and we want this fundamental difference see AT&T (1956) and
to avoid reliance on mass inspection, it is essential Hawkins et al. (2003).
to keep the process under control (AT&T, 1956). As mentioned in Section 2, eight dimensions are
Statistical process control consists of an alarm used to assess the level of InfoQ of a study. We
triggering mechanism and proactive management proceed to review these dimensions in the context
actions that trigger corrective actions. This is an of change point detection procedures in process
economically optimal combination. In some cases, control.
the control limits can be determined by consider-
ing various cost elements (Kenett and Zacks, 2014), i. Data resolution is related to the concept of
but in most cases the specific economic costs are rational samples. The frequency and extent
not used to set up the process control system. In of the data sample used to control a process
identifying the goal of a change point detection is a reflection of the process characteristics.
method it is critical to distinguish between proc- For example, multi stream processes require
esses that can be reset, such as industrial machines, a sample with representations of individual
and situation where an alarm triggers an investiga- streams. Generally stable processes do not
tion with delayed impact, such as in surveillance of require data at the microsecond level and can
health related epidemics. probably achieve proper control with quar-
The utility of a change point detection method terly or even hourly data.
is assessed by various performance indicators such ii. Data structure is about the available data
as those discussed above. Traditionally these are types such as time series, cross-sectional, panel
ARL0 and ARL1, however, as suggested, it appear data, geographic, spatial, network, text, audio,
that PFA and CED are more informative, see also video, semantic, structured, semi or non-struc-
Kenett and Pollak 2012. tured data. These can include output quality
The data used in process control is univariate or or process data, including video images or tex-
multivariate. In some cases that data is grouped in tual inputs by operators.
rational samples that represent inherent local vari- iii. Data integration considers how process moni-
ability. Such local variability is used to determine toring data from different sources is integrated.
control limits for ongoing process monitoring. Such methods include Extract-Transform-
The analysis of process control data is based Load (ETL) methods, Bayesian networks, data
on a conceptual framework that is different from fusion and general machine learning methods
the classical hypothesis testing framework. In fact, (Goeb, 2006, Weeze et al, 2015, Kenett, 2016).
Shewharts view on statistical control, presented in iv. Temporal relevance is relevant to both the data
his 1931 book, is connected to predictability (Di used for the process capability analysis stage
Bucchianico and Van Heuvel, 2015). Note that and data used for ongoing monitoring.
Shewharts view, in his 1939 book, is described in v. Chronology of Data and Goal is the dimen-
terms of exchangeability and has an almost Baye- sion determining effectiveness of sequential
sian flavour. The process control perspective is that methods. The signals should be produced in
the data analysed is generated by a process under a timely and informative manner. This is why
investigation. The objective of change point detec- Conditional Expected Delay (CED) is such an
tion is to identify a change from an underlying essential performance measure.
condition which was used to determine the proc- vi. Generalizability is at the core of statistical
ess capability. Unlike classical statistical modelling, process control. The information generated
if a condition of change is detected, especially in from the process control procedure should be
processes that can be reset, the process is changed. used by operators, engineers and managers
The implication being that if the data does not fit in a broader context than the specific sample
the model, you do not fit a new model to the data points. The first generalization is statistical in
but only observe if the control intervention has scope, deriving insights on the process from

322

AMER16_Book.indb 322 3/15/2016 11:31:47 AM


the rational samples. Further generalization
consists of considering other similar proc-
esses, impact of raw materials or management
effects, like shifts or training methods. The
point here is to generalize the change point
detection signals and data driven statistics
to various application domains (for more on
generalizability see Chapter 11 in Kenett and
Shmueli, 2016).
vii. Operationalization is again a critical dimen-
sion of process control. Control charts that
are ignored, or looked at retrospectively with
considerable time delays, are not informative. Figure 3. High level design of an SSPC.
viii. Communication. The simple display of data
over time, with an annotation scheme pointing
out alarms based on diverse triggering mecha- order requirements, traceability of parts across
nisms, is an essential element of process control. revisions, trouble shooting and corrective actions
Combining the display with mathematical cal- information in the form of text and images etc etc.
culations has made these methods so popular. All this data needs of course to be properly inte-
grated and analyzed.
The next section discusses an expanded view
An additional capability of SSPC is that appli-
of statistical process control, adding a system per-
cation of multivariate statistical process control
spective that integrates elements included in the
methods and advanced root cause analysis tools
InfoQ perspective.
involving machine learning algorithm. In general,
the application of multivariate process control has
been delayed because of the complexity in deploy-
7 A SYSTEM FOR STATISTICAL ment. Within SSPC, methods proposed in the past,
PROCESS CONTROL such a multivariate tolerance regions (Fuchs and
Kenett, 1987), can now be easily implemented. For
A System for Statistical Process Control (SSPC) more on multivariate process control and machine
is an infrastructure, mostly technological, that learning methods see Goeb 2006, Kenett and
enhances the impact of change point detection Zacks, 2014 and Weeze et al, 2015. For an example
methods. In terms of functionality, an SSPC pro- of an SSPC see www.spclive365.com.
vides features for data acquisition, data integra-
tion, reporting, filtering and visualisation. An
outline of such a system is presented in Figure 3. 8 SUMMARY AND DISCUSSION
Such systems integrate with ERP systems so that
data from work orders is automatically linked to Process control is a major elements in the body of
the statistical process control procedures, includ- knowledge analysed and developed in industrial
ing a meta-tagging of critical parameters and their statistics and applied statistics in general. In this
specification limits. An additional feature of SSPC paper we review several considerations of proc-
is its ability to handle data with high volume, veloc- ess control methods and discussions in the litera-
ity and variety, so called big data. Integrating ture from a perspective of Information Quality
structured and unstructured data leads to improved (InfoQ). The main point is that an InfoQ approach
diagnostic and troubleshooting capabilities, for broadens the scope of much of the work on the
example using Bayesian networks (Kenett, 2016). subject, as currently presented in books and jour-
In designing, or evaluating, an SSPC, one can nal articles. With similar considerations, Dennis
apply the eight InfoQ dimensions: Data Resolu- Lindsey writes about the question what is meant
tion, Data Structure, Data Integration, Temporal by statistics by referring to who he considers as
Relevance, Chronology of Data and Goal, General- the founding fathers: Harold Jeffreys, Bruno de
izability, Operationalization and Communication. Finetti, Frank Ramsey and Jimmie Savage: Both
These dimensions will help the system designers Jeffreys and de Finetti developed probability as the
and implementer cover the scope of functionalities coherent appreciation of uncertainty, but Ramsey
needed by an SSPC. Specifically, the data collected and Savage looked at the world rather differently.
by the system needs to have the right resolution, Their starting point was not the concept of uncer-
structure and temporal relevance. This includes, tainty but rather decision-making in the face of
for example, the on line and off line measurements uncertainty. They thought in terms of action,
of process outputs, in-process parameters, work rather than in the passive contemplation of the

323

CH39_kenotp.indd 323 3/15/2016 12:29:49 PM


uncertain world. Coherence for them was not so Chakraborti S., Humanb, S. and Graham, M. (2008).
much a matter of how your beliefs hung together Phase I Statistical Process Control Charts: An
but of whether your several actions, considered Overview and Some Results. Quality Engineering;
collectively, make sense..If one looks today at 21:5262
Champ, C.W., and Woodall, W.H. (1987). Exact Results
a typical statistical paper that uses the Bayesian for Shewhart Control Charts with Supplementary
method, copious use will be made of probabil- Runs Rules. Technometrics, 29(4):393399.
ity, but utility, or maximum expected utility, will Di Bucchianico A., and Van den Heuvel E. (2015).
rarely get a mention.. When I look at statistics Shewharts Idea of Predictability and Modern Sta-
today, I am astonished at the almost complete tistics, in Frontiers in Statistical Quality Control 11
failure to use utility.Probability is there but not (S. Knoth, W. Schmid, eds.). Springer International
utility. This failure has to be my major criticism Publishing Switzerland, pp.237248.
of current statistics; we are abandoning our task Frisn, M. (2003). Statistical surveillance. Optimal-
half-way, producing the inference but declining to ity and methods. International Statistical Review,
71:403434.
explain to others how to act on that inference. The Frisn, M. (2011). Methods and evaluations for surveil-
lack of papers that provide discussions on util- lance in industry, business, finance, and public health.
ity is another omission from our publications. Quality and Reliability Engineering International,
(Lindsey, 2004). The four InfoQ components 27:611621.
and the eight InfoQ dimensions are proposed as Frisn, M. and de Mar, J. (1991). Optimal Surveillance,
an antidote to the issues raised by Lindsey. We Biometrika, 78:271280.
focus here on an evaluation of process control Frisn, M. and Sonesson, C. (2006). Optimal surveillance
methods and change point detection procedures. based on exponentially weighted moving averages.
By taking an InfoQ perspective we re-evaluate the Sequential Analysis, 25:379403.
Fuchs, C. and Kenett, R.S. (1987). Multivariate Toler-
performance indicators used in the literature to ance Regions and F-tests. Journal of Quality Technol-
compare procedures and emphasize the applica- ogy, 19:122131.
tion of CED and PFA, instead of the commonly Godfrey A.B. and Kenett R.S. (2007). Joseph M. Juran,
used ARL0 and ARL1. By considering the grow- a perspective on past contributions and future
ing role and impact of technology on analytic impact, Quality Reliability Engineering International,
methods, we describe a System for Statistical 23(6):653663.
Process Control (SSPC) that can help enhance Goeb, R. (2006). Data Mining and Statistical ControlA
the impact and relevance of statistical process Review and Some Links, in Frontiers in Statistical
control in modern business and industry. We Quality Control 8 (H-J. Lenz, P-Th. Wilrich eds.).
Springer-Verlag, Heidelberg, 285308.
also show how InfoQ dimensions can be used to Gold, M. (1989). The Geometric Approximation to
design and evaluate such systems. The challenges the Cusum Run Length Distribution, Biometrika,
we describe are not specific to process control and 76(4):725733.
are relevant to modern applied statistics and qual- Hawkins, D., Qiu, P. and Kang, C. (2003). The Change-
ity management systems in general. In this sense, point Model for Statistical Process Control. Journal of
SSPC is a special case of integrated quality man- Quality Technology,g 35(4):35565.
agement systems envisioned by Juran in the 1950s Hillier F.S. (1969). X and R- Chart Control Limits Based
(Godfrey and Kenett, 2007). on a Small Number of Subgroups. Journal of Quality
Technology, 1:1726.
Kenett, R.S., Thyregod, P. (2006). Aspects of statistical
consulting not taught by academia. Statistica Neer-
ACKNOWLEDGEMENTS landica, 60(3):396412.
Kenett, R.S. (2016). On Generating High InfoQ with
In preparing this paper several people contributed Bayesian Networks. Quality Technology and Quantita-
comments, suggestions and perspectives. These tive Management, 13(3), in press.
helped enhance the scope and completeness of the Kenett, R.S. and Pollak, M. (1983). On Sequential
paper. I would like to particularly thank Alessandro Detection of a Shift in the Probability of a Rare
di Bucchianico, Murat Testik, Sven Knoth, Stefan Event. Journal of the American Statistical Association,
Steiner and Rainer Goeb for such contributions. 78:389395.
Kenett, R.S. and Pollak, M. (1986). A Semi-Parametric
Approach to Testing for Reliability Growth, With
Application to Software Systems. IEEE Transactions
REFERENCES on Reliability, R-35:304311.
Kenett, R.S. and Pollak, M. (1996). Data-Analytic
AT&T (1956). Statistical Quality Control Handbook, Aspects of the ShiryaevRoberts Control Charts:
Western Electric Company. Surveillance of a Non-Homogenous Poisson Process.
Box, G.E.P. and Luceno, A. (1997). Statistical Control by Journal of Applied Statistics, 23:125137.
Monitoring and Feedback Adjustment, John Wiley and Kenett, R.S. and Pollak, M. (2012). On Assessing the
Sons, New York. Performance of Sequential Procedures for Detecting

324

AMER16_Book.indb 324 3/15/2016 11:31:47 AM


a Change. Quality and Reliability Engineering Interna- Sociedad de Estadstica e Investigacion Operativa,
tional, 28:500507. 15:505524.
Kenett, R.S. and Shmueli, M. (2014). On Information Mei, Y. (2008). Is Average Run Length to False Alarm
Quality. Journal of the Royal Statistical Society, Series Always an Informative Criterion? (with discussions).
A (with discussion), 177(1):338. Sequential Analysis, 27:354419.
Kenett, R.S. and Shmueli, M. (2016). Information Qual- Page, E.S. (1954). Continuous inspection schemes.
ity: The Potential of Data and Analytics to Generate Biometrika, 41:100114.
Knowledge, John Wiley and Sons. Pollak, M. (1985). Optimal detection of a change in dis-
Kenett, R.S. and Zacks, S. (1998). Modern Industrial Sta- tribution, Annals of Statistics 13, 206227.
tistics: Design and Control of Quality and Reliability, Poor, V. (1988). Quickest detection with exponential pen-
Duxbury Press: Pacific Grove, CA, Spanish edition alty for delay. Annals of Statistics, 28: 21792205.
2002, 2nd paperback edition 2002, Chinese edition Shiryaev, A.N. (1961). The problem of the most rapid
2004. detection of a disturbance in a stationary process.
Kenett, R.S. and Zacks, S., with contributions by D. Soviet Math. Dokl. 2:795799.
Amberti (2014). Modern Industrial Statistics with Shiryaev, A.N. (2010). Quickest detection problems: Fifty
Application in R, MINITAB and JMP, John Wiley years later. Sequential Analysis, 29: 345385.
and Sons. Shiryaev, A.N. (1963). On optimum methods in quick-
Knoth, S. (2006). The art of evaluating monitoring est detection problems. Theory of Probability and Its
schemesHow to measure the performance of con- Applications, 8:2246.
trol charts? In H.-J. Lenz and P.-T. Wilrich, editors, Weeze, M., Martinez, W., Megahed, F. and Jones-Farmer,
Frontiers in Statistical Quality Control 8: 7499. Phys- L.A. (2015). Statistical Learning Methods Applied to
ica-Verlag Heidelberg, http://cran.r-project.org/web/ Process Monitoring: An Overview and Perspective.
packages/spc. Journal of Quality Technology, 48:427.
Knoth, S. (2015). Run length quantiles of EWMA con- Woodall, W.H. and Montgomery, D.C. (1999). Research
trol charts monitoring normal mean or/and vari- Issues and Ideas in Statistical Process Control. Journal
ance, International Journal of Production Re-search, of Quality Technology, 31:376386.
53;46294647. Yashchin, E. (1985). On the analysis and design of
Lai, T.L. (1995). Sequential Change-Point Detection in CUSUM-Shewhart control schemes. IBM Journal of
Quality Control and Dynamical Systems (with discus- Research and Development, 29:377391.
sions). Journal of Royal Statistical Society, Series B, Zacks, S. (1981). The probability distribution and the
57:613658. expected value of a stopping variable associated with
Lindsey, D. (2004). Some reflections on the current state one-sided CUSUM procedures for non-negative inte-
of statistics, in Applied Bayesian Statistics Studies ger valued random variables. Communications Statis-
in Biology and medicine, di Bacco, M., dAmore, G., tics A, 10:22452258.
Scalfari, F. (editors), Springer Verlag. Zacks, S. (2004). Exact Determination of The Run
Lorden, G. (1971). Procedures for Reacting to a Change Length Distribution of a One-Sided CUSUM Pro-
in Distribution. Annals of Mathematical Statistics, cedure Applied on An Ordinary Poisson Process.
42:18971908. Sequential Analysis, 23:159178.
Lucas, J. (1976). The Design and Use of V-Mask Control Zacks, S. and Kenett, R.S. (1994). Process Tracking of
Schemes. Journal of Quality Technology, 8: 112. Time Series with Change Points, in Recent Advances
Lucas, J. and Crosier, R.B. (1982). Fast Initial Response in Statistics, Proceedings of the 4th international
for CUSUM Quality-Control Schemes: Give Your meeting of statistics in the Basque Country, San
CUSUM a Head Start. Technometrics, 24:199205. Sebastin, Spain, 47 August, 1992. Utrecht: VSP,
Luceo, A. and Cofio (2006). The Random Intrinsic pp. 155171.
Fast Initial Response of Two-Sided CUSUM Charts.

325

AMER16_Book.indb 325 3/15/2016 11:31:48 AM


This page intentionally left blank
Applied Mathematics in Engineering and Reliability Bri, Snel, Khanh & Dao (Eds)
2016 Taylor & Francis Group, London, ISBN 978-1-138-02928-6

Two-factor hypothesis testing using the Gamma model

Nabendu Pal
Department of Mathematics, University of Louisiana at Lafayette, Louisiana, USA
Faculty of Mathematics and Statistics, Ton Duc Thang University, Ho Chi Minh City, Vietnam

Nuong Thi Thuy Tran & Minh-Phuong Tran


Faculty of Mathematics and Statistics, Ton Duc Thang University, Ho Chi Minh City, Vietnam

ABSTRACT: We are familiar with the two-way Analysis Of Variance (ANOVA) using the normal
model where we assume the additivity of the factor effects on the mean of the response variable assuming
homoscedasticity (i.e., variances are all equal across the factor levels). But this type of normal set-up is
not applicable in many problems, especially in engineering and biological studies where the observations
are non-negative to begin with and likely to be positively skewed. In such situations one may use the
Gamma model to fit the data, and proceed with further inferences. However, a normal type inference
based on the decomposition of total Sum of Squares (SS) is not possible under the Gamma model, and
further sampling distributions of the SS components are intractable. Therefore, we have looked into this
problem from the scratch, and developed a methodology where one can test the effects of the factors.
Our approach to tackle this interesting problem depends heavily on computations and simulation which
bringa host of other challenges.

1 INTRODUCTION The classical two-way ANOVA theory does


not work if the above assumptions fail to hold,
Analysis Of Variance (ANOVA) with two factors is although the practitioners may keep using it if the
an important as well as powerful tool which is used departures from assumptions are not significant.
to study the effects of two factors on the response But in many engineering and biological studies
variable. Suppose we have observations in the form the datasets tend to be positively skewed as well
of { ijk } which indicates the kth observation under as nonnegative, and as a result the two standard
the influence of the ith level of Factor 1 and assumptions may not hold. As a possible remedy
jth level of Factor 2, where k , , , nij . The one can transform the data using the well-known
standard statistical theory for two-factor ANOVA Box Cox transformation. However, the stand-
assumes the following linear additive model ard interpretation of the variables get lost under
such transformation, and as a result, the statistical
X ijk = + i + j + iij + iijk , (1.1) inferences become difficult to relate to the original
problem. To explain the situation further we use
where ijk s are assumed to be independent and two real-life datasets given as follows.
2
identically distributed as N( ) . The remaining Example 1.1: McDonald ((2014), Handbook of
terms in (1.1) are the general effect ( ) , effect of Biological Statistics, 3rd ed., pages 173179) pre-
the ith level of Factor 1 ( i ) , effect of the jth sented enzyme activity data for amphipod crusta-
level of Factor 2 ( j ) , and the corresponding cean (Platorchestia platensis), classified in terms
interaction term ( ij ) . Due to the above normal- of gender and genotype, as given in Table 1.1.
ity and homoscedasticity (i.e., equality of vari- Note that in Table 1.1 there are only 4 (= nij ,
ances for all i and j) assumptions the total Sum of for all i and j) observations for each combinations
Squares (SS), which measures the overall variabil- of Factor 1 (genotype) and Factor 2 (gender).
ity among the observations, can be decomposed Therefore, the standard tests for normality and/or
into several independent components. These com- homoscedasticity, which are asymptotic in nature,
ponents can be used to construct the F-statistics are not very effective. It would be prudent to con-
to test relevant hypotheses regarding factor and/or sider the Gamma model, instead of the normal
interaction effects.

327

AMER16_Book.indb 327 3/15/2016 11:31:48 AM


model, for a dataset with nonnegative observa- Problem - 4: Test H 0( 4 ) : ij i j ( i j ) vs.
tions. A twoparameter Gamma distribution is a H A( ) : ijij i j for some combinations of
more versatile model for nonnegative observations ( j) .
since the distribution can be extremely skewed (for
If we fail to reject the null hypothesis (i) in
small shape parameters) to almost symmetric (for
Problem-1, then it implies that Factor 2 has no
large shape parameters).
influence on the mean response; (ii) in Problem 2,
Example 1.2: BOD (Biological Oxygen Demand) then it implies that Factor 1 has no influence on
samples were taken from the Dongnai river basin the mean response; (iii) in Problem 3, then both
in Vietnam. The BOD is being described by two the factors have no influence on the mean response
factors: Location and Season. The Season factor when they act simultaneously; (iv) in Problem 4,
has two levels WET and DRY, and the Location the joint interaction effect of the two factors is of
factor has three levels DN (Dongnai River), SG multiplicative nature.
(Saigon River) and HL (Estuary). The dataset is In this study, we are going to consider only
given in Table 2. Problem 1 and Problem 3, since by interchanging
In Table 2, the values of nij s are different for the roles of i and j (as well as a and b) we convert
every level of Factor 1 (Location). Here we the Problem 2 to Problem 1. Problem 4 will be
ignore the year (we combine the data for the years taken up in a later study.
2009 and 2010 into a single time period). To keep the theory somewhat simpler we further
We are going to assume that the independ- assume that the scale parameters are all equal (but
ent observations X ijk , ( k nij ) follow Gamma unknown), i.e.,
( ij ij ) distribution, 1 i a and 1 j b with
the pdf ij (i j ). (1.4)

1 xijk / iij 1 The most general case, i.e., where all scale
f xijk | ij iij ) = ij
exp xijkij . (1.2)
( ij ) iij parameters are unknown and possibly unequal will
be considered in a later phase of our study. The
above assumption (1.4) helps us in developing the
The mean and variance of X ijk are given as ideas and concepts as well as computational tools
which will be generalized rather easily for the next
E (X ijk ) = ij ( ) = ij iij , phase of our study.
(1.3)
V (X ijk ) = ij2 (
andV ) = ij iij2 . Under the assumption of equal scale (i.e., (1.4)),
the four hypothesis testing problems essentially
To study whether the factor levels, individually boil down to studying the effects of the two factors
or jointly, have any significant influence on the on the shape parameters ij s only.
means, or whether the two factors have any inter- In Section 2, we develop the test procedures
action, we are going to consider the following four for Problem 1 (i.e., testing the significance of
hypothesis testing problems. Factor 2) followed by a comprehensive simula-
tion. In Section 3, we consider the test procedures
Problem - 1: Test H 0(1) ij i jj vs. H A( ) : ijij ij , for Problem 3 (i.e., testing the joint significance of
for some j j ,1 j j b . Factor 1 and Factor 2). In each of the above two
Problem - 2: Test H 0( 2 ) : ij j i vs. problems we first derive the Asymptotic Likelihood
H A : iijj i jj , for some i i ,1 i , i a .
( )
Ratio Test (ALRT). As we will see from our simula-
Problem - 3: Test H 0(3) : ij (i( j) vs. tion, the ALRT performs poorly in maintaining the
H A : ijij i j , for some i i and/or
( )
nominal level for small sample sizes, and hence an
j j ,1 i , i a 1 j , j b . improvement is presented in terms of a Parametric
Bootstrap (PB) version of the test based on the like-
Table 1. Enzyme activity data according to gender and lihood ratio statistic, henceforth called PBLRT.
genotype.
In a series of recent papers (see Pal et al. (2007),
Gender Chang et al. (2008), Chang et al. (2010), Lin et al.
Genotype Male Female (2015), it has been shown that the PBLRT works
much better than the ALRT for many other prob-
FF 1.884; 2.283; 2.838;4.216; lems where an exact optimal test either does not
4.939; 3.486; 2.889; 4.198 exist or hard to find due to a complicated sampling
FS 2.396; 2.956; 3.550; 4.556; distribution. Although the PBLRT performs better
3.105; 2.649; 3.087; 1.943 than the ALRT in terms of maintaining the level
SS 2.801; 3.421; 3.620; 3.079; (i.e., probability of type I error) condition, espe-
4.275; 3.110; 3.586; 2.669 cially for small to moderate sample sizes, it is heavily

328

AMER16_Book.indb 328 3/15/2016 11:31:49 AM


Table 2. Data of BOD (Biological Oxygen Demand).

Season DRY WET

Location 2009 2010 2009 2010

DN 7.0; 7.8; 13.0; 12.9; 8.0; 7.7; 11.3; 13.2; 5.9; 6.4; 9.0; 8.2; 6.0; 6.5; 8.9; 8.1;
14.7; 7.8; 8.0; 12.6 14.8; 7.9; 8.1; 12.7 9.6; 6.7; 6.8; 9.0 9.6; 5.8; 5.9; 8.1
SG 8.0; 9.5; 13.2; 13.0; 8.0; 9.5; 13.2; 13.0; 7.0; 8.1; 9.8; 12.4; 7.3; 8.3; 10.1; 12.8;
13.8; 26.8; 140.0; 17.2; 13.8; 26.9; 141.0; 17.3; 12.5; 21.9; 140.0; 15.5; 12.9; 19.6; 122.0; 14.0;
56.4; 23.8; 20.7; 162.0; 56.5; 23.8; 18.9; 149.6; 53.5; 20.8; 19.1; 132.0; 47.3; 21.7; 18.2; 134.0;
67.4; 17.5; 17.0; 11.3 59.9; 15.7; 18.8; 11.3 58.0; 15.6; 14.9; 9.2 59.5; 14.2; 15.5; 9.2
HL 10.4; 15.0; 8.0; 7.5; 9.3; 15.2; 7.9; 7.6 7.8; 8.7; 7.0; 6.2; 6.8; 9.4; 6.5; 6.4;
7.1; 6.0; 7.6; 7.4; 7.4; 5.4; 7.7; 7.4; 5.7; 4.4; 5.5; 5.7; 5.8; 4.5; 5.6; 5.7;
6.3; 27.7; 22.9; 11.2 6.3; 27.7; 29.9; 11.6 4.5; 19.1; 18.9; 9.4 4.5; 19.1; 18.9; 9.4

dependent on computations. But given the compu- percentile value of 2 -distribution. But one must
tational resources available today, implementation note that this test based on the Chi-square distri-
of PBLRT should not be any difficulty. bution is not very good (or accurate) when nijs are
small. But first we are going to see the details of
this LRT method.
2 TESTING THE SIGNIFICANCE OF Given the independent observations X ijk
FACTOR 2 (PROBLEM 1) Gamma ( ij ), the likelihood function L is given as

Our goal in this section is to address the hypoth- L = L( ij , , i a, j b | X ijk


j , (i , j k ))
esis testing problem H 0(1) ij i jj against the a b nij
1 X ijkk / ij 1 (2.4)
alternative which negates it. The null hypothesis is = exp (X iijk ) .
stating that Factor 2 has no effect on the mean ( ) ij
i =1 j=
j 1 k =1 i
ij
response, which under the assumption of equality
of scales (i.e., (1.4)) can be written as
Thus, the log-likelihood function L l L can
be written as
H 0(1) ij i jj for some suitable i; (2.1)
a b
where i can be thought as ( i ) . L { nij ln ( ij ) nij ij ln
To test (2.1) we derive the classical Likelihood i =1 j=j 1
(2.5)
Ratio Test (LRT) statistic given as ( ) nij nij

where (1 / ) X ijk + ( ij 1) ln ijk }.


k =1 k =1
sup H ( ) L
= 0
, (2.2) We use the notation X ij and X ij to denote
sup L the Arithmetic Mean (AM) and Geometric Mean
(GM) of the observations in the ( j )th cell, i.e.,
where L stands for the likelihood function of the
combined data, the numerator in (2.2) stands for 1/ nij
nij nij
the restricted supremum of L under H 0(1) , and the X ij = X ijk / nij ; X
 ij = X iijk . (2.6)
denominator in (2.2) represents the global supre- k =1 k =1
mum of L.
The standard asymptotic theory says that for all Then L can be simplified as
nij moderately large, the sampling distribution
of under H 0(1) can be approximated as a b
L nij { ln ( ij ) ij ln (1 / )X ij
( ) 2 , (2.3) i =1 j=
j 1
(2.7)
 ij}.
+ ( iijj 1) ln X
where the degrees of freedom is the difference
between the number of free parameters in the glo- By differentiating L in (2.7) w.r.t. ij and ,
bal parameter space
p and that under H 0(1) . So, the and then setting them equal to zero yields the fol-
LRT rejects H 0 if > (2 ,( )) = ( )100th
(1)
lowing system of equations

329

AMER16_Book.indb 329 3/15/2016 11:31:52 AM


ij ) + ln  ij,( j );
l = ln X b
a b a b ni = nij , and vij = nij / ni (2.15)
(2.8)
and ij ij ) = nij X ij ;
j =1
i =1 j=
j 1 i =1 j=
j 1
The MLEs of i and under H 0(1) , denoted by
0 0
where ( ) = { ln (c ) } is the di-gamma func- and , are obtained as follows. First obtain
i0
tion defined at c > 0 . Define the total sample size by solving the following system of a equations
n and the grand mean X as follows
a b
a b a b ln q 0q ij (
0
i)
n nij and X = nij X ij / n (2.9) q 1 j 1
(2.16)
i =1 j =1 i =1 j =1 a b b
= ln nql X ql
q
 vijij
X
Then, solving the system of equations in (2.8), q 1 l 1 j =1
yields the MLEs of ij and , say ij and , as 0
follows. and then obtain as
First obtain ij by solving the following system
of ( ) equations a b
= (n X )
0 0
nij i . (2.17)
i 1 j 1
a b
ln i0 0 i0 j0 ( ij )
i0 1 0 1 Thus,
(2.10)
n X 0 0
= ln ( j ); ( i 1 | j ( , j , )).
 ij
sup ijk (2.18)
X H 0(1)

and then obtain as As stated earlier, for moderately large nijj val-
ues, follows 2 under H 0(1) , with = a(b ) .
a b We will see later that for small nijs, the size of the
= (n X ) / nij ij . (2.11) ALRT is higher than whereas the proposed
i 1 j 1 PBLRT keeps it within . The beauty of the
PBLRT is that it is a purely computational tech-
Thus, nique where one does not need to know the sam-
pling distribution of the test statistic (which is the
sup = ( ij , ,1 ,1 j | j ,(
ijk , j , )). LRT statistic in this case), and the critical value is
(2.12) derived through a simulation. Before discussing
further about the pros and cons of the PBLRT,
we first describe how it is implemented through a
The log-likelihood function under H 0(1) , hence- series of steps as given below.
forth denoted by L0(1) , is
Steps of the proposed PBLRT
a b
(i , j k )} ,
L0(1) nij { ln ( i ) ii ln (1 / )X ij Step 1: Given the original data { ijk jk
obtain the unrestricted MLEs ( ij ) as well as
i =1 j=
j 1 0
 ij .}.
+ ( ii 1) ln X restricted MLEs ( i 0 ) (under H 0(1) ). Com-
pute using (2.12) and (2.18).
(2.13) Step 2:
i. Assuming that H 0(1) is true, generate arti-
Differentiating w.r.t. i and , and then
L0(1)
ficial (bootstrap) observations in an inter-
setting them equal to zero yields
nal loop of M replications. In the mth
(m)
b b
replications
p we generate X ijk j from Gamma
0 0
( i ) 1 i a.( )
nijj i nij {ln X ij ln },i; ij 1 j 1
ii. With the artificial observations { ijk (i , j k )}
j =1 j 1
j= 0 0
a b a b (2.14) compute ( ij ) and ( i ) as done in
and nij i = nij X ij Step 1, and call them ( ) ( ) ) and
(
ij
0( ) 0( )
i =1 j=
j 1 i =1 j=
j 1 ( i ) , respectively. Then obtain
value as done in Step - 1, and call it m .
Define the total sample size subject to ith level iii. By repeating g above ((i)(ii)
) for m 1, 2, , M ,
of Factor 1, and the corresponding sampling we have 1 2 ,, M . Order these m val-
proportion as ues as ( ) ( )  ( ).

330

AMER16_Book.indb 330 3/15/2016 11:31:56 AM


Step 3: The critical value for the statistic then we obtain the estimated power of the test.
(in Step 1) is obtained as ((1 ) ) , where To be specific, in every replication of the data
is the level of the test. If ((1 ) ) , then { ijk (i , j k )} , say in the qth replication, we
reject H 0(1); retain H 0(1) if otherwise. Alterna- (q ) (q ) (q )
define I ALRT and I PBLR T as I ALRT = 1 if ALRT
tively, the p-value of PBLRT is approximated (q )
rejects H 0(1) , I ALRT = 0 , otherwise; and I PBLR (q )
T =1
m=1 I (
M
by (m)
) / M . if PBLRT rejects H 0 , I PBLRT = 0 , otherwise.
(1) ( q )

Then, depending on the input parameters, (size


Remark 2.1: Why the PBLRT might work better or power of ALRT) = Q I ALRT (q)
Q , and (size or
q =1
than the ALRT is not counter intuitive. The ALRT power of PBLRT) = Qq =1 I PBLRT (q)
Q.
approximates the true distribution of under Remark 2.3: The utility of the proposed PBLRT lies in
(1)
H 0 by the Chi-square distribution which may its simplicity. One does not need to know the true sam-
be far from reality when nijs are not large. On the pling distribution of the test statistic . However, it
other hand, the proposed PBLRT tries to replicate is very computation intensive. In our simulation study,
the true distribution of under H 0(1) by drawing while the size (or power) of ALRT is computed through
0 0
samples from Gamma ( i ) which is an approx- a single loop (of Q replications), that of PBLRT is done
imation to the distribution Gamma ( ij ) under through a double loop (of Q replications in the outer
H 0(1) . Thus, the relative frequency histogram of loop and M replications in the inner loop). As a result,
( ) ,1 m M , comes very close to that of running the simulation study becomes a challenge in
under H 0(1) , for large M, and it appears to be a terms of computational time. But in real-life applica-
better fit, as the simulation results indicate, than tions where a decision has to be made, based on the
the 2 distribution. PBLRT, whether to reject the null hypothesis or not,
then that decision-making process is not that time con-
Remark 2.2: In order to compare the proposed suming, since it is done through a single loop (the inner
PBLRT with ALRT in terms of size and power, loop of M replications only to find the critical value for
we have undertaken a comprehensive simulation the test statistic). In Section 5, where four datasets have
study. In our simulation, we generate the dataset been analyzed, we have used M = 10,000.
{ ijk (i , j k )} a large number (say, Q) times. The next section is devoted to size comparison
In each replication we observe whether the test of the two tests mentioned above.
under consideration rejects the null hypothesis or
not. Then the size or power of the test is approxi- 3 COMPARISON OF ALRT AND PBLRT
mated by the proportion of times (out of Q) it IN TERMS OF SIZE
rejects H 0(1) . When our input parameters ( ij ) ,
For size comparison, the datasets are generated
i.e., the parameters used to generate the observa- under the null hypothesis, i.e., ij = i , jj , for some
tions { ijk (i , j k )} , obey H 0(1) , then the pro- i > 0 . Not only we are going to vary i but also
portion of times a test rejects the null hypothesis nijs as well as the nominal level . Three widely
becomes the estimated size of that test. When the used values will be used, which are 0.01, 0.05,
input parameters do not obey the null hypothesis, 0.10. We have noted that M = Q = 5000 gives quite

Table 3. Simulated size of two tests with a = b = 2, 1 = = 1.0 under H0(1) .

2 = 0.5 2 = 1.0 2 = 2.0 2 = 5.0 2 = 10.0

nij PALRT PPBLRT PALRT PPBLRT PALRT PPBLRT PALRT PPBLRT PALRT PPBLRT

0.01 5 0.019 0.011 0.018 0.011 0.021 0.013 0.022 0.014 0.024 0.016
10 0.015 0.009 0.015 0.008 0.014 0.007 0.017 0.009 0.015 0.006
25 0.011 0.006 0.012 0.007 0.013 0.005 0.017 0.006 0.011 0.006
50 0.012 0.005 0.012 0.006 0.010 0.005 0.011 0.006 0.010 0.005
0.05 5 0.076 0.047 0.075 0.047 0.079 0.053 0.091 0.058 0.088 0.054
10 0.076 0.045 0.071 0.046 0.069 0.039 0.070 0.042 0.070 0.040
25 0.051 0.029 0.055 0.031 0.053 0.030 0.064 0.035 0.054 0.029
50 0.059 0.035 0.055 0.032 0.050 0.025 0.054 0.028 0.049 0.026
0.10 5 0.142 0.097 0.142 0.093 0.144 0.100 0.154 0.110 0.156 0.113
10 0.133 0.093 0.129 0.086 0.127 0.081 0.129 0.082 0.126 0.082
25 0.106 0.063 0.110 0.071 0.104 0.063 0.118 0.072 0.111 0.061
50 0.109 0.071 0.110 0.068 0.106 0.061 0.108 0.060 0.098 0.054

331

CH40_42.indd 331 3/15/2016 12:19:35 PM


stable results with standard error bounded above Table 4. Simulated size of two tests with a = b = 2;
by 0.003. For convenience we use nij = n (i, j), 1 = 2 = 1.0 = under H0(3) .
and n is taken as 5, 10, 25, 50. The overall two-fac-
tor problem size, identified with (a, b), will be var- 0.01 0.05 0.10
ied as (2,2), (5,5), (10,10); but in Table 3 we report nij PALRT PPBLRT PALRT PPBLRT PALRT PPBLRT
the results for a = b = 2 only. We use the notation
PALRT and PPBLRT to denote the simulated size of 5 0.020 0.004 0.082 0.034 0.140 0.056
ALRT and PBLRT respectively. 10 0.012 0.005 0.061 0.029 0.115 0.063
25 0.010 0.007 0.053 0.030 0.106 0.067
Remark 3.1: It is noted that the size of the ALRT is
50 0.011 0.008 0.052 0.033 0.096 0.065
much higher than the nominal level when n < 25 ,
making the test very liberal. For n 25 , the ALRTs
size stays very close to , albeit a bit higher. On the
other hand, the proposed PBLRTs size is always By differentiating L0(3) w.r.t. and , and
within . In fact for n 25 , the PBLRT behaves then setting them equal to zero yields
like a more conservative test. Therefore, as a rule
of thumb, we suggest that the PBLRT be used for a b
small sample sizes, and the ALRT be used for n ) nij (ln X ij ln ); (4.4)
large sample sizes. i =1 j=
j 1
and = X

4 TESTING THE JOINT SIGNIFICANCE The MLEs of and under H 0(3) , denoted by
0 0
OF FACTOR 1 AND FACTOR 2 0and , are obtained as follows. First obtain

by solving the following equation
Our goal in this section is to consider the
ln n
0 0
Problem - 3, with the hypothesis testing of (
H 0(3) : ij (i
( j ) against the alternative which
i 1 j 1 nij ln X ij
a b
(4.5)
negates it. The null hypothesis is stating that both
= ln( )
the factors have no influence on the mean response n
when they act simultaneously, which under the
assumption of equality of scales ( ij (i (i , j )) and then obtain
0
as
can be written as
0 X
H 0(3) : ij (i
( j ) for some suitable ; (4.1) = (0) . (4.6)

where can be thought as ( ) . Similar to Sec-
tion 2, the likelihood ratio test statistic is given as Thus,

0
sup ( 0 | j (
ijk , j , )). (4.7)
sup H ( 3 ) L H 0(3)
= 2 ln 0
(4.2)
sup L
Similar to Section 2, a PBLRT version test can
With all nij moderately large, we can approxi- be constructed based on the LRT statistic. A com-
mate the sampling distribution of under H 0(3) prehensive simulation has been carried out, and
as it has been noted that the PBLRT adheres to the
level much closer than the ALRT. Table 4 shows
the simulated size values of the two tests (ALRT
2 , and PBLRT).

where ( ) is the degrees of freedom. Next


we are going to see the details of the LRT method 5 ANALYSIS OF DATASETS
for Problem 3. We also consider the log-likelihood
g
function under H 0(3) , denoted by L0(3) , as In this section we revisit the two datasets presented
in Section 1 and see two other datasets. These data-
a b sets have been used as demonstration purposes for
L0(3) nij { ln ( ) ln
(4.3) our proposed gamma distribution based analysis
i =1 j=
j 1
of two factors.
 ij}.
(1 / )X ijij + ( 1) ln X

332

CH40_42.indd 332 3/15/2016 12:19:39 PM


Example 5.1: Recall the dataset of McDonald not significant under the normal model. However,
(2014) presented in Example 1.1. The following under the gamma model, PD is clearly significant
Table 5 provides the results of the usual normal (along with AT).
distribution based ANOVA with interaction. Example 5.4: Hogg and Ledolter ((1987), Engi-
The following Table 6 presents the analysis neering Statistics, page - 238) reported a study done
of the dataset in Example 1.1 under the gamma by a textile engineer regarding the effect of tem-
model. perature (degree Fahrenheit) and time (in cycles)
Remark 5.1: Note that under the gamma model on the brightness of a synthetic fabric which uses
both the ALRT as well as PBLRT retain the null a particular dye. Brightness was measured on a
hypothesis negating the effect of the two factors 50-point scale, and three observations were taken
on enzyme activity. The results are consistent with at each combination of temperature and time as
the findings under the normal model, though the shown in the following Table 12.
corresponding p-values are slightly different.
Example 5.2: Recall the dataset presented in
Example 1.2. Similar to the previous example the
following two tables (Table 7 and Table 8) show Table 7. Results for the BOD data (normal model).
the results under the normal as well as the gamma
models. Here also the final inferences are consist- Source F-value p-value
ent with each other.
Location 14.460 0.000
In the following sequel we present two new
Season 0.558 0.456
datasets which show the divergence in outcomes
Interaction 0.011 0.992
of the two approaches (one based on the nor-
mal model, and the other based on the gamma
model).
Example 5.3: Montgomery ((2005), Design and Table 8. Results for the BOD data (gamma model).
Analysis of Experiments, 6th ed., page 201) cites an
experiment, similar to a study reported in an arti- Factor PALRT PPBLRT
cle in the IEEE Transactions on Electronic Devices
(Nov. 1986, page - 1754), where the response vari- Location 0.000 0.000
able, that is Base Current (BC), was observed sub- Season 0.660 0.693
ject to various levels of two factors: Polysilicon
Doping (ions) and Anneal Temperature (degree
Centigrade) as shown below. Table 9. BC dataset according to factors PD and AT.
The following Table 10 and Table 11 show the
normal based ANOVA results as well as those Polysilicon Anneal Temperature (AT)
based on the gamma model. Doping (PD)
Remark 5.2: Take a look at the Table 9. While 900 950 1000
the BC values clearly differ greatly for the levels of
AT, they do not vary much for the levels of PD. If 1 1020 4.60, 4.40 10.15, 10.20 11.01, 10.58
one uses the level = 0.01, then the factor PD is 2 1020 3.20, 3.50 9.38, 10.02 10.81, 10.60

Table 5. Results for the Enzyme data (normal model).


Table 10. Results for the BC data (normal model).
Source F-value p-value
Source F-value p-value
Genotype 0.332 0.722
Gender 0.489 0.493 PD 10.216 0.019
Interaction 0.351 0.709 AT 648.906 0.000
Interaction 3.474 0.0995

Table 6. Results for the Enzyme data (gamma model). Table 11. Results for the BC data (gamma model).

Factor PALRT PPBLRT Factor PALRT PPBLRT

Genotype 0.613 0.847 PD 0.000 0.000


Gender 0.647 0.781 AT 0.000 0.000

333

AMER16_Book.indb 333 3/15/2016 11:32:09 AM


Table 12. Brightness dataset according to Time and which is widely used to analyze a dataset subject to
Temp. two factors. Our proposed gamma model, and the
corresponding PBRLT can be used effectively for
Temperature (Temp) (in Fahrenheit) nonnegative datasets, especially when the sample
Time (in
cycles) 350 375 400 sizes are small and/or the normality assumption
fails to hold.
40 38, 32, 30 37, 35, 40 36, 39, 43
50 40, 45, 36 39, 42, 46 39, 48, 47
ACKNOWLEDGMENT

Table 13. Results for the Brightness data (normal We sincerely thank Mr. Huynh Van Kha of the
model). Faculty of Mathematics and Statistics, Ton Duc
Thang University, for his generous help with the
Source F-value p-value
numerical computations of this work.
Time 9.692 0.009 We are indebted to our colleague Dr. Pham Anh
Temperature 2.606 0.115 Duc of the Faculty of Environment and Labour
Interaction 0.111 0.896 Safety, Ton Duc Thang University, for providing
the BOD dataset.

Table 14. Results for the Brightness data (gamma


model). REFERENCES
Factor PALRT PPBLRT Chang, C.-H., J.-J. Lin, & N. Pal (2011). Testing the
equality of several gamma means: a parametric boot-
Time 0.027 0.030 strap method with applications. Computational Statis-
Temperature 0.131 0.300 tics 26(1), 5576.
Chang, C.-H. & N. Pal (2008). Testing on the common
mean of several normal distributions. Computational
As before, the Table 13 and Table 14 summa- Statistics & Data Analysis 53(2), 321333.
rize the findings under the normal as well as the Chang, C.-H., N. Pal, & J.-J. Lin (2010). A note on com-
gamma models. paring several poisson means. Communications in Sta-
Remark 5.3: Interestingly, while the normal tistics-Simulation and Computation 39(8), 16051627.
model indicates Time as a significant factor for Lin, J.-J., C.-H. Chang, & N. Pal (2015). A Revisit to
brightness for any , the gamma model infers it as Contingency Table and Tests of Independence: Boot-
strap is Preferred to Chi-Square Approximations as
insignificant using = 0.01. Also, under the gamma
well as Fishers Exact Test. Journal of biopharmaceuti-
model, PBLRT differs greatly from the ALRT in cal statistics 25(3), 438458.
terms of the p-value for the factor Temperature. Pal, N., W.K. Lim, & C.-H. Ling (2007). A computational
Concluding Remark: This work sheds some light approach to statistical inferences. Journal of Applied
on an approach alternative to the normal model Probability & Statistics 2(1), 1335.

334

AMER16_Book.indb 334 3/15/2016 11:32:09 AM


Applied Mathematics in Engineering and Reliability Bri, Snel, Khanh & Dao (Eds)
2016 Taylor & Francis Group, London, ISBN 978-1-138-02928-6

Author index

Abramov, O. 91 Gouno, E. 3, 193 Nguyen, T.-N. 237


Aissani, A. 159 Grall, A. 81 Nguyen, T.-Q. 133
Anh, D.N. 231 Nguyen, T.-Q.T. 139
Aslett, L.J.M. 207 Ha, C.N. 35, 51 Nguyen, X.-X. 133
Ha, L.K. 225 Nguyen-Thoi, T. 59
Barros, A. 73 Hoang, T.N. 139
Brenguer, C. 81 Ho-Huu, V. 59 Pal, N. 327
Bre, M. 107, 281 Ho-Nhat, L. 59 Pham, D.-T. 133
Bernatik, A. 169 Hung, L.Q. 67 Pham, U.H. 309
Blaheta, R. 281 Huynh, K.T. 81 Phan, T.T.D. 43
Bri, R. 9, 19, 107, Huynh, V.-K. 271
119, 309 Tai, V.V. 35, 51
Bui Quang, P. 27 Katueva, Y. 187 Thach, T.T. 9
Bui, L.T.T. 249 Kenett, R.S. 317 Thao, N.T. 35, 51
epin, M. 201 Khanh, C.D. 255 Tran, D.T. 95
Kieu, T.N. 139 Tran, M.-P. 237, 243, 271,
Coolen, F.P.A. 207 Kuera, P. 177 291, 327
Coolen-Maturi, T. 207 Kurov, P. 303 Tran, N.T.T. 327
Kvassay, M. 215 Tran, V.-D. 27
Damaj, R. 193 Tuan, N.H. 255
Dao, N.A. 249 Levashenko, V. 215
Ut, N.D. 145
Dao, P. 67 Long, P.T.H. 67
Derychova, K. 169 Lundteigen, M.A. 73 Van, K.H. 231
Daz, J.I. 231 Ly, S. 309 Vo-Duy, T. 59
Do, D.-T. 133, 139, 145, 153 Vozk, M. 139
Do, P. 73 Nazarov, D. 91, 187 Vu, H.C. 73
Do, V.C. 3 Nguyen, B.M. 95
Domesov, S. 119, 281 Nguyen, K.T.P. 95 Walter, G. 207
Duy, H.H. 139 Nguyen, T.D. 43
Dvorsk, H. 177 Nguyen, T.-L. 153 Zaitseva, E. 215

335

Author index.indd 335 3/15/2016 2:51:11 PM


This page intentionally left blank
Editors

Bri
Snel
Khanh
Dao

Applied Mathematics in Engineering and Reliability contains papers presented at the International
Conference on Applied Mathematics in Engineering and Reliability (ICAMER 2016, Ho Chi Minh City,
Viet Nam, 4-6 May 2016). The book covers a wide range of topics within mathematics applied in
reliability, risk and engineering, including:

in Engineering and Reliability


Applied Mathematics
Risk and Reliability Analysis Methods
Maintenance Optimization Editors
Bayesian Methods
Monte Carlo Methods for Parallel Computing of Reliability and Risk
Advanced Mathematical Methods in Engineering
Radim Bri
Methods for Solutions of Nonlinear Partial Differential Equations Vclav Snel
Statistics and Applied Statistics, etc.
Chu Duc Khanh
The application areas range from Nuclear, Mechanical and Electrical Engineering to Information
Technology and Communication, Safety Engineering, Environmental Engineering, Finance to Health Phan Dao
and Medicine. The papers cover both theory and applications, and are focused on a wide range of
sectors and problem areas. Integral demonstrations of the use of reliability and engineering
mathematics are provided in many practical applications concerning major technological systems
and structures.

Applied Mathematics in Engineering and Reliability will be of interest to academics and professionals
working in a wide range of industrial, governmental and academic sectors, including Electrical and
Electronic Engineering, Safety Engineering, Information Technology and Telecommunications, Civil
Engineering, Energy Production, Infrastructures, Insurance and Finance, Manufacturing, Mechanical
Engineering, Natural Hazards, Nuclear Engineering, Transportation, and Policy Making.

Applied Mathematics
in Engineering and Reliability
an informa business

Das könnte Ihnen auch gefallen