Sie sind auf Seite 1von 7

Abstract: Phenol, one of the major organic pollutants frompaper

and pulp, pharmaceutical, iron-steel, coke- petroleum, and paint


industry was degraded by heterotrophic bacteria Pseudomonas
putida (ATCC: 11172). In a batch reactor, four parameters namely
temperature, pH, RPM and phenol dosage were varied systematically
to generate three time series data set which were being used for
process identification using ARX model along with univariate and
multivariate statistical monitoring of phenol degradation process.
Different SPC (Statistical Process Control) charts and PCA (Principal
Component Analysis) were used for monitoring the process of phenol
degradation; hence, identification of abnormal process conditions
leading to faulty situations.

Keywords ARX, PCA, Phenol, Pseudomonas putida, SPC
I. INTRODUCTION
ROCESS identification helps in developing an efficient
monitoring and controller systems for any process. Process
identification is about the understanding of dynamics
present in a process from its historical data. Different machine
learning algorithms can be effectively utilized for this purpose.
One of the major objectives of the present work is
identification of Phenol bio-degradation using ARX (Auto
regressive model with exogenous inputs) model on the
perspective of complexity of metabolic pathways and lack of
knowledge of the rate limiting step of that particular process.
The detection of fault followed by its diagnosis is extremely
important for effective, economic, safe and successful
operation of a process. Efforts to manufacture a higher
proportion of within specification product have lead to the use
of Statistical Process Control (SPC). SPC refers to a
collection of statistical techniques and charting methods that
have been found to be useful in ensuring consistent production
with specificity. However, most modern industrial processes
have available frequent on-line measurements on many
process variables and, in some instances, on several properties
of raw materials and final product.





Dr. M. Kundu is with the National Institute of Technology, Rourkela,
Orissa, India (phone: 91-0661-246-2263; fax: 91-0661-246-2999; e-mail:
mkundu@ nitrkl.ac.in).
C. Kavuri is with National Institute of Technology, Rourkela, Orissa, India.
Presently He is with the Department of Civil Engg., (e-mail:
biochaitanya@gmail.com)



Furthermore, there are measurements of characteristics related
to product quality that are usually measured infrequently off-
line. Therefore, industrial quality problems are multivariate.
As a result, univariate SPC methods and techniques provide
little information about the interactions between
characteristics and, therefore, it is not appropriate for modern
day processes. Most of the limitations of univariate SPC can
be addressed through the application of Multivariate
Statistical Process monitoring (MSPM), which considers all
the characteristics of interest simultaneously and can extract
information on the behavior of each characteristic relative to
the others. Some of the recent efforts of advances in process
monitoring are worth mentioning. Chen et al. (2002) have
integrated two data driven techniques, neural network (NN)
and principal component analysis (PCA) to develop a method
called NNPCA for process monitoring [1]. In this method NN
was used to summarize the operating process information into
a nonlinear dynamic mathematical model and PCA was
employed to generate simple monitoring charts based on the
multivariable residuals derived from the difference between
the process measurements and the neural network predictions.
Examples from the recent monitoring practice in the industry
and the large-scale system in the Tennessee Eastman process
problem were presented. Zhao et al. (2007) have introduced a
new STMPCA (soft-transition multiple PCA) modeling
method to avoid misclassification problems associated with
simple stage-based sub-PCA while monitoring batch processes
[2]. The method was based on the idea that process transition
could be detected by analyzing changes in the loading
matrices, which revealed evolvement of the underlying
process behaviors. They proposed that by setting a series of
multiple PCA models with time-varying covariance structures,
which reflected the diversity of transitional characteristics and
could preferably solve the stage-transition monitoring problem
in multistage batch processes. The superiority of the proposed
method was illustrated by applying it to both the real three-
tank system and the simulation of benchmark fed-batch
penicillin fermentation process with more reliable monitoring
charts. Both results of real experiment and simulation clearly
demonstrated the effectiveness and feasibility of the proposed
method. J yh-Cheng Jeng (2010) presented the use of both
recursive PCA (RPCA) and moving window based PCA
(MWPCA) for online updation of the PCA model and its
corresponding control limits for monitoring statistics [3]. He
derived an efficient algorithm based on rank one matrix update
of the covariance matrix, which was tailored for RPCA and
MWPCA computations. He demonstrated the complete
monitoring system through simulation examples and the
C. Kavuri, M. Kundu

Bio-degradation of Phenol: Identification and
Monitoring
P
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)


results had shown the effectiveness of the proposed method.
Bin-Shams et al. (2011) have used a CUSUM based statistical
monitoring scheme to monitor a particular set of Tennessee
Eastman Process (TEP), which were early monitored by using
contribution plots [4]. Contribution plots were found to be
inadequate when similar variable responses were associated
with different faults. Abnormal situations from the process
historical database were then used in combination with the
proposed CUSUM chart based PCA model to unambiguously
characterize the different fault signatures. The use of a family
of PCA models trained with CUSUM transformations of all
the available measurements collected during individual or
simultaneous occurrence of the faults were found effective in
correctly diagnosing these faults.
For the phenol degradation process, three experimental runs
taken over 36 hours were used to produce time series data
which were being used for the univariate and multivariate
statistical monitoring of three continuous process variables
namely temperature, pH and RPM of the process. X-bar,
CUSUM charts along with Range charts were used for
univariate monitoring and Principal Component Analysis
(PCA) was used for multivariate monitoring.
II. EXPERIMENTATION
IIC-LABEAST bench-top bio-fermentor of 2.5 litter
capacity has been used for the degradation purpose. The
reactor was connected to a high and low temperature
thermostatic water bath to maintain constant temperature. An
aeration of 1LPM was maintained for the supply of oxygen in
the reactor. The fermentor was fully equipped with a pH
electrode, stirrer and baffles. The strain being used for phenol
degradation was a heterotrophic bacterium named
Pseudomonas putida (ATCC: 11172). The strain was obtained
from National Collection of Industrial Microorganisms
(NCIM), Pune, India. The strain was supplied in the form of
dry spores. The spores were cultured in the media specified by
NCIL for the preparation of inoculums. The strain was sub
cultured every two weeks for maintenance. The media
composition was decided in such way that it supplies all the
necessary nutrients for the growth of P.putida. The
constituents are of basic mineral salt medium which is
widely used for the growth of microorganisms. The
experiments were designed systematically by varying the
variables namely Temperature, pH, RPM and Phenol Loading
at four different levels each. A total of 16 combinations were
made which can effectively give the picture of effects imposed
by each parameter on the process. All the experiments were
run for an incubation period of 36 hrs. Table 1 shows the
different combinations containing four parameters at four
different levels each and their corresponding phenol
degradation percentage.

III. PROCESS IDENTIFICATION
The time series identification using ARX model was done by
using the time series data produced in 4
th
run. ARX model
itself considers only one data channel (run) for the
identification of dynamics in the process. So only one run (4
th
)
was considered there, either of the 6
th
& 16
th
runs could have
been used.
In ARX model the disturbance term will be neglected. An
Auto Regression models with exogenous (ARX) inputs can be
written as:
) ( ) ( ) ( ) ( ) ( t e t u q B t y q A + =
(1)
Where, A(q)- Transfer function developed from the output
data in correspondence to the given input data
B(q) Transfer function developed from the input data.
y(t) Output i.e. phenol degradation percentage.
u(t) Input continuous variable i.e. Temperature/RPM/pH
e(t) Error
q Shift operator given by y(t)/y(t-1)
Both A(q) and B(q) are the polynomials having their
corresponding coefficients (a
1
,a
2
,a
3
,,a
n
and b
1
,b
2
,b
3
,,b
n
)
which are determined from the time series data. Three
numbers of SISO transfer functions were developed by
considering each of the three continuous variables
(temperature, pH, and RPM) as inputs with percentage phenol
degradation as output. The coefficients for each ARX model
were given Table 2. These parameters and the coefficients are
determined by fitting candidate models to data and minimizing
some criteria based on reduction of prediction error and
parsimony of the model. Cross correlation coefficient was
calculated to check whether there was sufficient impact of the
selected process inputs on the process output, i.e. whether the
concerned input-output time series data were correlated. The
time series identification was done by using the Identification
Toolbox in MATLAB. The percentage (Phenol degradation)
fit values while varying temperature, RPM and pH were
90.22%, 91.51% and 85.22% respectively. Thus the ARX
models were able to identify the dynamics of the phenol
biodegradation process with respect to temperature, pH, and
RPM with reasonable accuracy.

IV. PROCESS MONITORING
A. Monitoring with SPC Charts
The goal of statistical process monitoring (SPM) is to detect
the existence, magnitude, and time of occurrence of changes
that cause a process to deviate from its desired operation. The
methodology for detecting changes is based on statistical
techniques that deal with the collection, classification,
analysis, and interpretation of data. Traditional statistical
process control (SPC) has focused on monitoring quality
variables at the end of a batch and if the quality variables are
outside the range of their specifications, making adjustments
(hence control the process) in subsequent batches. An
improvement of this approach is to monitor quality variables
during the progress of the batch and make adjustments if they
deviate from their expected ranges. Monitoring quality
variables usually delays the detection of abnormal process
operation because the appearance of the defect in the quality
variable takes time. Information about quality variations is
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)


encoded in process variables. The measurement of process
variables is often highly automated and more frequent,
enabling speedy refinement of measurement information and
inferencing about product quality. Monitoring of process
variables is useful not only for assessing the status of the
process, but also in controlling the product quality. When the
process monitoring indicates abnormal process operation,
diagnosis operations are initiated to determine the source
cause of this abnormal behaviour. In this framework, each
quality variable is treated as a single independent variable.The
abnormal operating conditions in the Phenol degradation
process were detected using traditional univariate statistical
control charts like x Charts, R charts, Moving Range charts
and CUSUM charts. The "3 " ( denoting the standard
deviation of the variable) control limit is the most popular
control limit.
Range Charts: Development of x starts with the R charts.
Since the control limits of the x chart depends on process
variability, its limits are not meaningful before R is in-control.
Range is the difference between the maximum and minimum
observations in a sample.
i
x
i
x
i
R
min max

=
=
m
i
i
R
m
R
1
1
(2)
The random variable R/ is called the relative range. The
parameters of its distribution depend on sample size n, with
the mean being
2
d . An estimate of (the estimates are
denoted by ) can be computed from the range data by
using
2
d
R
= (3)
Where
2
d is called Hartleys constant. The standard deviation
of R is estimated by using the standard deviation of R/ ,
3
d
,:
2
3 3
d
R
d d
R
= = (4)
The control limits of the R chart are:
2
3
3
d
R
d R LCL UCL = , (5)
2
3
3 1
3
d
d
D =

and
2
3
3 1
4
d
d
D + = (6)
The control limits become
3
D R UCl =

and
4
D R LCL =

(7)
X-bar Charts: One or more observations may be made at each
sampling instant. The collection of all observations at a
specific sampling time is called a sample.

=
=
n
i
i
x
n
i
X
1
1

=

=
=
m
i
n
j
ij
x
mn
ij
X
1 1
1
(8)
Where m is the number of samples and n is number of
observations in a sample (sample size). The estimator for the
mean process level (centerline is X

.

Since the estimate of the standard deviation of the mean
process levelo is
n
d
R
,
) ( X / R A LCL UCL =
2
(9)

Where
n d
A
2
3
2
=
(10)

Where n is the number of readings,
2
d is Hartleys constant

CUSUM Charts: The cumulative sum (CUSUM) chart
incorporates all the information in a data sequence to highlight
changes in the process average level. The values to be plotted
on the chart are computed by subtracting the overall mean
0

from the data and then accumulating the differences.For a
sample size n 1, denote the average of the j
th
sample x
j
. The
quantity

=
=
i
j
j
x
i
S
1
0
) (
(11)

is plotted against sample number i. CUSUM charts are very
effective in detecting small process shifts, since they combine
information from several samples. CUSUM charts are very
effective with samples of size 1. The CUSUM values can be
computed recursively.
1 0
+ =
i
S
i
x
i
S ) (
(12)

If the process is in-control at the target value
0
, the CUSUM
S
i
should meander randomly in the vicinity of 0. If the process
mean is shifted, an upward or downward trend will develop in
the plot. Visual inspection of changes of slope indicates the
sample number (and consequently the time) of the process
shift. Even when the mean is on target, the CUSUM S
i
may
wander far from the zero line and give the appearance of a
signal of change in the mean. Control limits in the form of a
V-mask were employed when CUSUM charts were first
proposed in order to decide that a statistically significant
change in slope has occurred and the trend of the CUSUM
plots different than that of a random walk. CUSUM plots
generated by a computer became more popular in recent years
and the V-mask has been replaced by upper and lower
confidence limits of one-sided CUSUM charts. One-Sided
CUSUM charts are developed by plotting

=
=
i
j
K
j
x
i
S
1
0
)] ( [
(13)
Where, K is the reference value to detect an increase in the
mean level. If S
i
becomes negative for
1
>
0
, it is reset to
zero. When S
i
exceeds the decision interval H, a statistically
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)


significant increase in the mean level is declared. Values for K
and H can be computed from the relations:
2

= K
2
d
H = (14)
Given the probabilities of type 1 () and type 2 () errors, the
size of the shift in the mean to be detected (

), and the
standard deviation of the average value of the variable
) (
x
x
, the parameters in above equation are as follows:
|
.
|

\
|
=

1
2
2
ln d Where

x


= (15)

Moving Range Charts: In a moving-range chart, the range of
two consecutive sample groups of size a are computed and
plotted. For 2 a
) min( ) max( i i MR
t
,
(16)
Where i is the subgroup containing samples from t-a+1
tot.

The computation procedure is as follows:
Selecting the moving range size a. Often 2 a
Obtaining the estimates of MR and
2
/d MR
by using the moving-ranges
t
MR of length a. For a
total of msamples:

+
= +
=
1
1 1
1
a m
t
t
MR
a m
MR
(17)

Computing the control limits with the center line at
MR:
, MR D LCL =
3
MR D UCL =
4
(18)

B. Principal component Analysis
PCA, a multivariate statistical technique extracted the
essential features from time series data of the Phenol
biodegradation process. PCA reduces the dimensionality
without compromising any valuable information of the
dataset. The projection of original data along the selected
principal component dimensions (which are mutually
orthogonal dimensions) produces the score data or
uncorrelated data. PCA decomposition was applied on the
auto-scaled data matrix. . The PCA model can be used to
detect outliers in data, data reconciliation, and deviations from
normal operation condition that indicate excessive variation
from normal target or unusual patterns of variation.
The process data matrix was constructed with each row
represents one measurement and the number of columns m is
equal to the length of the measurement sequence or features.
In the present study, the features considered were temperature,
pH, and RPM and the number of measurements considered
were 180, which resulted in a (1803) data matrix (X). The
covariance matrixC =co:(X)and its Eigen values were
calculated. Its eigenvectors u

form an orthonormal basis


=[u
1
,u
2
,u
3
,u
m
] ; that is u
1
u =1 . The original data set
could be represented in the new basis using the relation:
Z =u
1
X. After this transformation, a new set of PCs were
obtained. In the present work, the abnormal operating
conditions were the outliers to the cluster of normal process
operating conditions projected in biplots.

V. RESULTS AND DISCUSSION
A. Univariate Monitoring
Figs. 1, 2 & 3 show the CUSUM and Moving Range charts
of temperature, pH and RPM respectively. Figs. 4-6 show the
X-bar and Range charts of temperature, pH and RPM
respectively. The data produced in the 4
th
run have been used
for the monitoring of process parameters. The control limits
for these charts have been customized by altering the multiples
of values. The multiples of sigma were taken in such a way
that the control limits will represent the 95% confidence
intervals. From the figures one can observe that the deviation
between the R-charts and Moving range charts are prominent.
These are due to the fluctuations in the subset values i.e.
triplicate values that have been taken for readings. Anyway,
only X-bar and CUSUM charts were considered for the
process fault detection purpose. Both X-bar charts and
CUSUM charts indicated certain common operating points
where the parameters were out of control in both the charts.
CUSUM chart for temperature had shown 3 numbers of
instances where the temperature went out of control whereas
X-bar chart had shown 8 numbers of such instances having
one commonality. The CUSUM & X-bar charts for pH had
shown 4 numbers of deviations from normal operating
condition having two numbers of common deviations. The
CUSUM chart for RPM produced 5 outliers whereas X-bar
produced 2 outliers having no common deviations. 23
numbers of faulty situations were found over four numbers of
process parameters. All the deviation points were noted and
checked whether the multivariate statistical process control
can identify the same abnormalities.

B. Multivariate Monitoring
The original data matrix (1803) was decomposed by PCA
and projected onto the new principal component directions i.e.
along PC1, PC2, & PC3 to produce score data and three
bipolots were generated. As the principal components are the
directions in which maximum variance is present, all the data
points pertaining to the normal operating conditions should
fall within a cluster. Points that deviate from the normal
operating conditions should fall apart. Figs. 7-9, present the
projections of all the process variables/features on to
PC1&PC2, PC1&PC3 and PC2&PC3 respectively. The
Ellipses in all the figures represent the 95% confidence levels
of the axes values. The major and minor axes of the ellipse are
defined as:
Centroid=(
x
,
y
)
Length of Major axis (a)=
x
0.95R
x
/2
Length of Minor axis (b)=
y
0.95R
y
/2
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)


Where, is the mean value of the corresponding coordinates
and R is the range for the same. The points out of the ellipse
represent the outliers or abnormal operating conditions. Here
one can alter the confidence levels to maintain stringent
control limits. Projections on PC1&PC2, PC1&PC3 and
PC2&PC3 have produced 11, 15 and 12 outliers, respectively,
producing a total of 26 outliers excluding the common points.
In comparison to the traditional SPM, the MSPM yielded 2
new outliers or abnormal operating situations namely at 3
rd

and 102
nd
instant. Thus multivariate approach in process
monitoring always helps in detecting abnormal conditions in a
process where individual process variables/features may seem
to be under control but their combination may produce an
abnormal situation affecting the process in an adverse way.
One of the major characteristics of multivariate data is that the
variables being measured are almost never independent, but
rather, they are highly correlated with one another at any given
time. Hence multivariate process monitoring will be the best
approach to monitor a process where many inter related
process variables are involved.
VI. CONCLUSION
Phenol degradation dynamics was identified using ARX
model which could be immense useful for on-line process
monitoring and control in the perspective of inadequate
process knowledge for developing a model based on 1
st

principle. Different SPC charts and PCA were used
monitoring the process of phenol degradation and detection of
abnormal operating condition which may lead to process fault.
Projection of process data along principal component
directions distinguishes normal process operating conditions
from abnormal ones in a better and precise way.

REFERENCES
[1] J . Chen, & C. Liao, Dynamic process fault monitoring based on neural
network and PCA, Journal of Process Control, vo. 12, pp. 2277289,
2002.
[2] C. Zhao, F. Wang, N. Lu, & M. J ia,, Stage-based soft-transition
multiple PCA modeling and on-line monitoring strategy for batch
processes, Journal of Process Control, vol. 17, pp. 728741, 2007.
[3] J . J yh-Cheng, Adaptive process monitoring using efficient recursive
PCA and moving window PCA algorithms, Journal of the Taiwan
Institute of Chemical Engineers, vol. 41, pp. 475481, 2010.
[4] M.A. Bin Shams, H. M. Budman, & T. A. Duever, Fault detection,
identification and diagnosis using CUSUM based PCA, Chemical
Engineering Science, vol. 66, pp. 4488-4498, 2011.















Fig 1 CU-SUM and Moving R charts of temperature
TABLE 1
COMBINATIONS OF INPUT PARAMETERS AND THEIR OUTPUTS
.

Temperature RPM pH Phenol Loading % Degradation
1 34.0 210 6.0 100 62.22
2 34.0 150 7.0 300 54.76
3 28.0 150 6.0 400 49.22
4 28.0 210 7.0 200 59.68
5 31.0 240 6.0 300 53.72
6 25.0 150 5.5 100 61.34
7 28.0 240 6.5 100 62.7
8 25.0 240 7.0 400 51.26
9 31.0 150 6.5 200 56.42
10 25.0 210 6.5 300 54.46
11 25.0 180 6.0 200 56.08
12 31.0 180 7.0 100 64.56
13 34.0 240 5.5 200 55.2
14 28.0 180 5.5 300 51.8
15 31.0 210 5.5 400 48.86
16 34.0 180 6.5 400 51.54

You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)



Fig.2 CU-SUM and Moving R charts of pH


Fig.3 CU-SUM and Moving R charts of RPM


Fig.4 X-bar and Range chart of temperature

Fig.5 X-bar and Range chart of pH


Fig.6 X-bar and Range chart of RPM


Fig.7 Projections of score along PC-1 and PC-2 with ellipse
representing 95% confidence
You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)



Fig.8 Projections of score along PC-1 and PC-3 with ellipse
representing 95% confidence


Fig.9 Projections of score along PC-2 and PC-3 with ellipse
representing 95% confidence


























Madhusree Kundu was born in Kolkata, 21 J anuary 1965. She did her B.S.C.
in Chemistry from university of Calcutta, India. Dr. Kundu got her B.Tech
and M.Tech. in Chemical Engineering fromthe same university in the years
1990 & 1992, respectively. She completed her Doctoral study from Indian
Institute of Technology, Kaharagpur, India and was awarded Ph.D. in the year
2005.
She served as a Process Engineer in the Simon carves India Ltd. during
1993-1998. She served as a lecturer and then Assistant Professor in The Birla
Institute of Technology & Science, Pilani, India during 2004-2006. Presently,
she is the Associate Professor, Department of Chemical Engineering, NIT,
Rourkela, Orissa, India. She has published a couple of research articles and
book chapters to her credit. Fluid Phase Equilibria, Advanced Process
Control, Fault Detection and Diagnosis, & Process Monitoring are the current
research interests of her. Dr. Kundu is the life member of Indian Institute of
Chemical Engineers (IIChE).
Naga C. Kavuri, M.Tech Student, Department of Chemical Engineering,
NIT, Rourkela, Orissa, India.Presently, he is associated with the Department
of Civil Engg., NIT, Rourkela. e-mail: biochaitanya@gmail.com






TABLE 2
COEFFICIENTS OF THREE SISO TRANSFER FUNCTIONS FOR
THREE ARX MODELS DEVELOPED.
Variable considered for ARX Temperature RPM pH
q
1
a1 - 0.8985 - 0.8985 - 0.8931
b1 -4.247 10
-15
-- --
q
2
a2 - 0.1227 - 0.109 - 0.1227
b2 - 4.24710
-15
-- --
q
3
a3 - 0.03465 - 0.02249 - 0.03465
b3 - 4.24710
-15
-- --
q
4
a4 - 0.1638 - 0.1856 - 0.1638
b4 - 4.24710
-15
-- --
q
5
a5 - 0.08709 - 0.0995 - 0.08709
b5 - 4.24710
-15
-- --
q
6
a6 - 0.02045 - 0.01641 - 0.02045
b6 - 4.24710
-15
-0.009252 --
q
7
a7 0.2004 0.2055 0.2004
b7 -- -- --
q
8
a8 0.03953 0.04895 0.03953
b8 -- -- --
q
9
a9 0.08341 0.07851 0.08341
b9 -- -- --

You created this PDF from an application that is not licensed to print to novaPDF printer (http://www.novapdf.com)

Das könnte Ihnen auch gefallen