Sie sind auf Seite 1von 15


Machine Learning Applied to Multiwell Test Analysis and Flow Rate

Chuan Tian and Roland N. Horne, Stanford University

Copyright 2015, Society of Petroleum Engineers

This paper was prepared for presentation at the SPE Annual Technical Conference and Exhibition held in Houston, Texas, USA, 28 30 September 2015.

This paper was selected for presentation by an SPE program committee following review of information contained in an abstract submitted by the author(s). Contents
of the paper have not been reviewed by the Society of Petroleum Engineers and are subject to correction by the author(s). The material does not necessarily reflect
any position of the Society of Petroleum Engineers, its officers, or members. Electronic reproduction, distribution, or storage of any part of this paper without the written
consent of the Society of Petroleum Engineers is prohibited. Permission to reproduce in print is restricted to an abstract of not more than 300 words; illustrations may
not be copied. The abstract must contain conspicuous acknowledgment of SPE copyright.

Permanent downhole gauges (PDGs) can provide a continuous record of flow rate and pressure, which
provides us rich information about the reservoir and makes PDG data a valuable source for reservoir
analysis. In previous work, it has been shown that kernel ridge regression based machine learning is a
promising tool to interpret pressure transients from a single PDG. Kernel ridge regression denoises and
deconvolves the pressure signal efficiently and recovers the full reservoir behaviors. In this work, the
machine learning framework was extended to two applications: multiwell testing and flow rate recon-
The multiwell testing was formulated into the machine learning algorithm using a feature-coefficient-
target model. The features were nonlinear functions of flow rate histories of all the wells. For each well,
the features and the pressure target of this well were used to train its coefficients, which implicitly contain
the information about the reservoir model and well interactions. The reservoir model can then be revealed
by predicting the pressure corresponding to a simple rate history with the trained model. The multiwell
machine learning model was demonstrated to be useful in several different ways, including artificial
interference testing and reservoir pressure prediction at various well locations based on flow rate data
Flow rate reconstruction aims at estimating any missing flow rate history by using available pressure
history. This is a very useful capability in practical applications in which individual well rates are not
recorded continuously. A set of new features were developed as functions of pressure to model the flow
rate. Coupled with kernel ridge regression, the developed features were tested on both synthetic and real
data sets and demonstrated high prediction accuracy. The success of the rate reconstruction modeling also
illustrates the flexibility of machine learning to different kinds of modeling, by adapting features and
The models for both applications maintained the advantages of the machine learning based single well
pressure interpretation in terms of the accuracy of prediction, computational efficiency and tolerance to
noise. This work further demonstrates machine learning as a promising technique for PDG data analysis.
2 SPE-175059-MS

A permanent downhole gauge (PDG) is a device installed permanently in a well, to provide a continuous
record of pressure, temperature and sometimes also flow rate during production. Recently, PDGs have
become widely used in the industry. However, analyzing PDG data is challenging because of the inherent
characteristics of the data, including continuously variable flow rate, noise, and the large data volume.
Up to now, most effort on PDG data analysis has been for pressure transient analysis on single wells,
although there have also been some studies on temperature transient analysis and multiwell data analysis.
Recently, there have been some attempts to apply machine learning techniques for PDG data analysis, for
example Liu and Horne (2012, 2013a, 2013b). The fundamental idea is to learn the patterns behind PDG
data, where the patterns contain the reservoir information implicitly. Our previous study on single well
pressure analysis (Tian and Horne, 2015) showed that machine learning has the potential to handle the
complexities in PDG data analysis, and learn the reservoir model successfully.
In this work, we continued to utilize machine learning as the tool for investigation, but addressed two
different problems, namely, multiwell testing (multiwell pressure transient analysis) and flow rate
reconstruction. Both topics are important in PDG data analysis in practical engineering. We developed a
machine learning model for each problem and tested the models on different real and synthetic data sets.
The test results validated our developed approach, and illustrated the flexibility of the machine learning
framework to apply to different applications by adapting the features and the targets.

Literature Review
Machine Learning for PDG Data Analysis Machine learning has been applied widely in different
areas in computer science to extract the useful patterns behind data. Recently, there have been some
studies to adapt machine learning to reservoir engineering, to interpret data recorded by PDGs. Liu and
Horne (2013a, 2013b) developed the convolution kernel method to interpret flow rate and pressure data
from PDGs. The convolution method was shown to be successful in denoising and deconvolving the
pressure signal without explicit breakpoint detection. Based on the work of Liu and Horne (2013a, 2013b),
Tian and Horne (2015) proposed kernel ridge regression for pressure deconvolution. Kernel ridge
regression was shown to speed up the computation and recover the early transient behaviors, while
maintaining the advantages of the convolution kernel method described by Liu and Horne (2013a, 2013b),
which was effective but expensive computationally. Tian and Horne (2015) also tested the kernel ridge
regression approach on pressure-temperature interpretation. These works demonstrated machine learning
as a promising tool for PDG data analysis. The current study expanded their use to multiwell testing and
flow rate reconstruction problems.
Multiwell Testing As a field starts development, multiple active wells are expected and the behaviors of
those wells are likely to be impacted by each other. This is particularly true for PDG measurements,
because PDGs are installed permanently and record the well behaviors over the full life of the reservoir
development. Thus an extended approach to interpret PDG data for multiwell systems is required.
Mathematically, multiwell testing can be considered by superposition. Similar to multirate test
generating superposition in time, a multiwell test leads to superposition in space (Earlougher, 1977).
Superposition in both time and space makes multiwell test data even more complicated, and makes their
interpretation a very challenging research topic.
The simplest multiwell testing is the interference test, where one well (the active well) is put on
injection or production, and another well (the observation well) is shut in to monitor the pressure
changes caused by the first. The advantage of interference testing is that a greater volume of the reservoir
can be tested. Besides, both reservoir transmissivity (kh) and storativity (cth) can be estimated from the
pressure response (Horne, 1995). The disadvantage is that pressure drops can be very small due to the
separation distance and may be hidden by other operational variations. To interpret interference tests, the
SPE-175059-MS 3

data are matched usually to some form of the exponential integral type curve. In reality, the true reservoir
model is likely to have a much more complex physical structure.
Flow Rate Reconstruction Incomplete flow rate history is a common phenomenon in PDG measure-
ments. For most PDGs in which only pressure is measured, it would be very interesting if we could
back-calculate flow rate using the available pressure data, and estimate a rate history to provide us more
insights about the operation.
Nomura (2006) treated flow rate as a variable in his smoothing algorithm, which allowed for
uncertainty in both flow rate and pressure measurements. The algorithm was tested on a synthetic data set
where the flow rate was 10% off from the truth. After running the algorithm until convergence, Nomura
(2006) obtained a rate profile very close to the true flow rate. Although it looks promising, this method
required a good initial guess of the rate history for the algorithm to converge. In other words, the measured
flow rate should be similar to the true flow rate. However, if part of the rate history is completely missing,
the measured rate (which is zero) would be totally different from the true unknown rate. In that case,
Nomuras (2006) method could not produce an accurate rate estimation.
Ouyang and Sawiris (2003) proposed a theoretical development of calculating in-situ flow rate based
on permanent downhole pressure, where they assumed the steady-state flow of either a single phase or
well-mixed multiphase along the wellbore. They performed a field study applying such a calculation to
an offshore well, with a short and simple production profile. The estimated rate history was compared with
a PLT survey and showed a high consistency. However the assumptions and simplifications in their
derivations limit the application to more complicated multitransient analysis.
The challenge of reconstructing flow rate based on pressure data is that most models treat flow rate as
the cause of pressure response, not the outcome. However, estimating flow rate using pressure is
straightforward based on mathematical modeling. That is why we propose to apply machine learning here,
given machine learning as a powerful tool to find the pattern between variables.
Problem Statement
Based on the background information and literature review, the objectives of this research were set to be:
Extend the machine learning framework for pressure analysis on a single well to multiwell
systems. The framework should capture the well interference accurately, and be able to test a
greater area of the reservoir.
Develop a machine learning model to reconstruct the flow rate history by using pressure data.
Both models should maintain the advantages of the machine learning based single-well pressure
interpretation in terms of the accuracy of prediction, computational efficiency and tolerance to

Revisit of Machine Learning Concepts
First lets revisit the machine learning model developed for single well pressure transient analysis (Tian
and Horne, 2015), to be familiar with the context of machine learning and some of the important concepts
that will be used in this paper.
In single well pressure transient analysis, the goal is to model the pressurep(more accurately, the
pressure difference between the query point and the start point) using flow rate q and time t. So pressure
is labeled as target yp, and flow rate and time data are formulated into feature x[q,t]. The goal in
machine learning is to find an appropriate model f to describe the pattern between x and y, namely, yf(x).
However, the intuitive feature x[q,t] is shown to be inadequate for the modeling, and more efforts in
feature construction are required. We will review three machine learning techniques: linear regression,
kernel method and model regularization.
4 SPE-175059-MS

Linear Regression First we redefine feature x and target y as follows:



where q(i), t(i) and p(i) are the flow rate, time and pressure difference at the i-th data point, and n is
the number of observations from the PDG. The feature formulation is constructed to reflect the physical
properties of pressure response as a function of flow rate and time (details in Tian and Horne, 2015). In
linear regression, the relation between the feature and the target is assumed to be linear, namely,
f(x(i))Tx(i). The best model is defined in terms of minimizing the mean-squared error (Hastie et al.,

The value of that minimizes J() is given in closed form (Hastie et al., 2009):

where, X is an n-by-p matrix:


and y is an n-by-1 vector:


Having the solution in closed form is a considerable advantage, as the computation of the unknown
coefficients is extremely fast.
Kernel Method The kernel method can be applied as a tool for feature expansion when we want the
model to learn in a higher-dimensional space than x. The kernel on two features x and z, is defined as the
inner product of the corresponding expanded features (x) and (z) (Shawe-Taylor and Cristianini, 2004).

Now we apply the kernel method to map x(i) to (x) and to (Shawe-Taylor and
Cristianini, 2004):

Define the kernel matrix KM such that KM(i,j)K(x(i), x(j)),i, j1,...n (Shawe-Taylor and Cristianini,
2004). Then we can further summarize our model into the matrix form:
SPE-175059-MS 5

The solution can be obtained easily using K-1My. The kernel helps the algorithm to learn in (x)
dimensional space without explicit representation of (x). This property allows us to add more features
into our model with little cost. Once is obtained after training, pressure prediction ypred can be calculated
given xpred calculated from the test data:

Model Regularization Model regularization is a technique that reduces the prediction variance by
shrinking the parameter estimates (in our case, or ) to reduce the parameter space to only that
necessary. Generally there are two ways of model regularization: ridge regression and lasso (Hastie et al.,
2009). Both methods add some penalty to the cost function shown in Equation 3. Ridge regression adds
the penalty on the L2 norm of parameters:

In Equation 11, the tuning parameter controls the relative weights on a better fit and smaller
parameters (better generalization). Usually a ten-fold cross-validation is used to select an appropriate
value of (Hastie et al., 2009). After choosing a value for , parameters that minimize the cost function
in Equation 20 can be calculated (Hastie et al., 2009):

Ridge regression can also be applied on the kernel framework, which is called kernel ridge regression.
In kernel ridge regression, the cost function is:

The solution of parameters is given by (Shawe-Taylor and Cristianini, 2004):


Lasso changes the penalty term from L2 norm to L1 norm of the parameters (Hastie et al., 2009). Lasso
can also be applied on both linear regression and kernel frameworks.

Multiwell Testing
In this section, we describe a machine learning framework to model pressure using flow rate data for
multiwell systems. Here we have Well w from 1 to N. The pressure of each Well w is affected by the
production (or injection) of itself qw as well as the production (or injection) of the other wells , where
. Our goal is to model the pressure response of a given well pw using the flow
rate data of all the wells in the system (qw and ).
The main challenge of mutiwell testing is to model the well interactions caused by pressure superpo-
sition in space. To address this issue, we applied feature expansion on the basis features developed for
single well pressure transient analysis (Equation 1). The expanded features are written as the combinations
of the basis features of all the wells in the system:
6 SPE-175059-MS

where x(i) represents the expanded features, and x(i)w (given by Equation 1) represents the basis features
of Well w1,...,N. x(i)w describes the effect of the production (or injection) of Well w on the pressure
change at the observation well. Thus, we expect the well interactions to be captured by using those
expanded features. For a system of N wells, the dimension of the feature space expands from p (feature
dimension for single well) to N p. Accordingly, the dimension of coefficients also expands to Np.
Because of such explicit feature expansion, model regularization is always recommended for solving
multiwell system even without applying kernels. It should also be noted that the expanded features x(i) are
the same for each well.
A distinct machine-learning model is trained for each well, by using the expanded features and the
pressure of that well. Namely, a different w is trained using x(i) and y(i)w:

where w and y(i)w are the coefficients and pressure of Well w respectively. The intuition is that x(i)w
should have a small p-value (the probability of an observed or more extreme result assuming that there
is no relationship between the feature and the target) if Well w strongly affects the pressure at the
observation well. In other words, the well interaction is learned implicitly as we train the coefficients.
With a new set of flow rate histories, a new feature matrix Xpred can be developed using Equation 5.
Then the pressure prediction ypredw for each Well w is given by:

Flow Rate Reconstruction

The absence of flow rate measurements is commonly seen in modern PDGs, for at least part (and
sometimes all) of the time. It would be very helpful if we could reconstruct the missing rates using the
available pressure data, to provide a more complete description of the operation. Because our goal is to
estimate the flow rate based on pressure data, flow rate is treated as the target and pressure is used as the
feature. As discussed in Section 1.1, the challenge here is that the current approaches imply pressure as
a function of flow rate (and time) rather than the reverse. To deal with this, we proposed the idea to apply
machine learning to establish a reverse relationship from pressure to flow rate. We imposed the features
mapping pressure to flow rate based on the features mapping flow rate to pressure in Equation 1. The
details of mathematical derivations are not covered here, but the principle idea is that flow rate can be
expressed using features in the form of pressure convolution:

After defining the features, we may apply kernel ridge regression following the same solving procedure
as shown in Section 2.1.

Results and Analysis

Multiwell Testing
Case 1: Artificial Interference Testing In this case, we have two wells producing in a homogeneous
reservoir. The flow rate and pressure of Well 1 are shown in Figure 1 (a) (b), and the flow rate and
pressure of Well 2 are shown in Figure 1 (c) (d). The pressure response at each well was designed to be
affected by the production of the other well. However, due to the nature of interference, the small
magnitude of pressure change caused by the other well is mostly hidden by the dominant pressure change
SPE-175059-MS 7

at the well itself. That makes it difficult for the machine-learning algorithm to differentiate the pressure
contributions from other wells.
8 SPE-175059-MS

Figure 1Machine learning results on Case 1. (a) and (c) show the true flow rate (blue) and the noisy training rate (circle) of Well 1 and
Well 2; (b) and (d) show the true pressure (blue), noisy training pressure (circle) and the pressure prediction (red) corresponding to true
flow rate of Well 1 and Well 2; (e) shows the flow rate of Well 1 during the interference test (Well 2 is shut in); (f) shows the comparison
of true pressure (blue) and pressure prediction (red) of Well 2 during the interference test.
SPE-175059-MS 9

The flow rate and pressure of both wells were augmented with artificial noise before training. After
training the model, we created an artificial inference testing by computing the response that would occur
if we were to have shut Well 2 while keeping Well 1 producing at constant rate, as shown in Figure 1 (e).
Namely, Well 1 was used as an active well and Well 2 was used as an observation well. Figure 1 (f) shows
the comparison of the pressure prediction at Well 2 and the true data. Although the pressure change in the
interference test is much smaller in comparison with the training case, the algorithm still captured the
trend of this small pressure interference.
The example of interference testing demonstrates the ability of our multiwell machine learning model
to learn the well interactions, even though the magnitude of pressure interference is small. The idea of
artificial interference testing can also be useful because it does not require additional cost in the field
operations. The observation well is shut in the computer program but not in the field. If we have PDG data
from multiple wells, an artificial interference test can be implemented easily by picking one active well
and observing the pressure response at other wells.
Case 2: Pressure Forecast Here we add further complexities to the machine-learning task by introduc-
ing a greater number of wells and the presence of two-phase flow. Our goal is to forecast the pressure
given flow rate controls at various well locations. Four production wells are located at the corners of a
homogeneous reservoir and an injection well sits in the center. The flow rate control and pressure response
of two producers and the injector are shown in Figure 2.
10 SPE-175059-MS

Figure 2Machine learning results on Case 2. (a) and (b) show the flow rate and pressure of Producer 1; (c) and (d) show the flow rate
and pressure of Producer 2; (e) and (f) show the flow rate and pressure of the injector. Data to the left of the dashed line were used for
training and the data to the left were used for validation.
SPE-175059-MS 11

As discussed earlier, the key idea in multiwell testing is to expand the feature dimension to include the
contributions from multiple wells. Adding more wells only means adding more features. Besides feature
expansion, no other treatment was required to account for the additional number of wells. To address the
two-phase issue, we replaced the oil rate by total liquid rate when constructing the features. That reflects
the physics that the pressure response is caused by the flowing of both phases. A cross-validation was
applied by using data on the left of the dashed line in Figure 2 for training, and the remaining part of the
data for validation. The results show an accurate match to the training set as well as a good prediction on
the validation set, although we did observe a deviation in pressure prediction for the injector (Figure 2 (f)).
Thus we conclude that our machine-learning framework does have the flexibility to work on multiwell
systems with two-phase flow.

Flow Rate Reconstruction

Case 3: Synthetic Data In this case, we trained the machine learning model using pressure and flow rate
data with artificial noise as shown in Figure 3 (a) and (b). It should be noted that the direction of the
modeling was different although the training data were still pressure and flow rate. After training, two
pressure histories were used as input to generate the flow rate predictions. The comparison of the
predictions and the true flow rate that used to generate the input pressure are shown in Figure 3 (c) and
(d). We observe a high agreement between predictions and true data in both cases, although the zig-zag
flow rate looks very different from the training rate. That demonstrates the ability of our method to
generalize well to unknown data.
12 SPE-175059-MS

Figure 3Machine learning results on Case 3. (a) and (b) show the noisy training pressure and flow rate data respectively; (c) shows
the flow rate reconstruction corresponding to stepwise pressure in (a); (d) shows the flow rate reconstruction corresponding to zigzag

Case 4: Real Data Our machine-learning approach for flow rate reconstruction was further tested on a
real PDG data set. In the training step, part of the flow rate data was hidden to mimic the situation of
missing measurements (Figure 4 (a) and (b)). After training, the complete pressure history was used as
input to reconstruct the missing flow rate. Figure 4 (d) shows the comparison of reconstructed flow rate
(red) and the true measurements. We observe a good agreement between the two. Case 4 illustrates how
the developed model can be useful in real life. Usually we have continuous pressure measurements from
the PDG, but parts of the flow rate measurement are missing or may be unreliable. In that case, as long
as we have at least one period of consistent pressure and flow rate measurements to learn the pattern
between the two, we are able to apply the flow rate reconstruction technique to create an estimate of the
missing flow rates.
SPE-175059-MS 13

Figure 4 Machine learning results on Case 4. (a) and (b) show the training pressure and flow rate with part of history missing; (c)
shows the complete pressure history; (d) shows the reconstructed flow rate using pressure input in (c).

In short, a new set of features were developed to model flow rate using pressure measurements.
Coupled with the kernel ridge regression framework, the features were tested on both synthetic and real
data sets, and showed promising performance. The results also indicate another property of machine
learning: the flexibility in direction of modeling by adapting features and targets.
There are mainly three conclusions of this work.
The machine-learning approach for PDG pressure analysis was extended from single-well to
multiwell systems. The multiwell model was shown to be able to capture the well interactions
accurately, and differentiate the pressure contributions from various wells. The multiwell model
was tested in two promising applications: artificial interference test creation without affecting field
operation, and pressure forecasting for multiwell systems with given rate control.
A machine-learning model was developed for flow-rate reconstruction with new definitions of
features and targets. The model was tested on both synthetic and real data, and showed promising
performance. Our machine-learning model provides an effective alternative approach to estimate
14 SPE-175059-MS

the missing flow rate compared with earlier optimization or analytical solution approaches,
because these approaches require an assumption of the reservoir model.
This work demonstrates the flexibility of machine learning in solving different applications.
Pressure transient analysis, multiwell testing and flow rate construction each have their own
complex physical processes. Machine learning treats these different applications using the same
solution procedure, by adapting features and targets for each application. This work further
demonstrates machine learning as a promising technique for PDG data analysis.

We would like to thank Smart Fields Consortium for the financial support that made this research possible.

KM kernel matrix
K(x,z) kernel on two vectors x and z
n number of observations
N number of wells
p dimension of features
q flow rate
qw flow rate of well w
flow rate of well
t time
w well number
x(i) features for observation i
x(i)w features for observation i that comes from well w
X feature matrix
Xpred feature matrix for prediction
y target vector
y(i)w target for observation i that comes from well w
ypred target vector for prediction
parameters (coefficients) for features (x)
parameters (coefficients) for features x
tuning parameter
(x) expanded features on original features x
p pressure change

Earlougher, R. C. Jr. 1977. Advances in Well Test Analysis. Society of Petroleum Engineers
Hastie, T., Tibshirani, R., and Friedman, J. 2009. The Elements of Statistical Learning, second edition.
Horne, R. N. 1995. Modern Well Test Analysis, second edition. Petroway.
Liu, Y. and Horne, R. N. 2012. Interpreting Pressure and Flow-Rate Data From Permanent Downhole
Gauges by Use of Data-Mining Approaches. SPE Journal, 18 (1): 69 82.
Liu, Y. and Horne, R.N. 2013a. Interpreting Pressure and Flow Rate Data from Permanent Downhole
Gauges Using Convolution-Kernel-Based Data Mining Approaches. Presented at the SPE Western
Regional & AAPG Pacific Section Meeting 2013 Joint Technical Conference, Monterey, Cali-
fornia, USA, 19-25 April. SPE-165346-MS.
SPE-175059-MS 15

Liu, Y. and Horne, R.N. 2013b. Interpreting Pressure and Flow Rate Data from Permanent Downhole
Gauges with Convolution-Kernel-Based Data Mining Approaches. Presented at the SPE Annual
Technical Conference and Exhibition, New Orleans, Louisiana, USA, 30 September-2 October.
Nomura, M. 2006. Processing and Interpretation of Pressure Transient Data from Permanent Down-
hole Gauges. PhD thesis, Stanford University.
Ouyang, L. B. and Sawiris, R. 2003. Production and Injection Profiling: A Novel Application of
Permanent Downhole Gauges. Presented at the SPE Annual Technical Conference and Exhibition,
Denver, Colorado, USA, 5-8 October. SPE-84399-MS.
Shawe-Taylor, J. and Cristianini, N. 2004. Kernel Methods for Pattern Analysis. Cambridge Univer-
sity Press.
Tian, C. and Horne, R. N. 2015. Applying Machine Learning Techniques to Interpret Flow Rate,
Pressure and Temperature Data from Permanent Downhole Gauges. Presented at the SPE Western
Regional Meeting, Garden Grove, California, USA, 27-30 April. SPE-174034-MS.