Hegde 2017

Accepted Manuscript
Use of machine learning and data analytics to increase drilling efficiency for nearby
wells
Chiranth Hegde, K.E. Gray
PII: S1875-5100(17)30064-1
DOI: 10.1016/j.jngse.2017.02.019
Reference: JNGSE 2072
To appear in: Journal of Natural Gas Science and Engineering
Received Date: 19 July 2016

Revised Date: 2 February 2017
Accepted Date: 6 February 2017
Please cite this article as: Hegde, C., Gray, K.E., Use of machine learning and data analytics to increase
drilling efficiency for nearby wells, Journal of Natural Gas Science & Engineering (2017), doi: 10.1016/
j.jngse.2017.02.019.
This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to
our customers we are providing this early version of the manuscript. The manuscript will undergo
copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please
note that during the production process errors may be discovered which could affect the content, and all
legal disclaimers that apply to the journal pertain.
ACCEPTED MANUSCRIPT
Use of Machine Learning and Data

Analytics to Increase Drilling Efficiency
for Nearby Wells
PT
Chiranth Hegde and K.E. Gray
RI
Petroleum and Geosystems Engineering, The University of Texas at Austin
SC
Abstract:
U
Data-driven models can be used as an efficient proxy to model complex concepts in engineering. It is
common engineering practice to optimize some controllable input parameters in a model to increase
AN
efficiency of operations. Machine Learning can be used to predict the rate of penetration (ROP) during
drilling to a great accuracy as shown by Hegde, Wallace, and Gray (2015). This paper illustrates the use
M
of machine learning to predict and increase ROP effectively. The machine learning model is first used to
predict ROP – with input parameters such as weight on bit (WOB), rotations per minute of the drill bit
(RPM), and flow rate of the drilling mud. The input parameters are then modified to increase ROP. This
D
process has been applied to field drilling data from a vertical well consisting of different rocks and
TE
formations. The procedure can be used to determine the maximum achievable ROP in each formation,
and map out operational guidelines for drilling of pad wells. A post drilling analysis can be conducted for
pad wells to cut costs and save time while drilling. This model is very innovative because only surface
EP
measured parameters are used, without a priori requirements for geological, laboratory, or drilling data.
Keywords: Machine Learning, Drilling Parametrics, ROP, Data Analytics, Drilling Optimization
C
AC
1
ACCEPTED MANUSCRIPT
1. Introduction
Drilling accounts for a significant part of oil and gas budgets; hence any time-saving measure directly
PT
relates to reduced costs. Rate of penetration (ROP) during drilling is a direct measure of the time taken to
drill a well, apart from other times involved such as trips, bit change, down time etc. Hence controlling
the ROP can be extremely important in drilling, and maximizing ROP is one form of optimizing drilling,
RI
thereby reducing drilling time. The ultimate form of drilling optimization would be to optimize costs,
minimizing all contributing elements. This paper is a step towards optimization, covering prediction and
SC
maximization of ROP during drilling.
1.1 Traditional ROP Models
U
Prediction of ROP and its improvement have been the subject of much research in the past.
AN
Drilling models have been improved to incorporate advances such as bit technology, drilling in
unconventional reservoirs, or introduction of more parameters. Most of these models have been based on
the physics of drilling with empirical coefficients to incorporate changes in lithology, geology, and other
M
factors not readily measured. Empirical coefficients are determined and adjusted as the well is drilled,
thus data are acquired during drilling individual or pad wells. The values utilized in upper and lower
D
bounds of empirical coefficients are based on physical reasons, but for other models this range must be
determined by engineering judgement.
TE
An ROP model developed by Maurer (1962) applied a rock cratering approach to develop an
ROP formula for roller-cone bits. The parameters included weight-on-bit (WOB), rotary speed of drilling
EP
(RPM), bit diameter, rock strength (UCS). In addition to a theoretical basis for this model, an empirical
coefficient was adopted to incorporate the formation being drilled. An important concept used by Maurer
was rock floundering. Beyond some WOB there was no improvement in ROP, owing to reduction in hole
C
cleaning. The accumulation of cuttings make it harder to clean the bit, thereby reducing ROP.
AC
An early model for ROP prediction was introduced by Bingham (1965), using only parameters of
weight-on-bit (WOB), rotations per minute (RPM), and bit diameter. Eckel’s model (Eckel, 1967)
incorporated the effects of drilling mud. He used a Reynolds number function to correlate ROP with mud
properties, and ROP was shown to increase as mud viscosity is reduced.
Bourgoyne and Young (1974) introduced one of the most comprehensive ROP models with
additional parameters to include several physical and geological aspects involved in drilling. The model
2
ACCEPTED MANUSCRIPT
contains eight parameters: formation strength, normal compaction trend, under compaction, differential
pressure, bit diameter and bit weight, rotary speed, tooth wear, and bit hydraulics.
A model introduced by Walker et al. (1986) utilized tri-axial rock strength tests and a Mohr-
Coulomb failure criterion to develop a roller cone ROP equation dependent on WOB, borehole pressure,
rock porosity, average grain size, and formation compressive strength. Warren (1987) developed a model
PT
which separated the effects of drilling into physical breakage of the rock and hole cleaning. Winters et al.
(1987) added a fourth term to the Warren (1987) equation to include rock ductility.
RI
Hareland and Rampersad (1994) introduced a drag bit model which was later modified by
Motahari et al. (2010). Motahari et al. (2010) discussed a PDC (polycrystalline diamond compact) bit
SC
model in which positive displacement motors (PDMs) were taken into account. Drag bit models are of
increased interest since most drilling today utilizes PDC bits
U
These traditional models contain empirical constants which are formation dependent. The empirical
constants have to be determined for a formation using field data. The accuracy of these models is fairly
AN
low. Moreover, a major limitation is that they are not adaptive, i.e., they cannot adapt to new lithology
and other changes quickly (need large amounts for field data as compared to machine learning models for
M
training). For example, if the lithology is changed, both the traditional models and the machine learning
models require some training data; the traditional models always require more training data. In case of a
recurring lithology, neither the traditional or machine learning models require retraining for ROP
D
predictions. However it is recommended to retrain the models (especially the traditional models) to ensure
TE
higher accuracy. The traditional models generally require re-training because the results obtained are not
every accurate. However, in the case of machine learning models, the older (similar lithology based)
model will give ROP predictions which are fairly accurate and re-training is strictly not required (but re-
EP
training will provide better results in both cases). A recurring lithography, or change in wellbore
trajectory, or a change in rock lithology, for example, severely reduces accuracy in case of traditional
C
models. Empirical parameters cannot be generalized for these models for an entire well since this would
lead to extremely high prediction errors (Wallace, Hegde and Gray, 2015).
AC
1.2 Predictive Models

Some ROP models included machine learning and nonlinear mathematical models which were
more generalized as compared to traditional models. Neural networks, a nonlinear statistical model, has
been used for predicting ROP by Bilgesu et al.(1997). Exploration of this technique using different input
parameters was introduced by Jahanbakhshi and Keshavarzi (2012). Dunlop et. al (2011) created a model
3
ACCEPTED MANUSCRIPT
with two input parameters to optimize ROP, namely RPM and WOB. This was developed to create an
optimization algorithm which resulted in increased ROP. The work of Hegde, Wallace, and Gray (2015a)
has been insightful in that the authors used several machine learning techniques to predict ROP during
drilling in a given formation. Other work includes simple statistical methods by Hegde, Wallace, and
Gray (2015b) to infer parameters rather than predict them. Wallace, Hegde, and Gray (2015) developed a
PT
roadmap to incorporate this statistical model into real time drilling operations. The present work uses a
similar framework to mathematically illustrate ROP optimization.
RI
2. Data Management, Visualization and Validation
SC
Data from a vertical well for validation of the Wider Windows Statistical Learning Model (WWSLM) 1 is
used for ROP prediction, then ROP optimization. WWSLM 1 uses predictors - such as WOB, RPM, flow
rate and UCS of rock - as input variables, which are then utilized to ‘train’ a machine learning based
U
predictor for ROP. Input parameters are user-selected, and resulting accuracy will depend on the specific
AN
input parameters used. Increasing the number of relevant input parameters may yield a model with higher
accuracy. The examples shown in this paper are based on models created using surface data collected
while drilling. Other variables such as mud properties, drill string configuration, logs, and bottom-hole
M
assemblies were not included, but they could be. Basic requirements for WWSLM 1 are minimal, user-
friendly, and rig adaptable (Hegde, Wallace, and Gray, 2015a).
D
TE
2.1 Data Exploration and Feature Selection

Since this project’s aim is to predict ROP, this section introduces field data used for prediction and
EP
improvement of ROP. Field collected drilling data from one vertical well was utilized for validation of the
models (shown in Figure 1). The data contains some anomalies in ROP measurements, such as the abrupt
change in ROP at depths of 7200 ft, 8550 ft and 9150 ft. These outliers were removed.
C
Data exploration is important because the machine learning models depend on the data. A pairs plot
AC
(Figure 2) can be used to determine correlations between different parameters in the data. These
correlations between input parameters to the model will facilitate model construction, selection of
important features, and feature engineering. Figure 2 shows a pairs plot for data collected in limestone
rock. The advantage of a pairs plot is the simultaneous plotting of the ROP against some of the input
features in the model. This provides a bird’s eye view of the correlation between the input and output
parameters.
4
ACCEPTED MANUSCRIPT
PT
RI
U SC
AN
M
D
Figure 1: ROP vs Depth Plot for Field Data

TE
EP
C
AC
5
ACCEPTED MANUSCRIPT
PT
RI
U SC
AN
M
D
Figure 2: Pairs Plot for Limestone Rock

TE
Each plot in Figure 2 has a “window”, which can be numbered for easy evaluation. Numbering is similar
to matrix indexing, i.e. window(i,j) would represent a window in the ith row and jth column. The X-axis in
each sub plot represents the units for that input parameter. Hence each input parameter or each window
EP
would have an X & Y axis representing the data plotted in that window. Each window in the plot,
contains two parameters plotted against each other. For example, window (1,2) plots depth on the x-axis
C
and ROP on the y-axis. Window (2,1) displays the correlation of the variables plotted in window (1,2).
One can look at input features, their pairwise correlation, which assists in feature engineering (covered in
AC
section 3.2). An analysis of the pairs plot can result in discarding some input features based on low
correlation to the target or redundancy. For example, if a pairs plot yields a perfect or very high
correlation between the two variables that would be sufficient evidence to drop one of the two features on
account of redundancy.
6
ACCEPTED MANUSCRIPT
2.2 Data Management

Data have been divided into different sets to avoid overfitting. The training set includes the set which is
included in the algorithm, the validation set includes data required for fine tuning the model, and the test
set is the blind set with which the data will be utilized for evaluation of errors. Real time applications of
drilling are used for machine learning predictions of ROP between different formations. Earlier machine
PT
learning was used to predict drilling parameters such as ROP (Hegde, Wallace, and Gray, 2015a) and
torque (Hegde, Wallace, and Gray, 2015c) inside a given formation. However, the work presented here
RI
tunes the machine learning model such that predictions of ROP can be made continuously across different
formations simultaneously with depth increases. In these cases (a real time case scenario) the training set
would be data up to that given depth and the test set would be a finite depth thereafter. For example,
SC
training data from 0-2500ft would be tested against 2500-2600ft data, which would include the test data.
U
2.3 Model Assessment
AN
Accuracy of the model is measured using the root mean squared error (RMSE). This serves as a
M
measurement of the error when the model is tested against actual data. RMSE is a good method of model
evaluation as the unit of the error remains the same as the unit of the measured value. This facilitates
engineering interpretation.
D
1

=
( − )
TE

C EP
2.4 Cross Validation

AC
Overfitting is a common phenomenon associated with statistical and machine learning

models. An example of overfitting would be when a model returns a perfect result. This happens
when the model is tested on the same data that was used to build it (i.e. training data). To avoid
overfitting, common practice is to use a test set along with the training set to evaluate the results
of the model. Cross validation is also used in selection of tuning parameterss. Cross validation
7
ACCEPTED MANUSCRIPT
splits the training set into K parts, termed K-fold cross validation. Assume K=5 in this case, the
training set would be split in a 1:5 ratio. The smaller split is used as a validation set for the larger
split, i.e., 4 parts training and 1 part validation. This process is randomized until all parts of the
data are used effectively for training and validation.
PT
This process can be computationally intensive. If the training set is large, this approach can be
avoided to increase computational efficiency of the model. Another similar method commonly
RI
employed is the leave out one cross validation (LOOCV), where all but 1 data point is used for
training and the model is tested on the single left out data point. This process is repeated until all
SC
points have been tested (or all points have been left out).
U
3. Machine Learning Techniques AN
This section describes the various machine learning techniques applied as a prediction technique. Linear
M
and several nonlinear techniques for prediction have been extensively covered by Hegde (2016), with
applications of various regression techniques for predicting ROP. This section discusses the application of
D
random forests for the entire well irrespective of lithology or formations. Basics of the random forest
algorithm, and the process of building a model is covered in depth by Hegde, Wallace and Gray (2015a)
TE
and Hegde (2016).

EP
3.1 Random Forests

C
Decision trees (Quinlan, 1986) are the building blocks of nonlinear prediction techniques. They capture
AC
non-linearity in the data but suffer from high error, high variance, and over fitting. These problems can be
avoided by using random forests. Random forests implement bootstrapping (Efron and Efron, 1982) to
create a large number of samples B. However, at each node of a tree a random sample of features is
considered to construct the decision tree. This has the effect of de-correlating the trees which reduces
variance and improves prediction accuracy. By using a reduced number of predictors, each tree is forced
to use a small number of predictors. This forces different features which may not have a high bearing on
the end result to contribute to the tree.
8
ACCEPTED MANUSCRIPT
Using the training data, a random forest is built on the training data as a function of some input
parameters. These input parameters have a two-fold responsibility, they have to be correlated to the ROP,
and they have to be physically meaningful. Correlation of input parameters can be calculated using a pairs
plot (Figure 2). The input parameters have to affect ROP, which can be derived from traditional models.
The process of applying this domain expertise in the selection of input parameters is referred to as feature
PT
engineering in the machine learning world – covered in greater depth in the following section.
ROP is predicted using RPM of the bit, WOB, UCS of rock and flowrate as input features. Inside a given
RI
formation, random forests predict ROP with low error as shown in Figure 3. Figure 3a compares ROP
predictions of random forests to linear regression in a given sandstone formation. It is evident that the
SC
random forest prediction fits the data better. Fit to the data is generally described with an R2 value; R2 for
random forests was 0.96 and linear regression was 0.42. RMSE using the random forest algorithm was
7.36 ft/hr, less than half of the RMSE for linear regression (18.43 ft/hr). Figure 3b consists of a box-plot
U
which summarizes the accuracy of random forests and linear regression. The box-plot has been used to
AN
compare the normalized error (error percentage) at each point. The mean error for random forests is
around 5% , great results in comparison to linear regression which shows a normalized error of 14%.
M
D
TE
C EP
AC
9
ACCEPTED MANUSCRIPT
PT
RI
U SC
AN
M
D
TE
Figure 3: Comparison of random forests and linear regression for ROP prediction. Figure 3a (left) plots
Depth against ROP prediction for both methods. Figure 3b (right) is a box-plot which summarizes the
normalized errors of random forest and linear regression for ROP predictions in Tyler sandstone.
C EP
The same algorithm with the same input features was used to predict ROP throughout the depth of the
well, as illustrated in Figure 4. The prediction accuracy varies with depth, however the error rarely
AC
exceeds 10% of the average ROP in a given formation.
10
ACCEPTED MANUSCRIPT
PT
RI
U SC
AN
M
D
TE
Figure 4: ROP Prediction using Random Forests for the entire length of the well
EP
3.2 Feature Selection and Engineering

C
Feature selection is the selection of input variables which are then fed into a machine learning algorithm
AC
for prediction of a certain feature. In this case, input variables, i.e., surface measured parameters, are
features to the model. These surface measured parameters are fed into the random forest algorithm as
described in the previous section for prediction of ROP. However, aggregating features from raw data is
termed as feature selection, which requires manual input. The selection of features, or the features
themselves, have a huge bearing on success of the model in question. Selection of features can result in
decrease in computational time, and increase in accuracy if performed correctly. Details of feature
selection are given in detailed guides written by (Guyon and Elisseeff, 2003).
11
ACCEPTED MANUSCRIPT
Feature engineering is a term associated with crafting existent features in a manner such that maximum
accuracy is obtained by the machine learning algorithm. In simple terms it ensures that data are encoded
such that it is relatively easy for a machine learning algorithm to achieve good results. Feature
engineering is highly domain specific since it requires specific knowledge of the domain. Feature
engineering is model dependent and may work differently with different algorithms. Feature engineering
PT
is also dependent on the data involved, since they might not be based on the same physical concept. For
example, prediction of torque using neural nets can have different engineering features or parameters as
compared to prediction of ROP using random forests. Good feature engineering should have the ability to
RI
increase performance of the machine learning algorithm. This section applies feature engineering to the
available data and features used in the previous subsection in an attempt to increase accuracy and speed of
SC
prediction using random forests.
The input features in this paper (RPM, WOB, flow rate and UCS) have been chosen keeping the physics
U
of the wellbore in mind. These parameters have been actively used in traditional models. Feature
AN
engineering in this paper utilizes a small set of input parameters, rather than all the parameters available
on the surface (after consulting a pairs plot and looking at traditional physics-based models).
M
4. ROP Optimization using Data Analytics

D
The machine learning models described are used hereafter as the default models for ROP, since they have
TE
proved to be accurate and reliable. Data Analytics, i.e., tips and inferences using data are employed to
increase ROP while drilling. An increase in ROP generally correlates with higher drilling efficiency.
EP
Reduction in MSE generally correlates with increase in ROP, hence for the sake of simplicity, it can be
assumed that drilling cost is reduced when ROP is increased. The application of data analytics in this
manner, i.e., using a machine learning algorithm for prediction, after which an algorithm is used to
C
maximize ROP at each point is called Wider Windows Data Analytics Optimizer 1 (WWDAO) (Hegde
AC
and Gray, 2016).
4.1 Features and Spread of Data
For ROP optimization, any feature used in WWSLM can be varied across a large range of values until an
12
ACCEPTED MANUSCRIPT
optimum solution is reached. There are two kinds of limitations which are incurred while trying to solve
this problem: engineering and data limitation.
For example, it may not be possible drill with an RPM over 1000 revolutions per minute due
manufacturing, downhole conditions or rig constraints. Hence each feature has a limited operating
threshold. Since this paper is concerned with increasing efficiency of drilling, only values of features
PT
which lie in a threshold of 100 ft of the point of interest will be used for optimization purposes. If the drill
bit is at a depth of 3000ft, values of WOB and RPM in the operating range of 2800-3000ft will be used
RI
for optimization. Values of the features will be varied between the ranges of operable parameters for
assessing the optimum value of the feature to be employed for an increase in ROP. Other engineering
SC
constraints such as increase in vibrations and borehole stability, beyond the scope of this paper, are
considered in current work in progress.
Data limitation refers to shortage of data, where adequate data does not exist (in the training set) for
U
adequate extrapolation. Since the machine learning predictor is built on the training set, it can also be
AN
optimized only on the training set. Extrapolation of predictions beyond the range of the training data is
dangerous, and results in predictions with high uncertainty. The data limitation can be overcome with
some exploration work by the driller. The driller should be motivated to try different ranges of ROP,
M
RPM, WOB, and flow rate on entering a new formation, there by populating the training set with an
extensive range of input parameters. This will enable the machine learning predictor (WWSLM) to
D
increase its breadth or range of operation. For example, since WWSLM predictions should not be
extrapolated beyond its range, if the training set contains a datum with RPM of 450 rev/min, only then
TE
can the optimization range for RPM be extended to 450. The optimization is bounded or limited by the
ranges of values of the input parameters in the training set.
EP
4.2 Multi-Dimensional Feature Optimization

C
Formations are analyzed for a predicted increase in ROP with a change in certain features - weight on bit
AC
(WOB), rotatory speed (RPM), or flow rate - which can be manipulated by the engineer to change the
ROP. ROP has been modeled as a machine learning model, which is a function of specified input features,
some of which can be controlled by the drilling engineer on the surface. Commonly, variation of WOB,
RPM, and flow rate heavily relies on the driller’s experience and experience-based rules of thumb. This
section defines values for these features mathematically such that they may be changed to obtain
maximum possible ROP at that given depth.
13
ACCEPTED MANUSCRIPT
Since the prediction model, WWSLM 1, has numerous input parameters which can be controlled at the
surface, one or more of these features can be used for optimization. Optimizing only one feature would
yield one dimensional optimization, two features would yield two dimensional optimization, and three
features would yield three dimensional optimization. The algorithm used in this case for optimization was
a brute force algorithm to ensure that in each case, the global maxima was indeed reported. Although it is
PT
an engineering problem, since the model is statistical in nature, the search for global maxima is not simple
and local maxima can easily be mistaken for global maxima. Given this condition, it is safer to use a brute
force algorithm to calculate optimum values of given features to ensure maximum ROP in the given
RI
formation. This will yield an improved ROP, given that the feature is set at its ‘best’ value.
SC
The number of features which are controllable on the surface are limited by the input parameters.
Furthermore, given the change in geology over different formations, different features may have different
effects on ROP during drilling each formation. A safer approach is to optimize more than one input
U
feature. This ensures that at least one of the features used will be crucial in determining the ROP in that
formation as per WWSLM 1.
AN
M
4.3 Methodology
ROP measures the rate of penetration, how fast or slow a well is being drilled. A post drill analysis is very
D
beneficial in drilling pad wells. A given well can be analyzed and changes can be made to improve ROP
on subsequent wells and reduce drilling time. In this section, drilling time saved is computed using field
TE
data from a vertical well drilled across various formations, as noted in previous sections. Machine
Learning is used for ROP prediction and data analytics are used to calculate optimum parameters
thereafter.
EP
This section evaluates feature optimization to increase ROP in a given formation. The Tyler
sandstone formation is used for reference. The formation is divided into splits of 100 ft each. For
C
example, if the bit is at 5000 ft, the training set is composed of data collected from 4900ft-5000ft. This
AC
data would be used for training and validation of a machine learning model. That model is then used to
analyze predicted ROP over the length of the training data, i.e. 5000ft – 5100ft. An optimization
algorithm is applied to change WOB and RPM of the test data to find the maximum attainable ROP as
described in the previous section. For evaluation of ROP either 1D, 2D, or 3D optimization can be
applied. WOB and RPM are parameters which are generally changed while drilling. Here WOB and
RPM, and WOB, RPM, and flowrate are used as optimization parameters in 2 and 3D feature
optimizations, respectively. Figures 5 and 6 illustrate plots where ROP has been optimized by
14
ACCEPTED MANUSCRIPT
manipulating the aforementioned parameters. It can also be important to look at the change in surface
parameters, i.e., the change in values of WOB and RPM which would lead to a corresponding increase in
ROP.
PT
RI
U SC
AN
Figure 5: Original ROP vs Predicted increased ROP while optimizing WOB & RPM (left); Change in
WOB for increase in ROP (middle); Change in RPM for increase in predicted ROP (right)
M
D
TE
C EP
Figure 6: Original ROP vs Predicted increased ROP while optimizing WOB & RPM (left); Change in
AC
RPM for increase in ROP (left middle); Change in Pump pressure for increase in predicted ROP (right
middle); Change in WOB for increase in ROP (right)
Figure 5 shows three plots pertaining to ROP optimization using WWDAO. The left plot illustrates
increase in ROP along the formation by changing WOB & RPM. The middle and right plots show the
change in change in WOB and in RPM as WWDAO selects the best values for ROP optimization in the
formation. Figure 6 shadows Figure 5, however it is different since more features are being optimized.
15
ACCEPTED MANUSCRIPT
RPM, WOB, and Pump pressure are optimized in Figure 6 for increased ROP. The second, third and
fourth sub plots in Figure 6 show the change of features with a corresponding increase in ROP. It is
important to monitor the change in features so that they can be assessed. There might not always be an
increase in WOB or RPM as the predicted ROP increases. This has been denoted as rock floundering,
where hole cleaning or some other physical limiter overrides ROP.
PT
4.4 ROP Optimization for Intervals
This section outlines a method to evaluate ROP and calculate the best possible ROP that can be
RI
achieved over the course of drilling a well. Practically, it is not possible to change surface parameters
every second unless an advanced automated system is used. Hence this section introduces the
SC
aforementioned concepts applied in ROP optimization over 25ft, 50ft and 100ft intervals, as opposed to
the previous sections where only one formation was evaluated. WWSLM 1 has been used with a 3D
optimization method where WOB, flowrate, and RPM are optimized. This not only shows the power and
U
applicability of machine learning, but also illustrates an application for increasing efficiency in drilling.
AN
Alongside optimum ROP, the time saved with faster drilling can also be computed. Time to drill through
100ft may be calculated in hours by dividing 100 by the average ROP in that section. In this manner it is
possible to calculate the time required to drill the same 100ft if ROP were optimized using machine
M
learning and data analytics (WWSLM and WWDAO). In this manner time saved while using WWDAO
can be calculated. Figure 7, 8, and 9 shows increase in drilling efficiency using machine learning and data
D
analytics. The Left plot shows the predicted increase in ROP over the entire length of the drilled well.
Right plot shows the improved time for the depth of the well.
TE
Figures 8, and 9 plot show ROP against predicted increase in ROP by using WWDAO 1 over specified
interval lengths. Interval lengths are at the discretion of a given operator, depending on various factors
EP
such as lithology, thickness of formations, etc. In theory, with a decrease in interval length, the amount of
time saved should increase, since drilling can be optimized at each foot of depth rather than over an
interval. This methodology has real time drilling applications since it is feasible computationally as well.
C
However, here it is evaluated for post drilling analysis (PDA). PDA can be performed on data collected
AC
from one well (using WWSLM & WWDAO), to improve drilling efficiency in nearby wells. This can
improve the ability to drill pad wells more efficiently. Figures 6 through 9 clearly show application of the
concept explained in this section. It can lead to almost 30 hours of saved time while drilling about 4500ft
of a vertical well.
16
ACCEPTED MANUSCRIPT
PT
RI
U SC
AN
M
Figure 7: (Left) Predicted ROP improvement over the length of the well with ROP Optimization over 25ft
intervals, (Right) Amount of time saved with WWDAO over 25 ft intervals is used
D
TE
C EP
AC
17
ACCEPTED MANUSCRIPT
PT
RI
U SC
AN
M
Figure 8: (Left) Predicted ROP improvement over the length of the well with ROP Optimization over 50ft
intervals, (Right) Amount of time saved with WWDAO over 50 ft intervals is used
D
TE
C EP
AC
18
ACCEPTED MANUSCRIPT
PT
RI
U SC
AN
M
Figure 9: (Left) Predicted ROP improvement over the length of the well with ROP Optimization over
100ft intervals, (Right) Amount of time saved with WWDAO over 100 ft intervals is used
D
5. Results and Discussion

TE
This paper shows practical applications of machine learning in drilling engineering. ROP is predicted
using machine learning algorithms over the entire well. Random forests predictions of ROP across all
EP
formations and lithology are shown. These predictions had an acceptable error such that they could be
used for post drilling analysis. Feature engineering was introduced as a method of increasing performance
of the machine learning algorithm. This resulted in predictions with higher accuracy. Feature engineering
C
is an essential tool which enables machine learning algorithms to better predict ROP in drilling. This also
AC
enables juxtaposition of engineering knowledge with machine learning.
ROP prediction was followed with the use of data analytics to increase the ROP. Since ROP was modeled
as a function of the input data using machine learning, it is possible to change the values of some features
which would in turn increase ROP. Some features or drilling parameters can be controlled on the surface
by the drilling engineer such as WOB, RPM, and flowrate. These controllable parameters can be changed
while drilling to give higher ROP. A brute force algorithm was used to measure increased ROP, and using
changed features, ROP was increased in the Tyler sandstone formation. The increase in predicted ROP
19
ACCEPTED MANUSCRIPT
was 40% when applied to the data contained in the Tyler sandstone formation. However when the same
optimization concept was applied throughout the length of the well to evaluate increased ROP on a larger
scale, an overall improvement in ROP was achieved. Practical methods for applying the same technique
over the length of the well was discussed.
Modification of surface parameters over the length of an interval rather than at each point in depth would
PT
be a realistic application of WWDAO. Surface parameters were optimized over interval lengths of 25, 50
and 100ft. Modification of surface parameters such as WOB, RPM, and flowrate show an overall
RI
improvement in ROP while drilling. This can be represented in in units of time to calculate drilling time
saved using WWDAO over a drilling length of 4500ft. In these constrained parameters, analysis
SC
determined that drilling could be improved by about 12.56 %, since the time saved was 30.12 hours and
the total time to drill the well neglecting NPT was 238 hours.
U
6. Conclusions
AN
Machine Learning can be effectively used to predict ROP while drilling over the length of the well. These
predictions can be made purely with surface measured parameters (such as RPM, weight-on-bit, flow rate,
M
and UCS of rock). The accuracy of predictions can be improved using feature engineering (selecting the
correct input parameters), where engineering knowledge can be used to assist the machine learning
algorithm for more efficient predictions. Predictions from machine learning algorithms can be applied to
D
change surface parameters on the rig which will increase ROP while drilling. Surface parameters are
TE
modified using a brute force algorithm to ensure they result in maximum ROP. WOB, RPM, and flowrate
are optimized for the length of the well to achieve an increase in ROP. One, two, or all three input
parameters may be modified to increase ROP. As the number of input features increase, the ROP
EP
increases, but at a cost of increased computational time. Optimization of surface parameters is performed
over 25, 50 and 100ft lengths as a practical method of application in drilling. Time saved can be
C
computed with an increase in ROP. When WWDAO was applied to the field data used for validation in
this paper, it was estimated to save around 30 hours of drilling which was estimated to be 12.5 % of total
AC
drilling time.
20
ACCEPTED MANUSCRIPT
Acknowledgement
The authors would like to thank sponsors of the Wider Windows Industrial Affiliate Program: British
Petroleum, Chevron, ConocoPhillips, Halliburton, Marathon, National Oilwell Varco, Occidental Oil and
PT
Gas, and Shell. Special thanks are due to Marathon Oil Company for providing field data.
RI
References
SC
Aadnoy, B. S., Fazaelizadeh, M., & Hareland, G. 2010. A 3D analytical model for wellbore
friction. Journal of Canadian Petroleum Technology,49(10), 25-36.
Bingham, M. G. 1965. A New Approach to Interpreting Rock Drillability. Oil & Gas IOllr.
U
Bilgesu, H. I., et al. 1997. A new approach for the prediction of rate of penetration (ROP) values. SPE
AN
Eastern Regional Meeting. Society of Petroleum Engineers.
Bourgoyne Jr, A. T., and F. S. Young Jr. 1974. A multiple regression approach to optimal drilling and
abnormal pressure detection. Society of Petroleum Engineers Journal 14(04): 371-384.
M
Breiman, L. (1996). Bagging predictors. Machine learning, 24(2), 123-140.
Buntine, W. (1992). Learning classification trees. Statistics and computing, 2(2), 63-73.
D
Davison, A. C., & Hinkley, D. V. (1997). Bootstrap methods and their application (Vol.1). Cambridge
university press.
TE
Dashevskiy, D., Macpherson, J. D., Dubinsky, V., & McGinley, P. (2007).U.S. Patent No. 7,172,037.
Washington, DC: U.S. Patent and Trademark Office.
Dunlop, J., Isangulov, R., Aldred, W., Arismendi Sanchez, H., Sanchez Flores, J.L., Alarcon Herdoiza, J.,
EP
Belaskie, J., Luppens, J.C. 2011. Increased Rate of Penetration Through Automation. SPE/IADC
139897.
C
Eckel, J. R. (1967). Microbit Studies of the Effect of Fluid Properties and Hydraulics. Journal of
Petroleum Technology, pp. 541-546.
AC
Efron, B., & Efron, B. (1982). The jackknife, the bootstrap and other resampling plans (Vol. 38).
Philadelphia: Society for industrial and applied mathematics.
Gjelstad, G., Hareland, G., Nikolaisen, K. N., and Bratli, R. K. (1998). The Method of Reducing Drilling
Costs More Than 50 Percent. SPE/ISRM Eurock. Trondheim, Norway, July 8-10.
Guyon, I., & Elisseeff, A. (2003). An introduction to variable and feature selection. The Journal of
Machine Learning Research, 3, 1157-1182.
21
ACCEPTED MANUSCRIPT
Hareland, G., and P. R. Rampersad. 1994. Drag-bit model including wear. SPE Latin America/Caribbean
Petroleum Engineering Conference. Society of Petroleum Engineers.
Hegde,C.M., Wallace S.P. and Gray, K.E. (2015a). Using Trees, Bagging and Random Forests to Predict
Rate of Penetration during Drilling. Presented at SPE Middle East Intelligent Oil & Gas
Conference & Exhibition, Abu Dhabi, United Arab Emirates, 15-16 September. SPE-176792.
PT
Hegde,C.M., Wallace S.P. and Gray, K.E. (2015b). Use of Regression and Bootstrapping in Drilling:
Inference and Prediction. Presented at SPE Middle East Intelligent Oil & Gas Conference &
Exhibition, Abu Dhabi, United Arab Emirates, 15-16 September. SPE-176791.
RI
Hegde,C.M., Wallace S.P. and Gray, K.E. (2015c). Real time Prediction and Classification of Torque and
Drag during Drilling using Statistical Learning Methods. Presented at SPE Eastern Regional
SC
Conference, Morgantown, West Virginia, USA, 13-15 October. SPE-177313.
James, G., Witten, D., Hastie, T., & Tibshirani, R. 2013. An introduction to statistical learning. New
U
York: springer.
AN
Johancsik, C.A., D.B. Friesen, and Rapier Dawson. 1984. Torque and Drag in Directional Wells-
Prediction and Measurement. Journal of Petroleum Technology 36(6): 987-992.
Jahanbakhshi, R., R. Keshavarzi, and A. Jafarnezhad. 2012. Real-time prediction of rate of penetration
M
during drilling operation in oil and gas wells. 46th US Rock Mechanics/Geomechanics
Symposium. American Rock Mechanics Association.
D
Lesage, M., I.G. Falconer, and C.J. Wick. 1988. Evaluating Drilling Practice in Deviated Wells with
TE
Torque and Weight Data. SPE Drilling Engineering 3.03 : 248-252.
Maidla, E.E., Wojtanowicz, A. K. 1987. Field Method of Assessing Borehole Friction for Directional
EP
Well Casing. Presented at the Middle East Oil Show, Manama, Bahrain, March. SPE 15696.
Maurer, W. C. (1962). The “Perfect-Cleaning” Theory of Rotary Drilling. Journal of Petroleum
C
Technology, pp. 1270-1274.

Mevik, B. H., & Wehrens, R. (2007). The pls package: principal component and partial least squares
AC
regression in R. Journal of Statistical Software,18(2), 1-24.
Motahhari, H. R., Hareland, G., and James, J. A. (2010). Improved Drilling Efficiency Technique Using
Integrated PDM and PDC Bit Parameters. Journal of Canadian Petroleum Technology, v. 49, no.
10, pp. 45-52.
22
ACCEPTED MANUSCRIPT
Newman, K.R, and R Procter. 2009. Analysis of Hook Load Forces During Jarring. IADC/SPE Drilling
Conference and Exhibition. Amsterdam, Netherlands, March.
Newman, K. R. Finite Element Analysis of Coiled Tubing Forces. Society of Petroleum Engineers, SPE
89502, SPE/ICoTA Coiled Tubing Conference and Exhibition, Houston, Texas, March 2004.
PT
Nygaard, R., Hareland, G., Budiningsih, Y., Terjesen, H. E., and Stene, F. (2002). Eight Years Experience
with a Drilling Optimization Simulator in the North Sea. IADC/SPE Asia Pacific Drilling
Technology. Jakarta, Indonesia, September 11.
RI
Quinlan, J. R. (1986). Induction of decision trees. Machine learning, 1(1), 81-106.
Rampersad, P. R., Hareland, G., and Boonyapaluk, P. (1994). Drilling Optimization Using Drilling Data
SC
and Available Technology. III Latin American/Caribbean Petroleum Engineering Conference.
Buenos Aires, Argentina, April 27-29
U
Saldivar, B., Boussaada, I., Mounier, H., Mondie, S., & Niculescu, S. I. (2014, August). An overview on
the modeling of oilwell drilling vibrations. In World Congress (Vol. 19, No. 1, pp. 5169-5174).
AN
Soares, C. (2015). Development and Applications of a New System to Analyze Field Data and Compare
Rate of Penetration (ROP) Models. M.S Thesis, The University of Texas at Austin.
M
Walker, B. H., Black, A. D., Klauber, W. P., Little, T., and Khodaverdian, M. (1986). Roller-Bit
Penetration Rate Response as a Function of Rock Properties and Well Depth. 61st Annual
D
Technical Conference and Exhibition of the Society of Petroleum Engineers. New Orleans, LA,
TE
USA, October 5-8.

Wallace, Hegde and Gray (2015). System for Real Time Drilling Performance Optimization and
Automation Based on Statistical Leanring Methods . Presented at SPE Middle East Intelligent Oil
EP
& Gas Conference & Exhibition, Abu Dhabi, United Arab Emirates, 15-16 September. SPE
176804.
Warren, T. M. (1987). Penetration-Rate Performance of Roller-Cone Bits. SPE Drilling Engineering, pp.
C
9-18.
AC
Winters, W. J., Warren, T. M., and Onyia, E. C. (1987). Roller Bit Model With Rock Ductility and Cone
Offset. 62nd Annual Technical Conference and Exhibition of the Society of Petroleum Engineers.
Dallas, TX, USA, September 27-30.
23
ACCEPTED MANUSCRIPT
• Machine Learning is proposed to predict the ROP while drilling using surface measured
parameters as inputs
• Machine Learning provides a means to predict ROP during drilling with high accuracy with
minimal data
• This machine learning predictor can be used for simulation optimization so that ROP is
PT
maximized by adjusting surface parameters
• It was estimated to save around 30 hours during drilling which was around 12.5 % of total drilling
RI
time
• Data exploration by the driller can result in saving much more time during drilling
U SC
AN
M
D
TE
C EP
AC

Hegde 2017

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Hegde 2017

Hochgeladen von

Copyright:

Verfügbare Formate

Accepted Manuscript

Chiranth Hegde, K.E. Gray

To appear in: Journal of Natural Gas Science and Engineering

Received Date: 19 July 2016

Use of Machine Learning and Data

1.1 Traditional ROP Models

1.2 Predictive Models

2.1 Data Exploration and Feature Selection

Figure 1: ROP vs Depth Plot for Field Data

Figure 2: Pairs Plot for Limestone Rock

2.2 Data Management

2.4 Cross Validation

Overfitting is a common phenomenon associated with statistical and machine learning

and Hegde (2016).

3.1 Random Forests

exceeds 10% of the average ROP in a given formation.

3.2 Feature Selection and Engineering

4. ROP Optimization using Data Analytics

and Gray, 2016).

4.1 Features and Spread of Data

4.2 Multi-Dimensional Feature Optimization

5. Results and Discussion

enables juxtaposition of engineering knowledge with machine learning.

Torque and Weight Data. SPE Drilling Engineering 3.03 : 248-252.

Technology, pp. 1270-1274.

regression in R. Journal of Statistical Software,18(2), 1-24.

USA, October 5-8.

Das könnte Ihnen auch gefallen