Sie sind auf Seite 1von 8

000

001
002
003
004 An alternative for score prediction in rain interrupted
005
006
cricket matches.
007
008 Abhiram Eswaran
009
Department of Computer Science
010
011
University of Massachusetts
012 Amherst, MA 01002
013 aeswaran@umass.edu
014 Akul Swamy
015 Department of Computer Science
016
University of Massachusetts
017
018
Amherst, MA 01002
019 aeswaran@umass.edu
020
021 May 3, 2017
022
023
024
Abstract
025
026
We propose an alternative to the state-of-the-art Duckworth-Lewis-Stern[1]
027
method used to predict scores in a rain interrupted cricket match. Factors like
028
average score at the venue and number of power-play balls remaining highly af-
029 fect the performance of any team. The DLS method fails to take these factors
030 into consideration. We use these along with DLS factors as our features. We then
031 evaluate our model using different regression algorithms and compare the perfor-
032 mance of our model and DLS Method with the ground-truth or the actual score of
033 every match. We show that considering these additional features in the calculation
034 of target score provides a better approximation of the predicted score.
035
036
037 1 Introduction
038
039 The DLS method was introduced as the standard way to predicting target scores in a rain interrupted
040 cricket match in 1997, which was renamed to DLS in 2004. Over the years the game of cricket
041 has evolved with higher scores being scored and chased down. While the game has evolved, the
042 DLS method has been almost the same. The method heavily depends on 2 features, the wickets
043 at hand and the overs left. The method has been criticized because wickets are a more heavily
044
weighted resource than overs.[1] Another shortcoming of the method is that the DLS method does
not account for changes in proportion of the innings for which field restrictions are in place compared
045
to a completed match. In short, the DLS Method has failed to adapt to the evolved game of modern
046
cricket.
047
048 In our project, we came up with a predictive model which considers features like wickets lost, overs
049 left, power play balls left, target score, venue average and total overs of the game, thereby giving
050
an unbiased prediction of the target score for a team. By considering the historical statistics of the
venue and the field restrictions, we solve the existing shortcomings of the DLS method.
051
052 We built a model using the above mentioned features along with the ML regression algorithms like
053 Linear Regression, Decision Trees, Logistic Regression, KNN Regressor and ensemble methods
like Random Forests, Gradient Boosting and AdaBoost.

1
054
Since score prediction is different for 1st and 2nd innings, we consider 2 scenarios to evaluate our
055 model. The first being, rain interrupting 1st innings and the innings being curtailed at the end. In the
056 2nd innings, we consider where rain interrupts and the match is abandoned. To measure our success,
057 we propose two evaluation metrics,
058
059 • The RMSE of our model with baseline score and RMSE of DLS method with baseline
060 score.
061
• The RMSE of the model when considering only DLS features and RMSE of the model
062
when considering all features.
063
064 We found that the model outperformed the DLS-method for the first innings with Decision Trees
065 performing the best. We also found that the model performed with the mentioned features rather
066 than considering overs and wickets as done with the DLS-method. The findings underpin the fact
067 that using additional features like average score of the venue and power play balls left, are critical
068 in giving a better approximation of the target to be chased.
069
070
071
2 Related Work
072
073 Duckworth-Lewis method [2] is the one which is widely accepted in this scenario. It takes into
074 account, wickets and overs remaining as the only resources of a team. The general formula used to
075
compute the target score is given by,
076 T eam20 sresources
T eam20 sparscore = T eam10 sscore ×
077 T eam10 sresources
078
This however does not seem correct and might not always result in a fair score adjustment. Also,
079 with the advent of T20 cricket and power plays, larger totals have been chased down with ease over
080 the recent past in International cricket. The D/L methods fails to take this acceleration into account.
081 We address this by considering power play overs as a feature in our model. Also, the venue plays a
082 key role in determining the final score and we wish to investigate this by employing ML algorithms
083 to make use of these features. In order to improve Duckworth Lewis Stern method, Mankad et al[10]
084 discusses using an alternative to Duckworth-Lewis table by uses a Gibbs sampling scheme related
085 to isotonic regression to provide a nonparametric resource table. This approach helps to reflect the
086 relative run scoring resources available to the two teams, that is overs and wickets in combination.
087
But the approach fails at considering other important factors affecting score prediction.
088 Sethuraman et al.[3] discusses using momentum, venue, player ratings and player history for
089 predicting scores using a machine learning model. The project uses Correlation Based Subset
090 Feature Selection to pick the relevant set of features and runs a linear regression model to predict
091 the score of a cricket match. While momentum and venues have a positive impact, modeling on
092 player history and player rating impacts the performance of the model negatively since form of a
093
player is not quantifiable. We address these problems by considering just the average score in the
venue as a feature. Another work on similar lines is from Kampakis et al. [4]. They perform ML
094
based prediction of outcomes in T20 games in English County cricket. They consider relevant team
095
and player based features in determining the performance. This is a classification problem where
096 they predict the end result of the game, which can find relevant applications to betting. Ours on the
097 other hand, is a regression problem where we try to predict the target for the second innings. Here,
098 we avoid considering team specific features as our method is aimed at performing unbiased score
099 adjustment in case of a rain interruption in a 50 over match.
100
101
Similar to score prediction, they has been related work in different sports that predict the outcome
102 of the game. Albina Yezus et al. [5] is a paper which studies the methods of machine learning
103 used in order to predict outcome of soccer matches. It uses features like match information, season
104 information and table for each moment in the season and models it with machine learning algorithms
105 like Random Forest and K nearest neighbors. The models achieves a prediction accuracy of 60%.
106 On similar lines Liang et al. [6] predict scores in European football games by building a model
107 that can give an accurate prediction on game results based on data of previous matches and relevant
analysis. The model shows the ability of dealing with big data and beats bookmakers on game results

2
108
using classification. The best performing model was found to be Logistic regression achieving an
109 accuracy of 54.5%.
110
111
112 3 Dataset
113
114
The dataset was scraped off Cricinfo[7] using BeautifulSoup. The data contained the ball by ball
details of all one day international matches starting from 2006. The number of test cases was around
115
30000 for 1500 matches for 1st innings and 2nd innings separately. The data scraped consisted of
116
metadata of the match including the venue, umpires, result and day of the match. The ball by ball
117 details contained the striker, non-striker, runs, extras, bowler for each ball. From the above data
118 we picked relevant features like wickets, overs and venue. The average scores of the ground were
119 scraped separately and linked with the venue. The following were the features considered from the
120 scraped data,
121
122 • Balls played
123 • Wickets lost
124 • Cumulative runs scored till the ith ball
125
• Power play balls remaining
126
127
• Average score at the venue
128 • The total number of overs in the match
129
130
For the second innings we used the target the team has to chase as an additional feature. All the data
was integer in type.
131
132
133 4 Methodology
134
135 Our project is an application project where we try to use ML algorithms in the domain of cricket to
136 predict target scores in a rain interrupted game. As a part of the project, we built a pipeline which
137 includes data collection and pre-processing, feature engineering, feature selection, hyperparameter
138
optimization, model learning and regression.
139
140
4.1 Data collection and pre-processing
141
The data was scraped off Cricinfo[7] using Beautifulsoup as mentioned before. The data was ob-
142 tained in a CSV format. The data contained some meta data for each match like ground and umpire
143 information. Further, it contained information at the ball level. ie; The result of each ball played in
144 the match was listed as one row which included details of runs scored, batsman, extras conceded,
145 all of these just for that specific delivery. It was then processed using pandas[8] to extract only the
146 necessary fields into a dataframe.
147
148 4.2 Feature Engineering
149
150 Once the necessary fields were extracted into a dataframe, multiple other features were engineered
151
using these base features. ie; The raw feature was just the ball number and the runs conceded for
the ball. Since, we need to capture the rate of scoring, we compute the cumulative result for each
152
ball or the state of the game at the end of the ith ball. Further, the number of powerplay balls left at
153
every state in the game is computed using the elapsed over information while processing each row.
154 Additionally, we include the average ground score with the aforementioned features. Once all these
155 are engineered, we obtain the final feature vector containing- balls, runs, wicket, groundaverage,
156 ppballsleft and totalovers.
157
158 4.3 Feature Selection
159
160 On using the SelectKBest method from sklearn[9] to perform feature selection, the complete set of
161 features was returned as the best set. This is not surprising as these indeed are features which heavily
determine the scoring in a match and were hand engineered in the first place.

3
162
4.4 Hyperparameter Optimization
163
164 For hyperparameter optimization, we use sklearn’s GridSearchCV method with default parameters.
165 To this method, we pass the regressor object along with the parameters we wish to optimize. We
166 then store the optimized object for later testing and hence avoid re-computation.
167
168 4.5 Learning
169
170 As a part of the learning process for this problem, we explored numerous different regressors. We
171 further trained these models and evaluated the performance on the prediction task at hand. Below,
172
we present to you the details of the models we employed in the process.
173
174
4.5.1 Linear Regression
175 Considering the fact that ours is a regression problem, we chose Linear Regression as our first model.
176 We chose this model owing to the simplicity and straightforwardness. Additionally, we set this up
177 as our simple baseline upon which set an upper bound on the RMSE. Mathematically speaking,
178 Linear Regression tries to fit the best straight line through the set of data points in n dimensions.
179 The equation for the model is as shown below,
180
yi = β0 1 + β1 xi1 + · · · + βp xip + εi = x>
i β + εi , i = 1, . . . , n
181
182 where βi is the weight on the ith feature.
183
184 4.5.2 Decision Trees
185
186
The next model we explored was Decision Trees. We chose to explore this model due to couple of
strong reasons. Firstly, the intuition behind the model’s approach to learning is easy to comprehend.
187
Secondly, the prediction time in case of Decision Trees is quick owing to the structure of the model.
188
On a high level, the way the learning proceeds is by choosing the best attribute and the best threshold
189 for splitting. The best is the one which minimizes the residual sum of squares. How long this process
190 continues can be controlled using the depth or the number of data points necessary at a node for it
191 to be split.
192
193 4.5.3 Random forests
194
195 Moving ahead, we switched over to experimenting with ensemble models owing to their better
196
fitting and generalization accuracy. Random forests are essentially multiple decision trees which are
built by choosing a subset of samples every time and again inducing more de-correlation by using
197
a random subset of features at every split. This way, the trees are built and the relationships are
198
learned. The results of all the trees are then combined via averaging, majority voting etc;
199
200
4.5.4 Comparison
201
202 Comparing the above mentioned 3 models, the linear regression model, though simple and straight-
203 forward, fails to capture complex non-linear relationships and hence fails to perform well in complex
204 scenarios like this one. However, decision tree is much better in such cases where it can capture com-
205 plex decision boundaries and also deal with missing values in data, which we have in ours. Since,
206 it is a model of high capacity, it might also model noise and lead to overfitting. To avoid this,
207
we switch to ensembles which strike a good balance between variance and bias, thereby achieving
greater accuracy than other models. Hence, the comparison between the above 3 models.
208
209
4.5.5 Other regressors
210
211 Apart from the above mentioned regressors, we explored other regressors including lasso, gradient
212 boosting regressor and KNN. KNN is a non-parametric model which memorizes the vector space
213 and derives results using the results of neighborhood data points. Lasso is a method where linear
214 regression with l1 regularization is used which performs feature selection as well as learning. It tries
215 to drive only some weights to zero thereby giving less importance to those features. The gradient
boosting regressor is one which produces a prediction model in the form of an ensemble of weak

4
216
prediction models, typically decision trees. It builds the model in a stage-wise fashion like other
217 boosting methods do, and it generalizes them by allowing optimization of an arbitrary differentiable
218 loss function. AdaBoost, Logistic regression and support vector regression were also tried in the
219 modeling.
220
221 4.6 Testing
222
223 The testing procedure was pretty straightforward for all the models. We selected the best performing
224 model parameters using GridSearchCV and then ran prediction on a held out test set. The RMSE
225 of this prediction with the ground truth was taken and compared with the RMSE of the DL method
226 with the ground truth. We also compared the RMSE of the model with all features and features of
227
DLS-Method. More details regarding these experiments and results are discussed in the next two
sections.
228
229
230 5 Experiments and Results
231
232 The dataset scraped was split into test and training set in 80:20 ratio. The test set was held out
233 for final evaluation. The training set was used with GridSearch to tune the hyperparameters for
234 individual models.
235
236 5.1 Layout of the pipeline
237
238 The layout of the pipeline started with Hyperparameter optimization with GridSearch and then using
239
the best hyperparameters obtained for Model Learning. We fit the model with the best hyperparam-
eters obtained, obtain the training error and save the fitted model as a pickle to avoid repeated
240
training. We also eliminated feature selection from the pipeline since the best number of features
241
returned with Statistical Dependence Filtering was the maximum number of features in the dataset.
242
243
5.2 Hyperparamter Optimization
244
245 We used GridsearchCV to optimize our hyperparameters. We present the list of hyperparameters
246 tuned for each of the models below.
247
248 • Lasso- alpha
249 • Decision Tree- min samples split, max depth, min samples leaf, max leaf nodes
250
251
• Logistic Regression- C
252 • SVR- kernel, C, gamma
253 • Random Forests- n estimators, max features, max depth, min samples split, bootstrap
254
• Gradient Boosting- learning rate, max depth
255
256 • AdaBoost- n estimators
257
258 5.3 Training
259
260
The main objective of the project was to prove that having a linear dependency, like the DLS-method
between the predicted score and the features would give poorer results. So we set Linear Regression
261
as our baseline method. Every other model trained yielded better results than Linear Regression.
262
263 Using the best hyperparameters obtained from GridSearch we fitted the model on the training set
264 and obtained the training error on the validation set. We did this for the two scenarios, for rain
265 interruption in 1st innings and rain interruption in the 2nd innings. The training errors for the
266
scenarios are provided in Table 1. It was observed that Decision Trees and Gradient Boosting gave
the minimum training errors for 1st innings and 2nd innings respectively. Figure 1 and Figure 2
267
shows the RMSE training errors for each of the regressors selected.
268
269 In terms of speed, Linear Regression and Decision Trees took the minimum to train. Due to the
number of hyper-parameters involved SVR took the highest time in hyperparameter optimization.

5
270
271 Method RMSE for 1st innings RMSE for 2nd innings
272 Baseline(Linear Regression) 24.503 21.715
273 Lasso 24.406 21.506
274 Logistic Regression 23.945 19.301
275 SVR 23.633 18.768
276
Decision Trees 22.699 5.911
Random Forests 22.763 5.166
277
Gradient Boosting 22.913 4.519
278
Adaboost 23.168 16.847
279
280 Table 1: Training Errors for 1st innings
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
Figure 1: Training RMSE for 1st innings
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323 Figure 2: Training RMSE for 2nd innings

6
324 Innings RMSE for ML Prediction RMSE for DLS Method
325 1st innings 34.335 36.073
326 2nd innings 36.273 22.392
327
328 Table 2: Comparision of ML Prediction Model with the state of the art DLS-Method. ie; Test RMSE
329
330 Innings RMSE with all features RMSE with DLS Features
331 1st innings 34.335 36.616
332 2nd innings 36.273 37.2674
333
334 Table 3: Comparision of ML Prediction Model with all features and DLS features
335
336
337 5.4 Testing
338
339 From the above models we used the test set to predict scores for two scenarios,
340
341 • Rain interruption in 1st innings after the 1st team has played 40 overs and the innings was
342 terminated.
343 • Rain interruption in 2nd innings after the 2nd team has played 40 overs and the match was
344 abandoned.
345
346 To test the effectiveness of our predictions, we used the final scores scored by each team in the
347 innings and scaled it down to 40 overs in both scenarios. This score would incorporate the ground
348 truth of their actual score at 40 overs. We then use these scores, to compute the root mean square
349 error of our predictions and root mean square of the DLS-method. Table 2 presents the performance
350 of the best models against the DLS-Method for the 1st innings and 2nd innings. We discuss the
351 importance of the results in the next section.
352 In Table 3 we also present the performance of the best models with all features and the features used
353 by DLS Method in the computation of the predicted scores. It is clear from the results that using the
354 suggested features gives better generalization performance. We discuss in depth in the next section.
355
In terms of speed decision trees had the best speed for prediction. In terms of generalization error
356
Decision trees also gave the best performance for 1st innings scenario. Gradient Boosting performed
357 the best for 2nd innings predictions.
358
359
360 6 Discussion and Conclusion
361
362 We set out with this project to prove the DLS-method of considering just 2 features for score predic-
363 tion is archiac as the game as evolved. The results presented in Table 2 and Table 3 exactly reflect
364 this.
365
Table 3 proves the hypotheses that it is important to consider more features, especially power
366 play balls left and ground average into consideration to predict the score in a rain interrupted match.
367 This would solve for the criticism DLS-method has perennially faced.
368
369
Table 2 validates the success of the Machine Learning model we used to predict scores and its
comparison with the DLS Method. For 1st innings the best performing model was Decision Tree
370
Regressor, this is attributed to the fact that Decision tree regressor uses the training data to pick
371
variables and thresholds to optimize a local performance heuristic at each node in the tree. Further,
372 since we are trying to predict scores at abrupt times of a match, there are chances that we would
373 have not seen such values in training. This boils down to the problem of handling missing values and
374 Decision Trees are pretty powerful in this regard. Hence, our models gives a better generalization
375 error than the state of the art DLS model for first innings. The model for the 2nd innings fails
376 because the 2nd innings models heavily on the data for the 2nd innings and just uses target as an
377 additional feature to predict the score of the 2nd team. This is unlike what DLS method does. Hence
the DLS is better for the 2nd innings.

7
378
In a nutshell, the project was a successful one. We achieved the core objective which showed the bias
379
in DLS-Method with lesser features. The model we developed gave the best results than existing
380 state of the art methods in the first innings. We got to test the performance of regression algorithms
381 on the features we built and discovered decision trees and ensemble methods worked the best. We
382 also found out that not having proper features would impact the model negatively as in the case
383 of 2nd innings where the ML model failed compared to DLS Method. Future work for this project
384 would include building a robust model for 2nd innings which would consider more features and give
385 a better approximation of the score.
386
387 7 References
388
389 [1] The Duckworth Lewis factor Srinivas Bhogle, March 06 2003
390
[2] Duckworth-Lewis Method Wikipedia
391
392 [3] A Learning Algorithm for Prediction in the game of cricket Sethuraman, Parameswaran Raman, Vijay
393
Ramakrishna
394 [4] Using Machine Learning to Predict the Outcome of English County twenty over Cricket Matches Stylianos
395 Kampakis, William Thomas, 2015
396 [5] Predicting outcome of soccer matches using machine learning Albina Yezus, Saint-Petersburg State Univer-
397 sity, 2014
398 [6] Result Prediction for European Football Games Xiaowei Liang, Zhuodi Liu and Rongqi Yan, 2015
399
400
[7] ESPN Cricinfo www.espncricinfo.com
401 [8] Pandas pandas.pydata.org
402 [9] Sklearn scikit-learn.org
403
404
[10] Study and Analysis of Duckworth Lewis Method Sapan H. Mankad, Anuj Chaudhary, Nikunj Dalsaniya,
Vivek Mandir, Nirma University, 2014
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431