Glucose Prediction Data Analytics

Glucose Prediction Data Analytics for Diabetic
Patients Monitoring
Geshwaree Huzooree Kavi Kumar Khedo Noorjehan Joonas
Department of IT Department of Computer Science Ministry of Health & Quality of Life
Charles Telfair Institute University of Mauritius Victoria Hospital
Moka, Mauritius Reduit, Mauritius Candos, Mauritius
geshwaree.huzooree@telfair.ac.mu k.khedo@uom.ac.mu njoonas@intnet.mu
Abstract— Diabetes Mellitus (DM) is one of the leading health enable preventive intervention through self-disease management
complication around the world causing national economic burden such as healthy nutrition and physical activity, (ii) highly
and low quality of life. This increases the need to focus on improve overall BGL control through monitoring of BGL and
prevention and early detection to improve the management and (iii) be a key step in improving health outcomes and quality of
treatment of diabetes. The aim of this paper is to present a life of patients [2] [7].
comprehensive critical review focusing on recent glucose
prediction models and a best fit model is proposed based on the Big data or huge volume of data processing technologies
evaluation to perform data analytics in a wireless body area includes machine learning algorithms, natural language
network system. The proposed glucose prediction algorithm is processing algorithms, predictive modeling and other artificial
based on autoregressive (ARX) model which consider exogenous based techniques [8]. Within this context, the use of data-driven
inputs such as CGM data, blood pressure (BP), total cholesterol modeling techniques along with different combinations of data
(TC), low-density lipoprotein cholesterol (LDL), high density acquired from various sources can be used to provide clinically
lipoproteins (HDL). A dataset of 442 diabetic patients is used to meaningful output [9].
evaluate the performance of the algorithm through mean absolute
error (MAE), root-mean-square error (RMSE), and coefficient of Glucose prediction models can be classified in different
determination (R2). The experimental results demonstrate that ways such as Compartmental Models (CMs) and Data-Driven
the proposed prediction algorithm can improve the prediction Models [10]. Recently, data-driven models have gained
accuracy of glucose. Potential research work and challenges are popularity due to their simpler structure, ability to process
pointed out for further development of glucose prediction models. significant amount of data in real time, and ease of
personalization [11] [12]. The majority of proposed methods
Keywords— Diabetes; glucose prediction; type 1 diabetes; type 2 have been based on auto regressive (AR) [13], [14], [15], [16]
diabetes; autoregressive exogenous algorithm; or AR moving average (ARMA) models of various orders [15]
and more recently, on artificial neural networks (ANNs). ARMA
I. INTRODUCTION models compute current values as a linear combination of past
Continuous glucose monitoring (CGM) devices are values and past prediction errors. Several versions [17], [18]
becoming the new state-of-the-art method, providing a tidal exist where additional inputs can also be taken into account (AR
wave of real-time information due to glucose reading every five model with external insulin input [ARX]). Although AR and
minutes. Consequently, both patients and physicians have to ARMA models are linear, they are used for glucose prediction
process around 9000 readings [1] and interpret the huge amount because of their simple structure and relatively comprehensive
of data to adjust the insulin doses and to keep the blood glucose behavior. ANNs, on the other hand, are nonlinear models that
level (BGL) as close to normal as possible [2]. This exponential have been successfully applied in complex, nonlinear problems.
increase in data highlights a high need to mine the data and to Depending on their architecture, ANNs are mainly classified
predict the patients’ BGL for effective monitoring. A recent into feed-forward and feedback or recurrent. Other techniques of
review on datamining technologies for diabetes [3] pointed out ANN have been used such as multilayer perceptron (MLP) [19]
that mined data will bring value in (i) interpreting and predicting [20] , Neural Networks (NN) [21], Radial Basis Function (RBF)
the long-term glycemic status of the patients and (ii) identifying NN (RBF) [22], wavelet NN [23] and neurofuzzy techniques
important predictors of glucose control and diabetic [12] have been successfully applied for the simulation of glucose
complications and (iii) generating alerts whenever these hypo dynamics. Furthermore, glucose prediction models based on
and hyperglycemic events are forecasted [4]. Moreover, the Gaussian processes [2] [24] and support vector regression (SVR)
patients can also use these data to change or improve their daily [2], [6] have been developed. Moreover, hybrid models based on
lifestyles [5]. the combined use of CM and data-driven modelling techniques
such as AR [11], cARX [4] , SVR [25], and Self-Organizing
Deviating from the defined range of BGL can lead to short Map (SOM) [9] have derived promising results.
term (for example, nausea, fever and coma) and long term
effects (for example, heart attack, kidney failure, blindness and An accurate glucose prediction model that warned people up
stroke) on the body [6] [7] [2]. The ability to predict impending to an hour in advance of imminent changes in their blood glucose
hypo and hyperglycemic events before they occur would (i) levels would allow plenty of time for them to take preventive
978-1-5386-3831-6/17/$31.00 ©2017 IEEE

action. Several glucose predicting model have been developed 30 min. As a continuation of this work, Eren-Oruklu et al, [28]
in the past and comparing them is quite challenging due to tested the ability of this method to provide early hypoglycemic
prominent differences concerning the adopted CGM sensors, alarms, reduce false alarms, and, thus achieving increased
sampling time, preprocessing algorithms possibly used to specificity.
denoise the data, and size of the datasets.
Gani et al., [16] developed AR(30) models using three
Consequently, this paper conduct a literature survey on the approaches: (1) raw data and ordinary least squares; (2)
recent glucose prediction models and evaluate them to select the smoothed data and ordinary least squares; and (3) smoothed data
most suitable one for the proposed diabetic patient monitoring and regularized least squares with time-invariant parameters
system. identified. The model is used to make short-term, 30-min ahead
of BGL predictions. As a continuation of this work [5], a
II. RELATED WORK universal glucose model that allows for clinical use of predictive
algorithms and CGM devices for proactive therapy of diabetic
In order to prevent, detect, and manage diabetes, many patients is proved to be feasible. In a more recent study, Lu et
models have been developed. Real time algorithms have been al., [29] presented a real-time implementation of the previously
developed for calibration, filtering of noisy signals and glucose developed [16] offline data-driven algorithm. The
predictions for hypoglycemic and hyperglycemic alarms [26]. implementation consisted of a Kalman filter for real-time
Risk prediction models and early diagnosis models are filtering of CGM data and a data-driven AR model for BGL
developed for the management of T2DM. Similarly, glucose prediction.
prediction models and closed loop glucose controllers are
developed for the management of T1DM. Models for risk Fanmao and Youqing [18] developed an expectation
predictions of long term complications are developed to manage maximization (EM) algorithm based on ARX model to identify
both T1DM and T2DM. An exhaustive review to evaluate all the the diabetic model of two inpatients with T1DM whereby the
models is outside the scope of this paper and, in the remaining delay of insulin absorption can be varying at every sample.
section, the focus will be restricted only to new technologies to
monitor glucose in pervasive system for diabetics rather than B. Artificial Neural Network (ANN) Models
traditional glucose monitoring. Thus, glucose prediction models A time-lagged feedforward ANN trained with
that will be useful to propose the data analytics algorithm for the backpropagation was proposed by Pappada et al., [30]. Despite
proposed WBAN system are reviewed which will later be efficient prediction of euglycemia and hyperglycemia,
described in Section VI. hypoglycemia tended to be overestimated. Eskaf et al., [31]
conducted a pilot study on a limited dataset consisting of only
A. Autoregressive Models one patient using the feedforward ANN model by modeling the
Sparacino et al., [13] predicted glucose levels ahead of time glycemic level as a dynamic system. Another feedforward ANN
using continuous monitoring data. The first algorithm was based trained with the back propagation Levenberg–Marquardt
on first-order polynomial and the other one on autoregressive algorithm was presented by Pérez-Gandía et al., [32] for the
(AR) models with time-varying parameters. In a more recent online prediction of future glucose concentration levels from
study, Zanderigo et al., [27] evaluated the models using CGM data for T1DM patients. Its performance when compared
continuous glucose error grid analysis (CG-EGA), showing that with another algorithm based on an AR model proved to be more
their performance is similar, yet the AR model appeared to be accurate. Zarkogianni et al., [12] presented a personalized
better at hypoglycemia prevention prediction. glucose prediction model based on neuro-fuzzy techniques.
Moreover, wavelets are applied as activation functions to
Reifman et al., [14] proposed a predictor based on the data- enhance the prediction performance and avoid local minimum
driven AR (10) model, with time-invariant and subject-invariant during training stage. The model was able to capture the
parameters identified by regularized least-squares. The CGM metabolic behavior of a patient and handle intra- and inter-
measurement rate was one sample per minute. The results patient variability.
demonstrated that the predictions for a PH of 30 minutes were
quite accurate. C. Support Vector Regression (SVR) and Gaussian Models
Finan et al., [17] have proposed ARX models using Bunescu et al., [2] described a SVR model to predict the
information on past glucose, insulin, and carbohydrates (CHOs) blood glucose level using a generic physiological models. These
consumed. Batch and recursive methods of identification were models required up to seven specific input features to make
examined, and the models were assessed under normal reliable predictions. Reymann et al., [6] presented an algorithm
conditions and conditions of reduced insulin sensitivity. for mobile platforms to predict the BGL of a patient with a CGM
Eren-Oruklu et al., [15] proposed prediction algorithms device. The algorithm can easily be integrated into existing
based on AR(3) and on auto-regressive with moving average CGM readers or process the data transmitted by a CGM sensor.
(ARMA(3,1)) models exploiting past CGM readings solely, The algorithm was designed with data of an online simulation
with time-varying parameters identified by recursive least tool, followed by an evaluation with real patient data on a mobile
squares, using a forgetting factor which could be changed platform. Efendic et al., [24] used Gaussian Mixture Models
according to the glucose trend. The model resulted in average (GMM) to make short-term prediction of BGL.
relative absolute deviation (RAD) values of 2.6% and 3.7% for
healthy and diabetes patients and a prediction horizon (PH) of
TABLE I. CLASSIFICATION PERFORMANCE OF GLUCOSE PREDICTING MODELS
Author Models Input Parameters Patients (Period) Evaluation [PH/RMSE;TL]

Autoregressive Models
Sparacino et al., AR (1) models with time 28 T1DM [30/18.78; TL of 3.79 min],
CGM data
(2007) varying parameters [13] (2 days) [45/34.64]
Zanderigo et al. AR (1) models with time 28 T1DM
CGM data [30/18.78; TL of 3.79 min]
(2007) varying parameters [27] (2 days)
Reifman et al. 9 T1DM [30/22.3] , [60/35.0],
AR (10) model [14] CGM data
(2007) (5 days) [120/53.8]
Both approaches resulted in
Autoregressive exogenous CGM data, Insulin, 9 T1DM
Finan et al. (2009) RMSEs:
input (ARX) models [17] CHO ingested (2 to 8 days)
[30/26], [45/45], [60/60]
Eren-Oruklu et al. AR(3) and (ARMA(3,1)) 14 T2DM
CGM data 30/3.83a
(2009) models [15] (2 days)
[30/1.8; TL of 0.2 min],
9 T2DM
Gani et al., (2009) AR (3) models [16] CGM data [60/14.4; TL of 12.3 min],
(5 days)
[90/28.8; TL of 38.4 min]
27 T1DM, 7 T2DM
Gani et al., (2011) AR (3) models [5] CGM data [30/3.6; TL of 0.3 min]
(4000 min)
27 T1DM, 7 T2DM [10/8.97; TL of 2.5 min],

Lu et al., (2011) AR (6) model [29] CGM data
(4000 min) [20/16.06 TL of 9.26 min]
Fanmao and CGM data, Insulin, 2 T1DM

ARX model [18] RSME : 0.02
Youqing (2016) meal (2 days)
Artificial Neural Network (ANN) Models
Feed-forward ANN model CGM data, food intake, 1 T1DM RSME : ranges from 3.6 to
Eskaf et al, (2008)
[31] daily activities (7 days) 22.284
15 T1DM [15/10; TL of 4 min] ,
Pérez-Gandía et al., Feed-forward ANN model
CGM data (72 hrs/week over 4 [30/18; TL of 12 min] ,
(2010) [32]
weeks) [45/27; TL of 20 min]
CGM data, Insulin,
Pappada et al., Feed-forward ANN model Nutrition,
10 T1DM [75/43.9]
(2011) [30] Lifestyle/Emotional
states, Time
blood pressure, gender
MLP (RSME) : 0.24
Behara et al., (2014) MLP and BN [20] ,alcohol consumption, 1825 patients
BN (RSME): 0.37
and fasting glucose
Neurofuzzy(applying
Zarkogianni et al, CGM data, Energy 6 T1DM [15/14.42], [30/20.20],
wavelets as activation
(2014) expenditure (7 to 15 days) [45/24.79], [60/28.49]
functions) [12]
Support Vector Regression (SVR) Models

Bunescu et al., CGM data, Insulin,
SVR [2] 10
(2013) Meals
Reymann et al., CGM data, CHO 5 T1DM
SVR [6] Error rate : 30/19%
(2016) intakes, Insulin (25 days)
Hybrid and Fusion Models
[5 / 4.6] ; [15 / 11.1] ; [30 /
Georga et al. (2016) QKLMS-FB [33] CGM data 15 T1DM
18.7] ; [45 / 24.7] ; [60 / 30.0]
Moukiakou et al., Hybrid model based on CM Insulin, Food intake, 9 T1DM
[30/18.34]
(2008) and RNN CGM data (10 days)
Daskalaki et al., Real-time adaptive models 23 T1DM

CGM data and Insulin [15/11.9], [30/18.9], [45/26.1]
(2012) (fusion of RNN and AR) [11] (8 days)
CGM data, Insulin,
Hybrid model based on CM 27 T1DM [15/5.21], [30/6.03], [60/7.14],
Georga et al., (2013) CHO, Exercise Data,
and SVR [25] (5 to 22 days) [120/7.62]
Time
Hybrid glucose-insulin
Zarkogianni et al., CHO ingested, Insulin , 12 T1DM
metabolism model based on [30/14.10], [60/23.19]
(2013) CGM data (10 days)
CM and SOM [9]
Botwey et al., (2014) cARX and RNN [4] CGM data 23 T1DM [30/26.5; TL of 20 min]
PH: Prediction Horizon (min), RMSE: Root mean square error (mg/dl), TL: Time Lags, a: mean absolute percentage error
It can be observed that the raw CGM data are filtered to
D. Hybrid and Fusion Models remove high-frequency noise spikes. Moreover, despite the
Mougiakakou et al., [34] presented a hybrid model based on adaptive nature of the model, the lags between predicted and
the combined use of CMs and RNN. In [11], Daskalaki et al., measured values were quite significant and the study do not
presented the fusion of real-time adaptive models of recurrent report if further lags are introduced due to the Butterworth filter.
neural networks (RNN) and AR models. Botwey et al., [4] Furthermore, the model uses an AR model of order one which is
developed a multi-model data fusion to improve an early inadequate to capture the temporal variations of the time-series
warning system for hypo-/hyperglycemic events. The study glucose signals. Even though the model can produce clinically
shows that compared to the cARX and RNN models, and a linear acceptable predictions, it is individual specific and needs to be
fusion of the two, the proposed fusion schemes represents a adapted to every individual and thus, decreasing its practicality
significant improvement. [13]. Even though raw CGM time-series data were used in the
Georga et al., [25] presented a hybrid model based on the AR(10) model with fixed coefficients, the time lags were
combined use of CMs and SVR for the multivariate prediction relatively large, further reducing their practical applicability and
of the subcutaneous glucose concentration in patients with clinical benefits [14]. The application of a more informative
T1DM. The study points out that the availability of multivariable input parameters such as quantitative measures of lifestyle data
data and their effective combination can significantly increase for exercise and stress levels could result in better predictive
the accuracy of both short-term and long-term predictions. performance [17]. Gani et al., [16] [5] attempted to predict
glucose levels with acceptable time lags as compared to the
Zarkogianni et al. [9], presented a personalized hybrid model [14]. However, their entire time series have been
glucose-insulin metabolism model based on the combined use of previously smoothed which is never the case in a real-time.
CMs and a SOM. The model demonstrated the ability to capture
the patients’ metabolic behavior and to handle intra- and inter- Glucose prediction models based on ANNs presented so far
patient variability. did not outperform the simpler strategies based on time-series
modeling. For the same dataset, the results in [32] are relatively
the same as of [13]. Moreover, the results in [30] indicate that
III. EVALUATION OF DATA ANALYTICS ALGORITHMS FOR the NN described therein does not outperform the NN in [32].
THE WBAN SYSTEM Additionally, the performance of Multilayer MLP model with
Table I summarizes the classification performance of the 23 hidden layers was found to be better than that of BN [20].
glucose prediction models discussed in Section II. For each For 30-min predictions, [16] and [5] outperforms all the
model, their methods, along with their input parameters, types other methods proposed in terms of lowest RSME and barely no
of diabetic patients and monitoring period are presented. time lags. However, both methods are based only on CGM data
Common input parameters include CGM data, insulin infusion segments and are described by a stationary process. Thus, more
rates, insulin dosages, and lifestyle/ emotional states such as comprehensive input parameters could results into better
carbohydrate (CHO) ingested, nutritional intake, exercise type predictive performance. The accuracy of the prediction models
and duration. Moreover, the evaluated results present the could be further optimized by improving the accuracy on larger
performance results of each model in terms of root mean square dataset. Moreover, besides evaluating the RSME, it is also
error (RMSE) and any possible time lags (TL). important to compute the mean absolute percentage error to
Despite the comparison of the glucose predictive algorithms’ compare its accuracy with other related models. The other model
performance is not feasible due to the different models applied, that achieved the lowest RMSE values than the other glucose
input parameters, dataset and evaluation methodology used, predicting models in terms of performance and accuracy is the
several prominent conclusions can be still drawn in order to hybrid model based on CM and SVR [25], which demonstrates
select the most appropriate one for the proposed WBAN. the need of applying advanced data analytics and modeling
approaches based on hybrid models.
As shown in Table 1, most of glucose prediction models
developed are aimed at T1DM patients and very few glucose Almost all the models based on SVR [2], [25] and [36]
prediction models have been developed and evaluated for requires a physiological model for glucose prediction, do not
T2DM patients. Therefore, there is a high need of applying more include an unobtrusive mobile application and do not trigger any
sophisticated techniques to capture the metabolic behavior of a alarms to the user whenever a threshold is exceeded as compared
patient with both T1DM and T2DM, and ultimately predict their to the algorithm developed in [6].
glucose levels. For this research, an ARX model seems to be more suitable
Moreover, it can be observed that almost all the algorithms’ for the proposed WBAN system [37]. The ARX model reflects
predictive performance or accuracy declines (higher RSME) as the relationship between glucose and exogenous inputs by
the prediction horizon (PH) increases proving that most models including inputs signals into the model structures such as BP,
have yet to be improved in order to make long-term predictions. TC, HDL and LDL. The data analytic algorithm is based on the
ARX model time series prediction process to predict the glucose
Many models have been developed based on the AR, ARMA concentration of the patient and also to perform continuous
and ARX techniques. However, the AR models can not reflect monitoring.
the relationship between glucose and input for example
carbohydrate and insulin, therefore ARX models [18], [35], [11],
[17] have further been developed and evaluated.
IV. DOMAIN ANALYSIS short-range wireless communication to the patient’s android
According to the International Diabetes Federation (IDF), in smartphone. The smartphone then aggregates and stores the
2015, 418 million people had Diabetes Mellitus (DM) in the sensed data, processes the energy consumption, provides the
world and 78 million people in the SEA Region were suffering healthcare monitoring interface to the patients for logging and
from DM, while it was estimated that by 2040 this number will also sends the physiological data to the medical server at a
rise to 642 million. There were 220,000 cases of DM in specified time interval whereby the physicians can directly have
Mauritius in 2015 and the undiagnosed cases reached up to 113 access for further analysis, diagnosis and intervention.
million. In 2015, 2,932 deaths were attributed to DM, while the
associated cost per person with diabetes was estimated at USD
500 dollars [38]. The IDF estimated that Mauritius had a VI. PROPOSED GLUCOSE PREDICTION MODEL FOR DATA
prevalence rate of DM of 24.3%, in 2015. The recent NCD ANALYTICS FOR THE WBAN SYSTEM
survey report highlighted that the rapid increase in DM in Based on the evaluation of the models in section III, the ARX
Mauritius indicated an austere future in terms of health model seems more appropriate for the proposed WBAN system
complications, associated disorders, and the rising costs of since it can accommodate exogenous inputs as compared to the
health care. Hence, urgent measures and research are required AR model, which are necessary inputs to a practical application
both for prevention and treatment of DM to shift the emphasis for glucose control. A high level glucose prediction model for
from the disease to wellness [39]. data analytics is depicted in Fig 2.
The exponential rise in medical data results into complex,
and time consuming data processing and interpretation for both
patients and physicians. DM prevention, diagnosis, risk
reduction, decision making and timely intervention are not
trivial tasks since it is very challenging to process the acquired
data and information into useful knowledge. Recent
advancements in body area networks coupled with data analytics
have led to many promising development of personalized,
predictive, preventive, and participatory healthcare approaches
[10] [40]. Recent glucose monitoring technologies combined
with the smart data analytics can allow patients to continuously
monitor their BGL and ultimately optimize self DM
management, ensure better health monitoring, reduce further
risk complications and consequently, enhance the quality of
healthcare in Mauritius. The development of glucose prediction
models can eventually process the patients’ data and information Fig.2. High-level glucose prediction model for data analytics.
into clinically useful advanced knowledge to facilitate the
appropriate patient reaction in crucial situations. At the sensor tier, the sensor devices collect the patients’
physiological parameters (such as blood glucose, blood pressure
V. DESIGN ARCHITECTURE and blood oxygen saturation) and transmit them via Bluetooth to
The design architecture of the system is based on the the mobile computing tier. In this tier, the sensed data from the
Wireless Body Area Network (WBAN) proposed by Huzooree sensors, and other input data from the patients are processed as
et al., [37]. a dataset (as shown in Table II) and are then transmitted to the
remote server tier via Internet. At the remote server, data
selection, cleaning, filtering and transformation are the main
steps of data pre-processing prior to perform data analytics.
During the data analytics process, the data are trained initially
and then the model is tested using the trained data. After the
model is validated, it can be hosted on the remote server to
predict real time glucose analytics and trigger alarms to the
physicians or patients whenever the threshold is exceeded. This
paper will focus only on the glucose prediction model in the data
analytics process.
Present glucose values can be estimated as linear
combinations of past glucose measurements using the ARX
model expression. The general form of the ARX model can be
described as eq (1)
Fig.1. High-level WBAN architecture.
As shown in Fig 1, both invasive and non-invasive sensors are (1)

used to sense the vital signs, represent the data (pre-processing,
compression and filtering), and transfer the sensed data over
Equation (1) can be adapted to accommodate for the represented by and is the number of BG measurements
exogenous inputs such as BP, TC, LDL and HDL. The glucose taken.
prediction model to perform data analytics for this research is
described below as MAE ∑ | | (3)
⋯ γ (2) RMSE measures the standard deviation of the differences in
the predicted BG level ( ) and the actual BG level (a).
In (2), the glucose concentration is denoted at sampling
interval . to denote the independent variables from which ∑
the glucose measurement is to be predicted. The value of the (4)
exogenous inputs in period t is denoted by . For example,
could be BP, TC, LDL and HDL. γ is a constant disturbance, Furthermore, measures the proportion of the variance of
the white Gaussian noise is denoted by and is the discrete- the BG level, which is the dependent variable that can be
time sampling instant, i.e., = 1,2,.. predicted from the independent variables.

The coefficients for glucose concentration, carbohydrate 1 (5)

intake, insulin dosage are to respectively and are identified
using the least square ARX method. The dataset was divided into 2 sets whereby the first set
consisted of 330 records to train the model and the remaining
VII. DATA ANALYSIS AND EVALUATION 112 records were used to evaluate if the model was able to
predict the glucose measurements. Fig 3 shows the actual and
The data used in this research is from the diabetes dataset fitted graph and demonstrates clear correlation between the
used by Efron et al., [41]. This dataset includes 442 sets of data actual and predicted glucose levels. Moreover, the Residual Q-
recorded on diabetes patients. Table II is a representation of the Q Plot in Fig 4 demonstrates that the model is fit for glucose
dataset used in this research. prediction. Finally, the performance metrics MAE resulted in
7.44, RMSE in 9.28 and R2 in 0.342, which confirm that the
TABLE II. DATASET CHARACTERISTICS model can be used for glucose prediction.
Input
Values Input Parameters Values
Parameters
Patient Baseline Descriptive Statistics of the Glucose
Characteristics Dataset
235F /
Gender Total Cholesterol (TC) 189±34.6
207M
Low-density lipoprotein
Agea 48.5±13a 115±30.4
Cholesterol (LDL)
High Density
BMI 26.4±4.4 49.8±12.9
Lipoproteins (HDL)
Blood
94.6±13.8 Ratio (TC/HDL) 4.0±1.29
Pressure
Glucose 91.2±11.5
a.
Data are mean±standard deviation value
Moreover, an ideal TC is < 200 mg/dl. A borderline high TC
ranges 200-239 mg/dl. A high TC is >= 240 mg/dl. An ideal
LDL is < 100mg/dl, a close to ideal LDL ranges 100-129 mg/dl, Fig.3. Actual vs Fitted Graph
a borderline high LDL ranges 130-159 mg/dl, a high LDL ranges
160-189 mg/dl, and a very high LDL is >=190 mg/dl. A low
HDL is <40 mg/dl, a normal HDL ranges 40-59 mg/dl and a best
HDL is >=60 mg/dl. A glucose concentration value ≤70 mg/dl
is defined as hypoglycemic and a glucose concentration value
≥180 mg/dl is defined as hyperglycemic. If ever the threshold
set is reached, an alarm is triggered whereby the patient will
receive a warning message on his mobile phone and similarly
the physician will receive a warning message on the remote
server.
To evaluate the prediction performance of the model, the
metrics mean absolute error (MAE), root-mean-square error
(RMSE), and coefficient of determination (R2) are applied.
The MAE is the average error between the predicted BG
level and the actual BG level. In (3), the predicted BG
concentration is denoted by g, the actual BG concentration is
Fig.4. Q-Q Plot
The second set of experiment was used to predict the glucose The proposed prediction model have been evaluated by
measurements. Fig 5 shows that all the forecasted values falls simulation experiments and the results demonstrate that the
within the 95% interval of confidence therefore demonstrating proposed model improves the prediction accuracy. This model
that the model is reliable. will allow preventive intervention through self-disease
management and improve glucose control. As future work, this
work can be extended to analyze and further optimize the
performance of the glucose prediction algorithm by using bigger
prediction horizon.
REFERENCES
[1] E. Aboufadel, R. Castellano, and D. Olson, “Quantification of the
variability of continuous glucose monitoring data,” Algorithms, vol.
4, no. 1, pp. 16–27, 2011.
[2] R. Bunescu, N. Struble, C. Marling, J. Shubrook, and F. Schwartz,

“Blood Glucose Level Prediction Using Physiological Models and
Support Vector Regression,” 2013 12th Int. Conf. Mach. Learn.
Appl., vol. 1, pp. 135–140, 2013.
[3] E. Georga and V. C. Protopappas, “Short-term vs. long-term

Fig.5. Glucose Prediction Graph analysis of diabetes data: Application of machine learning and data
mining techniques,” Short-term vs. long-term Anal. diabetes data
The proposed model can be beneficial for diabetic patient Appl. Mach. Learn. data Min. Tech., 2013.
management since the patients can easily predict their glucose
[4] R. H. E. Botwey, E. Daskalaki, P. Diem, and S. G. Mougiakakou,
level based on different exogenous inputs and will consequently
“Multi-model data fusion to improve an early warning system for
allow them to self-manage their conditions thereby decreasing hypo-/hyperglycemic events,” Conf. Proc. ... Annu. Int. Conf. IEEE
hospital expenses and increasing their quality of life. Eng. Med. Biol. Soc. IEEE Eng. Med. Biol. Soc. Annu. Conf., vol.
2014, pp. 4843–4846, 2014.
VIII. FUTURE RESEARCH DIRECTIONS AND CHALLENGES [5] A. Gani, A. V Gribok, Y. Lu, W. K. Ward, R. A. Vigersky, and J.
Despite all the recent advancements towards the Reifman, “Universal Models For Predicting Glucose Concentration
development of several glucose prediction models for diabetes In Humans,” WO Pat. WO/2010/, vol. 14, no. 1, pp. 157–165, 2011.
management, they are still not yet widely accepted due to several
[6] M. P. Reymann, E. Dorschky, B. H. Groh, C. Martindale, P. Blank,
reasons. There are still many future works and challenges to be and B. M. Eskofier, “Blood Glucose Level Prediction based on
tackled to achieve a perfect compromise in terms of performance Support Vector Regression using Mobile Platforms ,” pp. 2990–
and accuracy when developing the glucose prediction model. 2993, 2016.
Error Grid analysis can be used to analyze the performance of
several prediction algorithms to most effective modeling [7] G. Shi, S. Zou, and A. Huang, “Glucose-tracking: A postprandial
glucose prediction system for diabetic self-management,” 2015 2nd
approach. Most models' performance decline as the prediction Int. Symp. Futur. Inf. Commun. Technol. Ubiquitous Heal. Ubi-
horizon increases thereby increasing the need for optimized HealthTech 2015, pp. 9–17, 2015.
algorithms. Many models consider only CGM data for glucose
prediction and exclude other exogenous inputs which [8] N. Ramkumar, “Data Analysis for Chronic disease – Diabetes using
consequently result into inaccurate data analysis and unreliable Map Reduce Technique,” 2016.
decision making. Advanced powerful data analytics and
modeling approaches are needed to detect correlation from the [9] K. Zarkogianni, E. Litsa, a. Vazeou, and K. S. Nikita, “Personalized
glucose-insulin metabolism model based on self-organizing maps
diverse range of raw data to extract clinically meaningful for patients with Type 1 Diabetes Mellitus,” 13th IEEE Int. Conf.
knowledge and to be widely accepted for clinical practice. Bioinforma. Bioeng., pp. 1–4, 2013.
Moreover, there is limited research conducted on glucose
prediction models applicable to patient suffering from both Type [10] K. Zarkogianni, E. Litsa, K. Mitsis, P. Wu, C. Kaddi, C. Cheng, M.
1 and Type 2 diabetes. Wang, and K. Nikita, “A Review of Emerging Technologies for the
Management of Diabetes Mellitus.,” IEEE Trans. Biomed. Eng.,
vol. PP, no. 99, p. 1, 2015.
IX. CONCLUSION
[11] E. Daskalaki, A. Prountzou, P. Diem, and S. G. Mougiakakou,
Self-monitoring and remote monitoring of patients generate “Real-Time Adaptive Models for the Personalized Prediction of
big data representing a wide range of patients’ data which Glycemic Profile in Type 1 Diabetes Patients,” Diabetes Technol.
consist of medical, lifestyle and social information. There is a Ther., vol. 14, no. 2, pp. 168–174, 2012.
high need to exploit these data effectively to better detect,
prevent, diagnose, predict and treat diabetes. Based on the [12] K. Zarkogianni, K. Mitsis, a Fioravanti, and K. S. Nikita, “Neuro -
evaluation of the various recent glucose prediction models using Fuzzy based Glucose Prediction Model for Patients with Type 1
Diabetes Mellitus,” Ieee, pp. 252–255, 2014.
autoregressive, ANN, SVR, Gaussian, hybrid and fusion model,
this paper proposes a glucose prediction model using ARx to [13] G. Sparacino, F. Zanderigo, S. Corazza, A. Maran, A. Facchinetti,
perform data analytics for a wireless body area network system. and C. Cobelli, “Glucose concentration can be predicted ahead in
time from continuous glucose monitoring sensor time-series,” IEEE [28] M. Eren-Oruklu, A. Cinar, and L. Quinn, “Hypoglycemia prediction
Trans. Biomed. Eng., vol. 54, no. 5, pp. 931–937, 2007. with subject-specific recursive time-series models.,” J. Diabetes Sci.
Technol., vol. 4, no. 1, pp. 25–33, 2010.
[14] J. Reifman, S. Rajaraman, A. Gribok, and W. K. Ward, “Predictive
monitoring for improved management of glucose levels.,” J. [29] Y. Lu, S. Rajaraman, W. K. Ward, R. A. Vigersky, and J. Reifman,
Diabetes Sci. Technol., vol. 1, no. 4, pp. 478–86, 2007. “Predicting human subcutaneous glucose concentration in real time:
A universal data-driven approach,” Proc. Annu. Int. Conf. IEEE
[15] M. Eren-Oruklu, A. Cinar, L. Quinn, and D. Smith, “Estimation of Eng. Med. Biol. Soc. EMBS, vol. 21702, pp. 7945–7948, 2011.
Future Glucose Concentrations with Subject-Specific Recursive
Linear Models,” Diabetes Technol. Ther., vol. 11, no. 4, pp. 243– [30] S. M. Pappada, B. D. Cameron, P. M. Rosman, R. E. Bourey, T. J.
253, 2009. Papadimos, W. Olorunto, and M. J. Borst, “Neural Network-Based
Real-Time Prediction of Glucose in Patients with Insulin-Dependent
[16] A. Gani, A. V. Gribok, S. Rajaraman, W. K. Ward, and J. Reifman, Diabetes,” DIABETES Technol. Ther., vol. 13, 2011.
“Predicting subcutaneous glucose concentration in humans: Data-
driven glucose modeling,” IEEE Trans. Biomed. Eng., vol. 56, no. [31] E. K. Eskaf, O. Badawi, and T. Ritchings, “Predicting blood glucose
2, pp. 246–254, 2009. levels in diabetics using feature extraction and Artificial Neural
Networks,” 3rd Int. Conf. Inf. Commun. Technol. From Theory to
[17] D. a Finan, F. J. 3Rd Doyle, C. C. Palerm, W. C. Bevier, H. C. Appl., pp. 1–6, 2008.
Zisser, L. Jovanovic, and D. E. Seborg, “Experimental evaluation of
a recursive model identification technique for type 1 diabetes,” J. [32] C. Pérez-Gandía, a Facchinetti, G. Sparacino, C. Cobelli, E. J.
Diabetes Sci. Technol., vol. 3, no. 5, pp. 1192–1202, 2009. Gómez, M. Rigla, a De Leiva, and M. E. Hernando, “Artificial
neural network algorithm for online glucose prediction from
[18] Z. Fanmao and W. Youqing, “Dynamic model with time varying continuous glucose monitoring,” Diabetes Technol. Ther., vol. 12,
delay for type 1 diabetes mellitus identified by using expectation no. 1, pp. 81–88, 2010.
maximization algorithm,” pp. 9376–9381, 2016.
[33] E. I. Georga, J. C. Príncipe, D. Polyzos, D. I. Fotiadis, and S.
[19] S. A. Saji and K. Balachandran, “Performance analysis of training Member, “Non-linear Dynamic Modeling of Glucose in Type 1
algorithms of multilayer perceptrons in diabetes prediction,” Conf. Diabetes with Kernel Adaptive Filters,” no. i, pp. 5897–5900, 2016.
Proceeding - 2015 Int. Conf. Adv. Comput. Eng. Appl. ICACEA
2015, pp. 201–206, 2015. [34] S. Mougiakakou, A. Prountzou, K. Zarkogianni, C. Bartsocas, K.
Nikita, and A. Gerasimidi-Vazeou, “Prediction of glucose profile in
[20] R. S. Behara, A. Agarwal, P. Pulumati, R. Jain, and V. Rao, children with type 1 diabetes mellitus using continuous glucose
“Predictive modeling for wellness and chronic conditions,” Proc. - monitors and insulin pumps,” Horm. Res., vol. 70, pp. 22–23, 2008.
IEEE 14th Int. Conf. Bioinforma. Bioeng. BIBE 2014, pp. 394–398,
2014. [35] C. Zhao and C. Yu, “Rapid Model Identification for Online
Subcutaneous Glucose Concentration Prediction for New Subjects
[21] C. Bayraktar, H. Gümüs, O. Karan, C. Bayraktar, H. Gümüşkaya, With Type I Diabetes,” IFAC Proc. Vol., vol. 19, no. 5, pp. 2094–
and B. Karlık, “Diagnosing diabetes using neural networks on small 2099, 2014.
mobile devices,” Expert Syst. Appl., vol. 39, no. 1, pp. 54–60, 2012.
[36] K. Plis, R. Bunescu, C. Marling, J. Shubrook, and F. Schwartz, “A
[22] G. Baghdadi and A. M. Nasrabadi, “Controlling Blood Glucose machine learning approach to predicting blood glucose levels for
Levels in Diabetics By Neural Network Predictor,” 29th Annu. Int. diabetes management,” Mod. Artif. Intell. Heal. Anal. Pap. from
Conf. IEEE Eng. Med. Biol. Soc., pp. 3216–3219, 2007. AAAI-14, 2014.
[23] Z. Zainuddin, O. Pauline, and C. Ardil, “A Neural Network [37] G. Huzooree, K. K. Khedo, and N. Joonas, “Wireless Body Area
Approach in Predicting the Blood Glucose Level for Diabetic Network System Architecture for Real-Time Diabetes Monitoring,”
Patients,” pp. 1–8, 2009. in ELECOM 2016, 2016.
[24] H. Efendic, H. Kirchsteiger, G. Freckmann, and L. del Re, “Short- [38] International Diabetes Federation, “International Diabetes
term prediction of blood glucose concentration using interval Federation,” 2015. [Online]. Available:
probabilistic models,” 22nd Mediterr. Conf. Control Autom., pp. http://www.idf.org/membership/sea/mauritius.
1494–1499, 2014.
[39] “Mauritius NCD Survey 2015 Report,” 2015.
[25] E. I. Georga, V. C. Protopappas, D. Ardigo, M. Marina, I. Zavaroni,
D. Polyzos, and D. I. Fotiadis, “Multivariate Prediction of [40] S. Ding and M. Schumacher, “Sensor Monitoring of Physical
Subcutaneous Glucose Concentration in Type 1 Diabetes Patients Activity to Improve Glucose Management in Diabetic Patients: A
Based on Support Vector Regression,” Biomed. Heal. Informatics, Review,” Sensors, vol. 16, no. 5, p. 589, 2016.
IEEE J., vol. 17, no. 1, pp. 71–81, 2013.
[41] B. B. EFron, T. Hastie, I. Hohnstone, and R. Tibshirani, “LEAST
[26] B. W. Bequette, “Continuous glucose monitoring: real-time ANGLE REGRESSION 1 . Introduction . Automatic model-
algorithms for calibration, filtering, and alarms.,” J. Diabetes Sci. building algorithms are familiar , and sometimes notorious , in the
Technol., vol. 4, no. 2, pp. 404–18, 2010. linear model literature : Forward Selection , Backward Elimination ,
All Subsets regression and various combinations are used to,” vol.
[27] F. Zanderigo, G. Sparacino, B. Kovatchev, and C. Cobelli, “Glucose 32, no. 2, pp. 407–499, 2004.
Prediction Algorithms from Continuous Monitoring Data:
Assessment of Accuracy via Continuous Glucose Error-Grid
Analysis,” J. Diabetes Sci. Technol., vol. 1, no. 5, pp. 645–651,
2007.

Glucose Prediction Data Analytics

Hochgeladen von

Dokumentinformationen

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Glucose Prediction Data Analytics

Hochgeladen von

Copyright:

Verfügbare Formate

Glucose Prediction Data Analytics for Diabetic

978-1-5386-3831-6/17/$31.00 ©2017 IEEE

Author Models Input Parameters Patients (Period) Evaluation [PH/RMSE;TL]

27 T1DM, 7 T2DM [10/8.97; TL of 2.5 min],

Fanmao and CGM data, Insulin, 2 T1DM

Support Vector Regression (SVR) Models

Daskalaki et al., Real-time adaptive models 23 T1DM

As shown in Fig 1, both invasive and non-invasive sensors are (1)

[2] R. Bunescu, N. Struble, C. Marling, J. Shubrook, and F. Schwartz,

[3] E. Georga and V. C. Protopappas, “Short-term vs. long-term

Das könnte Ihnen auch gefallen