Sie sind auf Seite 1von 6

A ROBUST PHYSICAL EXERCISE RECOGNITION SYSTEM

USING MACHINE LEARNING APPROACH

Abstract: Modern life is becoming more linked to our devices and work is being done
in a more regulated way. As life became more complicated, it is becoming challenging
to keep track of one’s health and fitness which may lead to unexpected illness and
diseases. Moreover, lack of activity monitoring and corresponding reminders is
preventing the adoption of a healthier lifestyle. This research provides an effective
approach for identifying Human Activity by using accelerometer data obtained from
wearable devices. The model automatically finds pattern among 33 different physical
exercises such as running, rowing, cycling, jogging etc. and correctly identifies them.
Principal component Analysis was used on the statistical features in order to make the
system more robust. Classification of the physical exercise was performed on the
reduced features using WEKA. An overall accuracy of 85.51% was obtained using 10-
Fold Cross Validation method and K nearest Neighbor Algorithm while 84% accuracy
for Random Forest. The accuracy obtained was better than previous models and could
improve recognition systems in monitoring user activity more precisely.

Keywords: Physical Activity Recognition, RealDisp Activity Recognition


Wearable Device

1. INTRODUCTION of keeping count in one’s mind technology has


accomplished this task using small sensors
Nowadays, people live and interact in ways that inside IOT devices like smart watches and
have lack proper physical activity and these
smartphones. If smart-watches can employ a
have brought adverse changes in their health. faster and efficient recognition system, then it
Not only it prevented a better lifestyle but it has will reduce both data consumed and time taken.
become the leading cause of many common
chronic diseases such as obesity, heart failure The recognition system in our devices performs
and even premature death [1]. In addition, less activity in different ways. When Human
serious diseases are contributing to loss of performs an activity, they require a combination
muscle strength and immense fatigue [2]. In of several basic movements repeatedly. Running
order to mitigate this issue, 3-5 hours of intense of a child is very different from those of
aerobic exercise is recommended by the United walking. In practice each activity takes only a
States Department of health [3]. Furthermore, few second to complete and record. Each basic
with the rapid rise in the number of smart movement could be involved over any given
sensing devices, such as smartphones and smart duration. So, the study of HAR is very crucial to
watches various high detailed applications make advancements in academics.
linked to monitoring of personal healthcare, The goal of this study is to find a better way
management of obesity, interactive and through feature extraction and feature reduction
experiential gaming etc., have continually been procedure to produces accurate classification
made better to meet human standards [4]. But, across a wide range of physical exercises. The
in general, both are in short of meeting this context that was explored is in a gym setting
standard regularly and have various significant where people performed 33 different physical
causes. exercises such as walking, running, cycling etc.
One of the apparent causes is the failure to Through using adaptive time series
monitor where and how many times activity segmentation features were extracted, followed
occurs and time spent on each activity. Instead by reduction into lower dimension using PCA.
Computer Science and Engineering Research Journal

Instead of using a regular fixed sized window, 90.5% for 10 daily activities. In [11] the task of
adaptive segment was used which ensures more inferring activities when travelling by metro was
likely to detect an activity in those respective explored. It showed when two features could be
segments. Reduction Algorithm makes the selected based on feature selection method the
system computational efficient and robust in model can obtain good accuracy.
many scenarios. Finally, 2 classifiers, one of
Our model improves the existing system by
them is Lazy Algorithm while the other
using adaptive sliding window on preprocessed
Ensemble Tree was used to evaluate
accelerometer signals and then normalizing
performance.
those features. Next using pca the reduced
This research can be pointed down to six features were classified using Machine Learning
sections. Section 2 reviews the previous models.
research related works. Section 3 introduces our
3. PROPOSED METHODOLOGY
proposed model and discusses the experimental
steps that were involved. Section 4 portrays the
findings of our study. Lastly, Section 5 gives the
conclusion of the study and addresses future
works.

2. LITERATURE REVIEW
In order to track and count human activity there
are two approach: The wearable sensor
approach and the other one is the smartphone
based. Vu Ngoc in [5] collected five activities
data from smartphone and was recognized
where they obtained an accuracy of 74% for
kNN and 75.3% for ANN respectively. Activity
and sensor was limited and there were some
ambiguity in the data collected. In [6] Rui
proposed a modified full CNN based algorithm
which predicted human activity sequences on
the self-collected opportunist and hospital An adaptive time series technique [12] of
dataset. In [7], CNN was applied directly on 16 segment length 200 was used on each of the
lower limb activities using 5 sensors and it sensor magnitude for all the subjects from which
showed comparison to using 1 single sensor. feature were extracted.
The Researchers in [8] used a tenfold random-
partitioning cross-validation evaluation to 3.1 Datasets
evaluate the system performance and the
number of features were only limited to 3. For To conduct our experiment, we have collected
capturing repeated periods in activities Banos et REALDISP Activity Recognition dataset [8]
al used a time window to segment the discrete from UCI Machine Learning Repository. The
signal. The model failed to demonstrate reliable dataset consists recordings from 17 subjects, ten
accuracy for accelerometer data when many males and seven females. The recordings
activities were involved. The obtained accuracy include 13 inertial signals obtained from sensors
was a little more than 80%. Tuan Le in [9] located at different part of the body. A
extracted related feature from raw data of 30 multivariate time series were obtained from the
volunteer performing 6 activity and finally used accelerometer sensors which are sampled at
IB3 method to improve accuracy by reducing regular intervals. The experiment consisted of
dimension. 15% improved accuracy was subjects performing 33 physical Activities in 3
obtained through Naïve Bayes (91.85%) and scenarios. The exercise lasted about 15-20 min.
Decision Tree (96%). In [10] Yu-Liang Hsu It is important to note that there are about 1900
used NWFE algorithm on two sensor data instances where the user weren’t moving The
located at two places which are wrist and ankle. count rate of each of the activity after removing
Moreover, the model ensured an accuracy of the irrelevant null activity is as follows
Computer Science and Engineering Research Journal

maximize the variance of the components. The


equations are

(6)

Where,

√ ∑ (𝑥 ) (7)

Fig. 2. Total Frequency of each Activity ∑(𝑥 ) (8)


3.2 Feature Extraction
3.4 Principal Component Analysis
The obtained initial signal generated from the
accelerometer signals were grouped together PCA is a useful dimension reduction algorithm
into segments of fixed length [12]. Then the where orthogonal transformation of the data
feature vector was measured having following converts correlated variables into linearly
features: uncorrelated features called main components.
Through reduction of original data features, the
Considering the sample 𝑥1, 𝑥2, 𝑥3, 𝑥4, 𝑥𝑛 the amount of noise is reduced, and in addition it
arithmetic mean x is the sum of the sampled diminishes the training data to let the model
values divided by the total number of items. become computationally faster. If number of
observations made with p is n, then number of
separate main components is min (n 1, p). The
𝑥̅ √ ∑ 𝑥 (1)
resulting vector contains n observations having
different variances.[13] The linear combination
The uppercase Σ and the lowercase letter σ are of the variables 1, 2, is the first principal
used to represent standard deviation. The components Y1 represented by the equation
formula for the sample standard deviation is below
∑ ( )
√ (2)
( )
𝑌1 = 𝑎11𝑍11 + 𝑎12 12 + ⋯+ 𝑎𝑝𝑍 (9)
The quantity of distribution for numerical data
can be measured using histogram. The first principal component is calculated by
∑ (𝑥 ) (3) choosing large values for weights a11, a12, alp
once could make the variance of Y1 as large as
An amplitude RMS can be expressed as possible. In addition, it is the maximum
variance. So, to limit the size of variance
√ ∑ 𝑥 (4) weights are calculated with sum of squares as 1.
𝑎21𝑋21 + 𝑎22𝑋12 + ⋯+ 𝑎22 22 = 1 (10)
To obtain how spread the values are in the
dataset we are to calculate the average distance Similarly, the second principal component is
between each data value and their mean. calculated. Collectively, all these original
∑| |
(5)
variables are transformed into the main
components.

Y = XA (11)
3.3 Z score- normalization

Following by the arrangement of similar


features by combining all the 17 subjects, the
feature was normalized to create a normal
distribution upon the entire set of features. It is
3.5 K nearest Neighbor
done before reduction phase in order to
Computer Science and Engineering Research Journal

KNN is a non-parametric supervised learning


algorithm that is used in both classification and The ratio between the true positive values and
regression problems. It is also known as lazy the sum of the true positive and false positive
learning algorithm because it does not classify values is the precision
based on the training data points. The purpose of Precision = (16)
this algorithm is to classify a new sample based
on its features and labeled training samples.
The ratio between the true positive values and
Given a query point 𝑥 , k training points closest
the sum of the true positive and false positive
in distance (Euclidean distance) to 𝑥 are found.
values.
Based on most of the neighbors found, the new
query is classified to its cluster. Any ties in Recall = (17)
voting are broken at random. So, no specific
training model is used for classification. KNN F1 score helps to regulate the balance between
algorithm works based on nearest neighbor’s precision and recall. It is the harmonic average
classification and works best on smaller dataset of precision and recall. It is formulated as:
with fewer features.
F1 Score = (18)
The Euclidean distance [14] of two n-
dimensional vectors, x and y, is defined as:
The majority of the activity is where the user
isn’t performing any movement. First, our
( ) √(𝑥 𝑥 ) (𝑥 𝑥 ) ⋯ (𝑥 𝑥 ) (12)
model was tested along with this activity as Null
Activity. The results are as follows
and Manhattan (or city block) distance [15] is
defined as: Table 4.1 Result with Null Class

( ) √|𝑥 𝑥 | |𝑥 𝑥 | ⋯ |𝑥 𝑥 | (13) Method CA F1 Precision Recall


KNN 92.99% 94.55% 93.77% 91.45%
Random 86.73% 87.04% 86.73% 86.11%
3.6 Random Forest
Forest
Random forest is an ensemble algorithm that Considering a round robin strategy nine subset was
works well for models with low bias (high used training and one subset was used for testing.
feature target relevant relationship) and high Averaging the accuracy obtained from these 10
variance (spread of data). For classification it experiments a final cross-validation result was
joins more than one or similar kind of learning obtained. It is same cross validation method used in
algorithms and predicts labels with much more paper [8]. K-nearest algorithm gives an accuracy of
accuracy. From randomly chosen training set 85.51% while Random Forest Algorithm gives an
decision trees are created. After totaling the accuracy of 84.005%. Experiment was performed on
votes produced the final class of the dataset WEKA [17]. It is observed the K nearest algorithm
performs better to predict Activity recognition on the
tested is decided [16]. Finally computing the
entire dataset for ideal placement scenarios.
average of single tree prediction for each new
object is made. If ̅ is the number of trees used
for bagging then, Method CA F1 Precision Recall
KNN 85.51% 85.3% 85.7% 85.5%
̂ ̅
∑ 𝑝 ( ) (14) Random 84.00% 83.2% 84.3% 84.0%
Forest
4. Experimental Results and Evaluation
TABLE 4.2 Cross validation results
We measured and evaluated our performance as
follows: The bar chart below shows the accuracy of 3
Accuracy is a measure, which determines the different machine Learning Models outperforms
probability that how many results are correctly previous models.
classified
Accuracy = (15)
Computer Science and Engineering Research Journal

REFERENCES

1. F. W. Booth, C. K. Roberts, and M. J.


Laye, "Lack of exercise is a major cause
of chronic diseases," Comprehensive
Physiology, vol. 2, 2012.
2. P. C. Hallal, C. G. Victora, M. R.
Fig 3. Bar chart of 3 Algorithms Azevedo, and J. C. Wells, "Adolescent
physical activity and health," Sports
Table 4.3 Comparisons with other paper medicine, vol. 36, 2006.
3. A. Bulling, U. Blanke, and B. Schiele,
Name Method Accuracy "A tutorial on human activity
Propose Model KNN(reduced 85.51% recognition using body-worn inertial
feature) sensors," ACM Computing Surveys,
Banos et al. [8] KNN 85% vol. 46, 2014.
Vu Ngoc [5] ANN 75.3% 4. L. C. Jatoba, U. v. Großmann, C.
Kunze, J. Ottenbacher, and W. Stork,
The results obtained with a success rate of "Context-aware mobile health
85.51% are better compared with the paper [8] monitoring: Evaluation of different
in Table III. In this study, the evaluated data pattern recognition methods for
includes 33 diverse activity types which are classification of physical activity," 2008
mostly dynamic in nature. 30th Annual International Conference
of the IEEE Engineering in Medicine
and Biology Society, pp. 5250-5253,
5. CONCLUSION
2008.
To classify Human activity there are many 5. V. N. T. Sang, N. D. Thang, V. Van
uncertainties for the best algorithm. Different Toi, N. D. Hoang, and T. Q. D. Khoa,
combination of signals vibrational features gives "Human Activity Recognition and
different activity detection rate. Moreover, the Monitoring Using Smartphones," 2015.
sheer amount of data created by these devices 6. Yao R, Lin G, Shi Q, et al. Efficient
causes significant amount of strain. Researchers Dense Labelling of Human Activity
still debates what features influences the Sequences from Wearables using Fully
recognition of Human Activity in Signals. In our Convolutional Networks[J]. Pattern
research, a model for successful detection of Recognition, 2017.
Human Activity was proposed. About five 7. A. Bevilacqua, K. MacDonald, A.
features were used and normalized using PCA. Rangarej, V. Widjaya, B. Caulfield, and
Classification was done with 2 classifiers. KNN T. Kechadi, “Human Activity
shows the best Accuracy. Further experiments Recognition with Convolutional Neural
with huge volume of irrelevant data where no Networks,” Lecture Notes in Computer
activity was identified data shows significant Science, 2019.
improvement in result. Possibly results may be 8. O. Banos, M. A. Toth, M. Damas, H.
improved using RFE to remove the weakest Pomares, and I. Rojas, "Dealing with
feature to reach desired set of features. the effects of sensor displacement in
Moreover combining smart watch features with wearable activity recognition," Sensors
those of smartphone or any wearable sensor can (Basel), 2014.
provide a much better recognition. 9. T. D. Le and C. V. Nguyen, "Human
activity recognition by smartphone,"
2015 2nd National Foundation for
Science and Technology Development
Conference on Information and
Computer Science (NICS), 2015.
10. Y. Hsu, S. Lin, P. Chou, H. Lai, H.
Chang, and S. Yang, "Application of
Computer Science and Engineering Research Journal

nonparametric weighted feature


extraction for an inertial-signal-based
human activity recognition system," in
2017 International Conference on
Applied System Innovation (ICASI),
2017.
11. P. Nurmi, Przybilski, M., Lindén, G.,
Floréen, P., "A Framework for
Distributed Activity Recognition in
Ubiquitous Systems" in: IC-AI.
Citeseer, 2005.
12. Andrey D. Ignatov and Vadim V.
Strijov, “Human activity recognition
using quasiperiodic time series collected
from a single tri-axial accelerometer”,
Multimedia Tools and Applications”,
2015.
13. B. Yuan and J. Herbert, “Context-
aware Hybrid Reasoning Framework
for Pervasive Healthcare,”Pervasive
Ubiquitous Computing., vol. 18, no. 4,
pp. 865–881, 2013.
14. G. Guo, H. Wang, D. Bell, Y. Bi, and
K. Greer, "KNN Model-Based
Approach in Classification," Berlin,
Heidelberg, 2003.
15. L.-Y. Hu, M.-W. Huang, S.-W. Ke, and
C.-F. Tsai, "The distance function effect
on k-nearest neighbor classification for
medical datasets," SpringerPlus, 2016.
16. J. Han and M. Kamber, "Data mining:
concepts and techniques morgan
kaufmann," vol. 54, 2006.
17. Weka 3: Data Mining Software in Java,
Machine Learning Group at the
University of Waikato, Official Web:
http://www.cs.waikato.ac.nz/ml/Weka/i
ndex.html/, accessed on 10th October
2019

Das könnte Ihnen auch gefallen