Sie sind auf Seite 1von 4

6

IEEE EMBEDDED SYSTEMS LETTERS, VOL. 8, NO. 1, MARCH 2016

Wearable Camera- and Accelerometer-Based Fall


Detection on Portable Devices
Koray Ozcan, Student Member, IEEE, and Senem Velipasalar, Senior Member, IEEE

AbstractRobust and reliable detection of falls is crucial especially for elderly activity monitoring systems. In this letter,
we present a fall detection system using wearable devices, e.g.,
smartphones, and tablets, equipped with cameras and accelerometers. Since the portable device is worn by the subject, monitoring
is not limited to conned areas, and extends to wherever the
subject may travel, as opposed to static sensors installed in certain
rooms. Moreover, a camera provides an abundance of information, and the results presented here show that fusing camera
and accelerometer data not only increases the detection rate,
but also decreases the number of false alarms compared to only
accelerometer-based or only camera-based systems. We employ
histograms of edge orientations together with the gradient local
binary patterns for the camera-based part of fall detection. We
compared the performance of the proposed method with that of
using original histograms of oriented gradients (HOG) as well as
a modied version of HOG. Experimental results show that the
proposed method outperforms using original HOG and modied
HOG, and provides lower false positive rates for the camera-based
detection. Moreover, we have employed an accelerometer-based
fall detection method, and fused these two sensor modalities to
have a robust fall detection system. Experimental results and
trials with actual Samsung Galaxy phones show that the proposed
method, combining two different sensor modalities, provides much
higher sensitivity, and a signicant decrease in the number of false
positives during daily activities, compared to accelerometer-only
and camera-only methods.
Index TermsFall detection, gradient local binary patterns, histogram of oriented gradients, smartphone, wearable camera.

I. INTRODUCTION
ALLS are considered to be the eighth leading cause of
death in the United States [1] and fall injuries can result in
serious complications [2], [3]. Autonomous fall detection systems can reduce the severity of falls by informing other people
to deliver help and reducing the amount of time people remain
on the oor. They will increase safety and independence of elderly. Yet, to nd widespread use, they should be robust and
reliable.
Many fall detection algorithms have been proposed relying
only on accelerometer data [4], [5], [6]. Koshmak et al. [4]
tested their method on actual falls of ice-skaters. Yet, since every

Manuscript received July 01, 2015; accepted September 26, 2015. Date of
publication October 05, 2015; date of current version February 25, 2016. This
work was supported in part by the National Science Foundation (NSF) CAREER
under Grant CNS-1206291 and by NSF Grant CNS-1302559. This manuscript
was recommended for publication by A. Gordon-Ross.
The authors are with the Department of Electrical Engineering and Computer
Science, Syracuse University, Syracuse, NY 13244 USA (e-mail: kozcan@syr.
edu; svelipas@syr.edu).
Color versions of one or more of the gures in this letter are available online
at http://ieeexplore.ieee.org.
Digital Object Identier 10.1109/LES.2015.2487241

fall has different acceleration characteristics and the magnitude


of acceleration has high variation among various body types,
it is challenging to detect different types of falls for different
people. Cao et al. [5] employ adaptive thresholds for motion
sensors. Accelerometers have also been used in the activity classication for sports activities [6] to capture training statistics. As
discussed in [6], since the placement of the phone differs from
person to person, using just the accelerometer might not be sufcient for activity classication. In such cases, use of camera
sensors in tandem with the accelerometer can help resolving
such issues. Wu et al. [7] also discuss the limitations of using
just the accelerometer. Accelerometer-based systems, although
simple and cost-effective, can still create false positive alarms
even with multiple sensors, especially in environments such as
high speed trains and elevators, where people are exposed to acceleration. Images captured by a camera sensor provide abundance of data including contextual information about the surroundings. In this letter, we propose a system that employs a
novel approach of using both accelerometer and camera modalities to detect falls by differentiating them from other daily activities including walking, sitting, lying down and going up and
down stairs. This is one of the rst works that uses data from
a wearable camera to overcome shortcomings of accelerometer-only systems.
Wearable cameras alleviate, if not eliminate, privacy concerns of users since the captured images are not of the subjects
but the surroundings. Also, with smartphone implementation,
images are processed locally on the device, and they are not
saved or transmitted anywhere. Also, a recent study, about privacy behaviors of lifeloggers using wearable cameras, discusses
privacy of bystanders and ways to mitigate concerns [8]. It is
also expected that wearable cameras will be employed more to
understand lifestyle behaviors for health purposes [9].
For the camera-based part, the proposed method employs
histograms of edge orientations together with the gradient
local binary patterns (GLBP), which use image features that
have more descriptive power [10]. GLBPs have been used
for human detection applications, and were derived from an
operator named local binary pattern [11]. We show that the
novel camera-based fall detection method proposed in this
letter is more robust, and outperforms our previous work that
uses only histograms of edge strengths and edge orientations
[12]. Moreover, we present an accelerometer-based fall detection algorithm and a fusion approach to combine results from
these two sensors. We rst present the signicant improvement
provided by the proposed camera-based algorithm on recorded
videos. We also show results on actual smartphones with a
simpler version of the camera-based algorithm for real-time
performance. The fusion of camera-based results with accelerometer data provides signicant decrease in the number of
false positives compared to using only accelerometer data.

1943-0663 2015 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/
redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

OZCAN AND VELIPASALAR: WEARABLE CAMERA- AND ACCELEROMETER-BASED FALL DETECTION ON PORTABLE DEVICES

Fig. 1. GLBP histogram generation steps.

II. PROPOSED METHOD


A. Summary of HOG and Modified HOG
In the original HOG-based algorithm, proposed for human
detection, an image is divided into blocks and then each block
is divided into cells. For each cell, an -bin histogram is built,
wherein each bin corresponds to a gradient orientation span. The
concatenation of histograms forms the HOG descriptor for a
block, with
bins. For each pixel in a cell, the intensity gradient magnitude and orientations are calculated. Each gradient
has a vote in its bin, which is its magnitude. Then, block-based
normalization is applied.
In [12], we proposed a modied HOG algorithm for fall
detection, wherein, different from original HOG, separate
histograms are constructed for edge strength (ES) and edge
orientations (EO). The edge orientation range is between 0 and
180 degrees and it is equally divided into nine bins. The edge
strength histogram contains 18 bins. Moreover, the cells that
do not contain signicant edge information are excluded from
the descriptor in this modied HOG algorithm.
B. Gradient Local Binary Pattern Features
The computation ow of GLBP is illustrated in Fig. 1. For
each center pixel, its eight neighboring pixels are checked. A
value of 1 or 0 is assigned to a neighboring pixel if its intensity value is greater or less than the center pixel, respectively.
This results in an 8-bit binary number. Only the sequences that
have a maximum of 2 transitions (from 0 to 1 or 1 to 0) are kept,
and the others, including all 0 and all 1 sequences, are considered as noise. In accordance with [11], we employ 58 uniform
patterns out of 256 possible patterns. Then, we analyze the 8-bit
sequence to nd the length of longest consecutive sequence of
1 s and the angle of the edge, as illustrated in Fig. 1. Then, these
values provide the index of the entry of the
matrix, which
is incremented by the edge strength value. This matrix is lled
by visiting each pixel in a cell, and then normalized. This results
in a 56-dimensional GLBP feature that is used in our algorithm.
We use one block divided into 16 cells, therefore our concatenated GLBP vector for one frame is of length
. After
calculating the GLBP feature for each cell, L2 normalization is
applied before concatenation.
C. Camera-based Detection
The proposed camera-based fall detection algorithm consists
of two stages. The rst stage detects a signicant event, which
could be caused by sitting down, lying down or falling down

etc. Once an event is detected, the second stage of the algorithm


is employed to detect whether it is a fall or not.
We employ a combination of edge orientation (EO) histograms and GLBP features. To compute EO histograms and
GLBP features, we use one block divided into 16 cells. This not
only decreases the computational load, but also is sufcient to
detect abrupt changes. Moreover, the cells that do not contain
signicant edge information are removed autonomously and
adaptively. To determine which cells to remove, the maximum
amplitude among the bins within a cell is found rst. Then, we
calculate the mean and the standard deviation of the vector of
maximums from the cells in a frame. The algorithm removes
the cells whose maximum value are standard deviation away
from the computed mean.
To build the EO histogram, horizontal ( ) and vertical ( )
intensity gradients are computed for every pixel within a cell.
Then, these values are used to compute the gradient strength
(
) and orientation (
). We use 9 bins
for each of the 16 cells, and then concatenate 16-many 9-bin
histograms to obtain a
-dimensional vector. We obtain
the GLBP descriptor as described in Section II-B. After the feature descriptors (EO and GLBP) are normalized, the dissimilarity distance is used to compute
and
between
two frames. The dissimilarity distance between two N-dimensional vectors ( and ) is calculated using

(1)
Dissimilarity distance values for EO (
) and GLBP
(
) are multiplied (
) in order to
attenuate the noise in the signal while emphasizing the peaks
for the detection.
1) Detecting an Event: We store the distance values
(values of
from time
to time ) in an array
of
for the last frames. Therefore, the
saves -many
values, which is computed between the current frame
and the frame
, such that
is an integer value. If the
maximum distance value in the buffer array
is larger than
a threshold , which has been chosen empirically, it implies the
occurrence of an event.
2) Detecting a Fall: Once an event is detected, the second
stage of the algorithm is employed to detect whether it is a
fall or not. The dissimilarity distances
and
are
computed between the current and previous frames. If the multiplied value (
) is greater than the threshold
, a fall is detected based on camera sensor information. Dissimilarity distance values for a typical fall event are plotted in
Fig. 2 for different features, namely when using original HOG,
and the proposed
continuous
line plot. In this video, the fall is taking place between frames
40 and 55. As can be seen, the proposed method (employing
) gives a higher dissimilarity distance value
during the fall, compared to using
[12], and
thus better discriminates the fall from the rest, and has better detection capability. Moreover, the proposed method results in less
false positives compared to the original HOG. As seen in Fig. 2,

IEEE EMBEDDED SYSTEMS LETTERS, VOL. 8, NO. 1, MARCH 2016

Fig. 3. Example frames captured during a fall from standing up position.

TABLE I
SENSITIVITY AND SPECIFICITY COMPARISON

Fig. 2. Plots of different distance measures for a typical fall.

the original HOG approach (light blue plot) has high values before and after the fall. The detailed rates are provided in Tables I,
showing the increase in sensitivity and specicity.
D. Accelerometer-based Detection
We observe the magnitude of linear acceleration with the
gravity component extracted from the corresponding direction.
Whenever the magnitude of 3-axis vector is greater than (an
empirically determined threshold), it is declared as a fall by the
accelerometer-based part.
E. Camera Data Fused with Accelerometer Data
We normalize the accelerometer data by the maximum value
of the accelerometer sensor of the smartphone. The fusion
method is inspired by the sum rule of two normalized classiers, since it gives the least detection error rate [13]. The
only requirement is that classiers need to be conditionally
independent, which is assumed to be true since camera and accelerometer are independent sensors. Other compared methods
include product, minimum, maximum and median of different
classiers. Whenever the sum value of the
and
the accelerometer data is greater than
, fallDetected
alarm is triggered, and the result is displayed on the screen of
the phone. The fusion of different sensor modalities also helps
to eliminate false positives caused by using only camera or
only accelerometer as fall detection sensor.
III. EXPERIMENTAL RESULTS
A. Camera-based Detection on Recorded Videos
We rst compared the proposed camera-based detection part
of the algorithm (incorporating GLBP features) with original
and modied HOG by using 10 different subjects in the experiments. Each subject performed 10 falls from standing up and
10 falls from sitting down positions as well as 10 sitting and
10 lying down activities. Fall experiments were used to compute the sensitivity of the algorithm while the rest were used for
specicity. Sensitivity and specicity are dened as
Sensitivity

Specicity

(2)

where TP, FP, TN, and FN denote the true positive, false positive, true negative and false negative, respectively.
For convenience of subjects, and for repeatability purposes
(so that different approaches can be compared on the same
videos), the experiments were performed on prerecorded videos
from ten different subjects. Videos were captured with a Microsoft LifeCam camera with image size of
pixels.

Fig. 4. An Android smartphone attached to the waist.

All the experiments were performed with the camera attached


to the belt around the waist facing front.
An example set of captured frames for falling from standing
up position is presented in Fig. 3. The parameters of the camerabased algorithm are
,
, and
. The
same values have been used in all the experiments. value is
selected to be 10 to cover information from approximately last
one second of the movement. and
are empirically-determined dissimilarity distance thresholds.
Average sensitivity and specicity values for 10 different subjects are presented in Table I for all the falls from standing
up position. The proposed method outperforms using modied
HOG with separate EO and ES histograms [12]. More specically, the sensitivity is increased from 87.84% to 96.36%, and
the specicity is increased from 89.11% to 92.45%. Moreover,
when compared to using original HOG, the proposed method
provides a very high specicity rate. The specicity rate of using
original HOG is 48.04%, which is unacceptably low, since it is
highly prone to creating many false positives.
Average sensitivity and specicity rates for falling from sitting down position are also presented in Table I. This is a more
challenging scenario compared to falls from standing up positions. The sensitivity rate has been increased from 67.39% to
90.91%, and the specicity is increased from 89.69% to 92.45%
compared to using modied HOG [12].
B. Detection with Fusion on Actual Smartphones
We have also performed experiments with people carrying a
Samsung Galaxy S4 phone with Android OS. The experimental
setup can be seen in Fig. 4. The subjects, 1 female and 9 male,
are between the ages of 24 and 30. Their heights and weights
range from 165 to 183 cm and from 50 to 103 kg, respectively.
They tried to act as if they were actually falling down. It should
be noted that, due to fear or cautiousness of subjects, it is difcult to recreate a free fall or collapse. It was observed that sometimes the falls were almost like in slow motion. Thus, the performed set of experiments proved to be even more challenging
than an actual free fall.

OZCAN AND VELIPASALAR: WEARABLE CAMERA- AND ACCELEROMETER-BASED FALL DETECTION ON PORTABLE DEVICES

TABLE II
SENSITIVITY AND SPECIFICITY VALUES FOR FALL DETECTION

C. Battery Consumption
The current consumption of the running algorithm is measured with a Monsoon Power Analyzer. With 2600 mAh battery
capacity, the smartphone is running on 80 mA with estimated
battery life of 32 hr. With the proposed algorithm continuously
running on the device, it draws 542 mA of current with an estimated battery life of 4.76 h.
IV. CONCLUSION

TABLE III
FALSE POSITIVES GENERATED FOR 30-MIN. VIDEOS

First, a robust and reliable algorithm for fall detection with


a wearable camera has been proposed. By combining GLBP
features with edge orientation histograms, this camera-based
method provides higher sensitivity and specicity rates compared to using original HOG and its modied version. A simplied version of this camera-based algorithm has been implemented on a Samsung Galaxy S4 smartphone. The features computed from camera modality have been fused with accelerometer data. In addition, longer-duration experiments have been
performed to analyze the false alarms. It has been shown that it
is neither reliable nor robust to rely only on the accelerometer
or only on the camera by itself, and fusing these two modalities
provides much higher sensitivity, and a signicant decrease in
the number of false positives.
REFERENCES

For real-time computation on the phone, without loss of generality, we have used a simplied version, with EO and ES histograms, for the camera-based detection on the Samsung Galaxy
S4 phone. The algorithm runs at 15 fps on the smartphone. In the
rst set of experiments, we compared sensitivity values and the
number of false positives for three different cases: 1) accelerometer only; 2) camera only; and 3) camera fused with accelerometer. The results are summarized in Table II. The performed nonfall activities include 15 of sitting down and then standing up,
walking and 15 of lying down and then standing up. Some of
the lying down experiments include lying on the oor, which is
a very complicated scenario to differentiate from actual falls. As
can be seen, the proposed method provides the highest detection
rate. When GLBP features are incorporated into the phone implementation, sensitivity and specicity values are expected to
increase even more as demonstrated in Section III-A.
It should be noted that, for the above set of experiments, the
nonfall activities (walking, sitting and lying down) are not very
fast and complicated in nature to cause false positives. Thus, in
the third set of experiments, the goal was to demonstrate the effectiveness of fusing camera data with accelerometer data in decreasing the number of false positives created during a variety
of daily activities, when only a single modality is used. For a
duration of about 30 min., ten subjects performed various activities including going up and down the stairs, running, jumping,
changing rooms, opening doors and changing directions. For
the proposed fusion-based algorithm, fallDetected alarm is triggered when the summation of features from different modalities is greater than
. When this happens, an alert
is displayed on the phones screen. The number of false positives, when we use: 1) accelerometer only; 2) camera only;
and 3) camera fused with accelerometer (proposed method) are
summarized in Table III. As seen, using camera-based features
and fusing the results with the accelerometer data decreases the
number of false positives.

[1] M. Heron, Deaths: Leading causes 2007, Nat. Vital Statist. Rep., vol.
59, no. 8, p. 17, 2122, 2011.
[2] J. Shelfer, D. Zapala, and L. Lundy, Fall risk, vestibular schwannoma,
and anticoagulation therapy, J. Amer. Acad. Audiol., vol. 19, no. 3, pp.
237245, 2008.
[3] R. C. O. Voshaar, S. Banerjee, M. Horan, R. Baldwin, N. Pendleton, R.
Proctor, N. Tarrier, Y. Woodward, and A. Burns, Predictors of incident
depression after hip fracture surgery, Amer. J. Geriatric Psych., vol.
15, no. 9, pp. 807814, 2007.
[4] G. Koshmak, M. Linden, and A. Lout, Evaluation of the androidbased fall detection system with physiological data monitoring, in
Proc. 35th Annu. Int. Conf. IEEE Eng. Med. Biol. Soc., Jul. 2013, pp.
11641168.
[5] Y. Cao, Y. Yang, and W. Liu, E-falld: A fall detection system using
android-based smartphone, in Proc. 9th Int. Conf. Fuzzy Syst. Knowledge Discovery (FSKD), 2012, May 2012, pp. 15091513.
[6] K. Taylor, U. A. Abdulla, R. J. Helmer, J. Lee, and I. Blanchonette,
Activity classication with smart phones for sports activities, Procedia En., vol. 13, pp. 428433, 2011.
[7] W. Wu, S. Dasgupta, E. E. Ramirez, C. Peterson, and J. G. Norman,
Classication accuracies of physical activities using smartphone motion sensors, J Med. Internet. Res., vol. 14, no. 5, p. e130, Oct. 2012.
[8] R. Hoyle, R. Templeman, S. Armes, D. Anthony, D. Crandall, and A.
Kapadia, Privacy behaviors of lifeloggers using wearable cameras,
in Proc. ACM Int. Joint Conf. Pervasive Ubiquitous Comput., 2014,
pp. 571582.
[9] A. R. Doherty, S. E. Hodges, A. C. King, A. F. Smeaton, E. Berry, C.
J. Moulin, S. Lindley, P. Kelly, and C. Foster, Wearable cameras in
health: The state of the art and future possibilities, Amer. J. Preventive
Med., vol. 44, no. 3, pp. 320323, 2013.
[10] N. Jiang, J. Xu, W. Yu, and S. Goto, Gradient local binary patterns
for human detection, in Proc. IEEE Int. Symp. Circuits Syst. (ISCAS),
2013, pp. 978981.
[11] T. Ojala, M. Pietikainen, and T. Maenpaa, Multiresolution gray-scale
and rotation invariant texture classication with local binary patterns,
IEEE Trans. Pattern Anal. Mach. Intell., vol. 24, no. 7, pp. 971987,
Jul. 2002.
[12] K. Ozcan, A. Mahabalagiri, M. Casares, and S. Velipasalar, Automatic
fall detection and activity classication by a wearable embedded smart
camera, IEEE J. Emerg. Sel. Topics Circuits Syst., vol. 3, no. 2, pp.
125136, Jun. 2013.
[13] J. Kittler, M. Hatef, R. Duin, and J. Matas, On combining classiers,
IEEE Trans. Pattern Anal. Mach. Intell., vol. 20, no. 3, pp. 226239,
Mar. 1998.
[14] N. Dalal and B. Triggs, Histograms of oriented gradients for human
detection,, in Proc. IEEE 2005 Comput. Soc. Conf. Comput. Vision
Pattern Recognit. (CVPR), 2005, pp. 886893.

Das könnte Ihnen auch gefallen