Sie sind auf Seite 1von 26

REAL TIME HAND GESTURE RECOGNITION FOR

COMPUTER INTERACTION

Seminar report
Submitted in partial fulfillment of
the requirements for the award of B.Tech Degree
in Computer Science and Engineering
of the University Of Kerala

Submitted By
NIMISHA K JITH
Seventh semester
B.Tech Computer Science and Engineering

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING


COLLEGE OF ENGINEERING
TRIVANDRUM
2014

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING


COLLEGE OF ENGINEERING
TRIVANDRUM

CERTIFICATE

This is to certify that this seminar report entitled REAL TIME HAND GESTURE
RECOGNITION FOR COMPUTER INTERACTION is a bonafide record of the work
done by Nimisha.K.Jith, under our guidance towards partial fulfillment of the
requirements for the award of the Degree of Bachelor of Technology in Computer
Science and Engineering of the University of Kerala during the year 2011.2015.

Rani Koshy
Assistant Professor
Dept. of CSE

Sabitha S
Associate Professor
Dept. of CSE

ii

Dr Abdul Nizar A
Professor
Dept. of CSE
(Head of the Department)

ACKNOWLEDGEMENT

I would like to express my sincere gratitude and heartfelt indebtedness to my guide


Mrs Rani Koshy, Associate Professor, Department of Computer Science and Engineering
for her valuable guidance and encouragement in pursuing this seminar.
I am also very much thankful to Dr. Abdul Nizar A, Head of the Department,
Department of Computer Science and Engineering for his help and support.
I also extend my gratitude to Seminar coordinator, Mrs. Sabitha S, Associate
Professor, Department of CSE, College of Engineering Trivandrum for providing necessary
facilities and sincere co.operation. My sincere thanks are extended to all the teachers of the
department of CSE and to all my friends for their help and support.

.NIMISHA K JITH

iii

ABSTRACT
Hand gesture recognition is a natural and intuitive way to interact with the computer, since
interactions with the computer can be increased through multidimensional use of hand
gestures as compare to other input methods. It is one of the most active areas of research in
computer vision, which provides an easy way to interact with a machine without using any
extra devices. If the user doesnt have much technical knowledge about the system then
human-computer interaction (HCI) enables the user to use the system without any
difficulty. This technology can be used as an aid for people with handicaps.
The purpose here is to explore three different techniques for HGR (hand gesture
recognition) using finger tips detection. A new approach called Curvature of Perimeter is
presented. The system presented, uses only a webcam and algorithms which are developed
using computer vision, image and the video processing toolboxes.

The three algorithms which are employed are:


Convex Hull
K-Curvature
Curvature of Perimeter

iv

TABLE OF CONTENTS

1. Introduction

2. Uses

3. Gesture Recognition Systems

4. Methods
4.1Template Matching . 5
4.2 Fingertip Detection 5
5. System Model
5.1 Image Acquisition .. 8
5.2 Color Segmentation ... 8
5.3 Edge Detection ... 10
5.4 Fingertip Detection 10
5.5 Convex Hull ... 11
5.6 K-curvature 13
5.7 Curvature of perimeter ... 15
6 Upcoming Technologies

17

7 Conclusion and Future scope

19

8 References

20

LIST OF FIGURES

1.1
3.1
4.1
5.1
5.2

5.3

System Detecting hand location and movement........ 1


Data Gloves .. 4
Template matching ... 5
System Architecture 7
(a) L*A*B color space illustration ... 8
(b) Color segmentation input image .. 9
(c) Segmented hand... 9
Edge detection using canny edge detection.............. 10

5.4.1 (a) Convex hull method ... 11


(b) Flaws in convex hull .. 12
5.4.2 (a) K-curvature

14

(b) Live video of K-curvature method 14


5.4.3 Curvature of perimeter
(a)Open hands

16

(b) Fists ... 16


(c) Showing fingertips and count ... 16

vi

CHAPTER 1
INTRODUCTION

1.1 GESTURE RECOGNITION IN GENERAL:

Fig 1.1: A child being sensed by a


simple gesture recognition algorithm
detecting hand location and movement.

Gestures can be defined as the physical actions made by humans that conveys some
meaningful information to Interact with the environment. Gesture recognition system
provides a more natural way to interact. In Human Computer Interaction (HCI), with the
help of gestures communication is more natural as compared to graphical user interface like
mouse, keyboards and haptic screens.
In the present day framework of interactive, intelligent computing, an efficient
humancomputer interaction is assuming Utmost importance. They constitute one
interesting small subspace of possible human motion. A gesture may also be perceived by
the environment as a compression technique for the information to be transmitted elsewhere
and subsequently reconstructed by the receiver. Gesture recognition can be seen as a way
for computers to begin to understand human body language, thus building a richer bridge
between machines and humans than primitive text user interfaces or even GUIs (graphical
user interfaces), which still limit the majority of input to keyboard and mouse. Gesture
recognition enables humans to interface with the machine (HMI) and interact naturally
without any mechanical devices. Gesture recognition can be conducted with techniques
from computer vision and image processing.
1

1.2 NEED FOR GESTURE RECOGNITION:


The goal of virtual environments (VE) is to provide natural, efficient, powerful, and
flexible interaction. Gesture as an input modality can help meet these requirements because
Human gestures are natural and flexible, and may be efficient and powerful, especially as
compared with alternative interaction modes. The traditional two-dimensional (2D),
keyboard and mouse oriented graphical user interface (GUI) is not well suited for virtual
environments. Synthetic environments provide the opportunity to utilize several different
sensing modalities and technologies and to integrate them into the user experience. Devices
which sense body position and orientation, direction of gaze, speech and sound, facial
expression, galvanic skin response, and other aspects of human behaviour or state can be
used to mediate communication between the human and the environment. Combinations of
communication modalities and sensing devices can produce a wide range of unimodal and
multimodal interface techniques. The potential for these techniques to support natural and
powerful interfaces for communication in VEs appears promising.

1.3 BENEFITS OF GESTURE RECOGNITION:

Replace mouse and keyboard

Pointing gestures

Navigate in a virtual environment

Pick up and manipulate virtual objects

Interact with the 3D world

No physical contact with computer

Communicate at a distance

CHAPTER 2
USES OF GESTURE RECOGNITON

For socially assistive robotics: By using proper sensors worn on the body of a
patient and by reading the values from those sensors, robots can assist in patient
rehabilitation. The best example can be stroke rehabilitation.

Control through facial gestures: Controlling a computer through facial gestures is


a useful application of gesture recognition for users who may not physically be able
to use a mouse or keyboard. Eye tracking in particular may be of use for controlling
cursor motion or focusing on elements of a display.

Alternative computer interface: Foregoing the traditional keyboard and mouse


setup to interact with a computer, strong gesture recognition could allow users to
accomplish frequent or common tasks using hand or face gestures to a camera.

Immersive game technology

Virtual controllers: For systems where the act of finding or acquiring a physical
controller could require too much time, gestures can be used as an alternative
control mechanism. Controlling secondary devices in a car or controlling a
television set are examples of such usage.

CHAPTER 3
HAND GESTURE RECOGNITION SYSTEMS
Hand Gesture Recognition is a subset of Gesture Recognition System, which
tracks the movements of hands in real time and identifies Gestures using image processing
techniques. There are about 5000 gestures in vocabulary. Each gesture consists of a hand
shape, a hand motion and a location in 3D space.

3.1 DATA GLOVES


One of the initial methods used for hand gesture recognition. It provides very accurate
measurements of hand shape and captures the position and movement of the fingers and
wrist.
It has up to 22 sensors, including three bend sensors (including the distal joints) on each
finger,
Four abduction sensors, plus sensors measuring thumb crossover, palm arch, wrist flexion
and wrist

But due to the following Drawbacks, Data


gloves are not popular today.

Cumbersome to wear

Expensive

Connected

by

wires.

freedom of movement

Figure 3.1: datagloves


4

restricts

CHAPTER 4
METHODS
There are several methods for Hand gesture recognition. Two of them are mentioned here.
4.1. Template Matching
In the template matching approach an image or part of image is compared with some
template given in a database. In other words this approach tries to find some pattern in the
image. Figure 1 and figure 2 show the examples of template matching.

Figure 4.1: template matching

4.2. Finger tips detection

In this approach hand gestures can be recognized by means of finger tips detection. Many
researchers have used fingertips detection for hand gesture recognition. Fingertip detection
approach is used here to explain hand gesture recognition techniques.

CHAPTER 5
SYSTEM MODEL
The system flow diagram is shown in figure 3. Initially the system acquires images
captured from a webcam. The output of the webcam is basically a video sent to the system.
A system will acquire video in the form of sequences of frames. Matlab has an efficient
image acquisition toolbox for this purpose. After acquiring an image, it will be converted
into L*a*b space for colour segmentation. Then edges of a segmented hand are detected
using canny edge detection algorithm and the noise is removed. From the segmented hand,
fingertips can be found by any of three given approaches.

Figure 5.1: system architecture

5.1 .IMAGE ACQUISITION


In order to initialize the system user have to wave his/her hand in front of webcam. The
image acquisition toolbox of Matlab acquires the output of the webcam in terms of frames.
The two functions getsnapshot and getdata in Matlab are used for grabbing the frame.

frame = getsnapshot(obj)

data = getdata(obj)

5.2. COLOR SEGMENTATION


In the next step, it is needed to segment the hand from the background. Skin colour
extraction in real time is a very difficult task. So, for this reason RGB is converted to L*a*b
colour space. L*a*b colour space is derived from the CIE XYZ tristimulus values. L*a*b
space consists of luminosity L* or brightness layer, chromaticity layer a* indicating
where colour falls along the red-green axis, and chromaticity layer b* indicating where
colour falls along blue-yellow axis.
The LAB colour model is a three axis colour system and LAB colours are absolute,
meaning that the colour is exact. L*a*b space consists of luminosity L* or brightness
layer, chromaticity layer a* indicating where colour falls along the red-green axis, and
chromaticity layer b* indicating where colour falls along blue-yellow axis.

Figure 5.2: (a)


L*A*B COLOR SPACE

STEP 1:

Calculate Sample Colours in L*a*b* Colour Space. Converting L*a*b moves

colour on three dimension (R, G, B) onto two chrominance channels and one luminance
channel.

STEP 2:

Classify Each Pixel Using the Nearest Neighbour Rule. Grab the two

chrominance channels; calculate the Euclidian distance from a reference colour to colour of
the given pixel. The smallest distance will tell us that the pixel most closely matches that
reference colour.
STEP 3: Finally, any pixel close enough will be set as 1. Hence, a binary image of
segmented hand is acquired.
Skin colours can be extracted by choosing particular values for a* and b*. In the figure
below, colour A=17.3 and colour B= 13.82 are chosen.

(b) RGB input image (c) Segmented hand in binary

5.3 .EDGE DETECTION AND NOISE REMOVAL


The next step is to detect edges or to find contours of the segmented image. This is done by
Canny edge detection RGB colour space is converted to L*a*b colour space and the hand
region is segmented from the input image.
Noise can be removed by considering the segmented hand or blob as a biggest region and
removing the small objects. Results are shown in Figure.

Figure 5.3: Edge detection using canny edge detection

5.4 .FINGERTIPS DETECTION


In literature, several different methods are used for finger tips detection. The three
algorithms are employed as follows:

Convex Hull

K-Curvature

Curvature of Perimeter

10

5.4.1 .CONVEX HULL


The first technique to find a fingertip is with a convex hull algorithm. In this
approach, a convex hull or a polygon around a blob or segmented hand, i.e. a smallest
convex that encloses all the points of the binary image is made. The kinks in convex hull
are used to find the finger tips. This is done by going around each point in a convex hull
and calculating the angles at those points. One method is to use law of cosine to find angles.
Filter the angles to find the finger tips.
Figure shows the results.

Figure 5.4.1
(a) Convex Hull method

Algorithm ConvexHull (P)


Input

: A set P of points in the plane.

Output : A list containing the vertices of CH (P) in clockwise order.

1. Sort the points by x-coordinate, resulting in a sequence p1... pn.


2. Put the points p1 and p2 in a list Lupper, with p1 as the first point.
3. For i = 3 to n do

4.

Append pi to Lupper.

11

5.

While Lupper contains more than two points and the last three

points in

Lupper do not make a right turn, do

6.

Delete the middle of the last three points from Lupper.

Flaws in convex hull


The convex hull method is fast but it is not robust. For example, when making a fist the
knuckles are identified as fingertips since they are the points in convex hull, likewise when
moving the finger inwards, it losses the fingertips. Figure below shows the faults in convex
hull.

Figure 5.4.1:
(b) Flaws in Convex Hull

12

5.4.2 .K-CURVATURE
In this approach, the contour is represented as a list of boundary points
P(i)=(x(i),y(i)), and compute k-curvature i.e. the angle between the two vectors [P(i-k),
P(i)] and [P(i), P(i+k)], where k is a constant (set k=35). K-Curvature can be calculated
easily using dot product. The main idea is that the points in K-Curvature close to the 0
are considered as candidates points (represent peaks and valleys). Figure shows the idea
behind this approach. A threshold angle th= 30 is used such that the points below this
angle will be considered further. In order to find whether it is a peak or valley, the vectors
are converted into 3D, lying into xy plane and then computing their cross product. If the
sign of the z component is positive, then it is considered as peak while for a negative sign it
is considered as valley. By knowing the number of peaks and valleys, one can find the
gesture.

Algorithm K-curvature

Let Ci represent the contour for the hand.

We attempt to find pixels that represent peaks along the contour perimeters.

At each pixel j in a hand contour i, the k curvature is found, which is the angle
between the two vectors [Ci(j), Ci(j k)] and [Ci(j), Ci(j + k)], where k is a constant.
The k-curvature can be computed with the help of vector algebra.

The equation used for angle calculation is:

The contour points with a k-curvature close to 0 will represent potential peaks or
valleys along the perimeter.

A threshold angle th= 30 for the k-curvature is used such that only points below
this angle will be considered further.

In order to classify the points as either peaks or valleys, convert the vectors into 3D
vectors lying in the xy-plane and then compute the cross product.

If the sign of the z component of the cross product is positive then label the point as
a peak, while a negative cross product results in a valley label.
13

Figure 5.4.2: Live Video Results of K-curvature

Drawbacks of K-curvature

K-curvature cannot be used for dynamic gesture recognition.

It is not robust.

For K-curvature, if there is a fist then it will always identify as 1 finger.

14

5.4.3 .CURVATURE OF PERIMETER


In this technique, some morphological operations are used to find the finger tips.
Firstly, the segmented hand is eroded using distance transform and then finding the
perimeter of the regional hand. After this, the corner points of the eroded versio of the
segmented hand are explored. All the corner points will be candidate points. Eliminate the
corner points closer to the boundary. At each point, crop a section of the perimeter of the
segmented hand and calculate the eccentricity. Low eccentricities indicate shapes closer to
circles similar to the tips of fingers. With certain value of eccentricity it can be said that it is
closed to the fingertips. Other candidate points, such as knuckles, it is going to be a straight
line. Now the candidate points at fingertips and candidate points along the finger are
distinguishable. This approach is pretty much robust and can be used for dynamic gesture
recognition.

Algorithm Curvature of perimeter

Step 1: The segmented hand is eroded and then the perimeter of the regional hand is found.
Erosion is a morphological operation that removes the pixels on object boundaries.
The value of the output pixel is the minimum value of all the pixels in the input
pixel's

neighbourhood.
In a binary image, if any of the pixels is set to 0, the output pixel is set to 0.

Step 2: The corner points of the eroded version of the segmented hand are explored.
All the corner points will be candidate points.
Eliminate the corner points closer to the boundary.

Step 3: At each point, crop a section of the perimeter of the segmented hand and calculate
the eccentricity.
Low eccentricities indicate shapes closer to circles similar to the tips of fingers.
With certain value of eccentricity it can be said that it is closed to the fingertips.
Other candidate points, such as knuckles, it is going to be a straight line.
Now the candidate points at fingertips and candidate points along the finger are
distinguishable.
15

(a)

(b)
Figure 5.4.3:
Live video results of Curvature of
Perimeter,
(a) Open hand,
(b) Fist
(c) Showing the fingertips and count

(c)

16

CHAPTER 6
UPCOMING TECHNOLOGIES

6.1 THE SIXTH SENSE DEVICE


Sixth Sense is a wearable gestural interface device similar to Telepointer, a neck
worn projector/camera system developed by Media Lab student Steve Mann (which Mann
originally referred to as "Synthetic Synaesthesia of the Sixth Sense"). The SixthSense
prototype is comprised of a pocket projector, a mirror and a camera. The hardware
components are coupled in a pendant like mobile wearable device. Both the projector and
the camera are connected to the mobile computing device in the users pocket. The
projector projects visual information enabling surfaces, walls and physical objects around
us to be used as interfaces; while the camera recognizes and tracks user's hand gestures and
physical objects using computer vision based techniques. The software program processes
the video stream data captured by the camera and tracks the locations of the coloured
markers at the tip of the users fingers using simple computer vision techniques. The
movements and arrangements of these fiducials are interpreted into gestures that act as
interaction instructions for the projected application interfaces. The maximum number of
tracked fingers is only constrained by the number of unique fiducials, thus SixthSense also
supports multi-touch and multi-user interaction.

6.2 INTELS GESTURE TECHNOLOGY


What's Next? Gesture Recognition Technology from Intel Labs allows you to
interact with and control devices using simple hand gestures. Imagine a world where
gestures like turning an "air knob" could turn up the volume on your TV or waving your
hand would answer a phone that's in your pocket.
According to a research, the target applications for AVX are interface technology to
control gaming and entertainment. Intel expects that this forthcoming technology would
reduce the need for specialized DSPs and GPUs. Smart computing is here.. Yes visibly
smart. But my personal opinion would be that Intel would make people lazier by the launch
of the next generation gesture recognition technology. Its amazing to just thing about the
17

world where we can control TV , PC, Washing machine and other devices at home in just a
gesture.

6.3 GESTURE TEK


GestureTek's Illuminate interactive multi-touch surface computing technology with
a motion sensing gesture control interface lets users navigate interactive content on a
floating panel, multimedia kiosk, multi touch surface screen, interactive table or interactive
window. Surfaces can be configured with a multi-touch interface for multitouch or multipoint interaction. With no projector or hardware to be seen, the effect is unforgettable as
GestureTeks dynamic interactive displays react to every point of your finger or wave of
your hand, delivering a rich, interactive experience.
The hand tracking system lets you control multimedia in ways you never imagined,
transforming an ordinary surface into an interactive multi-touch surface computing
platform. Illuminate surfaces are available as interactive multi-touch display panels and
windows, interactive kiosks and multi-touch tables. Multi-touch interactive surface displays
come turnkey or can be customized to virtually any shape or size.

18

CHAPTER 7
CONCLUSION AND FUTURE WORK
The importance of gesture recognition lies in building efficient humanmachine interaction.
Its applications range from sign language recognition through medical rehabilitation to
virtual reality. Three techniques for hand gesture recognition called convex hull, kcurvature, and curvature of perimeter have been discussed. Experimental results show that
curvature of a perimeter is the most robust approach among them.

Some of the improvisations are:

The methodology for hand segmentation from a background can be improved.


Lastly, the algorithms can be optimized further by converting them into a fixed point and
generating C code from it to embed them into systems like DSP or FPGA.

19

REFERENCES
[1] Real Time Hand Gesture Recognition for Computer Interaction.2014
International Conference on Robotics and Emerging Allied Technologies in
Engineering (iCREATE) Islamabad, Pakistan, April 22.24, 2014.

[2] Gesture Recognition with Applications .Oleksiy Busaryev ,Department of


Computer Science and Engineering ,The Ohio State University Columbus,
OH 43210, USA.busaryev@cse.ohio.state.edu

[3] International Journal of Scientific & Engineering Research Volume 4,


Issue3, March.2013

[4] http://www.mathworks.in/help/images/examples/color.based.segmentation.u
sing.the.l.a.b.color.space.html

[5] http://en.wikipedia.org/wiki/Erosion_(morphology)

[6] http://www3.cs.stonybrook.edu/~algorith/files/convex.hull.shtml

[7] http://www3.cs.stonybrook.edu/~algorith/files/convex.hull.shtml

20