Sie sind auf Seite 1von 5

ISSN: 2277 9043 International Journal of Advanced Research in Computer Science and Electronics Engineering Volume 1, Issue 2, April

l 2012

Self Organizing Markov Map for Speech and Gesture Recognition


Ms. Nutan D Sonwane, Prof. S. A. Chhabria, Dr.R.V.Dharaskar

Abstract Gesture and Speech based human Computer


interaction is attractive attention across various areas such as pattern recognition, computer vision. Thus kind of research areas find many kind of application in Multimodal HCI, Robotics control, Sign language recognition. This paper presents head and hand Gesture as well as Speech recognition system for human computer interaction (HCI).This kind of vision based system can show the capability of computer, which understand and responding to the hand and head gesture also for Speech in form of sentence. This recognition system consists of two main modules namely 1.Gesture recognition 2.Speech recognition, Gesture recognition consists of various phases.i. image capturing, ii. Feature extraction of gesture iii.Gesture modeling (Direction, Position, generalized), 2.Speech recognition consists of various phases i. taking voice signals ii. Spectral coding iii. Unit matching (BMU) iv. Lexical decoding v.syntactic, semantic analysis. Compared with many existing algorithms for gesture and speech recognition, SOM provides flexibility, robustness against noisy environment. The detection of gestures is based on discrete predestinated symbol sets, which are manually labeled during the training phase. The gesture-speech correlation is modelled by examining the co-occurring speech and gesture patterns. This correlation can be used to fuse gesture and speech modalities for edutainment applications (i.e. video games, 3-D animations) where natural gestures of talking avatars are animated from speech. A speech driven gesture animation example has been implemented for demonstration. KeywordsGesture recognition, Human computer interaction, speech recognition, self organizing map and Markov model I INTRODUCTION

This paper presents head and hand Gesture as well as Speech recognition system for human computer interaction (HCI).This kind of vision based system can show the capability of computer. Which understand and responding to the hand and head gesture, Speech in form of sentence. This recognition system consists of four modules namely 1. Manual Module 2.Head Tracker 3.Hand Recognition 4.Voice Recognition which consists various Symbolic gesture command and voice command. i. Image capturing, ii. Feature extraction of gesture iii. Gesture modeling (Direction, Position, generalized), 2.Speech recognition consists of various phases i. taking voice signals ii. Spectral coding iii. (BMU)Best Unit matching iv. Lexical decoding v. syntactic, semantic analysis. Compared with many existing algorithms for gesture and speech recognition, SOMM (Self Organizing Markov map) provides flexibility, robustness against noisy

environment. The approach involves the combination of self Organizing Markov Map (SOM) and Markov Model. Its most effective application is the development of strong and friendly interfaces for human-machine interaction, since gesture and speech are a natural and powerful way of communication. The Principle component Analysis approach describes a method for gesture recognition It is a classical feature extraction technique widely used in the field of pattern recognition and computer vision [1]. The gesture recognition using PCA algorithm that involves two phases: Training Phase Recognition Phase. Support Vector Machines it is a classical statistical technique for analyzing the covariance structure of multivariate data. Self-Growing and Self-Organized Neural Gas (SGONG) network [2] describe a method which is an unsupervised neural classifier. It achieves clustering of the input data, so as the distance of the data items within the same class (intra-cluster variance) is small and the distance of the data items stemming from different classes (inter-cluster variance) is large. The final number of classes is determined by the SGONG during the learning process. (SOM) [3] Describes a method of self organizing map for Speech recognition. Modular system based on hidden Markov model [4] describes a layered method based on (HMM) Hidden Markov model.SOMM architecture for gesture recognition, fusing separate component model all of which are based on hand trajectory. The approach involves a combination of Self Organizing Maps and Markov Models [5] for gesture trajectory classification, using the trajectory of the hand segment and direction of motion during a gesture. This classification scheme is based on the transformation of a gesture representation from series of coordinates and movements to a symbolic form and building probabilistic models based on these transformed representations. Automatic speech [6] recognition is a process by which a machine identifies speech. The machine takes a human utterance as an input and returns a string of words phrases or continuous speech in the form of text as output. since gesture and speech are a natural and powerful way of communication [2][3][4][6].

Figure: 1 Symbolic Hand Gesture

119 All Rights Reserved 2012 IJARCSEE

ISSN: 2277 9043 International Journal of Advanced Research in Computer Science and Electronics Engineering Volume 1, Issue 2, April 2012

II

SELF ORGANIZING MAP Step III: Scale neighbors 1) Determining Neighbors There are actually two parts to scaling the neighboring weights: determining which weights are considered as neighbors and how much each weight can become more like the sample vector. The neighbors of a winning weight can be determined using a number of different methods. Some use concentric squares, others hexagons. 2) Learning Learning in the self-organizing map is to cause different parts of the network to respond similarly to certain input patterns.second part to scaling the neighbors is the learning function. The winning weight is rewarded with becoming more like the sample vector. The neighbors also become more like the sample vector. An attribute of this learning process is that the farther away the neighbor is from the winning vector, the less it learns. The rate at which the amount a weight can learn decreases and can also be set . Here use a Gaussian function. This function will return a value ranging between 0 and 1, where each neighbor is then changed using the parametric equation. So in the first iteration, the best matching unit will get a t of 1 for its learning function, so the weight will then come out of this process with the same exact values as the randomly selected sample. III HIDDEN MARKOV MODEL

A self-organizing map or self-organizing feature [3] map is a type of artificial neural network that is trained using unsupervised learning to produce a low-dimensional (typically two-dimensional), discredited representation of the input space of the training samples and called a map. Self-organizing maps are different from other artificial neural networks. They use a neighborhood function to preserve the topological properties of the input space. Training builds the map using input examples. It is a competitive process, also called vector quantization. Mapping automatically classifies a new input vector. A self-organizing map consists of components called nodes or neurons. Three stage of SOM, Initialization 2) gets best matching unit 3) scale nneighbors. Step I: Initialization Initialize the weight vector map. Each weight vector random values for its data. Before the training, initial values are given to the prototype vectors. The SOM is very robust with respect to the initialization, but properly accomplished it allows the algorithm to converge faster to a good solution. Typically one of the three following initialization procedures is used: 1. Random initialization, where the weight vectors are initialized with small random values. 2. Sample initialization, where the weight vectors are initialized with random samples drawn from the input data set. 3. Linear initialization, where the weight vectors are initialized in an orderly fashion along the linear subspace spanned by the two principal eigen vectors of the input data set. The eigenvectors can be calculated using Gram-Schmidt procedure. In SOM Toolbox, random and linear initializations have been implemented. Random initialization is done by taking randomly values from the d-dimensional cube defined by the minimum and maximum values of the variables. Linear initialization is done by selecting a mesh of points from the d-dimensional min-max cube of the training data. The axis of the mesh is the eigenvectors corresponding to the m greatest values of the training data. Step II: Get best matching unit Go through all the weight vectors and calculate the distance from each weight to the chosen sample vector. The weight with the shortest distance is the winner. If there is more than one with the same distance, then the winning weight is chosen randomly among the weights with the shortest distance. The most common method is to use the Euclidean distance. Operation of calculating distances and comparing them is done over the entire map and the weight with the shortest distance to the sample vector is the winner and the BMU. The square root is not computed in the program for speed optimization. 1)

Hidden Markov model (HMM) is a statistical Markov mode [4] in which the system being modeled is assumed to be a Markov process with unobserved (hidden) states. An HMM can be considered as the simplest dynamic Bayesian network. In a regular Markov model, the state is directly visible to the observer, and therefore the state transition probabilities are the only parameters. In a hidden Markov model, the state is not directly visible, but output, dependent on the state, is visible. Each state has a probability distribution over the possible output tokens. Therefore the sequence of tokens generated by an HMM gives some information about the sequence of states. The parameters of a hidden Markov model are of two types 1. Transition probabilities 2. Emission probabilities (also known as output probabilities).The transition probabilities control the way the hidden state at time t is chosen given the hidden state at time t 1. The hidden state space is assumed to consist of one of N possible values, modeled as a categorical distribution. This means that for each of the N possible states that a hidden variable at time t can be in, there is a transition probability from this state to each of the N possible states of the hidden variable at time t + 1, for a total of N2 transition probabilities. (Note, however, that the set of transition probabilities for transitions from any given state must sum to 1, meaning that any one transition probability can be determined once the others are known, leaving a total of N(N 1) transition parameters.)

120 All Rights Reserved 2012 IJARCSEE

ISSN: 2277 9043 International Journal of Advanced Research in Computer Science and Electronics Engineering Volume 1, Issue 2, April 2012

speech utterances along with their transcriptions into phonemes and outputs the speech models for the phonemes. Hidden Markov models can model complex Markov processes where the states emit the observations according to some probability distribution.One such example of distribution is Gaussian distribution, in such a Hidden Markov Model the states output is represented by a Gaussian distribution.HMM uses various technique to solve problem such as 1) Forward and backward 2) viterbi algorithm and posterior algorithm 3) Baum Welch algorithm. III. ALGORITHM IV HARDWARE COMPONENT AS WHEELCHAIR ROBOT A wheelchair robot move according to the command given to it from various kinds of Symbolic gesture and voice commands. The system takes symbolic gesture commands as input to hardware and it will move accordingly. Wheelchair robot made up of various hardware component: a) Microcontroller i. 2K bytes of Flash ii. 128 bytes of RAM iii 15 I/O lines iv Two 16-bit timer/counters v A five vector two-level interrupt architecture vi A full duplex serial port vii A precision analog comparator viii on-chip oscillator and clock circuitry b) Other devices: DC Motor, TX-RX Antenna, USB to serial connector, Battery
Figure:2 Self Organizing Map

Kohonen Algorithm: Step1.Randomize the map's nodes' weight vectors Step 2.Grab an input vector Step 3.Traverse each node in the map i) Use Euclidean distance formula to find similarity between the input vector and the map's node's weight vector ii) Track the node that produces the smallest distance (this node is the best matching unit, BMU) Step 4.Update the nodes in the neighborhood of BMU by pulling them closer to the input vector Step 5.Increase t and repeat from step 2 Markov Model include various algorithm:Use Viterbi algorithm for finding sequence of hidden states called the Viterbi path. Baum-Welch algorithm is use for finding set of state transition and output probabilities of sequence. Step1.The (potentially) occupied state at time t is called qt Step2. A state can referred to by its index, e.g. qt = j Step3.1event equal to1 state At each time t, the occupied state outputs (emits) its corresponding.Markov model is generator of events. Each event is discrete, has single output. In typical finite-state machine, actions occur at transitions, but in most Markov Models, actions occur at each state. The data in a speech recognition system. Training takes as input a large number of
Figure: 4 Internal circuit of Robot Figure: 3 Wheelchair Robot

121 All Rights Reserved 2012 IJARCSEE

ISSN: 2277 9043 International Journal of Advanced Research in Computer Science and Electronics Engineering Volume 1, Issue 2, April 2012

V MODULES i. Manual mode ii Head Gesture iii Hand gesture iv Voice recognition .This all modules include in one system and according to order of command in form of gesture and speech, accordingly it will take movements.

CONCLUSION Proposed system includes both the approaches speech as well as gesture recognition. System will take input in form of speech signal and gesture as hand & head coordinates. System will also use one wheelchair as hardware device for interaction with system. REFERENCES
[1] Soloman Raju Kota,J.L Reheja,Ashutosh Gupta,Archna rathi , Shashikant SharmaPrincipal component analysis for Gesture recognition using systemC2009 international conferences in advance technology in communication and computing 2009IEEE [2] Yean Choon Ham, Yu Shi Developing a Smart Camera for Gesture Recognition in HCI Applications The 13th IEEE International Symposium on Consumer Electronics (ISCE2009) 978-1-4244-2976-9/09/$25.00 2009 IEEE [3] E. Stergiopoulou and N. Papamarkos A New Technique For Hand Gesture Recognition 1-4244-0481-9/06/ 2006 IEEE [4] Anjali Kalra, Sarbjeet Singh, Sukhvinder SinghSpeechRecognition International Journal of Computer Science and Network Security, VOL.10,2010. [5] George Caridakis , Kostas Karpouzis, Athanasios Drosopoulos, Stefanos Kollias SOMM: Self organizing Markov map for gesture recognition Pattern Recognition Letters 31, 2010 [6] WU Song-Lin, CUI Rong-Yi Human Behavior Recognition Based on Sitting Postures 2010 International Symposium on Computer, Communication, Control and Automation. 978-1-4244-5567-6/10/ 2010 IEEE [7] Jagdish Lal Raheja, Radhey shyam Real Time Robotic Hand Control Using Hand Gesture 978-0-7695-3977-5/10 2010 IEEE.

Speech and Gesture Recognition:

Figure:5 Speech and Gesture Recognition

I. Manual mode

Figure:6 Mannual mode

II.Head Gesture III.Hand Gesture

[8] Mr. Chetan A. Burande, Prof. Raju M. Tugnayat, Prof.Dr. Nitin K. Choudhary Advanced Recognition Techniques for Human Computer Interaction. 978-1-4244-5586-7/10. 2010 IEEE [9] Shuai Jin, Guang-ming Lu, Jian-xun Luo, Wei-dong Chen Xiao-xiang Zheng SOM-based Hand Gesture Recognition for Virtual Interactions in IEEE International Symposium on Virtual Reality Innovation 2011. [10] G.R.S Murthy, R.S Jadon Hand gesture recognition using neural network in 2nd International Advance Computing Conference 2010 Mr. Chetan A. Burande, Prof. Raju M. Tugnayat, Prof.Dr. Nitin K. Choudhary Advanced Recognition Techniques for Human Computer Interaction. 978-1-4244-5586-7/10. 2010 IEEE [11] M. Ajallooeian, A. Borji, B. N. Araabi , M. Nili Ahmadabadi, H. Moradi Fast Hand Gesture Recognition based on Saliency Maps: An Application to Interactive Robotic Marionette Playing The 18th IEEE International Symposium on Robot and Human Interactive Communication Toyama, Japan, Sept. 27-Oct. 2, 2009. 978-1-4244-5081-7 /09/ 2009 IEEE

Figure:

7 Head Gesture

/ Hand Gesture

[12] wei-hua andrew wang, chun-liang tung Proceedings of the Seventh International Conference on Machine Learning and Cybernetics, Kunming, 12-15 July 2008 Dynamic Hand Gesture Recognition Using Hierarchical Dynamic Bayesian Networks Through Low-Level Image Processing. 978-1-4244-2096-4/08 2008 IEEE [13] Sridhar P. Arjunan, Dinesh K. Kumar School of Electrical and Computer Engineering Recognition of facial movements and hand gestures using surface Electromyogram (sEMG) for HCI based applications. 0-7695-3067-2/07 2007 IEEE [14] T Nakanot , T Mori&, M. Nagata , and A. Iwatat A Cellular-Automaton-Type Image Extraction Algorithm and Its Implementation Using An Fpga 0-7803-7690-0/02/$17.00 @2002 IEEE

Figure:8 Speech Recognition

This all are the output of particular module. Which perform work according to command.

122 All Rights Reserved 2012 IJARCSEE

ISSN: 2277 9043 International Journal of Advanced Research in Computer Science and Electronics Engineering Volume 1, Issue 2, April 2012 First Author: Ms. Nutan D. Sonwane IV sem MTech[CSE], G.H.Raisoni College of Engineering,Nagpur, R.T.M.N.U, Nagpur Second Author : Prof. S.A. Chhabria HOD[IT] Department, G.H.Raisoni College of engineering,Nagpur R.T.M.N.U, Nagpur Third Author : Dr. R.V.Dharaskar Director of Matoshri Pratishthan's Group of Institutions MPGI Integrated campus, Nanded India S.R.T.M Nanded University

123 All Rights Reserved 2012 IJARCSEE

Das könnte Ihnen auch gefallen