Graph Fourier Transform Based Descriptor For Gesture Classification

2017 Fourth International Conference on Image Information Processing (ICIIP)
Graph Fourier Transform Based Descriptor for

Gesture Classification
Jaya Shukla∗ , Vijay Kumar Chakka† , A. Satyanarayana Reddy‡ , and Madan Gopal†
∗ Department of Computer Science and Engineering, Shiv Nadar University.
Email: {js234@snu.edu.in}
† Department of Electrical Engineering, Shiv Nadar University.
Email: {vijay.chakka@snu.edu.in, mgopal@snu.edu.in}

‡ Department of Mathematics, Shiv Nadar University.
Email: {satyanarayana.reddy@snu.edu.in}
Abstract—This paper proposes a method for gesture classi- classification algorithms relies on the choice of a suitable
fication based on Graph Fourier transform (GFT) coefficients. feature descriptor.
GFT coefficients are the projection of image pixel block onto
the eigenvectors of a Laplacian matrix. This Laplacian matrix For the images, graphs have the ability to model local
is generated from undirected graph, representing a spatial relations between image pixels and edges. In case of Graph
connectedness between each pixel within an image block. This Signal Processing, converting an image into a graph considers
work proposes a method for generating an undirected graph the spatial as well as structural information. In Graph Signal
by using edge information of the image. Edge information of processing, an image is mapped onto structural graph where
the image is obtained by average sum of absolute difference
between the current pixel and its neighboring pixels by using an nodes represent the pixels and edges depict the relationship
appropriate threshold. The resulting GFT based feature vector between nodes.
is formed by concatenating GFT coefficients of each block. The In this work, we focus on the discriminative strength of
resultant feature vector is applied to linear Support Vector
the GFT based method. GFT is the projection of image block
Machine (SVM) classifier to predict the gesture class. For NTU
and Massey hand gesture datasets, threshold value 30 gives onto the eigenvectors of a Laplacian matrix obtained from
maximum prediction accuracy. We compare the results of the proposed undirected graph representation of image block. DCT
proposed GFT based descriptor approach with Karhunen-Loeve and Discrete Fourier transforms (DFT) are fixed transforms
transform (K-LT) and Discrete Cosine transform (DCT) based with no signal adaptations, i.e., if the size of the block is fixed,
descriptors on three different gesture datasets: NTU, Cambridge
then DFT and DCT transformation matrix is fixed. Whereas
and Massey. Simulation results show that the proposed GFT
based descriptor gives a comparable results with Karhunen- in GFT, different gesture sequences contain different number
Loeve transform (K-LT) and Discrete Cosine transform (DCT) of nodes and different edges for the proposed graph structure.
based descriptors for gesture classification. This leads to a different set of graph Laplacian eigenvectors,
i.e., giving distinct graph Fourier transform bases. Thus, graph
I. I NTRODUCTION
frequency representations for different gesture sequences will
In the recent past, researchers renewed their interest in be different.
applying graph theory concepts for the discrete signals pro-
In classical graph construction approach, when the neigh-
cessing, termed as Graph Signal Processing. Graph signal
boring pixels are close or when pixels i and j belong to
processing uses techniques of algebraic graph theory and
the same region, weight in adjacency matrix will be large.
computational harmonic analysis to process signals/data gen-
If pixels i and j belong to the different regions, weight will
erated from areas such as transportation, social, and sensor
be small [7]. But in this work, we construct an undirected
networks [1]- [2]. The growing interest of researchers in graph
graph based on the edge information of the image. In general,
based discrete signal processing has encouraged to study the
sum of absolute difference (SAD) is used to measure similarity
transform approaches on the signals residing on the graph,
between original image and reference image [8].
termed as Graph Fourier transform.
In the literature, GFT has found applications like image We use Average of SAD (ASAD) to obtain edge information
compression [3], image interpolation, image denoising [4] and of the image. In order to find edge information of image I, we
attitude analysis [5]. construct a binary matrix, called “image edge information ma-
trix A (I)” with same size as image I. Now we will construct
A. Motivation and Contributions the adjacency matrix A using the image edge information
Despite active research over last two decades, image clas- matrix A (I) based on k-nearest neighborhood method.
sification still has challenges such as illumination condition, This work presents a new image classification method based
different shapes, size and speed variation, background details on GFT feature descriptor. The main contributions of this work
and issues due to occlusion [6]. Prediction accuracy of image are:
978-1-5090-6734-3/17/$31.00 ©2017 IEEE 388

1) Proposed a method for gesture classification based on v are adjacent. Let v be a vertex of a graph G, then the degree
Graph Fourier Transform (GFT). of v, denoted d(v), means the number of vertices incident at
2) Proposed a method for generating an undirected graph v. For any finite set S, let |S| denote the number of elements
using an appropriate threshold to preserve local neigh- in S. Then a graph/digraph is said to be finite if |V | and |E|
borhood information. are finite. A graph is called simple if it has no loops. In this
paper, graph means finite, simple and undirected graph.
B. Related Work
An adjacency matrix of a graph G = (V, E) with |V | =
In the application areas of sign language recognition, N , is an N × N matrix, denoted A(G) = [aij ] (or A), with
human-computer interaction, virtual reality and computer aij = 1, if the i-th vertex is adjacent to the j-th vertex and 0,
graphics, hand gesture recognition has become of great impor- otherwise.
tance [9]. Despite active research over last two decades, vision Another matrix associated with the graph G is the Laplacian
based hand gesture recognition methods [10]- [12] still have matrix denoted as L(G) or simply L and is defined as
challenges such as illumination condition, different shapes,
size & speed variation, background details and issues due L = D − A, (1)
to occlusion [6]. Extensive literature related to hand gesture
recognition system in particular or gesture recognition in where A is the adjacency matrix and D is the degree matrix,
general is given in [10]- [12]. In the literature, the feature which is a diagonal matrix whose diagonal entries are given
descriptors used for classification are mainly based on the by
N
shape, i.e., contour or region of the object information for
the vision based methods. S. Belongie et. al. [13] presented Dii = Aij . (2)
j=1
a method that measured similarity between the shapes and
used this descriptor for object recognition. H. Ling et. al. [14] It is easy to check that L is a symmetric and positive semi
proposed a method based on part structure. They made shape definite matrix, hence its eigenvalues are nonnegative, real and
descriptors which capture the part structure and are robust can be ordered as 0 = λ0 ≤ λ1 ≤ λ2 ≤ λ3 · · · ≤ λN −1 .
to articulation using the inner-distance. X. Bai et. al. [15] U VON Luxburg has also explained these concepts in his
presented a graph matching algorithm and applied it to shape tutorial for spectral clustering [20]. In 2003, Niyogi et. al. [21]
recognition based on object silhouettes. found that the Laplacian of a graph incorporates neighborhood
In other feature extraction methods, features (HOG, DFT, information of the dataset. Hence, in this work, we construct
DCT, K-LT, 3D model based) of training images are used to GFT coefficients based on Laplacian matrix of the graph which
model the visual appearance and compare these features with preserves the neighborhood information. A graph signal S is a
the features of test images [16]. In 2005, Dalal et. al. [17] used real-valued function defined on the vertices of the graph, i.e.,
Histogram of Oriented Gradient (HOG) for human detection. S : V → R and v → S(v). A signal S can also be represented
Liu et. al. [18] chose to use Gabor filter as a feature descriptor as a vector, i.e. S ∈ RN where ith component of the vector
followed by principal component analysis for dimensionality S denotes signal value of ith vertex in V .
reduction and used Support Vector Machine as a classifier for The classical Fourier transform is the expansion of a signal
different illumination conditions. R. Zen et. al. [19] proposed S in terms of the eigenvectors of the Laplacian operator, i.e.,
a robust part based hand gesture recognition system using the fˆ(ζ) = f, e2πiζt . Analogously, the GFT of a signal S ∈ RN
Kinect sensor on NTU hand gesture dataset. on the vertices of G is the expansion of S in terms of the
eigenvectors of the graph Laplacian. GFT coefficients are the
C. Organization of the Paper
inner product of S with the eigenvectors of the graph Laplacian
GFT feature descriptor is described in Section II including matrix, i.e.,
the method of construction. Section III explains proposed XGF T = T H .S (3)
method for classification. Section IV discusses the experimen-
tal setup followed by results and discussion. Section V presents where T is the matrix formed with eigenvectors of the
the results comparison between GFT, DCT and K-L transform Laplacian matrix of G as columns, T H denotes the conjugate
for gesture images. Concluding remarks and future directions transpose of T and S is the pixel values of the image block.
are made in Section VI. These GFT coefficients are used as a feature descriptors for
classification.
II. G RAPH F OURIER T RANSFORM FOR I MAGES
A directed graph (in short, digraph), denoted G = (V, E), A. GFT Feature Descriptor Computation
consists of the vertex set V and the edge set E ⊂ V × V .
If e = (u, v) ∈ E, then the edge e is said to be incident Given an N × N image I whose ij th pixel value is denoted
with vertices u and v. An edge e = (u, u) is called a loop. A by I(i, j). To find out the edge information from the image,
digraph is called a graph if for any two elements u, v ∈ V , for each pixel (I(i, j)) we will calculate the ASAD i.e. (θi,j )
(u, v) ∈ E whenever (v, u) ∈ E and in this case, we write from its 8 neighboring pixels as follows:
e = {u, v}. We also state it by saying that the vertices u and
389
• For the corner pixels, i.e., first and last pixels of the Results of adjacency matrix A, degree matrix D, Laplacian
first row and first and last pixel of the last row, θi,j is matrix L, matrix of eigenvectors T made from the binary
calculated as given in equation 4. matrix A as given in example II.1, are shown in the Figure 1.
1 1
1
θi,j = |I(i + k, j + l) − I(i, j)| (4)
3
k=0 l=0
• For the pixels on the boundary but not on the corner, θi,j
is calculated as given in equation 5.
1 1
1
θi,j = |I(i + k, j + l) − I(i, j)| (5)
5
k=0 l=−1
• For rest of the image, θi,j is calculated as given in

equation 6.
1 1
1
θi,j = |I(i + k, j + l) − I(i, j)| (6)
8
k=−1 l=−1
If the ASAD, i.e., (θi,j ) is greater than some threshold value

(Γ), value 1 is assigned for that pixel, else value 0 as given in Fig. 1: An example of constructing GFT coefficients from a
equation 7. 3 × 3 pixel block given in example II.1. The corresponding
1 if θi,j > Γ adjacency A, degree D and Laplacian L matrix have been
A (i, j) = (7)
0 if θi,j ≤ Γ computed. Eigenvector matrix T for the Laplacian matrix L
and its GFT coefficients are shown
The value of θi,j gives a measure of local smoothness of
signal S around vertex I(i, j). The value of θi,j is small For calculating the GFT coefficients, we stack the pixels of
when θi,j has similar values at I(i, j) and all its neighborhood the N × N blocks into a length-N 2 vector S on the column
vertices. Different values of threshold (Γ) like 10, 20, 30, 40 basis as described below:
and 50 have been used to get the edge information from the T
image. We illustrate this in the following example for Γ = 30. S= C11 , . . . , CN 1 , C12 , . . . , CN 2 , . . . , C1N , . . . , CN N 1×N 2
(8)
Example II.1. Let I be an image block, defined as where C11 , C21 , . . . , CN 1 are the first columns of the image
⎡ ⎤ block and C1N , C2N , . . . , CN N are the last column of the
51 125 169
I = ⎣70 59 56 ⎦. image block. So, the number of XGF T will be N 2 × 1.
77 44 30 III. B LOCK D IAGRAM R EPRESENTATION OF P ROPOSED
Then the edge information for this⎡image block ⎤ based on
1 1 1 M ETHODOLOGY
8-connectivity is calculated as A = ⎣1 1 1⎦ . Our proposed method is illustrated in Fig. 2 outlining the
0 0 0 whole procedure of image classification. The block diagram
Here, illustrates the procedure step by step.
θ2,2 = 18 [|(59 − 51)| + |(59 − 125)| + |(59 − 169)| + |(59 −
79)| + |(59 − 56)| + |(59 − 77)| + |(59 − 44)| + |(59 − 30)|] =
1
8 260 > 30.
θ1,2 = 15 [|(125 − 51)| + |(125 − 169)| + |(125 − 70)| +
|(125 − 59)| + |(125 − 56)|] = 15 308 > 30.
θ1,1 = 13 [|I(1, 1) − I(1, 2)| + |I(1, 1) − I(2, 1)| + |I(1, 1) −
I(2, 2)|].
Now we will construct an adjacency matrix A using the
edge information of binary matrix A based on k-nearest
neighborhood method. The adjacency matrix A, is the
symmetric matrix of size N 2 × N 2 with entries 0 or 1. In
adjacency matrix A, rows and columns are indexed by N 2
pixels and is defined as Ai,i = 0 for all 1 ≤ i ≤ N 2 and
Ai,j = 1 if i, j are adjacent pixels (eight neighborhood) in the
binary image matrix A and whose both the values in A are 1. Fig. 2: Block diagram for the proposed classification method
390
IV. E XPERIMENT, R ESULTS AND D ISCUSSION Original Image Γ = 10 Γ = 20

A. Datasets Used
1) NTU Hand Gesture Dataset
NTU hand gesture dataset [19] are captured by a Mi-
crosoft Kinect Camera. The dataset consists of a 1000
depth maps of 10 hand gestures (i.e. decimal digits from
0 to 9) performing from the 10 different persons with
10 images for each hand gesture. In NTU hand gesture Γ = 30 Γ = 40 Γ = 50
dataset, each depth map is of the size 640 × 480. After
applying the depth threshold for hand segmentation as
per [22], depth images are obtained. Images are resized
into 240 × 240 pixel block size.
2) Cambridge Hand Gesture Dataset
Cambridge hand gesture dataset [23] consists of 900
image sequences of 9 gesture classes. Each class contains
100 image sequences(5 different illumination × 10 arbi- Fig. 3: Images with different threshold values
trary motions × 2 subjects). Each sequence was recorded
in front of a fixed camera having isolated gestures in
space and time. We selected the gestures performed with be feasibly applied to many other areas in pattern recognition
3 subjects without motion. We resized each image to size such as face recognition, finger print recognition, human
240 × 240. detection, web document classification, object classification,
3) Massey Hand Gesture Dataset etc., where the high dimensional data is needed for analysis.
The Massey hand gesture dataset [24] is an image dataset We tested the proposed algorithm on the INRIA [17] dataset
containing 1500 images of different hand postures, in which is divided into pos and neg folders. Pos folder contains
different lighting conditions. The gesture class contains the images of human and neg folder contains the nonhuman
the gestures of the numbers from 0 to 9 and alphabets a to images. It is a problem of binary classification between human
z. The maximum resolution of the images are 640 × 480 and nonhuman images. Prediction accuracy of 72% has been
with 24 bit RGB color. We converted all the images into achieved using linear SVM classifier.
gray scales and resized each image to size 240 × 240.
V. P ERFORMANCE C OMPARISON
B. Results and Discussion
In the field of image processing, K-LT and DCT was ini-
Proposed algorithm for feature extraction is implemented tially used for image compression and reconstruction [26], [27]
using MATLAB tool. LIBSVM library [25] is used for classi- but they can also be employed for image recognition [28], [29].
fication. We calculated GFT coefficients for different threshold K-LT is known to be optimal transform in terms of compact-
values. For these gestures datasets, first we divide the image ness of representation is mainly because of data dependent. In
into 8 × 8 pixel blocks and threshold Γ = 10, 20, 30, 40, 50 this work, we compare the results of the GFT based descriptor
have been used to obtain edge information of the image. 8- which is also data dependent with the K-LT and DCT based
nearest neighbors and edge information of each image are used descriptors. To calculate the feature vector using K-LT and
to construct an undirected graph. The edge information after DCT, we convert an RGB image into grayscale image. To
applying the different thresholds have been shown in Figure 3. compare the performance with GFT, block based K-LT and
For NTU, Cambridge and Massey hand gesture datasets, DCT have been used. Each image is divided into 8 × 8 sized
result is presented in Table I for different threshold values sub-blocks. For each block, this 8×8 sub-block can be defined
Γ = 10, 20, 30, 40, and 50 using linear SVM Classifier and as a vector of length 64 × 1 on the column basis as given in
10-fold cross validation. In Table I, simulation results show equation 8.
that for the NTU and Massey hand gesture dataset, prediction
accuracy is maximum for a threshold value Γ = 30. A. Performance Comparison with Karhunen-Loeve (K-L)
Transform
TABLE I: Prediction accuracy in (%) using all GFT coeffi-
K-LT coefficients for each block, can be calculated as the
cients
Threshold Γ = 10 20 30 40 50 inner product of signal S with W , i.e.,
NTU 64.63 69.87 71.58 66.29 65.64 XKLT = W T .S (9)
Cambridge 100 100 100 100 100
Massey 98.62 99.31 99.71 99.54 98.31 where, W is the orthogonal eigenvectors matrix of the
covariance matrix computed over blocks of size 8 × 8.
It is noticed that the application of our proposed algorithm The resulting K-LT based feature vector is formed by
is not limited to the gesture classification problem. It can also concatenating K-LT coefficients of each block.
391
B. Performance Comparison with Discrete Cosine Trans- R EFERENCES

form(DCT) [1] D. I. Shuman, S. K. Narang, P. Frossard, A. Ortega, and P. Vandergheynst,
“The emerging field of signal processing on graphs: Extending high
To make the DCT coefficients as a feature vector, DCT dimensional data analysis to networks and other irregular domains,”
matrix D of size 64 × 64 has been generated using MATLAB Signal Processing Magazine, IEEE, vol. 30, no.3, pp. 83-98, 2013.
dctmtx() function. DCT coefficients can be calculated as the [2] A. Sandryhaila and J. M. F. Moura, “Discrete signal processing on graphs:
Graph Fourier Transform”, ICASSP, 2013, pp. 6167-6170.
inner product of S with the DCT matrix D, i.e., [3] W. Hu, G. Cheung, A. Ortega and O. C. Au, “Multiresolution graph
Fourier transform for compression of piecewise smooth images,” IEEE
XDCT = D.S (10) Transactions on Image Processing, vol. 24, no. 1, pp. 419-433, Jan. 2015.
[4] Q. Ge, X. Cheng, W. Shao, Y. Dong, W. Zhuang and L. Haibo, “Graph
The resulting DCT based feature vector is formed by concate- based sparse representation for image denoising,” Procedia manufactur-
nating DCT coefficients of each block. ing, vol. 3, pp. 2049-2056, 2015.
[5] Z. Yang, A. Ortega and S. Narayanan, “Gesture dynamics modeling for
attitude analysis using graph based transform,” in IEEE International
100
Conference on Image Processing, 2014, pp. 1515-1519.
90
GFT
DCT [6] H. Cheng, L. Yang, and Z. Liu, “A Survey on 3D Hand gesture recogni-
KLT
80 tion,” IEEE Transactions on Circuits and Systems for Video Technology,
70 vol. PP, no. 99, pp. 1-14, July 2015.
Prediction Accuracy
60 [7] F. Zhang and E. R. Hancock, “Graph spectral image smoothing using the
50 heat kernel,” Pattern Recogn., vol. 41, pp. 3328-3342, Nov. 2008.
40 [8] B. Zitova, J. Flusser and F. Sroubek, “Image Registration: A survey and
30 recent advances,” in Proc. of IEEE International Conference on Image
20 Processing(ICIP’2005), pp. 1-55, Los Alamitos.
10 [9] J. P. Wachs, M. Kolsh, H. Stem, and Y. Edan, “Vision-based hand gesture
0
NTU Cambridge Massey applications,” Commun. ACM, vol. 54, pp. 60-71, 2011.
Datasets
[10] A. Erol, G. Bebis, M. Nicolescu, R.D. Boyle, and X. Twombly, “Vision-
based hand pose estimation: A review,” Comput. Vision Image Under-
Fig. 4: Comparison of prediction accuracy when using all GFT, stand, vol. 108, pp. 52-73, 2007.
DCT and K-LT coefficients for each block. [11] S. Mitra and T. Acharya, “Gesture recognition:A survey,” IEEE Trans.
Syst., Man, Cybern. C, Appl. Rev., vol. 37, pp. 311-324, 2007.
[12] G. R. S. Murthy and R. S. Jadon, “A review of vision based hand gesture
recognition,” Int. Journal. Inf. Technol. Knowl. Manage., vol. 2, pp.405-
410, 2009.
VI. C ONCLUSION AND F UTURE D IRECTION [13] S. Belongie, J. Malik, and J. Puzicha, “Shape matching and object
recognition using shape context,” IEEE Trans. Pattern Anal. Mach. Intell.,
This paper presented a novel method for hand gesture vol 24, pp. 509-522, 2002.
[14] H. Ling and D. W. Jacobs, “Shape classification using the inner
classification based on Graph Fourier Transform (GFT). For distance,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 29, pp.286-299,
efficient calculation of transform coefficients first the image is 2007.
divided into blocks of subimages. We proposed a method for [15] X. Bai and L. J. Latecki, “Path similarity skeleton graph matching,”
IEEE Trans. Pattern Anal. Mach. Intell., vol. 30, pp. 1-11, 2008.
generating an undirected graph using an appropriate threshold [16] S. Ding, H. Zhu, W. Jia, and C. Su, “A survey on feature extraction
to preserve local neighborhood information. Graph Laplacian for pattern recognition,” Artificial Intelligence Review, vol. 37, no. 3, pp.
is used to define the connectivity of the underlying graph. 169-180.
[17] N. Dalal and B. Triggs, “Histograms of oriented gradients for human
Inner product of the eigenvector matrix of the unnormalized detection,” IEEE Computer Society Conference on Computer Vision and
graph Laplacian and the image is called as Graph Fourier Pattern Recognition, vol. 1, pp. 886-893, 2005.
Transform. To determine the appropriate threshold for con- [18] C. Liu, “Gabor-based kernel PCA with fractional power polynomial
models for face recognition,” IEEE Transactions on Pattern Analysis and
struction of undirected graph different thresholds values are Machine Intelligence, vol. 26, no. 5, pp. 572-581, 2004.
used. GFT coefficients based on these threshold values are [19] Z. Ren, J. Yuan, J. Meng, and Z. Zhang, “Robust part-based hand gesture
used to determine the prediction accuracy. The threshold of recognition using Kinect sensor,” IEEE Transactions on Multimedia, vol.
15, no. 5, pp. 1110-1120, Aug. 2013.
30 is used to get maximum prediction accuracy using 10 fold [20] U Von Luxburg, “A tutorial on spectral clustering,” Springer Statistics
cross validation. Though the features generated by the K-LT and Computing, vol. 17, 2007
are optimal for the image reconstruction, but they perform well [21] X. He and P. Niyogi, “Locality preserving projections,” Proc. Conf.
also for the gesture classification. Similarly, DCT coefficients Advances in Neural Information Processing Systems, 2003.
[22] J. Shukla and A. Dwivedi, “A method for hand gesture recognition,”
are also used for classification. For the Cambridge and Massey Conference of Communication Systems and Network Technologies , pp.
datasets performance of GFT and DCT are same. For the 919-923, April 2014.
Massey and NTU datasets which have more number of classes, [23] T-K. Kim, R. Cipolla, “Canonical correlation analysis of video volume
tensor for action categorization and detection,” IEEE Transactions on
GFT performance is better than the K-L transform. The main Pattern Analysis and Machine Intelligence, vol. 31, no. 8, pp. 1415-1428,
drawback of the proposed method is its computational com- 2009.
plexity and size of feature vector. Practical software requires [24] A. L. C. Barczak, N. H. Reyes, M. Abastillas, A. Piccio, T. Susnjak, “A
new 2D static hand gesture color image dataset for asl gestures,” Research
O(n3 ) time to compute all the eigenvalues and eigenvectors Lett. Information and Mathematical Sciences, vol. 15, pp. 12-20, 2011.
of an n × n symmetric matrix. In the proposed method size [25] C.-C. Chang and C.-J. Lin, “LIBSVM : a library for support vector
of the feature vector is number of blocks in an image ×64. machines,” ACM Transactions on Intelligent Systems and Technology,
vol. 2 no. 27, pp. 1-27, 2011.
Hence, future direction of the proposed method is to reduce [26] N. Ahmed, T. Natarajan, and K. R. Rao, “Discrete Cosine transform,”
the computational complexity and size of the feature vector. IEEE Transactions on computers, vol. C-23, no. 1, pp. 90-93, Jan. 1974.
392
[27] M. Kirby and L. Sirovich, “Application of the Karhunen-Loeve pro-

cedure for the characterization of human faces,” IEEE Trans. Pattern
Analysis and Machine Intelligence, vol. 12, no. 1, pp. 103-108, jan 1990.
[28] M. Turk and A. Pentland, “Eigenfaces for recognition,” J. Cognitive
Neuroscience, vol. 3, no. 1, pp. 71-86, 1991.
[29] M. Uenohara and T. Kanade, “Use of Fourier and Karhunen-Loeve
decomposition for fast pattern matching with a large set of templates,”
IEEE Trans. Pattern Analysis and machine Intelligence, vol. 19, no. 8,
pp. 891-898, Aug. 1997.
393

Graph Fourier Transform Based Descriptor For Gesture Classification

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Graph Fourier Transform Based Descriptor For Gesture Classification

Hochgeladen von

Copyright:

Verfügbare Formate

2017 Fourth International Conference on Image Information Processing (ICIIP)

Graph Fourier Transform Based Descriptor for

Email: {vijay.chakka@snu.edu.in, mgopal@snu.edu.in}

978-1-5090-6734-3/17/$31.00 ©2017 IEEE 388

• For rest of the image, θi,j is calculated as given in

If the ASAD, i.e., (θi,j ) is greater than some threshold value

IV. E XPERIMENT, R ESULTS AND D ISCUSSION Original Image Γ = 10 Γ = 20

B. Performance Comparison with Discrete Cosine Trans- R EFERENCES

[27] M. Kirby and L. Sirovich, “Application of the Karhunen-Loeve pro-

Das könnte Ihnen auch gefallen