Sie sind auf Seite 1von 6

International Journal on Recent and Innovation Trends in Computing and Communication

Volume: 3 Issue: 5

ISSN: 2321-8169
3337 - 3342


An Effective Technique for Removal of Facial Dupilcation by SBFA

Miss. Deepika B. Patil

Dr. Ayesha Butalia

Computer Department,
Pune, India

Computer Department,
Pune, India

AbstractSearch based face annotation (SBFA) is an effective technique to annotate the weakly labeled facial images that are freely available
on World Wide Web. The main objective of search based face annotation is to assign correct name labels to given query facial image. One
difficult drawback for search based face annotation theme is how to effectively perform annotation by exploiting the list of most similar facial
pictures and their weak labels that square measure typically droning and incomplete. To tackle this drawback, we tend to propose a good
unattended label refinement (URL) approach for purification the labels of web facial pictures exploitation machine learning technique. We tend
to formulate the educational drawback as a gibbose improvement and develop effective improvement algorithms to resolve the large scale
learning task expeditiously. To additional speed up the projected theme, we also proposed clustering based approximation algorithmic program
which may improve quantify ability significantly. We have conducted an in depth set of empirical studies on a large scale net facial image test
bed, within which encouraging results showed that the projected URL algorithms will considerably boost the performance of the promising
SBFA theme. In future work we will use HAAR algorithm. HAAR is feature based method for face detection. HAAR features, integral images,
recognized detection of features improve face detection in terms of speed and accuracy.

KeywordsFace annotation; content-based image retrieval; machine learning; label refinement; web facial images; weak label



Due to the recognition of varied digital cameras and

also the rapid growth of social media tools for internet-based
photo sharing [1], recent years have witnessed associate
explosion of the amount of digital photos captured and hold on
by consumers. An outsized portion of photos shared by users
on the Internet area unit human facial pictures. A number of
these facial images area unit labeled with names,however
several of them don't seem to be tagged properly. This has
actuated the study of automobile face annotation, a vital
technique that aims to annotate facial pictures mechanically.
Auto face annotation may be useful to several real world sites
(e.g., Facebook) will automatically annotate users
Uploaded photos to facilitate online icon search and
management. Besides, face annotation may also be applied in
news video domain to sight important persons appeared within
the videos to facilitate news video retrieval and account tasks
[2], [3]. Classical face annotation approaches area unit usually
treated as an extended face recognition downside, wherever
completely different classification models area unit trained
from a group of well labeled facial pictures by
Using the supervised or semi-supervised machine learning
techniques [4], [5], [6], [7]. However, the model-based face
annotation techniques area unit restricted in many aspects.
First, it's typically time-consuming and high-priced to gather
an outsized quantity of human-labeled coaching facial
pictures. Second, it's typically difficult to generalize the
models once new coaching knowledge or new persons area
unit intercalary, within which associate intensive training

process is typically needed. Last however not least, the

annotation/recognition performance usually scales poorly
when the amount of persons/classes is extremely massive.
Recently, some rising studies have tried to explore a promising
search-based annotation paradigm for facial image annotation
by mining the globe Wide net (WWW), wherever a vast
variety of decrepit labeled facial images area unit freely
accessible. Rather than coaching express classification models
by the regular model-based face annotation approaches, the
search-based face annotation (SBFA) paradigm aims to tackle
the machine-controlled face annotation task by exploiting
content-based image retrieval (CBIR) techniques [8], [9] in
mining huge decrepit labeled facial pictures on the net. The
SBFA framework is data-driven and model-free, that to some
extent is inspired by the search-based image annotation
techniques [10], [11], [12] for generic image annotations. The
main objective of SBFA is to assign correct name labels to a
given question facial image.Particularly, given a unique facial
image for annotation, we tend to initial retrieve a brief list of
prime K most similar facial pictures from a decrepit labeled
facial image information, then annotate the facial image by
performing ballot on the labels related to the topk similar
facial pictures. One challenge Janus-faced by such SBFA
paradigm is a way to effectively exploit the listing of candidate
facial pictures and their weak labels for the face name
annotation task. To tackle the higher than downside, we tend
to investigate and develop a search -based face annotation
theme. Particularly, we propose a unique unattended label
refinement (URL) scheme by exploring machine learning
techniques to enhance the labels strictly from the decrepit

IJRITCC | May 2015, Available @


International Journal on Recent and Innovation Trends in Computing and Communication

Volume: 3 Issue: 5

ISSN: 2321-8169
3337 - 3342

labeled knowledge without human manual efforts. we tend to
additionally propose a clustering-based approximation (CBA)
rule to boost the efficiency and measurability. As a outline, the
most contributions of this paper embrace the following: we
tend to investigate and implement a promising search based
face annotation theme by mining massive amount of feeble
labelled facial pictures freely available on the computer
network. we tend to propose a completely unique ULR theme
for enhancing label quality via a graph-based and low-rank
learning approach. We tend to propose Associate in Nursing
economical clustering-based approximation formula for largescale label refinement problem. we tend to conducted an in
depth set of experiments, in which encouraging results were
This section elaborates literature survey on the
substantial achievement in mining weakly labelled facial
S.C.H. Hoi et al. gives emergence of user-centric
multimedia system applications on social networks has been
instrumental within the creation of a brand new type of
social media created victimisation extremely accessible
and scalable business technologies for sharing via the net. This
timely text/reference presents the newest advances in varied
aspects of social media modeling and social media computing
analysis. Gathering along very good analysis from a variety of
established international conferences and workshops, the
editors coherently organize and gift every of the topics in
respect to the fundamental principles and practices of social
media modeling and computing [1].
S. Satoh et al. discussed that car face annotation is
enjoying vital role in several real-world data management
systems and multimedia system data. Car face annotation is
useful to several globe applications. Face annotation
associated with face detection and recognition. Recently
analysis interests in mining weakly-labeled facial pictures on
the net to resolve analysis challenge in laptop vision and
image understanding. This paper provides varied techniques or
strategies that square measure accustomed expanding upon
facial pictures [2].
P.T. Pham et al. shown the labeling persons in video
frames with names detected in an exceedingly corresponding
video transcript helps up video content annotation and search
tasks. In this paper there is a tendency to implement a face
naming methodology that learns from tagged and unlabelled
examples victimisation repetitious label propagation in an
exceedingly graph of connected faces or name-face pairs. By
incorporating the unlabelled information points throughout the
educational method, this methodology will work with few
tagged information points. On BBC News videos, the label
propagation algorithmic rule yields higher results than a

Support Vector Machine classifier and a nearest neighbor

classifier trained on a similar tagged information [3].
L. Zhang et al. conferred a completely unique
methodology of connectedness feedback. Support Vector
Machine learning within the content-based image retrieval
system. A SVM classifier is learned from coaching
information of connectedness pictures and irrelevancy pictures
marked by users. Victimization the classifier, the system will
retrieve additional pictures relevant to the query within the
info with efficiency. Experiments were administrated on largesize info of 9918 pictures [4].
T.L. Berg et al. presented the car face annotation, that
aims to notice human faces from a facial image and assign
them correct human names, may be a elementary analysis
drawback and useful to several real-world applications. In this
work, authors have a tendency to address this drawback by
investigation a retrieval-based annotation theme of mining
huge internet facial pictures that square measure freely
accessible over the net. Above all, given a facial image, they
have a tendency to initial retrieve the highest n similar
instances from a large-scale internet facial image info
victimisation content-based image retrieval techniques, then
use their labels for car annotation. They propose a good Weak
Label regularised native Coordinate writing (WLRLCC)
technique, that exploits the principle of native coordinate
writing by learning distributed options [5].
J. Yang et al. proposed a learning-based approach to
annotate shots of reports video with locations extracted from
video transcript, supported options from multiple video
modalities as well as grammar structure of transcript
sentences, speaker identity, temporal video structure, and so
on. Machine learning algorithms square measure adopted to
mix multi-modal options to resolve two sub-problems: (1)
whether or not the situation of a video shot is mentioned
within the transcript, and if so, (2) among several locations
within the transcript, that square measure correct one(s) for
this shot. Experiments on TRECVID dataset demonstrate that
their approach achieves more or less eighty fifth accuracy in
properly labeling the situation of any shot in news video [6].
J. Zhu et al. represented the on-line rating systems
square measure currently present owing to the success of
recommender systems. In such systems, users square measure
allowed to rate the things in an exceedingly predefined vary of
values. The ratings collected is accustomed infer users
preferences in addition as items intrinsic options, that square
recommendation. Most previous work focuses on up the
prediction accuracy or ranking capability. Very little attention
has been paid to the matter of spammers or low-reputed users
in such systems [7].

IJRITCC | May 2015, Available @


International Journal on Recent and Innovation Trends in Computing and Communication

Volume: 3 Issue: 5

ISSN: 2321-8169
3337 - 3342

A. Search Based Face Annotation
The system flow of the proposed framework of
search-based face annotation is as given in the following steps
a) Facial image data collection
b) Face detection and facial feature extraction
c) High-dimensional facial feature indexing
d) Learning to refine weakly labeled data
e) Similar face retrieval and
f) Face annotation by majority voting on the similar faces
with the refined labels.
The first four steps are usually conducted before the test
phase of a face annotation task, while the last two steps are
conducted during the test phase of a face annotation task.
Which usually should be done very efficiently and properly in
face annotation. All the above are the processes before
annotating a queryfacial image. . In particular, given a query
facial image for annotation, we first conduct a similar face
retrieval process to search for a subset of most similar faces
typically top K similar face examples from the previously
indexed facial database.
B. Mathematical Model
A] Collect Weakly Labeled Images
Q= {q1,q2,q3}
Where Q is input query images like q1,q2,q3
B] List of human names for Annotation
= {n1, n2,n3}
Where N is Human names list for Annotation
C] Dataset
D] No. of iteration to find optimized image
t= {t1, t2,t3..}
Whered is no. of iteration
E] Total no of Human names
m= {m1,m2,m3}
Where m is total no of human names
F] Extracted Facial Image feature
F] Annoted Image
X*=Annoted Image
G] Feature Dimensions
d= {d1,d2,d3.}
Whered is no. of feature dimensions
C. Algorithms
The optimization tasks belong to exactly quadratic
programming (QP) problems. It seems to be possible to solve
them directly by applying generic QP solvers. However, this

would be computationally highly intensive since matrix F can

be potentially very large, for example, for a large 400-person
database of totally 40,000 facial images, F is a 40,000* 400
matrix that consists of 16 million variables, which is almost
infeasible to be solved by any existing generic QP solver. So
we first adopt multistep gradient algorithm to solve the
1) Multistep Gradient Algorithm for ULR

0 0; k 1; z (0) x (0) x ( 1) 0;

Case SRF: Achieve

x (k) with equation;

x(k 1) argmin pt (x, z (k) ) s.t. x 0

Case CCF: Achieve

min t x v

x 0

x (k) with equation;

m 1

s.t. xk .n i , i 1..., n
k 0

1 4 2 k 1 1

z (k) x (k)

k 1 1 (k) (k 1)
( x x );

2) Coordinate Descent Algorithm for URL
To further improve the scalability, we propose a
coordinate descent approach to solving the optimization
iteratively. This can take advantages of the power of parallel
computation when solving a very large-scale problem. For the
proposed coordinate descent approach, at each iteration, we
optimize only one label vector Fi* by leaving the other vectors
Input: X

Rn d , Y [0,1]n m

Output: F

Rn m

t 0 and F (t) Y ;
For i=1 to n do
Case SRF: achieve Fi*

(t 1)

with equation

Fi*(t 1) = argmin (f/ F(t) ,i) s.t. f 0

Case CCF: Achieve Fi*

(t 1)

with equation

Fi*(t 1) = argmin (f/ F(t) , I) s.t. f , f 0

t t 1;

IJRITCC | May 2015, Available @


International Journal on Recent and Innovation Trends in Computing and Communication

Volume: 3 Issue: 5

ISSN: 2321-8169
3337 - 3342

D. Proposed System
The proposed system design is as shown in figure 1.
In proposed work we will use HAAR algorithm, HAAR is
feature based method for face detection.
It recognize a given face image according to the
images in database. The database of a face recognizer is
formed using a training set of images. Training set is set of the
features extracted from face images of different persons. The
face recognition system finds the most similar feature vector
among the training set matching to the feature vector of a
given test image. Here, we want to recognize the identity of a
person where an image of that person (test image) is given to
the system. Principal Component Analysis (PCA) is working
as follows shown in figure 2.

compute the similarities between feature set of test image and

all of the feature vectors in the training set. The similarity
between feature vectors can be computed using Euclidean
distance or any other comparative method.
The identity of the most similar feature set will be the
output of our face recognizer. Schematic diagram of the face
recognition system that will be implemented is shown in
Figure 2.
Objective of SBFA is to assign correct name labels to
a given query facial image. In particular, given a novel facial
image for annotation, we first retrieve a short list of top K
most similar facial images from a weakly labelled facial image
database, and then annotate the facial image by performing
voting on the labels associated with the top K similar facial
images. The results obtained from the iterations are as shown
in the below figures.
1) Performance Graph

Figure 1. System architecture Using HAAR Technique

Figure 2. Face Recognition Using PCA

The figure 3 shows the relationship between the

annotation performance of varied values of K and T,
respectively, where K represents the retrieved images and T
represents annotated names. Some observations can be drawn.
First of all, when fixing K, we found that increasing T value
generally leads to better hit rate results. Second, when fixing
T, we found that the impact of the K value to the annotation
performance fairly depends on the specific T value. In
particular, when T is small (e.g., T = 1), increasing the K value
leads to the decline of the annotation performance; but when T
is large (e.g., T > 5), increasing the K value often boosts the
performance of top T annotation results.

Figure 3. Annotation Performance Graph

In the training phase, you should extract feature

vectors for each image for each training image, you should
calculate and store these feature vectors. In the recognition
phase (or, testing phase), you will be given a test image of a
known person. In order to identify the person, you should
IJRITCC | May 2015, Available @


International Journal on Recent and Innovation Trends in Computing and Communication

Volume: 3 Issue: 5

ISSN: 2321-8169
3337 - 3342


P.T. Pham, T. Tuytelaars, and M.-F.Moens, Naming People in

News Videos with Label Propagation, IEEE Multimedia, vol.
18, no. 3, pp. 44-55, Mar. 2011.


L. Zhang, L. Chen, M. Li, and H. Zhang, Automated

Annotation of Human Faces in Family Albums, Proc. 11th
ACM Intl Conf. Multimedia (Multimedia), 2003.


T.L. Berg, A.C. Berg, J. Edwards, M. Maire, R. White, Y.W.

Teh, E.G. Learned-Miller, and D.A. Forsyth, Names and Faces
in the News, Proc. IEEE CS Conf. Computer Vision and
Pattern Recognition (CVPR), pp. 848-854, 2004.


J. Yang and A.G. Hauptmann, Naming Every Individual in

News Video Monologues, Proc. 12th Ann. ACM Intl Conf.
Multimedia (Multimedia), pp. 580-587. 2004.


J. Zhu, S.C.H. Hoi, and M.R. Lyu, Face Annotation Using

Transductive Kernel Fisher Discriminant, IEEE Trans.
Multimedia, vol. 10, no. 1, pp. 86-96, Jan. 2008.


A.W.M. Smeulders, M. Worring, S. Santini, A. Gupta, and R.

Jain, Content-Based Image Retrieval at the End of the Early
Years, IEEE Trans. Pattern Analysis and Machine Intelligence,
vol. 22, no. 12, pp. 1349-1380, Dec. 2000.


S.C.H. Hoi, R. Jin, J. Zhu, and M.R. Lyu, Semi-Supervised

SVM Batch Mode Active Learning with Applications to Image
Retrieval, ACM Trans. Information Systems, vol. 27, pp. 1-29,

Figure 4. Accuracy Graph

2) Accuracy graph
The figure 4 shows the comparison of accuracy
between url technique and HAAR technique. The URL
algorithm enhances the lable quality but does not eliminate the
false positive rate as compared to HAAR algorithm. As
HAAR eliminates the false positive rate, thus it increases the
accuracy. Also it uses the large database as it provides
accurate results as compared to URL.
This paper investigated a promising search-based face
annotation framework, throughout that we tend to tend to
targeted on dealing with the crucial disadvantage of enhancing
the label quality and projected a ULR rule. To further improve
the measurability, we tend to tend to collectively planned a
clustering-based approximation resolution that successfully
accelerated the improvement task whereas not introducing
superabundant performance degradation. From associate
degree exhaustive set of experiments, we tend to found that the
planned technique achieved promising results below a variety
of settings. Our experimental results collectively indicated that
the planned URL technique considerably surpassed the
alternative regular approaches in literature. In future work we
will focus on more sophisticated voting scheme and evaluation
on a larger retrieval database, explore different learning
techniques. Also we will use HAAR algorithm which is
feature based method for face detection. HAAR features,
integral images, regionalized detection of features improve
face detection in terms of speed and accuracy.

[10] X.-J. Wang, L. Zhang, F. Jing, and W.-Y. Ma, AnnoSearch:

Image Auto-Annotation by Search, Proc. IEEE CS Conf.

Computer Vision and Pattern Recognition (CVPR), pp. 14831490, 2006.
[11] L. Wu, S.C.H. Hoi, R. Jin, J. Zhu, and N. Yu, Distance Metric

Learning from Uncertain Side Information for Automated Photo

Tagging, ACM Trans. Intelligent Systems and Technology, vol.
2, no. 2, p. 13, 2011.
[12] P. Wu, S.C.H. Hoi, P. Zhao, and Y.He, Mining Social Images

with Distance Metric Learning for Automated Image Tagging,

Proc. Fourth ACM Intl Conf.
[13] D. Wang, S.C.H. Hoi, and Y. He, Mining Weakly Labeled Web

Facial Images for Search-Based Face Annotation, Proc. 34th

Intl ACM SIGIR Conf. Research and Development in
Information Retrieval (SIGIR), 2011.
[14] P. Belhumeur, J. Hespanha, and D. Kriegman, Eigenfaces

versus Fisherfaces: Recognition Using Class Specific Linear

Projection, IEEE Pattern Analysis and Machine Intelligence,
vol. 19, no. 7, pp. 711-720, July 1997.
[15] W. Zhao, R. Chellappa, P.J. Phillips, and A. Rosenfeld, Face

Recognition: A Literature Survey, ACM Computing Survey,

vol. 35, pp. 399-458, 2003.
[16] G.B. Huang, M. Ramesh, T. Berg, and E. Learned-Miller,


S.C.H. Hoi, J. Luo, S. Boll, D. Xu, and R. Jin, Social Media

Modeling and Computing, eds. Springer, 2011.


S. Satoh, Y. Nakamura, and T. Kanade, Name-It: Naming and

Detecting Faces in News Videos, IEEE MultiMedia, vol. 6, no.
1, pp. 22-35, Jan.-Mar. 1999.

Labeled Faces in the Wild: A Database for Studying Face

Recognition in Unconstrained Environments, technical report
07-49, 2007.
[17] H.V. Nguyen and L. Bai, Cosine Similarity Metric Learning for

Face Verification, Proc. 10th Asian Conf. Computer Vision

(ACCV 10), 2008.
[18] M. Guillaumin, J. Verbeek, and C. Schmid, Is that You? Metric

Learning Approaches for Face Identification, Proc. IEEE 12th

Intl Conf. Computer Vision (ICCV), 2009.
IJRITCC | May 2015, Available @


International Journal on Recent and Innovation Trends in Computing and Communication

Volume: 3 Issue: 5

ISSN: 2321-8169
3337 - 3342

[19] Z. Cao, Q. Yin, X. Tang, and J. Sun, Face Recognition with

[25] Y. Yang, Y. Yang, Z. Huang, H.T. Shen, and F. Nie, Tag

Learning-Based Descriptor, IEEEConf. Computer Vision and

Pattern Recognition (CVPR), pp. 2707-2714, 2010.

Localization with Spatial Correlations and Joint Group

Sparsity, Proc. IEEE Conf. Computer Vision and Pattern
Recognition (CVPR), pp. 881-888, 2011.

[20] E. Hjelmas and B.K. Low, Face Detection: A Survey,

Computer Vision and Image Understanding, vol. 83, no. 3, pp.

236-274, 2001.
[21] R. Jafri and H.R. Arabnia, A Survey of Face Recognition

Techniques, J. Information Processing Systems, vol. 5, pp. 4168, 2009.

[22] K. Delac and M. Grgic, Face Recognition, IN-TECH, 2007.
[23] M.G. KresimirDelac and M.S. Bartlett, Recent Advances in Face

Recognition. I-Tech Education and Publishing, 2008.

[26] J. Fan, Y. Gao, and H. Luo, Hierarchical Classification for

Automatic Image Annotation, Proc. 30th Ann. Intl ACM SIGIR

Conf. Research and Development in Information Retrieval
(SIGIR), pp. 111-118, 2007.
[27] Z. Lin, G. Ding, and J. Wang, Image Annotation Based on

Recommendation Model, Proc. 34th Intl ACM SIGIR Conf.

Research and Development in Information Retrieval (SIGIR),pp.
1097-1098, 2011.

[24] A. Hanbury, A Survey of Methods for Image Annotation, J.

Visual Languages and Computing, vol. 19, pp. 617-627, Oct.


IJRITCC | May 2015, Available @