Sie sind auf Seite 1von 12

This article has been accepted for publication in a future issue of this journal, but has not been

fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCSVT.2016.2598704, IEEE
Transactions on Circuits and Systems for Video Technology

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. XX, NO. XX, DECEMBER 2015 1

Automatic Generation of Social Event Storyboard


from Image Click-through Data
Jun Xu, Tao Mei , Senior Member, IEEE, Rui Cai, Member, IEEE, Houqiang Li, Senior Member, IEEE and
Yong Rui, Fellow, IEEE

AbstractRecent studies have shown that a noticeable percent-


age of web search traffic is about social events. While traditional
websites can only show human-edited events, in this paper we
present a novel system to automatically detect events from search
log data and generate storyboards where the events are arranged
chronologically. We chose image search log as the resource for
event mining, as search logs can directly reflect peoples interests.
To discover events from log data, we present a Smooth Nonneg-
ative Matrix Factorization framework (SNMF) which combines
the information of query semantics, temporal correlations, search
logs and time continuity. Moreover, we consider the time factor
an important element since different events will develop in
different time tendencies. In addition, to provide a media-rich
and visually appealing storyboard, each event is associated with
a set of representative photos arranged along a timeline. These
relevant photos are automatically selected from image search
results by analyzing image content features. We use celebrities
as our test domain, which takes a large percentage of image
search traffics. Experiments consisting of web search traffic on
200 celebrities, for a period of six months, show very encouraging
results compared with handcrafted editorial storyboards. Fig. 1. Screen shot of www.people.com, a website for celebrity news. The
marked region shows recent news of Britney Spears, arranged along timeline.
Index TermsEvent storyboard, social media, click-through
data, non-negative matrix factorization, image search.
version of a persons larger relevant event collection. Although
I. I NTRODUCTION such a short profile is very helpful for quickly introduc-
ing a person, it cannot satisfy peoples curiosity for more
As social creatures, people are by nature curious about
detailed and timely information of celebrities. By contrast,
others activities. Information on famous persons have often
some professional websites provide comprehensive and up-
been of particular interest. This tendency has remained true in
to-date information on famous persons. Fig. 1 shows a screen
the internet era [35]. Since common search engines as well
shot of www.people.com, a website well-known for celebrity
as news websites often experience massive search demands
news and photos. In the marked region of Fig. 1, it shows
about a myriad of current affairs, a great amount of news
Britney Spearss recent news (events) arranged along a time-
and events are collected from the web. However, most social
line. This is a very nice feature for fans to trace their idols
events originate from professional editors. In this case, it is
activities. Almost all these websites are powered by human
quite meaningful to detect such events for users automatically
editors, which inevitably leads to several limitations. First, the
instead of manual efforts.
coverage of human center domains is small. Typically, one
Current search engines often show the summaries of famous
website only focuses on celebrities in one or two domains
persons as a simple profile. From such a summarization, peo-
(most of them are entertainment and sports), and to the
ple can easily get a celebritys basic information like portrait,
best of our knowledge, there are no general services yet
nationality, birthday, representative works, and awards. The
for tracing celebrities over various domains. Second, these
search engine summaries can be considered a concentrated
existing services are not scalable. Even for specific domains,
Correspondence author: T. Mei (tmei@microsoft.com). J. Xu only a few top stars are covered1 , as the editing effort to cover
(junx1992@gmail.com) is with the University of Science and Technology of more celebrities is not financially viable. Third, reported event
China. This work was performed When J. Xu visited Microsoft Research news may be biased by editors interests. In this paper, we
as an intern. H. Li (lihq@ustc.edu.cn) is with the University of Science
and Technology of China. T. Mei, R. Cai and Y. Rui are with Microsoft aim to build a scalable and unbiased solution to automatically
Research, Beijing, China {tmei, ruicai, yongrui}@microsoft.com. detect social events especially related to celebrities along a
This work was supported in part to H. Li by NSFC under Contract timeline. This could be an attractive supplement to enrich the
61325009, 61272316.
Copyright (c) 2009 IEEE. Personal use of this material is permitted. existing event description in search result pages. In this paper,
However, permission to use this material for any other purposes must be
obtained from the IEEE by sending an email to pubs-permissions@ieee.org. 1 e.g. http://www.people.com/people/celebrities/

1051-8215 (c) 2016 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCSVT.2016.2598704, IEEE
Transactions on Circuits and Systems for Video Technology

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. XX, NO. XX, DECEMBER 2015 2

to some current knowledge bases. Taking singer Adele as


an example, major groups of queries on Adele are her more
popular songs like Rolling in the Deep or Someone Like
You. Event news like her pregnancy are considered noisy
data in the clustering, as the related queries are not prominent
enough in comparison to the popular ones about her songs.
Therefore, we need an elaborate way to balance the discovery
process: on the one hand, we should distinguish informative
Fig. 2. Example of Adele storyboard, from July 2012 to December 2012. queries from noisy ones; On the other hand, we should prevent
The first event is the expectation that she would give birth soon. The second social events from being overwhelmed by popular queries.
event is the release of her new album Skyfall. The third event is about her
weight. In addition, we need to fully consider the time factor when
discovering social events since they will often have a burst in
the time dimension. Events can be more easily recognized if
we will focus on those events happening at a certain time we add time information into consideration.
favored by users as our celebrity-related social events. Therefore, to achieve this goal, a novel approach is proposed
Meanwhile, about 30% of search queries aim to search for in this paper using Smooth Nonnegative Matrix Factorization
real-world events according to statistics from a commercial (SNMF) for event detection, by fully leveraging information
search engine data [23]. A further-70% of these queries are from query semantics, temporal correlations, and search log
related to celebrities, including artists, sports stars, politicians, records. We use the SNMF method rather than the normal
scientists, entrepreneurs, et al. Thus, we will focus on events NMF method or other MF method to guarantee that the
related to celebrities because of the volume of related search weights for each topic are non-negative and consider the
queries and the ability to obtain ground truth events from time factor for event development at the same time. The
professional websites. basic idea is two-fold: 1) promote event queries through by
The most related research topics to this paper are event/topic strengthening their connections based on all available features;
detection from Web. There have been quite a few works 2) differentiate events from popular queries according to their
that examine related directions [2], [4], [9], [10], [19], [23], temporal characteristics.
[25], [26], [30], [34], [38]. The most typical data sources for To provide a comprehensive and vivid storyboard, in this
event/topic mining are news articles and weblogs. Various paper, we also introduce an automatic way to attach a set of
statistical methods have been proposed to group documents relevant photos to each piece of event news. In [37], a method
sharing the same stories such as [7], [10], [19], [23]. Temporal for photo selection from image search logs is presented.
analysis has also been involved to recover the development Actually, directly triggering an image search engine with event
trend of an event like in [13], [16], [33]. However, we argue queries will not always return satisfying photos. The reason
that news articles are not good enough for mining events is, some dominant photos (e.g. a celebritys portrait) have
considering uses interests, as most reports from mainstream high static ranks and will disrupt the ranking list of an event
media are dominated by breaking news and influential social image search. The idea behind our approach is to leverage
events. Similarly, weblog is not an ideal choice as blog posts the information of content duplication among images returned
are mainly about individual stories covering regular people by event queries and common popular queries. In this way,
rather than interesting events for all general users. Besides photos that have more duplicates returned for queries of the
news and weblog data, there have been some recent research same event, while at the same time they do not appear in
efforts attempting to extract events from web search logs [21], search results of those popular queries, will be selected to
[40]. According to our study, search log data is a good data describe that event. Here, we provide an example of our
resource for detecting those events and gaining user attention results. Fig. 2 shows the singer Adele storyboard from July
instantly , because 1) search logs may cover a wide variety 2012 to December 2012. Three events with automatically
of real-world events. 2) search logs directly reflect users selected images are discovered about her.
interests, as they are in essence a majority voting over billions The preliminary evaluations on 200 celebrities over a period
of internet users; and 3) search logs respond promptly to events of six months have shown strong promise. In a user study,
happening in real time. auto-selected event photos had higher relevance scores, com-
Discovering events from a search log is not a trivial task. pared with the top search results returned by Google and Bing.
Existing work on log event mining [21], [40] mostly focus In summary, we make the following contributions:
on merging similar queries into groups, and investigating We propose a novel framework to detect interesting
whether these groups are related to semantic events like Japan events by mining users search log data. The framework
Earthquake [40] or American Idol [21]. Basically, their consists of two components, i.e., Smooth Non-Negative
goals are to distinguish salient topics from noisy queries. Matrix Factorization event detection and representative
Directly applying their approaches will fail as the discovered event related image photo selection
topics are more likely related to vast and common topics, We have conducted comprehensive evaluations on large-
which may be familiar to most users. Here, we would like to scale real-world click through data to validate the effec-
detect those more interesting social events to entertain users tiveness.
and fit their browsing taste, which could be supplementary The rest of paper is organized as follows. Section II discuss-

1051-8215 (c) 2016 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCSVT.2016.2598704, IEEE
Transactions on Circuits and Systems for Video Technology

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. XX, NO. XX, DECEMBER 2015 3

es related works. Section III introduces our detailed approach A. Framework


for the work. In Section IV we present the data statistics. The framework overview of the proposed approach is shown
Finally, we discuss a comprehensive set of experimental results in Fig. 3, which mainly consists of two components: (A) event
in Section V and Section VI concludes this paper. detection and (B) representative event photo selection.
There are three steps for event detection. First, topic factor-
II. R ELATED W ORK
ization methods are adopted to discover groups of queries that
The representative work for event/topic detection is the have a high co-occurrent frequency. This solves issue with
DARPA-sponsored research program called TDT (topic detec- sparsity and random noise in the query set. As we want to
tion and tracking) [2], [36], [38], which focus on discovering detect social events, but not those in the salient topics, we have
events from streams of news documents. With the development to keep a relatively large number of topics in the factorization
of Web 2.0, weblogs have become another data source for step, and then merge topics with similar behaviors in the
event detection [23], [25], [30]. Some of these research efforts second step. To merge correlated topics, we consider topic
develop new statistical methods [7], [10], [19], [23], and distributions on both the timeline and the space of click-URLs.
some others focused on recovering the temporal structure of Lastly, a rank function is introduced to highlight topics which
events [1], [13], [16], [18], [33]. There are some research are very likely to be social events. Again, information on query
efforts investigated the problem of merging multiple document semantics, temporal correlations and search log mappings are
streams for event detection [10], [34]. As we argued before, combined in the ranking process. After ranking, the top topics
web documents (both news articles and blog posts) are not are referred to as social events. Non-top but salient topics are
suitable for social event detection. The cost to filter celebrity called profile topics.
related information from massive web documents is expensive, For representative event photo selection, top queries from
and the coverage of social events is also weak. social events and profile topics are first sent to commercial
Web search log is another data source which has attracted search engines (Google or Bing) to collect two sets of image
the interests of many researchers. Search log data contains thumbnails. These two sets are considered the most relevant
useful information like user queries and clicked search result images to the social event and the celebritys background,
URLs. It has been successfully exploited in various areas like respectively. However, image search results are very noisy,
relevance ranking [15], [24], query expansion [5], [11] and and sometimes a photo has high-ranking scores in both image
query alternation [28]. Besides, search log data is an unbiased sets. To identify the most representative photos for an event,
statistic showing user intention. It is therefore a good resource we propose measuring the content similarity among images
for event detection, especially for those events attracting the in these two image sets, using both global and local image
interests of internet users. Zhao et al. [40] and Liu et al. [21] features. The assumption is that event related photos should
have done lots of work in this area. In [40], a bipartite graph have similar (duplicate) images in the social event image
is constructed based on query and click URL pairs, and two set, but should not have similar ones in the profile image
similarity measurements are proposed for event clustering. set. Based on this assumption, a simple ranking function is
In [21], Random Walk and Markov Random Fields (MRF) proposed to sort photos in the social event image set. In this
are utilized for modeling search log data. These methods way, we can identify a set of relevant photos to describe each
have been proven effective in detecting significant events like detected event. All the social events, together with their photos,
Japan Earthquake [40] or American Idol [21]. In contrast construct a story board of that celebrity.
with these papers which target popular events, this work is
more interested in identifying social events of a celebrity from
his/her salient topics (e.g., a singers popular songs). This B. Event Detection by SNMF
because salient topics help identify who a celebrity is [29], The most straightforward way to discover events from
while the social events tell you what a celebrity has been up search log data is to identify abnormal queries. For example,
to recently. In addition, we also work on providing a rich for the well-known singer Adele, the query Adele pregnant
description to the mined social events with relevant photos. is somewhat abnormal in comparison to more common queries
To generate a vivid storyboard for our social event, it is like Adele lyrics and Adele mp3. To characterize how
similar to image selection to some distance. Traditionally, abnormal a query is, we have to resort to statistical measures
photos are selected according to their local and global features like occurrence frequency and temporal density. Unfortunately,
to judge the photo quality and relevance such as [8], [20], such statistics are quite unstable as the log data is quite
[32]. In our work, photo selection is our final step to help us noisy and sparse. Therefore, it is not feasible in practice to
summarize the events from our photo collection. We have a determine an appropriate boundary to distinguish events from
completed timeline for our storyboard which is quite different others. In addition, query-level statistics ignore relationships
with common photo selection job. among correlated queries (e.g., Adele pregnant and Adele
baby). As a result, the evidence of an event becomes obscure
III. A PPROACH as we cannot integrate the statistics of correlated queries.
In Section A we will introduce our framework. Next, we Experimental results reported later in this paper show the
present how to detect social events with search log data in limitations of this simple solution.
Section B. Finally, how to get associated images to represent To deal with noisy and sparse data, topic modeling (or topic
relevant events is explained in Section C. factorization) has proven to be an effective approach, especial-

1051-8215 (c) 2016 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCSVT.2016.2598704, IEEE
Transactions on Circuits and Systems for Video Technology

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. XX, NO. XX, DECEMBER 2015 4

data must be decomposed into a sum of additive compo-


nents. In other words, both the coefficients of documents
distributions over topics and the coefficients of topics
distributions over queries must be non-negative. This makes
sense, especially for event modeling, as it is hard to accept the
explanation that we observe a certain query just because some
events didnt happen that day. In addition, the non-negative
coefficients also improve event mining in the next subsections.
The log data is first converted into a matrix D of the size
|Q| |D|. Each row represents a query and each column
indicates one day. Every item Dij is the number ith query
that was observed on the j th day. NMF aims to find two non-
negative matrices W and H satisfying
D W H. (1)
W = [w1 , . . . wK ] in which every column wk (1 k K)
denotes a topic, and K is the pre-defined number of topics.
Fig. 3. The overview of the proposed approach, consisting of two main parts: H = [h1 , . . . h|D| ] in which each column hd (1 d |D|)
(A) event detection by SNMF and (2) representative event photo selection. is the decomposition coefficients of topics for the dth day.
According to [17], the decomposition problem converts to
minimizing the cost function
ly for text mining [3], [6], [14]. Through topic modeling, high-
g
dimensional sparse data is projected into a low-dimensional arg min DKL (DkW H) s.t.W 0, H 0. (2)
W,H
topic space, in which the correlations among original feature
g
dimensions are embedded. Topic modeling is also good at Here, DKL (AkB) is the generalized Kullback-Leibler diver-
suppressing random noise. In this paper, we choose topic gence of two matrices A and B
factorization as the first step to cook the search log data. g
X Aij
For a celebrity with N log records, each one is repre- DKL (AkB) = (Aij ln Aij + Bij ). (3)
ij
Bij
sented by a triplet ri = (qi , di , ui ), 1 i N , where
qi Q, di D, ui U. Here, Q, D, U are the collections Like most other topic modeling algorithms, the standard
of unique queries, days, and click-URLs in the celebritys log NMF ignores the orders of input documents. In other words,
data. We choose days as the unit of time, as the resolution is permutation of the order of columns in D would not affect the
enough to characterize the period of a social event. decomposition results. However, for log mining, the temporal
1) SNMF Topic Factorization: In classic topic modeling, order is a critical factor which needs to be taken seriously.
the inputs are text documents consisting of words and the That is to say, there shouldnt be significant difference between
outputs are decompositions of these documents into topics. queries (and related topics) from two adjacent days. Similar
Here, each topic is a distribution over the word vocabulary. constraints also arise when decomposing time-series signals
Analogically, we treat one days log data as a document such as audio stream [12]. To embed such constraints, Smooth
and each query as a word. The vocabulary consists of all Non-Negative Matrix Factoriazation(SNMF) was proposed by
the unique queries of a celebrity in his/her log records, i.e., introducing an extra regularization factor S(H) to the cost
the set Q defined in Section III A. The assumption is, various function, as
stories (potentially interesting events or other representative g
arg min {DKL (DkW H) + S(H)},
aspects) of a celebrity are considered as latent topics leading W,H
P|D|
to different search queries. It should be noted that we choose S(H) = d=2 khd hd1 k2 s.t.W 0, H 0.
a whole query as a word but not break each query into (4)
real English words. This is because a query is more like Here, S(H) acts as a penalty which favors the smoothness
a short phrase having specific semantic meanings compared (small l2 distance) between two adjacent columns in H, and
to single word. Breaking a query into words may introduce is a nonnegative weight to adjust the degree of smoothing. In
unexpected ambiguities to topic factorization. For example, implementation, we first solve the standard NMF problem in
the word love in the queries love story and love Harry (2), then use its decomposition results as the initial values for
Styles of Taylor Swift has completely different semantics the constrained optimization in (4). For more details please
the former is about one of her famous songs and the latter is refer to [12], [17]. Fig. 4 gives an example of a topics
about her ex-boyfriend. distributions over the time-line, created by NMF and SNMF
Widely used algorithms for topic factorization include prob- respectively. It is clear that for NMF, the curve of the topics
abilistic latent semantic indexing (PLSI) [14], latent Dirichlet weight jumps dramatically. By contrast, the curve generated
allocation (LDA) [7], singular value decomposition (SVD) [3], by SNMF varies more rationally along the time-line.
non-negative matrix factorization (NMF) [17], and their vari- There are two parameters, the number of topics K and the
ants. In this paper, we choose NMF as it has a nice advantage smoothing weight in the SNMF topic factorization. We will

1051-8215 (c) 2016 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCSVT.2016.2598704, IEEE
Transactions on Circuits and Systems for Video Technology

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. XX, NO. XX, DECEMBER 2015 5

Fig. 4. Comparison of topic distribution Adeles Music generated by NMF Fig. 5. Comparison of the distributions along the timeline for two topics
and SNMF (with = 10). The horizontal axis is time (in days) and the generated by SNMF: (a) Adeles pregnancy and (b) Adeles Music.
vertical axis is the topics weight in each days log data.

between PD (di |tk ) and PD (di |tl ). This is because the timeline
investigate their influences on performance in the experiments is not always well aligned even for two topics describing
section. It should be noted that, in general we choose a a same story, their temporal distributions may still have a
relatively large K, as we dont want to miss some social events slight time lag. To deal with the alignment issue, we shift one
which are not that prominent in comparison to popular ones. distribution forward and backward with a small offset (one
The side-effect of a large K is the risk of over-splitting topics. day in the implementation), and select the smallest symmetric
Therefore, an additional fusion step is introduced in the next K-L divergence as the distance between tk and tl , as
subsection, in which more informative clues such as the search distD (tk , tl ) = min {KLD (tk , tl ; ), {1, 0, 1}}, (6)
log data are taken into account to merge similar topics.
2) Topic Fusion: After the factorization step, we have K where KLD (tk , tl ; ) is the shift-enabled K-L divergence
topics {t1 , . . . , tK } and two matrices W and H. To charac- mentioned above, is the offset in days.
terize a topic, the most intuitive clues are its distributions, C. Topic similarity over search log URLs . From the search
both over the query vocabulary and over the time line. These log, the relationships between the search log URLs and the
two distributions can be directly obtained from W and H. queries can be described by a |U| |Q| matrix L in which
Another useful clue from the search log data is the set of each element Lij denotes the number of times that the URL ui
search log URLs, which have proven to be effective for being clicked given the query qj . By multiplying L and W, we
query clustering [40]. The assumption is, queries trigging can propagate a topics weights over queries to the search log
the same URL are very likely to have similar semantics. URLs. Next, a topic tk s distribution over
P|Uthe
|
search log URLs
Consequentially, two topics should be semantically correlated is defined as PU (ui |tk ) = (LW)ik / j=1 (LW)jk ; and the
if they have similar distributions over the click-URL space U corresponding distance between tk and tl is
defined in III B. In this paper, we combine these three clues
to measure the similarity between two topics, and merge all distU (tk , tl ) = KLU (tk , tl ), (7)
the topics in an unsupervised way. where KLU has the same form as KLQ in (5).
A. Topic similarity over queries . Given a topic tk (1 The three distance scores in (5)(7) are simply added up
k K), its distribution over the queries can be approximated to describe the overall distance between two topics. Then,
by the k th column of W. As W is a non-negative matrix, agglomerative hierarchical clustering is adopted to merge sim-
it is straightforward enough to transform the k th column ilar topics in a bottom-up way. We selected complete linkage
into a distribution PQ (qi |tk ) by normalizing it with the sum as the merge criterion for the clustering, to ensure strong
P|Q|
of its elements, i.e., PQ (qi |tk ) = Wik / j=1 Wjk . Then, connections between those merged topics. The stop threshold
the distance between two topics tk and tl , over the query is automatically estimated by identifying the significant jump
vocabulary, is defined by the symmetric Kullback-Leibler from the ascending sorted distance scores of all topic pairs.
divergence, as 3) Event Ranking: The last step is to distinguish event
related topics from others. Although this is essentially a
distQ (tk , tl ) = KLQ (tk , tl ) classification problem, collecting enough unbiased training
|Q| data is quite difficult in practice. Therefore, we treat it as a
1X PQ (qi |tk ) PQ (qi |tl )
= (PQ (qi |tk ) ln + PQ (qi |tl ) ln ). ranking problem, to leverage several heuristics summarized
2 i=1 PQ (qi |tl ) PQ (qi |tk )
based on a number of observations. Similar to the above part,
(5)
these heuristics are based on the distributions of a topic over
B. Topic similarity over timeline . Similarly, a topic tk s the time-line, over the query vocabulary, and over the search
distribution over the timeline, PD (di |tk ), can be approximated log URLs.
by normalizing the k th row in H. That is, PD (di |tk ) = Fig. 5 shows the distributions of two topics along the
P|D|
Hki / j=1 Hkj . However, the similarity of tk and tl over the timeline. One is a generic topic about Adeles music and the
time-line cannot be directly measured by the KL divergence other is about her pregnancy. It is clear that the curve of her

1051-8215 (c) 2016 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCSVT.2016.2598704, IEEE
Transactions on Circuits and Systems for Video Technology

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. XX, NO. XX, DECEMBER 2015 6

events is defined as
topic
rankevent (tk ) = scoret (tk )scoreq (tk )scoreu (tk ). (11)

We choose multiplication but not addition here, because a


social topic should satisfy all of the above three criteria. By
contrast, topics with small ranking scores usually describe
(a) Googles search results for Amanda Bynes car accident some popular aspects of a celebrity, like his (or her) profile.
We call these profile topics in the following sections.

C. Event Photo Selection


People often say that a picture is worth a thousand words.
Without a doubt, interesting events associated with related
photos are more attractive to the audience. For each detected
social event, it is straightforward to identify a set of most
(b) Bings search results of Amanda Bynes car accident relevant queries by inspecting the events distribution in the
query space. The simplest way to get events related photos
is to directly search commercial image search engines with
these event queries. Fig. 6 (a) and (b) show the search results
returned by Google and Bing, for the query Amanda Bynes
car accident. It is clear that the relevance of returned images
is not satisfactory. There are a lot of irrelevant images such
(c) Search results of Amanda Bynes as portraits of Amanda Bynes. In the celebrity domain, the
Fig. 6. Image search results returned by commercial search engines, for
main reasons for this are (1) some portrait photos have high
queries on social events and profile topics. Photos with (partial) duplicates static ranking scores and (2) queries are too short to accurately
are highlighted with blue rectangles. describe a event. Therefore, we need a better way to collect
event photos.
By investigating a good number of examples, it is observed
pregnancy has a clear evolutional process (occur, sustain, and that (1) in search results of an event query, images with
decay), which is very close to a Gamma distribution. Inspired (partial) duplicates are very likely to be relevant to the event;
by such an observation, we first fit the distribution with a and (2) portraits (or other popular) images also appear in
Gamma distribution, and then use the estimated parameters to search results of a celebritys hot queries (e.g., name of a
re-produce an artificial curve, denoted as Gamma(di |tk ). As celebrity). For example, in Fig. 6 (a) and (b), the images with
a result, the timeline based ranking score is defined as duplicates (marked by blue rectangles) are more relevant to
scoret (tk ) = expKL(PD (|tk )kGamma(|tk )) . (8) car accident; most portrait photos in (a) and (b) have similar
ones in Fig. 6 (c), which shows the results for the query
In this way, higher scores will be assigned to topics whose Amanda Bynes. Based on these observations, two criteria
temporal curve look more like a Gamma distribution. are formulated to re-rank photos:
The observations for the distributions over queries and over Promote those images which have (partial) duplicates in

search log URLs are similar. That is, social events have more the search results of queries form social events.
Penalize those images which have similar ones in the
concentrated distributions than generic topics. In reality, the
search results of queries from popular topics.
numbers of queries and URLs associated with a social event
To do this, for each social event, the 5 most dominant
are much smaller than those of a generic topic. For example,
queries are selected for image searching. For each query,
the two numbers of search log URLs related to Adeles lyrics
thumbnails of the top 100 images returned by a commercial
and Adeles pregnancy have different orders of magnitude.
search engine are downloaded. In this way, we construct a
A natural choice for measuring the degree of concentration
candidate photo set for the events, denoted by Ievent , which
of a distribution is entropy. To promote topics with more
has 500 thumbnails. Similarly, the top 10 queries from profile
concentrated distributions, another two ranking scores are
topics are used to collect a set of the most representative
defined as
images of that celebrity, denoted by Iprof ile , which has 1000
|Q| thumbnails in total. For each celebrity, Iprof ile is shared across
1 X
scoreq (tk ) = 1.0 + (PQ (qi |tk ) ln PQ (qi |tk )) (9) various social events. Before processing, blur features and dark
ln |Q| i=1
features are used to remove photos that are low quality. In the
following subsections, we will introduce how to measure the
|U |
1 X content similarities among images in Ievent and Iprof ile ; and
scoreu (tk ) = 1.0 + (PU (ui |tk ) ln PU (ui |tk )) (10)
ln |U| i=1 how to re-rank photos in Ievent based on these similarities.
These steps will help identify those photos that most represent
Lastly, the ranking score of a topic tk associated with some the event in question.

1051-8215 (c) 2016 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCSVT.2016.2598704, IEEE
Transactions on Circuits and Systems for Video Technology

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. XX, NO. XX, DECEMBER 2015 7

1) Image Similarity Measures: To measure image similar- w becomes very small if Ix has similar images in Iprof ile .
ity, we considered both global and local image features in Finally, the new ranking score of a photo Ix Ievent is
this paper. Global features are extracted based on a whole |Ievent | idx(Ix )
img
image, and are suitable for identifying fully duplicate images. rankevent (Ix ) = w+ (Ix ) w (Ix ), (17)
|Ievent |
By contrast, local features describe a local image patch, and
have been widely used for recognizing partial duplicates. where idx(Ix ) is the zero-based index of the photo Ix in the
Supporting partial duplicate detection is quite important in search results returned by search engines. According to the
this step, as many images have been edited (e.g., cropping new ranking scores, the photos with the highest scores are
or stitching) before being published online. considered to be the most representative images of that event.
The global feature adopted in this paper is the block-
based intensity histogram [27]. Each image is divided into IV. DATA A NALYSIS
64 (8 8) blocks, and for the ith block a 256dimensional In this paper, we use the image search log collected by a
intensity histogram gi is computed based on the pixels within commercial search engine, consisting of queries, clicks and
that block. Consequently, the global feature-based similarity search results from July to December 2012. After filtering the
between two images Ix and Iy is defined as2 log data with the 200 celebrities names, we obtain more than
64 190 million log records. In other words, for each celebrity,
1 X Ix I
simhist (Ix , Iy ) = max{1.0 ||g gi y ||2 , 0}. (12) there are an average of around 5000 log records in every day.
64 i=1 i The data has been updated by the search engine to remove
For local feature-based similarity measurements, we choose private information, and each log record has three main fields:
the classic SIFT (Scale-Invariant Feature Transform) feature time, query, and click-URL, as shown in Fig. 7. Given a
and follow the matching process proposed in [22]. In [22], celebrity, only records with a query containing the celebritys
there is a geometric verification process which ensures that name are retained for further event detection. To guarantee data
the remaining SIFT correspondences between two images are quality, we also ignore those log records whose click-URL is
compliant with each other. This is a very strong assumption, empty. It should be noted that sometimes users click more than
and two images are very likely to be partial duplicates with one URLs in a given set of search results. For such a situation,
each other if the number of surviveing SIFT correspondences there will be multiple records, each of which correspond to one
is larger than a threshold. For two images Ix and Iy , the local clicked URL. This is to reserve more information about query
feature-based similarity is defined as and URL pairs, which is helpful in measuring the similarity
 among different log records. For example, the 4th and 5th
1 inlier(Ix , Iy ) > sif t
simsif t (Ix , Iy ) = , (13) rows in Fig. 7 are two different click-URL for the singular
0 inlier(Ix , Iy ) sif t
query Jennifer Lopez Movies.
where inlier(Ix , Iy ) is the number of survived SIFT correspon-
dences between Ix and Iy , and the threshold sif t is set as 12 V. EVALUATION AND DISCUSSION
as suggested in [22]. A. Experimental Settings
Both the global and local similarity measurements can be
accelerated via off-the-shelf indexing technologies like k-d tree For evaluation, the first step is to choose a list of celebrities.
or hashing kernels [39], which have proven to be very efficient In this paper, we select target celebrities from three main
for million-scale image retrieval. Therefore, the computation data resources: (1) Google Zeitgeist 20123 which contains
cost in this paper is affordable. the hottest celebrities in search queries; (2) the most popular
At last, the integrated content similarity between image Ix celebrities in Yahoo!4 , the list from the largest internet portal;
and Iy are defined as and (3) the Forbes celebrity 100 list5 . After removing those
candidates which have few log records or no related ground
sim(Ix , Iy ) = max{simhist (Ix , Iy ), simsif t (Ix , Iy )}. (14) truth, we come up with a list of 200 celebrities from who are
2) Event Photo Re-ranking: To promote a photo Ix singers, actors/actresses, and politicians.
Ievent which has duplicates in Ievent , we define the weighting For quantitative performance measurement, the most chal-
score w+ (Ix ) as lenging step is to prepare a benchmark dataset with ground
X truth labels. In practice this turn out to be a laborious task. Ten
w+ (Ix ) = sim(Ix , Iy ). (15) websites, as listed in Table V-A, are adopted for ground truth
Iy Ievent ,Iy 6=Ix generation. For each website, we first develop a site-specific
According to the definition, the more duplicates in Ievent , the crawler to download those pages containing celebrity-related
more important the photo Ix is. Similarly, to punish photos social events. Then, we manually write regular expressions
with similar ones in Iprof ile , another weighting score w (Ix ) to extract events related information from every fetched web
is defined as page. In this way, we convert these web pages into a table of
structured data, of which there are three fields: celebrity name,
w (Ix ) = 1.0 max {sim(Ix , Iy )}. (16) event time, and event descriptions. To provide a convincible
Iy Iprof ile

2 Although 3 http://www.google.com/zeitgeist/2012/
l2 distance is not the best one to measure the similarity of two
4 http://omg.yahoo.com/top-celebrities/
histograms, it works well in practice. We adopt l2 distance mainly because it
can be easily accelerated via off-the-shelf indexing technologies. 5 http://www.forbes.com/celebrities/

1051-8215 (c) 2016 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCSVT.2016.2598704, IEEE
Transactions on Circuits and Systems for Video Technology

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. XX, NO. XX, DECEMBER 2015 8

Fig. 7. A snapshot of the web search log data used for celebrity social event mining.

TABLE I
10 W EBSITES FOR GROUND TRUTH GENERATION

No. Website URL


1 http://www.celebitchy.com/archives-by-category/
2 http://www.people.com/people/celebrities/
3 http://www.egotastic.com/celebrities/
4 http://www.hellomagazine.com/celebrities/
5 http://www.idontlikeyouinthatway.com/pictures/
6 http://www.okmagazine.com/celebs-list
7 http://omg.yahoo.com/top-celebrities/
8 http://www.popsugar.com/celebrities Fig. 8. Comparisons of the performance with different smoothing weight
9 http://theblemish.com/ in the topic factorization(Mirco-Precision, Micro-Recall, Macro-Precision and
10 http://www.thehollywoodgossip.com/stars/ Macro-Recall).

scores in two ways


P200
i=1matched(ci )
ground truth list for each celebrity, we only keep those events precmicro = P200
i=1 gt(ci )
which appear on at least two websites. Here, to judge whether P200
two stories are about the same event, we adopt a simple yet i=1 covered(ci )
recmicro = P200 (18)
effective heuristic rule. That is, two events are considered to i=1 gt(ci )
be matched with each other if they happen within 3 days 1
200
X matched(ci )
(1 day accepted), and there are more than two common precmacro =
200 gt(ci )
keywords (both celebrity name and stop words are ignored) in i=1
200
their descriptions. To prevent the influence of the noisy data 1 X covered(ci )
from mixed topics, we only extract the top appearing words recmacro = (19)
200 i=1 gt(ci )
as descriptions.
The micro-averages focus on the performance at the event-
To evaluate performance, we adopted the classic precision level, while the macro-averages measure performance at the
and recall measures. As introduced in IV C, the social events celebrity-level, ignoring the difference in celebritys populari-
discovered in this paper are sorted in a descending order ty.
according to their scores defined in (11). Hence, for each
celebrity ci , the precision and recall are computed based on the
top gt(ci ) topics in the ranking list. Here, gt(ci ) is the number B. Event Topic Discovery
of social events in ci s ground truth list. Similarly, a discovered As mentioned in IV A, in the topic factorization step
topic is said to match a ground truth event, if (1) the difference there are two parameters, the smoothing weight and the
in time is within 3 days and (2) there should be more than number of topics K. In this section, we first investigate the
two keywords from the events description appearing in the influence of the two parameters in event detection, and then
topics top 5 queries. Suppose there are matched(ci ) topics compare the overall performance of our approach with two
matched with some events in the ground truth, and there are other approaches.
covered(ci ) events in the ground truth appear in the discovered The parameter in the SNMF controls the smoothness of
topics, we have precision(ci ) = matched(ci )/gt(ci ) and topic distributions along the timeline. If = 0, it degenerates
recall(ci ) = covered(ci )/gt(ci ). It should be noted that here to the standard form of NMF. The larger is, the stronger
we have 0 covered(ci ) matched(ci ) gt(ci ), this is the regularization applied is. In the experiment, we vary
because sometimes a social event could be over split into from 0 to 100 on nine different scales (and K is fixed at 40).
multiple topics. For such a situation, precision is still good The performance measurements under different are shown
but recall drops. To better measure the overall performance in Fig. 8. From Fig. 8, it is clear that the performance of
for all 200 celebrities, we average the precision and recall standard NMF ( = 0) is not good. This is because the

1051-8215 (c) 2016 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCSVT.2016.2598704, IEEE
Transactions on Circuits and Systems for Video Technology

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. XX, NO. XX, DECEMBER 2015 9

Fig. 11. Subjective evaluation for the relevance of event photos returned by
Bing, Google, and the proposed approach.
Fig. 9. Comparisons of the performance with different number of topics K
in the topic factorization(Mirco-Precision, Micro-Recall, Macro-Precision and
Macro-Recall).
to group queries/URLs into events. The third one [1] converts
the time series data into time-interval sequences of temporal
abstractions and present minimal predictive recent temporal
paaterns framework to select the event patterns. For the
convenience of comparison, the number of abnormal queries
and number of clusters in [40] are both set to the ground truth
event number gt(ci ). Fig. 10 shows the experimental results.
It is clear that the abnormal query-based solution achieve the
worst performance. As we argue in IV, statistics at query-level
are very noisy and unreliable. The scores of [40] and [1]
are not good enough, too. This is because (1) some popular
topics (events) will dominate the clustering and (2) sometimes
Fig. 10. Comparisons of the overall performance with different approaches. events related search log URLs are too sparse to bridge
related queries. (3) the time patterns in the search log can
be overwhelmed by the common noisy data. Therefore, topic
violently varying timeline distribution (refer to the example factorization and event ranking are both necessary components
shown in Fig. 4) cannot accurately characterize the temporal in the solution.
evolution of a topic, and will hurt the next steps of topic fusion
and event ranking. When increased, the performance got C. Evaluation of Event Photo Relevance
better and better, which demonstrates the necessity of adopting
SNMF for topic factorization. However, when becomes To measure the relevance of social event photos, we have
large enough, the performance starts creeping down. There to resort to subjective evaluation. 10 undergraduate students
are two reasons leading to the performance drop: (i) a strong were invited as judges. For each event, the judges first read
smoothing operation will weaken peaks (just like the one the related webpages (through search log URLs) to know
shown in Fig. 5) which reflect the occurrence of some events; the story. Then, photos returned by Google, Bing, and our
and (ii) the strong regularization factor will dominate the cost approach were presented to the judges in a random order. Each
function in equation (4) and the obtained W H cannot photo was assigned one of the three scores: perfect, relevant,
approximate the original matrix D very well. According to and irrelevant. Perfect means the photo is about both the
Fig. 8, we set = 10 in the following experiments. celebrity and the event, relevant means the photo is at least
For the number of topics K, we vary it in the range of about the celebrity, and irrelevant means the photo is totally
(10 50). The performance curves are shown in Fig. 9. From wrong. Considering the cost of human judges, we randomly
Fig. 9, it is noted that as K increases, the performance curves selected 50 correctly discovered event topics for evaluation;
have a clear trend of upmaintaindown. When K is small, and for each event, only the top five photos returned by the
some social events are easily mixed with other popular topics; search engines and our approach was labeled. The comparison
when K is very large, the fusion step in IV B may fail to results are shown in Fig. 11. The event photo re-ranking
merge some relevant topics. Both of the two situations will hurt method introduced in V is helpful to identify event relevant
performance. By contrast, these curves are relatively stable for photos from image search results. To provide a vivid feeling
the range 20 K 40, which indicates the effectiveness of to the event photo selection, some example cases are shown
the topic fusion component. We set K = 40 in the following in Fig. 12, in which perfectly relevant photos are marked with
experiments. blue rectangles.
Lastly, we compare the overall performance of the proposed
method with three other approaches. One is the straightforward D. Examples of Event Storyboard
abnormal query-based strategy mentioned in the beginning Finally, the storyboard will be generated using the selected
of IV; and the second is the approach proposed by Zhao et events with relevant photos. To ensure the high quality of
al. [40] which also utilized web search logs. In [40], the query the photos, low visual quality photos will be eliminated at
and URL pairs in log data are represented with a bipartite first. Besides, time and location contexts are other important
graph, based on which a novel clustering method is adopted fact for the storyboard photos. For each detected event, the

1051-8215 (c) 2016 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCSVT.2016.2598704, IEEE
Transactions on Circuits and Systems for Video Technology

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. XX, NO. XX, DECEMBER 2015 10

Fig. 13. The storyboards for Tome Cruise and Barack Obama, from July 2012 to December 2012.

TABLE II
H UMAN EVALUATION FOR STORYBOARD AND HUMAN WEB PAGE . E ACH
METHOD IS EVALUATED BY 10 PERSONS ( SCALE 1-10, HIGHER IS
BETTER ).

Method Correctness Photo Relevance Event Representation


Our storyboard 7.9 6.5 7.9
Web Page 10 9.4 9.8

and detected event time, we calculate the whole events of 200


celebrities about the time delay. To some distance, the detected
Fig. 12. Showcases of the event photos returned by Bing (1st row), Google event time can reflect when common users are interested about
(2nd row), and the proposed approach (3rd row.). Perfectly relevant photos the event. The average delay is about 2 days, which means that
are marked with blue rectangles. real time event will be noticed by users in a fast time from our
search log data. Moreover, we find that those celebrities with
higher reputation have a much small time delay than other
happening time and location will be extracted at first. We use common celebrities.
the surrounding texts of the photos at web to identify whether
the photo is relevant to our detected event. Those photos with E. User study for the storyboard
big time and location difference compared to detected event
will be ranked in low portion. To make our results more convincing, 10 persons are hired
to evaluate the storyboard results from correctness, photo
As discussed in the Introduction, it would be an attrac-
relevance and event representation aspects. Each one will score
tive feature if we could generate a storyboard based on the
it from 1 to 10 score. At the same time, we select the web
discovered event topics and photos. Actually, for each event,
page edited by human as our ground truth to compare with.
we have its top-related queries and search log URLs. After
From Table. II, the results of generated storyboard can
downloading webpages following those hot search log URLs,
be satisfying in correctness and event representation aspects,
we can extract sentences which contain top queries from
which approve that the search log can well reflect interesting
the fetched pages. Candidate sentences are organized into
events. Currently, the selected photos are not good enough
a graph, in which each edge denotes how many common
because most photos is about the celebrity person and it is
words (excluding stop words) are shared between the two
difficult to get the photo which is relevant to the event. Overall,
corresponding sentences. Given the link graph, we can select
our method is promising according to the user study.
the most representative sentence following a PageRank like
ranking strategy. Such a sentence can be used as a short
description to a social event in storyboard. Fig. 13 gives two F. Potential application and Other entity extension
examples of storyboards for Tom Cruise and Barack Obama. We have applied our approach into a real phone demo
To compare the difference between the physical event time system [31]. Fig. 14 shows the basic page or our celebrity

1051-8215 (c) 2016 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCSVT.2016.2598704, IEEE
Transactions on Circuits and Systems for Video Technology

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. XX, NO. XX, DECEMBER 2015 11

[8] Y.-J. Chang, H.-Y. Lo, M.-S. Huang, and M.-C. Hu. Representative
photo selection for restaurants in food blogs. In Multimedia & Expo
Workshops (ICMEW), 2015 IEEE International Conference on, pages
16. IEEE, 2015.
[9] H. L. Chieu and Y. K. Lee. Query based event extraction along a
timeline. In Proceedings of the 27th annual international ACM SIGIR
conference on Research and development in information retrieval, pages
425432. ACM, 2004.
[10] T.-C. Chou and M. C. Chen. Using incremental plsi for threshold-
resilient online event analysis. Knowledge and Data Engineering, IEEE
Transactions on, 20(3):289299, 2008.
Fig. 14. A celebrity social event system: (a) home page,(b) list view of [11] H. Cui, J.-R. Wen, J.-Y. Nie, and W.-Y. Ma. Probabilistic query
topics related to Jenifer Aniston, (c) list view of hot event in October, and (d) expansion using query logs. In Proceedings of the 11th international
relevant images about the event girlfrined, developed on Windows Phone conference on World Wide Web, pages 325332. ACM, 2002.
8. [12] S. Essid and C. Fevotte. Smooth nonnegative matrix factorization
for unsupervised audiovisual document structuring. Multimedia, IEEE
Transactions on, 15(2):415425, 2013.
[13] G. P. C. Fung, J. X. Yu, H. Liu, and P. S. Yu. Time-dependent event
social event system. Uses can interactively switch among hierarchy construction. In Proceedings of the 13th ACM SIGKDD
four views: people-centric, timeline-centric, month-centric and international conference on Knowledge discovery and data mining,
topic-centric. pages 300309. ACM, 2007.
[14] T. Hofmann. Probabilistic latent semantic indexing. In Proceedings of
Other than celebrity, more entities events can be detected the 22nd annual international ACM SIGIR conference on Research and
using a similar strategy with the search log data. Such as land- development in information retrieval, pages 5057. ACM, 1999.
mark, brand, we can detect their development and evolution [15] T. Joachims. Optimizing search engines using clickthrough data. In
Proceedings of the eighth ACM SIGKDD international conference on
from timeline with related photos, which make it easier for Knowledge discovery and data mining, pages 133142. ACM, 2002.
users to know more about them. [16] N. Kawamae. Trend analysis model: trend consists of temporal words,
topics, and timestamps. In Proceedings of the fourth ACM international
conference on Web search and data mining, pages 317326. ACM, 2011.
VI. C ONCLUSIONS [17] D. D. Lee and H. S. Seung. Algorithms for non-negative matrix
factorization. In Advances in neural information processing systems,
In this paper, we use search logs as data source to generate pages 556562, 2001.
social event storyboards automatically. Unlike common text [18] J. Li and C. Cardie. Timeline generation: Tracking individuals on twitter.
mining, search logs have short, sparse text queries and the data In Proceedings of the 23rd international conference on World wide web,
pages 643652. ACM, 2014.
size is much bigger than some news websites or blogs. Based [19] Z. Li, B. Wang, M. Li, and W.-Y. Ma. A probabilistic model for
on these features, we do not use the query text information retrospective news event detection. In Proceedings of the 28th annual
to do the analysis. Structure and statistic information are used international ACM SIGIR conference on Research and development in
information retrieval, pages 106113. ACM, 2005.
to get the topics and event detection in our work, which can [20] A. Liu, W. Lin, and M. Narwaria. Image quality assessment based
fit the data well. Furthermore, we add time information in our on gradient similarity. Image Processing, IEEE Transactions on,
approach to SNMF to make it easier to discover social events 21(4):15001512, 2012.
[21] H. Liu, J. He, Y. Gu, H. Xiong, and X. Du. Detecting and tracking topics
compared with traditional NMF methods. Our work performs and events from web search logs. ACM Transactions on Information
better than traditional works in this area, e.g. [40], because Systems (TOIS), 30(4):21, 2012.
we can distinguish the topics in a way that gets the events [22] D. G. Lowe. Object recognition from local scale-invariant features. In
Computer vision, 1999. The proceedings of the seventh IEEE interna-
which are most appealing to common users. The associated tional conference on, volume 2, pages 11501157. Ieee, 1999.
images were selected to make up the storyboard in a timeline [23] Q. Mei, C. Liu, H. Su, and C. Zhai. A probabilistic approach to
to present a good representation of the mined events using the spatiotemporal theme pattern mining on weblogs. In Proceedings of
the 15th international conference on World Wide Web, pages 533542.
image search results features and relationships. ACM, 2006.
[24] T. Mei, Y. Rui, S. Li, and Q. Tian. Multimedia search reranking: A
R EFERENCES literature survey. ACM Computing Surveys (CSUR), 46(3):38, 2014.
[25] M. Platakis, D. Kotsakos, and D. Gunopulos. Searching for events in
[1] C. Alexander, B. Fayock, and A. Winebarger. Automatic event detection the blogosphere. In Proceedings of the 18th international conference on
and characterization of solar events with iris, sdo/aia and hi-c. In World wide web, pages 12251226. ACM, 2009.
AAS/Solar Physics Division Meeting, volume 47, 2016. [26] S. D. Roy, T. Mei, W. Zeng, and S. Li. Towards cross-domain learning
[2] J. Allan, J. G. Carbonell, G. Doddington, J. Yamron, and Y. Yang. Topic for social video popularity prediction. Multimedia, IEEE Transactions
detection and tracking pilot study final report. 1998. on, 15(6):12551267, 2013.
[3] S. Arora, R. Ge, and A. Moitra. Learning topic modelsgoing beyond [27] Y. Rui, T. S. Huang, and S.-F. Chang. Image retrieval: Current
svd. In Foundations of Computer Science (FOCS), 2012 IEEE 53rd techniques, promising directions, and open issues. Journal of visual
Annual Symposium on, pages 110. IEEE, 2012. communication and image representation, 10(1):3962, 1999.
[4] N. Babaguchi, S. Sasamori, T. Kitahashi, and R. Jain. Detecting events [28] E. Sadikov, J. Madhavan, L. Wang, and A. Halevy. Clustering query
from continuous media by intermodal collaboration and knowledge refinements by user intent. In Proceedings of the 19th international
use. In Multimedia Computing and Systems, 1999. IEEE International conference on World wide web, pages 841850. ACM, 2010.
Conference on, volume 1, pages 782786. IEEE, 1999. [29] S. Song, Q. Li, and N. Zheng. Understanding a celebrity with his salient
[5] P. N. Bennett, R. W. White, W. Chu, S. T. Dumais, P. Bailey, F. Borisyuk, events. In Active Media Technology, pages 8697. Springer, 2010.
and X. Cui. Modeling the impact of short-and long-term behavior on [30] Y. Suhara, H. Toda, and A. Sakurai. Event mining from the blogosphere
search personalization. In Proceedings of the 35th international ACM using topic words. In ICWSM, 2007.
SIGIR conference on Research and development in information retrieval, [31] S. Tan, C.-W. Ngo, J. Xu, and Y. Rui. Celebrowser: An example of
pages 185194. ACM, 2012. browsing big data on small device. In Proceedings of International
[6] D. M. Blei. Introduction to probabilistic topic models. Comm. ACM, Conference on Multimedia Retrieval, page 514. ACM, 2014.
55(4):7784, 2012. [32] T. C. Walber, A. Scherp, and S. Staab. Smart photo selection: Interpret
[7] D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. the gaze as personal interest. In Proceedings of the SIGCHI Conference on
Journal of machine Learning research, 3:9931022, 2003. Human Factors in Computing Systems, pages 20652074. ACM, 2014.

1051-8215 (c) 2016 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCSVT.2016.2598704, IEEE
Transactions on Circuits and Systems for Video Technology

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. XX, NO. XX, DECEMBER 2015 12

[33] X. Wang and A. McCallum. Topics over time: a non-markov continuous- Rui Cai is a Lead Researcher at Microsoft Research
time model of topical trends. In Proceedings of the 12th ACM SIGKDD Asia. He received the B.E. and Ph.D. degrees in
international conference on Knowledge discovery and data mining, computer science from Tsinghua University, Beijing,
pages 424433. ACM, 2006. China, in 2001 and 2006, respectively. His research
[34] X. Wang, K. Zhang, X. Jin, and D. Shen. Mining common topics from interests include web search and data mining, ma-
multiple asynchronous text streams. In Proceedings of the Second ACM chine learning, pattern recognition, computer vision,
International Conference on Web Search and Data Mining, pages 192 multimedia content analysis, and signal processing.
201. ACM, 2009. He is a member of Association for Computing
[35] W. Weerkamp, R. Berendsen, B. Kovachev, E. Meij, K. Balog, and Machinery (ACM) and the Institute of Electrical and
M. De Rijke. People searching for people: Analysis of a people search Electronics Engineers (IEEE).
engine log. In Proceedings of the 34th international ACM SIGIR
conference on Research and development in Information Retrieval, pages
4554. ACM, 2011.
[36] J. Weng and B.-S. Lee. Event detection in twitter. ICWSM, 11:401408,
2011.
[37] C.-C. Wu, T. Mei, W. H. Hsu, and Y. Rui. Learning to personalize trend-
ing image search suggestion. In Proceedings of the 37th international Houqiang Li (SM12) received the B.S., M.Eng.,
ACM SIGIR conference on Research & development in information and Ph.D. degrees from the University of Science
retrieval, pages 727736. ACM, 2014. and Technology of China (USTC), Hefei, China, in
[38] Y. Yang, T. Pierce, and J. Carbonell. A study of retrospective and on- 1992, 1997, and 2000, respectively, all in electronic
line event detection. In Proceedings of the 21st annual international engineering.
ACM SIGIR conference on Research and development in information He is currently a Professor at the Department
retrieval, pages 2836. ACM, 1998. of Electronic Engineering and Information Science,
[39] X. Zhang, L. Zhang, and H.-Y. Shum. Qsrank: Query-sensitive hash USTC. He has authored or co-authored over 100
code ranking for efficient epsilon-neighbor search. In Proc. CVPR, pages papers in journals and conferences. His current re-
20582065, 2012. search interests include video coding and commu-
[40] Q. Zhao, T.-Y. Liu, S. S. Bhowmick, and W.-Y. Ma. Event detection nication, multimedia search, and image/video anal-
from evolution of click-through data. In Proceedings of the 12th ACM ysis.
SIGKDD international conference on Knowledge discovery and data Dr. Li served as an Associate Editor of the IEEE TRANSACTIONS ON
mining, pages 484493. ACM, 2006. CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY from 2010 to
2013 and has been in the Editorial Board of Journal of Multimedia since
2009. He has served on technical/program committees, organizing committees
and as Program Co-Chair, Track/Session Chair for over ten international
conferences. He was the recipient of the Best Paper Award for Visual
Jun Xu is a PHD student at USTC(University of Communications and Image Processing in 2012, for International Conference
Science and Technology of China). He received on Internet Multimedia Computing and Service in 2012, for the International
the B.E. degree from USTC, in 2015. His research Conference on Mobile and Ubiquitous Multimedia from in 2011, and a senior
interests include data mining, video analysis, pattern author of the Best Student Paper of the 5th International Mobile Multimedia
recognition, computer vision and multimedia content Communications Conference in 2009.
analysis. He took part in Microsoft Research, Bei-
jing, China as an intern since from 2012 to 2013 and
from 2014 to 2016.

Yong Rui is currently Deputy Managing Director of


Microsoft Research Asia (MSRA), leading research
groups in multimedia search and mining, and big
data analysis, and engineering groups in multimedia
Tao Mei (M07-SM11) is a Lead Researcher with processing, data mining, and software/hardware sys-
Microsoft Research, Beijing, China. His current tems.
research interests include multimedia analysis and A Fellow of IEEE, IAPR and SPIE, a Distin-
retrieval, and computer vision. He has authored or guished Scientist of ACM, and a Distinguished Lec-
co-authored over 100 papers in journals and con- turer of both ACM and IEEE, Rui is recognized as a
ferences, 10 book chapters, and edited four books. leading expert in his research areas. He holds 60 is-
He holds over 15 U.S. granted patents and 20+ sued US and international patents. He has published
in pending. Tao was the recipient of several paper 16 books and book chapters, and 100+ referred journal and conference papers.
awards from prestigious multimedia journals and Ruis publications are among the most cited C 15,000+ citations and his h-
conferences, including IEEE Communications Soci- Index = 54.
ety MMTC Best Journal Paper Award in 2015, IEEE Dr. Rui is the Editor-in-Chief of IEEE Multimedia Magazine, an Associate
Circuits and Systems Society Circuits and Systems for Video Technology Best Editor of ACM Trans. on Multimedia Computing, Communication and Ap-
Paper Award in 2014, IEEE Trans. on Multimedia Prize Paper Award in 2013, plications (TOMM), a founding Editor of International Journal of Multimedia
and Best Paper Awards at ACM Multimedia in 2009 and 2007, etc. He was Information Retrieval, and a founding Associate Editor of IEEE Access.
the principle designer of the automatic video search system that achieved He was an Associate Editor of IEEE Trans. on Multimedia (2004-2008),
the best performance in the worldwide TRECVID evaluation in 2007. He is IEEE Trans. on Circuits and Systems for Video Technologies (2006-2010),
an Editorial Board Member of IEEE Trans. on Multimedia, ACM Trans. on ACM/Springer Multimedia Systems Journal (2004-2006), and International
Multimedia Computing, Communications, and Applications, Machine Vision Journal of Multimedia Tools and Applications (2004-2006). He also serves on
and Applications, and Multimedia Systems, and was an Associate Editor of the Advisory Board of IEEE Trans. on Automation Science and Engineering.
Neurocomputing, a Guest Editor of eight international journals. He is the He is an Executive Member of ACM SIGMM, and the founding Chair of its
General Co-chair of ACM ICIMCS 2013, the Program Co-chair of ACM China Chapter.
Multimedia 2018, IEEE ICME 2015, IEEE MMSP 2015 and MMM 2013, Dr. Rui received his BS from Southeast University, his MS from Tsinghua
and the Area Chair for a dozen international conferences. University, and his PhD from University of Illinois at Urbana-Champaign
Tao received B.E. and Ph.D. degrees from the University of Science and (UIUC).
Technology of China, Hefei, China, in 2001 and 2006, respectively. He is a
Senior Member of the IEEE and the ACM, and a Fellow of IAPR.

1051-8215 (c) 2016 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

Das könnte Ihnen auch gefallen