Beruflich Dokumente
Kultur Dokumente
1, MARCH 2014
Abstract—Facing massive multimedia services and contents in video-sharing websites and social network applications
the Internet, mobile users usually waste a lot of time to obtain everyday [1]. The video content may be duplicate, similar,
their interests. Therefore, various context-aware recommendation related, or quite different. Facing billions of multimedia web-
systems have been proposed. Most of those proposed systems
deploy a large number of context collectors at terminals and access pages, online users are usually having a hard time finding their
networks. However, the context collecting and exchanging result favorites. This situation is even worse for mobile users because
in heavy network overhead, and the context processing consumes of screen limit and low bandwidth. How to help mobile users
huge computation. In this paper, a cloud-based mobile multimedia obtain their desired content lists from billions of webpages
recommendation system which can reduce network overhead and in a short time is very challenging [2]. Some video-sharing
speed up the recommendation process is proposed. The users are
classified into several groups according to their context types and websites recommend video lists for end users according to
values. With the accurate classification rules, the context details video classification, video description tags, or watching history.
are not necessary to compute, and the huge network overhead However, these recommendations are not accurate and are
is reduced. Moreover, user contexts, user relationships, and user always not consistent with the end users’ interests. To improve
profiles are collected from video-sharing websites to generate this, some websites also provide users with search engine to
multimedia recommendation rules based on the Hadoop platform.
When a new user request arrives, the rules will be extended and search their desired videos quickly. However, searching is based
optimized to make real-time recommendation. The results show on the keywords. For most cases, mobile users do not have
that the proposed approach can recommend desired services with any keyword when they process the search. Favorite video
high precision, high recall, and low response delay. recommendation techniques are commercially driven and are
Index Terms—Cloud computation, minimal spanning tree, important for mobile multimedia applications.
multimedia service recommendation, user behavior analysis. There are several successful video recommendation algo-
rithms and systems that have been developed and exploited. For
I. I NTRODUCTION example, Google has adopted content-based filtering (CB) rec-
ommender system in its AdWords services. The Google search
A CCORDING to Cisco’s latest forecast, two-thirds of the
world’s mobile data traffic and 62% of the consumer
Internet traffic will be video by the end of 2015. The sum of all
engine returns search results with keyword-related advertise-
ments. However, those advertisements are always neglected by
end users. This is mainly because of the biased decisions of
forms of video (TV, video on demand, Internet, and P2P) will
users’ favorite content [3]. Unfortunately, Google AdWords had
continue to be approximately 90% of global consumer traffic
been removed from the right side of the page. Amazon and
by 2015. Internet users post a large number of video clips on
Taobao have achieved great success in recent years. They have
introduced collaborative filtering (CF) recommender systems
Manuscript received July 29, 2012; revised December 27, 2012; accepted
into their e-commerce websites to help users find their inter-
May 8, 2013. Date of publication January 16, 2014; date of current version ested goods [4]. The users’ interests are identified by matching
February 5, 2014. This work was supported in part by the National Key Projects the click and concern patterns among a group of users. The
of China under Grants 2012ZX03002010 and 2009ZX03004-004-004-04 and
in part by the National Science Foundation of China under grants No. 61001070
basic concept is to use the large group people’s behavior to
and 61201219. (Corresponding author: X. Xie.) predict the individual interests. Therefore, the highly popular
Y. Mo is with the Department of Electronics and Information Engineer- contents are considered as the common users’ interests, while
ing, Huazhong University of Science and Technology, Wuhan 430074, China
(e-mail: moyj@hust.edu.cn).
the less popular contents are always not judged as users’ inter-
J. Chen is with the Department of Computer Science, University of Califor- ests. As a result, the less popular but users’ interest content will
nia, Los Angeles, CA 90095 USA (e-mail: jianwen.chen@ieee.org; xu-feng@ be never recommended to them. Another famous recommender
live.com).
X. Xie and C. Luo are with the School of Computer Science and Technol- system based on social network filtering (SNF) is exploited
ogy, Huazhong University of Science and Technology, Wuhan 430074, China by Facebook. On Facebook, the social network is formed
(e-mail: shelicy@mail.hust.edu.cn; chqluo2013@gmail.com). according to social signals, such as space links, user concerns,
L. T. Yang is with the School of Computer Science and Technology,
Huazhong University of Science and Technology, Wuhan 430074, China, and content forwards, and user interactions. Users can recommend
also with the Department of Computer Science, St. Francis Xavier University, content to their social network. That becomes a trend of content
Antigonish, NS B2G 2W5, Canada (e-mail: ltyang@gmail.com). recommendation. However, recommendation satisfaction, cold
Color versions of one or more of the figures in this paper are available online
at http://ieeexplore.ieee.org. start, and timeliness in content recommendation are still three
Digital Object Identifier 10.1109/JSYST.2013.2279732 challenging issues [5].
1932-8184 © 2014 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
MO et al.: CLOUD-BASED MOBILE MULTIMEDIA RECOMMENDATION SYSTEM WITH USER BEHAVIOR INFORMATION 185
For almost all of the existing recommendation algorithms, model, user-behavior-based clustering, clustering-based user
the typical system consists of two essential components: 1) a profile collecting, and cloud-based recommendation rule rea-
content recommender that takes charge of user interest iden- soning in detail. Further discussion about system performance
tification, user interest recommendation, and result reranking optimization is presented in Section IV. Section V describes
and 2) various collectors that collect user context and activities, the implementation of the proposed recommender system and
content attributes, and updates. In recommendation system ini- presents a comprehensive evaluation of the system. Section VII
tialization, a few contextual information, e.g., time and location, concludes this paper and discusses future work.
is collected [6]. To capture the interests of users in a ubiquitous
environment, more and more contextual information, such as
II. R ELATED W ORK AND S YSTEM A RCHITECTURE
user opinions, watching times, and video ages, is logged in the
recommendation system [4]. Real-time recommendation cannot For emerging mobile devices and services, various context-
be guaranteed due to inevitable increment of computations. aware service platforms, such as SPICE [8], CASD [9], and
User interests and content clustering are often used to narrow uPnP-based architecture [10], are developed to provide mo-
the searching range of related content. bile user favorite services and applications. Recommendation
In this paper, we propose a mobile multimedia recommender systems based on the users’ preference have been applied to
system based on user behavior. The system is implemented on user favorite recommendation for several years. In this section,
the Hadoop platform to satisfy the huge computation require- we will review existing recommendation systems and present
ments for real-time recommendation systems. Compared with the architecture of the proposed cloud-assisted recommendation
traditional recommender systems, there are three differences: system.
1) the collector and user profiles are decentralized into several
computing nodes; 2) the user behavior clusters are collected
A. Recommendation System
except for only user profiles; and 3) the graph-based optimiza-
tion mechanism is introduced into the recommender to speed Recommendation systems focus on a specific domain. For
up the recommendation process. The proposed system has the example, Google News provides personalized news recom-
following contributions. mendation services for a substantial amount of online read-
ers. Amazon uses the recommender system to help users find
1) User clusters are collected instead of detailed user pro-
their desired products. YouTube uses user watching history to
files. More and more user contexts and profiles will be
predict and recommend videos for users. In general, four cate-
delivered and exchanged with the increment of collectors
gories of algorithms have been exploited by the recommender
and Hadoop nodes. To avoid the explosion of network
system: CB recommendation [11]–[13], CF-based recommen-
overhead, user-behavior-based clustering is performed
dation [14]–[19], context-aware recommendation, and graph-
first, and the collectors calculate user clusters according
based recommendation [20].
to the clustering rules and then report the user cluster to
CB recommendation: The systems make recommendation
the recommender only.
based on the similarities of content titles, tags, or descriptions.
2) The Hadoop platform is used in the proposed multimedia
Some systems find user-interested items based on user’s indi-
recommendation system. On the platform, user clusters
vidual reading history in term of content. CB recommender
and multimedia content are collected, distributed, and
systems are easy to implement. However, in some scenarios,
stored into the Hadoop distributed file system (HDFS).
simply representing the user’s profile information by a bag of
During user content recommendation, those data are par-
words is not sufficient to capture the exact interests of the user.
titioned into several chunks, the chunks are processed
CF-based recommendation: The systems make recommen-
simultaneously by several mapper, and then, the results
dation based on abundant user transaction histories and con-
are reduced and merged together [7]. The MapReduce
tent popularity. In the systems, individual user’s interests are
procedure can speed up the existing recommendation
predicted by a group of similar users [15]–[17]. To obtain
algorithm, such as CB, CF, or SNF (social-network-based
the content rating and users’ similarity, statistics and feedback
filter).
methods are used [18], [19]. CF systems require enough histor-
3) Recommendation rules are reordered to improve scal-
ical consumption record and feedback. Otherwise, prediction,
ability and real-time recommendation. Existing recom-
implicit feedback, or opinion classification methods should be
mendation systems always recommend a ranked list to
adopted to solve cold-start issues [5].
users after training from some given data. However, if
Context-aware recommendation: The aforementioned sys-
the content changes or a new keyword appears, a fixed
tems provide stable recommendation without considering user
list is always provided. In our work, according to rec-
context information. In fact, user interests vary according to
ommendation rules, the recommender searches a real-
location, time, and emotion. Context-aware recommendation
time ranked list for users. Furthermore, considering the
systems complement user context sensed on smartphone and
influence of rule execution order, we proposed a graph-
long-time user profile to assist the user in selecting better ser-
based rule reordering method to reduce searching latency.
vices, photographs, or videos dynamically. Context is a difficult
The rest of this paper is organized as follows. Section II concept to capture and describe; fuzzy ontologies and semantic
discusses the related work and proposes the cloud-assisted reasoning are used to augment and enrich the description of
system architecture. Section III presents the user behavior context [21], [22].
186 IEEE SYSTEMS JOURNAL, VOL. 8, NO. 1, MARCH 2014
in each group and incomplete user coverage. Fortunately, users C. User Profiles and Viewing Interests
often co-comment on popular videos and add others, providing Based on context clusters and social communities, we can
them favorite videos as idols. The behaviors hide some group make a rough recommendation for users. More accuracy recom-
information. Therefore, we are concerned with three kinds of mendation depends on the user’s reading interests on content.
user relationship: idol–fan profiles, co-commenting behaviors, User’s reading interests are extracted from his profiles which
and interest groups. Based on the connection relationship, we keep track of what videos he has viewed. Former research
construct a weighted graph like Fig. 3. studies build user’s profiles by exploration on three different
The graph is composed of three kinds of subgraph. The but related dimensions, such as topic distribution, similar access
subgraph with red link is constructed from idol–fan profiles; patterns, and preferred entities. However, similar access pat-
if a user adds another user as his idol, the two users are linked terns are discussed previously, and preferred entities are hardly
together. If two users take each other as idols, there are two detected from videos. We construct user’s profile from two
edges between users; the subgraph with blue link is generated aspects: video content and video attributes; each aspect includes
from user co-comment behaviors. If a user comments on his several dimensions.
interested videos, an edge links the user and the video. The
subgraph with green link is plotted based on user interested 1) Video content: It is characterized by a probability vector
groups. If a user joins an interested group, he is connected with of keywords in video title and tags, and the vector is
the group. It is complex to recommend items based on the graph denoted as {key1 , pro1 , key2 , pro2 , . . .}. If keywords
directly. or tags are synonyms, related elements will be merged.
With the help of intermediate videos, we merge the 2) Video attributes: Besides video content, user interested
three subgraphs into one graph of user–user community. videos have many specific attributes, such as video
The new graph G = {V, E} consists of node set V and length, video resolution, video popularity, and video age,
edge set E. V = {u1 , u2 , . . . , ui , . . . , um }, where node ui which are denoted as a list {va1 , va2 , . . .}. Analyzing
denotes the ith user. E = {e1 , e2 , . . . , ej , . . . , en }, where historic attribute lists from a user’s profiles, we obtain
edge ej denotes correlation between two users. The weight the probability of the user’s interests on specific video
of node ui [Weightnode (ui )] and the weight of edge clusters. IEEEhowto:kopka
ej [Weightedge (ej )] are calculated as follows: The aforementioned profiles are not only explored by their
own users during recommending but also used by other users
Weightnode (ui ) = α ∗ fans + β ∗ comments during CF. On Tudou, 80 million users click 100 million times
a day, and the data increase every day. Recommending based
+ γ ∗ grpusers
on the items should consume great computation and bring large
1 + MuFans(j0, j1) latency. On the other hand, users’ interests vary from time to
Weightedge (ej ) = α ∗ time, and old profiles introduce a lot of noise during recom-
2
mending. To overcome the problems, we combine the group
CoCom(j0, j1)
+β∗ results in Section III-B, put the profiles of the users in the same
comj0 + comj1 group into one set, and adopt k-means clustering algorithm
CoGrp (j0, j1) on the set to obtain interest clusters. By doing so, searching
+γ∗ space is narrowed down to one cluster while making real-time
grpj0 + grpj1
recommendation. For example, searching in the profiles for
α > 0, β > 0, γ > 0; α + β + γ = 1 (1) 10 days requires n-dimensional query in 1000 million profiles;
after the profiles have been divided into average N clusters
where α, β, and γ represent the influence of idol–fan profiles, with M groups, they require log2 M +log2 N 1-D search and
co-commenting behaviors, and interest groups. fans means the n-dimensional query in 1000 million/(M ∗N) profiles in average.
number of ith user’s fans. comments is commented times of
the ith user posted videos, and grpusers is the size of the ith IV. C LOUD -A SSISTED C LUSTERING
user created group. MuFans(j0, j1) depends on whether two
users are mutual idols. If they are, it is 1; otherwise, it is 0. As mentioned previously, various cluster algorithms are
comj0 , comj1 are commenting times of two users linked by the adopted to analyze user behavior and to obtain the recommend-
jth edges, while CoCom(j0, j1) means co-commenting times ing rules. For example, SCA is used to cluster user contexts,
of the two users. grpj0 , grpj1 are numbers of the groups joined graph partition is exploited to get community groups, and
by the two users, while CoGrp(j0, j1) is the number of the k-means is introduced into viewing interest clustering. Although
groups joined by both users. the algorithms are executed offline, it is still a time-consuming
If a weight of an edge is lower than a threshold δ, the users work and unacceptable. Therefore, we deploy the cluster al-
linked by the edge will be divided into two subgraphs, and then, gorithms on Hadoop—a famous MapReduce-based cloud plat-
the graph is partitioned into several subgraphs. Tunable value form provided by Apache. More details are illustrated in Fig. 4.
δ depends on the probability distribution of Weightedge . For 1) User profiles in HDFS are cut into s chunks. Each trunk
example, the graph in Fig. 3 is partitioned into two subgraphs. includes profiles of different users, and the profiles of the
The partitioned groups and the weights of nodes are adopted same user may be stored in several trunks. To balance
as important parts for interest extraction. resources and processing latency among Hadoop nodes,
MO et al.: CLOUD-BASED MOBILE MULTIMEDIA RECOMMENDATION SYSTEM WITH USER BEHAVIOR INFORMATION 189
R EFERENCES
[1] C.-F. Lai, Y.-M. Huang, and H.-C. Chao, “DLNA-based multimedia
sharing system for OSGI framework with extension to P2P network,”
IEEE Syst. J., vol. 4, no. 2, pp. 262–270, Jun. 2010.
[2] K.-D. Chang, C.-Y. Chen, J.-L. Chen, and H.-C. Chao, “Challenges to next
generation services in IP multimedia subsystem,” J. Inf. Process. Syst.,
vol. 6, no. 2, pp. 129–146, Jun. 2010.
[3] D. Li, Q. Lv, X. Xie, L. Shang, H. Xia, T. Lu, and N. Gu, “Interest-
based real-time content recommendation in online social communities,”
Knowl.-Based Syst., vol. 28, pp. 1–12, Apr. 2012.
[4] X. Wu, Y. Zhang, J. Guo, and J. Li, “Web video recommendation and long
tail discovering,” in Proc. IEEE ICME, 2008, pp. 369–372.
[5] D. Poirier, F. Fessant, and I. Tellier, “Reducing the cold-start prob-
lem in content recommendation through opinion classification,” in Proc.
IEEE/WIC/ACM Int. Conf. WI-IAT, 2010, pp. 204–207.
[6] M.-H. Kuo, L.-C. Chen, and C.-W. Liang, “Building and evaluating a
location-based service recommendation system with a preference adjust-
Fig. 12. Recommendation latency comparison of clustering algorithms on ment mechanism,” Exp. Syst. Appl., vol. 36, no. 2, pp. 3543–3554, Mar. 2009.
cloud or no cloud. [7] Z.-D. Zhao and M.-S. Shang, “User-based collaborative-filtering recom-
mendation algorithms on Hadoop,” in Proc. WKDD, 2010, pp. 478–481.
[8] C. Cordier, F. Carrez, H. Van Kranenburg, C. Licciardi, J. Van der Meer,
A. Spedalieri, J. P. Le Rouzic, and J. Zoric, “Addressing the challenges
of beyond 3G service delivery: The SPICE service platform,” in Proc.
Workshop ASWN, 2006, pp. 1–29.
[9] P. Pawar and A. Tokmakoff, “Ontology-based context-aware service dis-
covery for pervasive environments,” in Proc. IEEE Int. Workshop Service
Integr. Pervasive Environ., Jun. 2006, pp. 1–7.
[10] C.-F. Lai, S.-Y. Chang, Y.-M. Huang, J. H. Park, and H.-C. Chao, “A
portable uPnP-based high performance content sharing system for sup-
porting multimedia devices,” J. Supercomput., vol. 55, no. 2, pp. 269–283,
Feb. 2011.
[11] M. J. Pazzani and D. Billsus, “Content-based recommendation systems,” in
The Adaptive Web. Berlin, Germany: Springer-Verlag, 2007, pp. 325–341.
[12] E. Gabrilovich, S. Dumais, and E. Horvitz. Newsjunkie, “Providing per-
sonalized newsfeeds via analysis of information novelty,” in Proc. WWW,
2004, pp. 482–490.
[13] L. Li, D. Wang, T. Li, D. Knox, and B. Padmanabhan, “SCENE: A
scalable two-stage personalized news recommendation system,” in Proc.
Fig. 13. Real-time recommendation latency comparison. SIGIR, 2011, pp. 125–134.
[14] Z. Wang, Y. Tan, and M. Zhang, “Graph-based recommendation on social
networks,” in Proc. Int. Asia-Pac. APWEB Conf., 2010, pp. 116–122.
VII. C ONCLUSION AND F UTURE W ORK [15] K. Ali and W. van Stam, “TiVo: Making show recommendations using a
distributed collaborative filtering architecture,” in Proc. ACM SIGKDD,
In this paper, we have proposed a cloud-assisted recom- 2004, pp. 394–410.
mender system for videos. Based on the MapReduce platform, [16] T. Hofmann, “Latent semantic models for collaborative filtering,” ACM
Trans. Inf. Syst., vol. 22, no. 1, pp. 89–115, Jan. 2004.
we have analyzed three kinds of user behaviors, including user [17] Z. Zheng, H. Ma, R. Lyu, and I. King, “WSRec: A collaborative filtering
contexts, interest groups, and user profiles. Along with different based web service recommender system,” in Proc. IEEE Int. Conf. ICWS,
2009, pp. 437–444.
characteristics of the three kinds of information, we adopt SCA, [18] G. Go, J. Yang, H. Park, and S. Han, “Using online media sharing behavior
graph partition, and k-means separately. Distinguishing with as implicit feedback for collaborative filtering,” in Proc. IEEE Int. Conf.
other recommender systems, we have stored recommendation Social Comput., 2010, pp. 439–445.
[19] Z. N. Chan, W. Gaaloul, and S. Tata, “Collaborative filtering technique
rules instead of recommending lists. Additionally, a graph- for web service recommendation based on user-operation combination,”
based rule reordering method is used in real-time recommend- in Proc. OTM, 2010, pp. 222–239.
ing. Evaluation shows that the proposed system provides higher [20] S. Baluja, R. Seth, D. Sivakumar, Y. Jing, J. Yagnik, S. Kumar,
D. Ravichandran, and M. Aly, “Video suggestion and discovery for
quality of recommendation with lower training latency and YouTube: Taking random walks through the view graph,” in Proc. WWW,
recommending latency. 2008, pp. 895–904.
In this paper, user profiles have been obtained from co- [21] A. C. M. Costa, R. S. S Guizzardi, G. Guizzardi, and J. G. P. Filho, “COReS:
Context-aware, ontology-based recommender system for service recom-
comment information, but users always make no comment after mendation,” in Proc. Ubiquitous Mobile Inf. Collab. Syst., 2007, pp. 1–15.
viewing their interested video, which leads to errors during [22] A. C. G. C. A. Cimino, B. Lazzerini, and F. Marcelloni, “Situation-aware
clustering. For future work, we plan to handle the data sparsity mobile service recommendation with fuzzy logic and semantic web,” in
Proc. ISDA, 2009, pp. 1037–1042.
of user profiles. Another important point that should be studied
is designing a distributed recommendation cache to improve
recommending hit rate. The cache can also reduce computation Yijun Mo received the B.Eng. degree in electrical
and electronics engineering, the M.Phil. degree, and
pressures caused by the amount of concurrent rule reordering the Ph.D. degree from Huazhong University of Sci-
and executions. ence and Technology (HUST), Wuhan, China, in
1999, 2001, and 2008, respectively.
Since November 2009, he has been an Associate
ACKNOWLEDGMENT Professor with HUST. His research interests include
wireless networks, semantic networks and service
The authors would like to thank Q. Chen and H. Wu for their composite, and multimedia communication.
work on data collecting and preprocessing.
MO et al.: CLOUD-BASED MOBILE MULTIMEDIA RECOMMENDATION SYSTEM WITH USER BEHAVIOR INFORMATION 193
Jianwen Chen (SM’12) received the Ph.D. degree Changqing Luo received the Ph.D. degree in electri-
in electrical engineering from Tsinghua University, cal engineering from Beijing University of Posts and
Beijing, China, in 2007. His Ph.D. research focused Telecommunications, Beijing, China, in 2011.
on video compression algorithm design, video codec He is an Assistant Professor with the School of
hardware architecture design, and embedded video Computer Science and Technology, Huazhong Uni-
codec algorithm optimization and implementation. versity of Science and Technology, Wuhan, China.
From 2007 to 2010, he was a Staff Researcher with During his Ph.D. study, he was a visiting stu-
IBM Research, where he conducted cutting-edge re- dent with the Department of Electrical and Com-
search on wireless communication systems and mul- puter Engineering, University of British Columbia,
ticore video-coding architectures. From September Vancouver, BC, Canada, for half a year and with the
2010 to September 2012, he was with the research Department of Systems and Computer Engineering,
group in the Department of Electrical Engineering, University of California, Carleton University, Ottawa, ON, Canada, for half a year. His current research
Los Angeles (UCLA), Los Angeles, CA, USA, where he furthered the research interests include algorithms and optimization for wireless networks, green
on high-efficiency video-coding techniques, wireless networking systems, and communication, and mobile cloud computing.
high-performance computing systems and applications. Since October 2012,
he has been a Senior Visiting Scholar with the Human Visio Research Center
of Harvard, where he is focusing on visual quality evaluation, 3D video Laurence Tianruo Yang (M’97) received the B.E.
experience, and media cloud systems. He has authored more than 50 papers. His degree in computer science from Tsinghua Uni-
current research interests include multimedia communication over networks, versity, Beijing, China, and the Ph.D. degree in
video coding, and wireless communication network systems. computer science from the University of Victoria,
Dr. Chen has more than 50 standard proposals for MPEG, AVS, and Victoria, BC, Canada.
VCEG since 2003. Since February 2012, he has served as the Chairman of He is a Professor with the School of Computer
the MPEG Internet Video Codec Ad Hoc Group. He was nominated as the Science and Technology, Huazhong University of
Chancellor’s Postdoctoral Researcher of UCLA in 2012. He has served as a Science and Technology, Wuhan, China, and the
reviewer/organizer for many academic journals and conferences, such as the Department of Computer Science, St. Francis Xavier
IEEE T RANSACTIONS ON W IRELESS C OMMUNICATION, the IEEE T RANS - University, Antigonish, NS, Canada. His current re-
ACTIONS ON M ULTIMEDIA , the IEEE T RANSACTIONS ON C IRCUITS AND search interests include parallel and distributed com-
S YSTEMS FOR V IDEO T ECHNOLOGY, the IEEE Visual Communication and puting and embedded and ubiquitous/pervasive computing. His research is
Image Processing, the IEEE International Symposium on Circuits and Systems, supported by the National Sciences and Engineering Research Council and the
and the IEEE T RANSACTIONS ON P ROFESSIONAL C OMMUNICATION. Canada Foundation for Innovation.