Beruflich Dokumente
Kultur Dokumente
Alexandros Karatzoglou
Research Scientist @ Telefonica Research, Barcelona
alexk@tid.es
@alexk_z
RecSys 2013: xCLiMF: Optimizing Expected Reciprocal Rank for Data with Multiple Levels of Relevance
ECML/PKDD 2013: Socially Enabled Preference Learning from Implicit Feedback Data
AAAI 2013 Workshop: Games of Friends: a Game-Theoretical Approach for Link Prediction in Online Social Networks
CIKM 2012: Climbing the App Wall: Enabling Mobile App Discovery through Context-Aware Recommendations
RecSys 2012: CLiMF: Learning to Maximize Reciprocal Rank with Collaborative Less-is-More Filtering * Best Paper Award
RecSys 2011: Implicit Feedback Recommendation via Implicit-to-Explicit Ordinal Logistic Regression Mapping
RecSys 2010: Multiverse Recommendation: N-dimensional Tensor Factorization for Context-Aware Collaborative Filtering
Machine Learning Journal, 2008: Improving Maximum Margin Matrix Factorization * Best Machine Learning Paper Award at ECML
PKDD 2008
NIPS 2007: CoFiRank - Maximum Margin Matrix Factorization for Collaborative Ranking
5. References
5. References
C:= {users}
S:= {recommendable items}
u:= utility function, measures the usefulness of
item s to user c,
u : C X S R
where R:= {recommended items}.
For each user c, we want to choose the items
s that maximize u.
is diverse:
it represents all the possible interests of one user
5. References
What matters?
Data preprocessing: outlier removal, denoising, removal of
global effects
Smart dimensionality reduction
Combining methods
5. References
2 4 5
Each user has expressed
an opinion for some
5 4 1
items:
5 2 Explicit opinion:
rating score
1 5 4
Implicit: purchase
4 2 records or listen to
tracks
4 5 1
24
4 5 1
25
2 4 5 1. Identify set of
items rated by the
5 4 1
target user
5 2
1 5 4
4 2
4 5 1
26
2 4 5 1. Identify set of
items rated by
5 4 1
the target user
5 2
2. Identify which
1 5 4 other users rated 1+
items in this set
4 2
(neighborhood
4 5 1 formation)
27
29
Predicted rating
30
sim(u,v)
2 4 5 NA
5 4 1
5 2
1 5 4
4 2
4 5 1 NA
31
sim(u,v)
2 4 5 NA
5 4 1 0.87
5 2
1 5 4
4 2
4 5 1 NA
32
sim(u,v)
2 4 5 NA
5 4 1 0.87
5 2 1
1 5 4
4 2
4 5 1 NA
33
sim(u,v)
2 4 5 NA
5 4 1 0.87
5 2 1
1 5 4 -1
4 2
4 5 1 NA
34
sim(u,v)
2 4 5 NA
5 4 1 0.87
5 2 1
1 5 4 -1
4 5 1 NA
35
Target item:
2 4 5 item for
which the CF
5 4 1 prediction
task is
5 2
performed.
1 5 4
4 2
4 5 1
36
Predicted rating
39
2 4 5
5 4 1
5 2
1 5 4
4 2
4 5 1
40
sim(i,j) -1
Alexandros Karatzoglou September 06, 2013 Recommender Systems
Example: Item-based CF
2 4 5
5 4 1
5 2
1 5 4
4 2
4 5 1
41
sim(i,j) -1 -1
Alexandros Karatzoglou September 06, 2013 Recommender Systems
Example: Item-based CF
2 4 5
5 4 1
5 2
1 5 4
4 2
4 5 1
42
sim(i,j) -1 -1 0.86
Alexandros Karatzoglou September 06, 2013 Recommender Systems
Example: Item-based CF
2 4 5
5 4 1
5 2
1 5 4
4 2
4 5 1
43
sim(i,j) -1 -1 0.86 1
Alexandros Karatzoglou September 06, 2013 Recommender Systems
Example: Item-based CF
2 4 5
sim(6,5) cannot
5 4 1 be calculated
5 2
1 5 4
4 2
4 5 1
44
sim(i,j) -1 -1 0.86 1 NA
Alexandros Karatzoglou September 06, 2013 Recommender Systems
Example: Item-based CF
2 4 5 2.94*
5 4 1
5 2 2.48*
1 5 4
4 2
4 5 1 1.12*
45
sim(i,j) -1 -1 0.86 1 NA
Alexandros Karatzoglou September 06, 2013 Recommender Systems
Item Similarity Computation
Pearson r correlation-based Similarity
does not account for user rating biases
Cosine-based Similarity
does not account for user rating biases
Offline Online
51
and the previous is used to update the weights wij and thresholds j
Sparsity:
it is hard to find users who rated the same items.
Popularity Bias:
Cannot recommend items to users with unique tastes.
Tends to recommend popular items. 72
5. References
75
Textual content
e.g. for a book:
title,
description,
table of content
Alexandros Karatzoglou September 06, 2013 Recommender Systems
In Content-Based
Recommendations...
The recommended items for a user are based on the
profile built up by analysing the content of the items
the user has liked in the past
Content-Based
Recommendation
Suitable for text-based products (web pages, books)
Items are described by their features (e.g. keywords)
Users are described by the keywords in the items they
bought
Recommendations based on the match between the
content (item keywords) and user keywords
The user model can also be a classifier (Neural Networks,
SVM, Nave Bayes...)
78
79
80
applications
Introduction
technology
Knowledge
relationship
Handbook
Consumer
Marketing
Mastering
customer
Research
behavior
COUNT
science
Building
website
mining
using
data
CRM
your
and
the
art
for
to
of
a
applications
Introduction
technology
Knowledge
relationship
Handbook
Consumer
Marketing
Mastering
customer
Research
TFIDF Normed
behavior
science
Building
website
mining
Vectors
using
data
CRM
your
and
the
art
for
to
of
a
Marketing Research:
0.537 0.537 0.368 0.537
a Handbook
Customer
Knowledge 0.381 0.736 0.522
Management
5. References
Post-filtering
+ Single model
+ Takes into account context interactions
- Computationally expensive
- Increases data sparseness
- Does not model the Context directly
Tensor Factorization
+ Performance
+ Linear scalability
+ Models context directly
Alexandros Karatzoglou September 06, 2013 Recommender Systems
Index
1. Introduction: What is a Recommender System?
2. Approaches
1. Collaborative Filtering
2. Content-based Recommendations
3. Context-aware Recommendations
4. Other Approaches
5. Hybrid Recommender Systems
3. Research Directions
4. Conclusions
5. References
Popularity
2
Predicted
Ra4ng
Final
Ranking
3
4
Popularity
7/log(3) 31/log(5)
1/log(5) 7/log(7)
2) Pairwise
Loss function is defined on pair-wise
preferences
0.75
F_i = RR = 0.5
0.64
0.58
0.55
0.80
F_i = RR = 0.5
0.63
0.52
0.50
0.82
F_i = RR = 1
0.62
0.52
0.49
Born This Way Pink Friday Dangerously in Born This Way Femme Fatale Can't be Tamed Teenage Dream
Love The Remix
Lady Gaga Nicki Minaj Beyonc Lady Gaga Britney Spears Miley Cyrus Katy Perry
116
Wrecking Ball Not your Kind Like a Prayer Choice of Sweet Heart The Light the Little Broken
of People Weapon Sweet Light Dead See Hearts
B. Springsteen Garbage Madonna The Cult Spiritualized Soulsavers Norah Jones
top 5 top 5
not diverse diverse
Recommender Re-ranking
118
action
comedy 120
Trust for CF
Use trust to give more weight to some users
Use trust in place of (or combined with)
similarity
5. References
130
Switching
The system uses a criterion to switch between
techniques
The main problem is to identify a good switching
criterion.
e.g.
The DailyLearner system uses a CB-CF. When CB cannot
predict with sufficient confidence, it switches to CF.
131
133
5. References
5. References
5. References
148
Questions?
Alexandros Karatzoglou
alexk@tid.es
@alexk_z