Beruflich Dokumente
Kultur Dokumente
Claudio Biancalana, Fabio Gasparetti, Alessandro Micarelli, Alfonso Miola and Giuseppe Sansonetti
Department of Computer Science and Automation, ROMA TRE University, Via della Vasca Navale 79, Rome, Italy
{claudio.biancalana, gaspare, micarel, miola, gsansone}@dia.uniroma3.it
Abstract: Recommender Systems provide suggestions for items (e.g., movies or songs) to be of use to a user. They must
take into account information to deliver more useful (perceived) recommendations. Current music recom-
mender takes an initial input of a song and plays music with similar characteristics, or music that other users
have listened to along with the input song. Listening behaviors in terms of temporal information associated to
ratings or playbacks are usually ignored. We propose a recommender that predicts the most rated songs that
a given user is likely to play in the future analyzing and comparing user listening habits by means of signal
processing techniques.
Recommender systems provide suggestions based chance to use taxonomies or ontologies to represent
on user preferences in order to recommend items the new items and facilitate the clustering as happens
likely to be of interest to a user. It is obvious that in different domains (e.g., (Acampora et al., 2010a;
user preferences are influenced by the current context, Micarelli et al., 2009)) Content-based approaches col-
such as the current time of the day, mood, or current lect information describing the items and then, based
activities. Nevertheless, a few recommender systems on the user preferences, they predict which tracks the
explicitly include this information in the preference user might enjoy (see for example the Pandora ser-
models. vice1 ). The key component of this approach is the
A special group of recommender systems are the similarity function among the songs. Nevertheless,
ones based on the collaborative approach (Resnick there is a strong limitation of the highlevel descriptors
et al., 1994; Shardanand and Maes, 1995; Breese that can be automatically extracted from the tracks
et al., 1998). The system generates recommendations (Celma, 2010).
using only information about rating profiles for dif- One more relevant issue that traditional CF ap-
ferent users. Collaborative systems locate peer users proaches do not take into consideration is the listening
with a rating history similar to the current user and behavior of the user in terms of temporal information.
generate recommendations using this neighborhood. The timestamp of an item (i.e., when the song song
Collaborative filtering (CF) systems have been is played) is an important factor for the recommenda-
successful in several recommender systems. The tion algorithm. Usually, the prediction function treats
availability of large datasets and additional informa- the older items as less relevant than the new ones, but
tion that is easy collectable from the web, makes this any further reasoning about the temporal information
is simply ignored.
task interesting.
In this paper, we discuss a recommendation ap-
There are several issues that do not allow us to proach based on signal processing. In particular, a
directly apply the traditional CF approach for music traditional CF approach is enhanced considering an
recommendation. The space of possible items (i.e., improved similarity function between users. The user
tracks) can be very large and, similarly, the user space listening habits are represented by signals. Wavelet
can also be enormous. Often user ratings are not theory is used to study the related time-frequency rep-
available or they cover only a small subset of the user resentations of signals and draw similarity between
library of songs. Moreover, when new users enter to listening behaviors. Signal processing techniques are
the system or new songs are added to the global li- not employed to extract features from the songs, but
brary, it is not possible to provide any recommenda- for representing and comparing those behaviors in or-
tion to them due to the lack of any preference infor-
mation (the so known cold-start problem). There is no 1 www.pandora.com
der to group similar users together. This is the novelty 2 WAVELET-BASED
of the approach in comparison with the current litera- RECOMMENDATION
ture.
The rest of this paper is organized as follows. Sec- Traditional user-based CF approaches relies on
tion 1 briefly introduces some related studies on mu- similar users which have similar rating patterns, that
sic recommendation. Section 2 details our proposed is, the prediction of a rating ru,s by user u for the track
approaches. Last, in Section 3 a brief account of the trackk is evaluated as an aggregate of the rating of
testbed we are developing for the evaluation is given. some other users for the same item trackk . We call
Conclusions close the discussion. these similar users neighbors. If a user v is similar to
a user u, we say that v is a neighbor of u. User-based
algorithms generate a prediction for a track trackk by
analyzing ratings for trackk from users in us neigh-
1 RELATED WORK borhood.
In order to draw the distance (or similarity) be-
tween two users, the Pearson correlation coefficient is
Many algorithms have been developed to address usually employed (Resnick et al., 1994):
the personalized recommendation problem. Content-
based approaches aim at including different sources sSu,v (ru,s ru )(rv,s rv )
of information (Semeraro et al., 2009; Groh and sim(u, v) = q
Ehmig, 2007; Micarelli et al., 2006) or better mod- sSu,v (ru,s ru )2 sSu,v (rv,s rv )2
elling the user interests (Gasparetti and Micarelli, (1)
2007). User-based collaborative filtering (CF) is where Su,v denotes the set of co-rated items between
widely used, and the main idea is to find the items u and v, ru,s is the rating of the user u for the item s,
liked by other people with similar taste. Different and ru is the average of the ratings of the user u.
from the user-based CF, the item-based CF recom- Pearson correlation ranges from 1.0 for users with
mends the items which are similar with the users perfect agreement to -1.0 for users with perfect dis-
collected items (Schafer et al., 2007). Context-aware agreement. In this way, it is possible to generate a
high-level frameworks (e.g., (Acampora et al., 2010b; prediction of rating for the user u and the item s as
Gaeta et al., 2009)) are not easily adaptable to this follows:
specific domain because of the peculiar characteris-
tic of the items. For example, in (Biancalana et al., vNNu sim(u, v)(rv,s rv )
2011a) the authors devise a neural network context- pred(u, s) = r u + (2)
vNNu sim(u, v)
aware recommender extracting different features from
where NNu is the set of users in the us neighborhood.
point of interests. In the music scenario, techniques
The proposed recommendation approach is en-
that automatically extract features from the played
hanced considering a user similarity function that
songs are not easily conceivable.
analyzes contextual factors that are included in the
As for music recommendation, the most compres- data collected during the normal usage of the recom-
sive survey on the literature is to be found in (Celma, mender system. In particular, the timestamp associ-
2010). The author groups the recommendation ap- ated to playbacks.
proaches in four categories: (1) collaborative filtering, In our recommender we employ Discrete wavelet
based on explicit or implicit feedbacks; (2) content- transforms (DWT). The basic principles of wavelet
based filtering, by means of manual or automatic fea- theory were put forth in a paper by Gabor in 1945
ture extraction; (3) context-based filtering, the take (Gabor, 1946). In comparison with the Fourier trans-
advantage of potential user tags associated to each form, wavelets are localized in both time (or location)
single song; and (4) hybrid approaches that combine and frequency instead of just frequency. A wavelet
more then one of the above-mentioned ones. is a function used to represent a time signal into dif-
To the best of our knowledge, there are currently ferent scale components. Usually one can assign a
no attempts to include temporal behavior in user frequency range to each scale component. Each scale
habits in the music recommendation task. A prelim- component can then be studied with a resolution that
inary attempt has been suggest for the movie domain matches its scale. The DWT is computed by suc-
in (Biancalana et al., 2011b). The proposed approach cessive lowpass and highpass filtering of the discrete
can be categorized as context-based, where the simi- time-domain signal as shown in Fig. 1. This is called
larity of different songs is evaluated according to the the Mallat pyramid algorithm, a computationally effi-
implicit listening behavior that the user exhibits. cient method of implementing the wavelet transform.
The input signal is assumed to be a set of discrete- Algorithm 1 Similarity between users u and v
time samples, i.e., a sequence x[n], where n is an inte- for all trackk L do
ger. Whereas the basis function of the Fourier trans- vvu,k [th ] number of times the song trackk has
form is a sinusoid, the wavelet basis is a set of func- been played by the user u in the time interval
tions. In our approach we decide to employ the popu- th < t < th + T
lar Haar wavelets. vvu,k [th ] number of times the song trackk has
been played by the user v in the time interval th <
t < th + T
end for
for all trackk L do
wu,k discrete Haar Wavelet transform of signal
vvu,k
wv,k discrete Haar Wavelet transform of signal
vvv,k
end for
simu,v Euclidean distance between the two vec-
tors wu and wv