You are on page 1of 2

2015 IEEE International Conference on Consumer Electronics (ICCE)

Music Recommendation System Based on Usage History and

Automatic Genre Classification
Jongseol Lee*,**, Saim Shin*, Dalwon Jang*, Sei-Jin Jang* and Kyoungro Yoon**
*Korea Electronics Technology Institute, Gyeong-gi, Korea ** Konkuk University, Seoul, Korea,,,,
Abstract--The personalized music recommender supports the
user-favorite songs stored in a huge music database. In order to
predict only user-favorite songs, managing user preferences
information and genre classification are necessary. In our study,
a very short feature vector, obtained from low dimensional
projection and already developed audio features, is used for
music genre classification problem. We applied a distance metric
learning algorithm in order to reduce the dimensionality of
feature vector with a little performance degradation. We propose
the system about the automatic management of the user
preferences and genre classification in the personalized music

Composer, and Album Name. The last one is the

recommendation engine selecting the music that has the most
similar features to users favorite songs.

We can see thousands of songs through various delivery
mechanisms such as downloading service and streaming
service. There is an increasing demand for efficient
recommendation and retrieval from a large digital music
database. However, it is not realistic to choose a wanted song
among all these songs because of immeasurable timeconsumption of the work. Thus, the research area of music
information retrieval is receiving increased attention [1, 2, 11].
In the area of music information retrieval, there exist
challengeable tasks including query-by-singing/humming [3],
tempo estimation [4], melody transcription [5], cover-song
identification [6], music genre classification [7], and so on.
In this paper, we propose the Music Recommendation
System Based on Usage History and Automatic Genre
Classification, the MusicRecom system, which recommends
suitable music to user by similarity algorithm and automatic
genre classification. The users are able to enjoy the preferable
music without checking all the music information such as title,
singer and genre. The prediction of the preferable music uses
the music information, usage history and genre features. The
system gathers each users listening patterns and generates
each users usage history from them. We make a comparison
between recommendation using static genre and
recommendation using automatic genre classification.
We propose the MusicRecom system which consists of four
main functional modules. The usage history generator collects
users music usage patterns and classifies them. The genre
classification module extracts the features and classifies genre
of music. We utilize the advanced algorithms for the proposed
genre classification [9, 10]. The music information generator
collects music information such as Singer, Title, Artist,
This work was supported in part by the Ministry of Trade, Industry and
Energy (MOTIE) grant funded by the Korean government (No. 10037244).

978-1-4799-7543-3/15/$31.00 2015 IEEE

Fig. 1. The MusicRecom system architecture


Fig. 2. Feature extraction process

The feature extraction process of our system is shown in Fig.

2. First, input audio is pre-processed with decoding, downsampling, and mono-conversion. The pre-processed audio is
framed using hamming window of about 23ms with 50%
overlap. From each window, raw features are obtained. In our
system, mel-frequency cepstral coefficients (MFCC),
Decorrelated filter bank (DFB), and Octave-based spectral
contrast (OSC) are used. MFCCs represent the spectral
characteristics based on Mel-frequency scaling, and they are
also used in various music classification systems. The DFB
considers variation of amplitudes between neighboring bands.
It is extracted from subtraction of log spectrum in neighboring
mel-scale band. The OSC considers the spectral peak, spectral
valley, and spectral contrast in each octave-based subbands.
After raw feature extraction, length of feature vector is 42: 13
for MFCCs, 13 for DFB, and 16 for OSC. After extracting 3
features, their statistical values are computed in order to
represent temporal variation.
We used mean, variance, feature-based modulation spectral
flatness measure, and feature-based modulation spectral crest
measure [9]. Length of the feature vector is quadrupled, and
we get a feature vector of dimension 168. This 168-dim vector
can be used as it is, but sometimes, the feature vector is used
after dimension reduction depending on system design. For an
application using low-computational power, short feature
vector is necessary. To reduce feature dimension without
performance degradation, distance metric learning is applied
in our system [10]. The feature vector of length from 5 to 168
is used after dimension reduction. In this study, we use 5 and


2015 IEEE International Conference on Consumer Electronics (ICCE)

10-dim feature vector.

The usage history generator generates the usage history
from users listening patterns. The pattern collector gathers the
candidate patterns in the usage patterns, and decides the
updated positive and negative patterns for classification. We
decided pattern policies as below.
- a) Negative pattern: skipping song within 15 seconds.
- b) Positive pattern: rewinding or repeating song.
The users can comment about recommended lists what he
wants to listen or not. The comment values are Very good,
Good, Dont like, Hate. They can be used as the users
evaluation of the recommendation.

music by the MusicRecom system. In order to evaluate the

users satisfaction about the MusicRecom system, we
proposed MRR-based evaluation method given Equation (2).
1 n 1
n i =1 ri
In Equation (2), ri means the ranking in the recommended
list in by the proposed user recommendation engine. Table I
shows the evaluation results of the MusicRecom
recommended contents. The average MRR in Table I shows
that proposed recommendation system using automatic genre
classification increases the users overall satisfaction about the
recommended songs.


The music information generator parses ID3 tag of music
file or use open API for music information. We use the Open
API. We extract song title, singer, genre and composer. This
information can be saved with the unique content id.
We use the cosine coefficient to compare users preferred
songs and music datasets. The recommendation engine
extracts user preference features and dataset features. In
Equation (1), we can control recommendation focus through
preference weight. If user doesnt want recommendation by
singers, then singer preference weight can be set 0. At last we
sort candidate songs by preference value and recommend top
15 songs.

PrefVal i = w k Sim k ( D i , Ti )


30s male (A) 30s male (B) 30s female (C) 30s female (D)




















Test1: using manual genre classification

Test2: using automatic genre classification (5-dim feature vector)
Test3: using automatic genre classification (10-dim feature vector)

The MusicRecom system is the personalized music services
based on usage history and automatic genre classification. We
used 5 and 10-dim feature vector for genre classification
without performance degradation. This system can be applied
to various audio devices, apps and services.



k =0

, where Di is the music feature vector lists of user

preference music and Ti is the music feature vector lists of

Fig. 3. Implementation results of the MusicRecom system

For the system evaluation, we build music datasets which

contains 2,000 files. The dataset consists of 8 genres like
dance, ballad, trot, children, rock, R&B, pops and carol.
Length of each play varies between 4 and 6 minutes except for
children and carol. We had gathered usage history for 5 days
in the MusicRecom system then learned the user preferences
for 4 users. The assumption in the evaluation is that music
which each user played or commented is each users answer
system, and we compared these answers and the recommended


J. S. Downie, Music information retrieval, Annual Review of

Information Science and Technology, 37:295-340, 2003.
[2] R. Typke, F. Wiering, and R. Veltkamp, A survey of music infonnation
retrieval systems, Proc. ISMIR, pp. 153-160, 2005
[3] D. Jang, C.-J. Song, S. Shin, S.-J. Park, S.-J. Jang, S.-P. Lee,
Implementation of a matching engine for a practical query-bysinging/humming system, Proc. ISSPIT, pp. 258-263, 2011
[4] S. W. Hainsworth and M. D. Macleod, Particle filtering applied to
musical tempo tracking, EURASIP J. Applied Signal Processing, vol.
15, pp. 2385-2395, 2004
[5] S. Jo and C. D. Yoo, Melody extraction from polyphonic audio based
on particle filter, Proc. ISMIR, pp. 357-362, 2010
[6] D. P. W. Ellis and G. E. Poliner, Identifying cover songs with chroma
features and dynamic programming beat tracking, Proc. Int. Conf
Acoustic, Speech and Signal Processing, Honolulu, HI, 2007.
[7] G. Tzanetakis and P. Cook, Musical genre classification of audio
signals, IEEE Trans. Speech Audio Process. vol. 10, no. 5, pp. 293-302,
[8] T. Li , M. Ogihara, and Q. Li, "A comparative study on content-based
music genre classification" Proc. ACM Conf on Research and
Development in Information Retrieval, pp. 282-289, 2003
[9] S.-C. Lim, S.-J. Jang, S.-P. Lee, and M. Y. Kim, Music genre/mood
classification using a feature-based modulation spectrum, in Proc. IEEE
Int. Conf. Mobile IT Convergence, 2011, pp. 133-136.
[10] [2] D. Jang and S.-J. Jang, Very short feature vector for music genre
classification based on distance metric learning, in Proc. ICALIP, 2014
[11] Saim Shin ; Dalwon Jang ; Lee, J.J. ; Sei-Jin Jang ; Ji-Hwan Kim,
MyMusicShuffler: Mood-based music recommendation with the
practical usage of brainwave signals, ICCE, 2014