IJDIWC - Volume 6, Issue 4

I
SSN2225-
658X(
ONLINE)
I
SSN2412-
6551(
PRI
NT)
Vol
ume6,I
ssue4
2016
International Journal of
DIGITAL INFORMATION AND WIRELESS COMMUNICATIONS
Editors-in-Chief Overview
The SDIWC International Journal of Digital Information and Wireless
Communications is a refereed online journal designed to address the
Prof. Hocine Cherif, Universite de Bourgogne, France networking community from both academia and industry, to discuss
Dr. Rohaya Latip, University Putra Malaysia, Malaysia recent advances in the broad and quickly-evolving fields of computer and
communication networks, technology futures, national policies and
standards and to highlight key issues, identify trends, and develop visions
Editorial Board
for the digital information domain.
Ali Sher, American University of Ras Al Khaimah, UAE
In the field of Wireless communications; the topics include: Antenna
Altaf Mukati, Bahria University, Pakistan
Andre Leon S. Gradvohl, State University of Campinas, Brazil systems and design, Channel Modeling and Propagation, Coding for
Azizah Abd Manaf, Universiti Teknologi Malaysia, Malaysia Wireless Systems, Multiuser and Multiple Access Schemes, Optical
Bestoun Ahmed, University Sains Malaysia, Malaysia Wireless Communications, Resource Allocation over Wireless Networks,
Carl Latino, Oklahoma State University, USA Security, Authentication and Cryptography for Wireless Networks, Signal
Dariusz Jacek Jakbczak, Technical University of Koszalin, Poland
Processing Techniques and Tools, Software and Cognitive Radio, Wireless
Duc T. Pham, University of Bermingham, UK
E.George Dharma Prakash Raj, Bharathidasan University, India Traffic and Routing Ad-hoc networks, and Wireless system architectures
Elboukhari Mohamed, University Mohamed First, Morocco and applications. As one of the most important aims of this journal is to
Eric Atwell, University of Leeds, United Kingdom increase the usage and impact of knowledge as well as increasing the
Eyas El-Qawasmeh, King Saud University, Saudi Arabia visibility and ease of use of scientific materials, IJDIWC does NOT CHARGE
Ezendu Ariwa, London Metropolitan University, United Kingdom
Fouzi Harrag, UFAS University, Algeria authors for any publication fee for online publishing of their materials in
Genge Bela, University of Targu Mures, Romania the journal and does NOT CHARGE readers or their institutions for
Guo Bin, Institute Telecom & Management SudParis, France accessing to the published materials.
Hadj Hamma Tadjine, Technical university of Clausthal, Germany Publisher
Hassan Moradi, Qualcomm Inc., USA The Society of Digital Information and Wireless Communications
Hocine Cherifi, Universite de Bourgogne, France
Kowloon Centre, 33 Ashley Road, Tsimshatsui, Kowloon, Hong Kong
Isamu Shioya, Hosei University, Japan
Jacek Stando, Technical University of Lodz, Poland
Jan Platos, VSB-Technical University of Ostrava, Czech Republic
Further Information
Jose Filho, University of Grenoble, France Website: http://sdiwc.net/ijdiwc, Email: ijdiwc@sdiwc.net,
Juan Martinez, Gran Mariscal de Ayacucho University, Venezuela Tel.: (202)-657-4603 - Inside USA; 001(202)-657-4603 - Outside USA.
Kaikai Xu, University of Electronic Science and Technology of China, China
Khaled A. Mahdi, Kuwait University, Kuwait
Ladislav Burita, University of Defence, Czech Republic Permissions
Maitham Safar, Kuwait University, Kuwait International Journal of Digital Information and Wireless Communications
Majid Haghparast, Islamic Azad University, Shahre-Rey Branch, Iran (IJDIWC) is an open access journal which means that all content is freely
Martin J. Dudziak, Stratford University, USA available without charge to the user or his/her institution. Users are
Mirel Cosulschi, University of Craiova, Romania
Mohamed Amine Ferrag, Guelma University, Algeria
allowed to read, download, copy, distribute, print, search, or link to the
Monica Vladoiu, PG University of Ploiesti, Romania full texts of the articles in this journal without asking prior permission
Nan Zhang, George Washington University, USA from the publisher or the author. This is in accordance with the BOAI
Noraziah Ahmad, Universiti Malaysia Pahang, Malaysia definition of open access.
Pasquale De Meo, University of Applied Sciences of Porto, Italy
Paulino Leite da Silva, ISCAP-IPP University, Portugal
Piet Kommers, University of Twente, The Netherlands Disclaimer
Radhamani Govindaraju, Damodaran College of Science, India Statements of fact and opinion in the articles in the International Journal
Ramadan Elaiess, University of Benghazi, Libya of Digital Information and Wireless Communications (IJDIWC) are those
Rasheed Al-Zharni, King Saud University, Saudi Arabia of the respective authors and contributors and not of the International
Su Wu-Chen, Kaohsiung Chang Gung Memorial Hospital, Taiwan
Journal of Digital Information and Wireless Communications (IJDIWC) or
Talib Mohammad, University of Botswana, Botswana
Tutut Herawan, University Malaysia Pahang, Malaysia The Society of Digital Information and Wireless Communications (SDIWC).
Velayutham Pavanasam, Adhiparasakthi Engineering College, India Neither The Society of Digital Information and Wireless Communications
Viacheslav Wolfengagen, JurInfoR-MSU Institute, Russia nor International Journal of Digital Information and Wireless
Wen-Tsai Sung, National Chin-Yi University of Technology, Taiwan Communications (IJDIWC) make any representation, express or implied,
Wojciech Zabierowski, Technical University of Lodz, Poland
Yasin Kabalci, Nigde University, Turkey in respect of the accuracy of the material in this journal and cannot accept
Yoshiro Imai, Kagawa University, Japan any legal responsibility or liability as to the errors or omissions that may
Zanifa Omary, Dublin Institute of Technology, Ireland be made. The reader should make his/her own evaluation as to the
Zuqing Zhu, University of Science and Technology of China, China appropriateness or otherwise of any experimental technique described.
Copyright 2016 sdiwc.net, All Rights Reserved
The issue date is October 2016.

IJDIWC
ISSN 2225-658X (Online)
ISSN 2412-6551 (Print)
Volume 6, Issue No. 4 2016
TABLE OF CONTENTS
Original Articles
PAPER TITLE AUTHORS PAGES
ENGINEERING MINING A LARGE SCALE DATA BASED ON FEATURE

Ahmed Adeeb Jalal, Ouz Altun 219
ENGINEERING, METADATA, AND ONTOLOGIES
PROPOSAL OF REPRODUCTIVE DESIGN EDUCATION BASED ON

KNOWLEDGE AND RESOURCE DISCOVERY THROUGH SNS Masatoshi Imai, Yoshiro Imai 230
COMMUNITY
Demstenes Z. Rodriguez, Renata L.
ASSESSMENT OF QUALITY-OF-EXPERIENCE IN
Rosa, Rodrigo D. Nunes, Emmanuel T. 241
TELECOMMUNICATION SERVICES
Affonso
MUSIC EMOTION RECOGNITION WITH AUDIO AND LYRICS

C. V. Nanayakkara, H. A. Caldera 260
FEATURES
PROPOSITION OF AN INTELLIGENT SYSTEM FOR PREDICTIVE

Basma Boukenze, Abdelkrim Haqiq 274
ANALYSIS USING MEDICAL BIG DATA
A HYBRID CREDIBILITY ANALYSIS METHOD APPLIED ON TURKISH

TWEETS WITH TV NEWS AND DISCUSSION PROGRAMS RELATED Ali Fatih Gunduz, Pnar Karagz 281
CONTENT
A QUESTION-ANSWERING INFERENCING SYSTEM BASED ON

DEFINITION AND ACQUISITION OF KNOWLEDGE IN WRITTEN Kenta Hiratsuka , Hiroki Imamura 292
ENGLISH TEXT
International Journal of Digital Information and Wireless Communications (IJDIWC) 6(4): 219-229
The Society of Digital Information and Wireless Communications, 2016 ISSN: 2225-658X (Online); ISSN 2412-6551 (Print)
Engineering Mining a Large Scale Data Based on Feature Engineering, Metadata, and
Ontologies
Ahmed Adeeb Jalal 1 and Ouz Altun 2
12
Computer Engineering Department, Yildiz Technical University, Istanbul, Turkey
1
Al-Iraqia University, Baghdad, Iraq
Ahmedadeeb80@gmail.com, oguz@ce.yildiz.edu.tr
didn't have any idea about it or which they are likely

ABSTRACT
to prefer to users [4], [6], [7], [10], [11], [13]. These
The growth of web especially in a social network in a personalized suggestions are a useful alternative to
continuously increasing. Multiplicity of offered items such searching algorithms to providing a way to help
as products or web pages, has made pick up relevant items people picking the right items they might not have
for a user which searching for it a tedious. On the other
found by themselves. It became much easier to
hand, different tastes and behaviors of users is making
likelihood to finding a neighbor user hard to get. Therefore, finding the necessary items easily from quantity of
difficult for automated software systems to discover what information available online.
is interesting to users. We have proposed a new approach Two main categories of the recommender systems:
to adapt to this widespread in e-commerce nowadays to content-based recommender systems and
reduce multiplicity impact of items and different views of collaborative recommender systems. Most
users that can quickly produce the recommendations. We recommendations systems use a hybrid recommender
will exploit the domain knowledge of training data set to systems, which is a combination of these two
creating testing data set depending on an attribute of one approaches.
feature that represents distinctive item genre. The testing
data set will be the inputs to a hybrid recommender 1.1 Collaborative Recommender Systems
systems which is aspiring to achieve best The collaborative filtering approach is the most
recommendations through performing meta-level
popular method of recommender systems [1], [10]. It
hybridization techniques that combine of content-based
recommender systems and collaborative recommender
generates the recommendations based only on the past
systems. The proposed approach will reduce from effects users database ratings that represents full information
of sparsity, cold start, and scalability very common about users past rates. The collaborative filtering
problems with the collaborative recommender systems. predicts preferable items to users by calculation the
Additionally to, improve the recommendations accuracy similarity score of user comparing with the other
comparing with the pure collaborative filtering Pearson users. The collaborative filtering approach avoids
Correlation approach. semantics and systematically analyzes for items.
KEYWORDS Therefore, it characterized by quickly and accurately
of recommendations for items without considering to
Recommender Systems, Feature Engineering, Hybrid the concept of item itself and what signifies.
Recommender Systems, Meta-level, Collaborative The collaborative filtering is based on the assumption
Filtering, Content-Based Filtering, Sparsity, Cold Start,
that consensuses people in the past will agree in the
Scalability, Metadata, Ontologies.
future, and that they will like similar kinds of items as
1 INTRODUCTION they liked in the past. The advantage of the
The recommender systems are most popular collaborative filtering among other recommender
intelligent software systems of the information systems, its recommended different items from what
filtering systems that applied in a various domains for the user already knows. Also, the item unknown to the
example movies, music, books, jokes, restaurant, user yet this represents a surprise and the attraction of
financial services [8], and Twitter followers [9]. It is the user. Nevertheless, the collaborative filtering
recommending an interesting item to users who are often suffers from three problems reduce its impact
can be a challenge.
219
Scalability: In many of the environments in order preferred, rather than similar items that enable a
to find neighbors in collaborative filtering it simple substitution [3], [13]. In addition, a content-
requires a lot of time to doing the certain based filtering depends on well-structured attributes
computations until finding similar users or items, and reasonable distribution of attributes across items
because the data sets contain a million users and [14].
items. Furthermore, the number of users and items 1.3 Hybrid Recommender Systems
excessively increasing it becomes computationally
difficult to find similar neighbors. The hybrid recommender systems defined as a
Sparsity: Mostly, users do not rate the items, even combinations of various knowledge sources as the
the most popular items that they liked or inputs (such as user profile, community data, and item
purchased. Regarding the e-commerce companies, features) and multiple different recommender systems
it strives to increase the amount of items which together to get the outputs.
leads to increase sales and attract more consumers. The hybrid recommender systems could be luckier in
Inasmuch to extremely increasing number of the some cases in different application domains to get
users and items and very few ratings, most entries right recommendations to user in a timely manner. As
of the data sets matrix still remain zero. As a result, a result, there is one output for whatever the number
aggravation sparse problem. of recommender systems contributed to the formation
Cold start: Can be viewed as a special case of this the hybrid recommender systems. The collaborative
sparsity problem [12], it happens because the user filtering uses a certain type of information, user
does not have a sufficient rating or any rating at all. profile (user's ratings) together with community data
Some companies are forcing consumers when to derive recommendations, whereas the content-
login to the company's accounts to evaluate some based filtering rely on textual descriptions of item
of the most popular items in order to avoid this features and user's ratings. Thus, the type of
problem. Otherwise, it is difficult for recommender systems chosen determines which kind
recommender systems to provide an accurate of knowledge sources required. However, none of the
recommendation to users. basic approaches are able to use all of these
knowledge sources. It divides into three different
1.2 Content-Based Recommender Systems major categories of hybridization designs contain
The content-based filtering approaches are based on a seven hybridization techniques. Each of these seven
description of an attribute of the item features and the techniques operate under the context are different
profile of the users preference [15]. The from each other, although it's participated in one
recommended items at content-based filtering is hybridization design, that can be contributed to
matching predictions for the same kind of items that resolving some of problem as we mentioned.
user already liked compared with various candidate Monolithic hybridization design: Exploiting
items. So, it's considered a searched and compared different knowledge sources of inputs for several
process nearly, such as the processes used in the recommender systems that implemented and
information retrieval systems but, without requiring combined in one algorithm to produce the final set
user queries. of recommendations. Feature combination and
The content-based filtering retrieves information feature augmentation techniques can be included
from two knowledge sources the features items and its into this category.
rating that given by the user, simple approaches use Parallel hybridization design: Each recommender
the average values of the rated item. There are also systems participating in this design operates
more advanced techniques to infer to what is desirable independently of one another and each having its
by the user, such as decision trees, Bayesian own outcomes (i.e. separate recommendation
classifiers and cluster analysis algorithms. For lists). The outcomes of several existing
example, if the user has given a preferred rating implementations are combining to generate the
toward action movies, so it will recommend more final set of recommendations. The mixed,
action movies to him. In many cases, getting common weighted, and switching techniques classified
attributes is not easy and complimentary items are among this design.
220
Pipelined hybridization design: Sequentially about one or more aspects of the available data, it is
outputs of previous recommender systems used to summarize basic information about the data
becomes inputs of subsequent one and final one which can make tracking and working with specific
produces recommendations for user. So, the data easier [29].
outputs of the first recommender systems affects 1.6 Ontologies
all chain of recommender systems that contributed
to formation this algorithm. Optionally, An ontology in computer science is a formal naming
subsequent recommender components may use and definition of the types, properties, and
parts of original input data, too [1]. The cascade interrelationships of the entities that really or
and meta-level techniques are examples of such fundamentally exist for a particular domain of
pipeline design. discourse which variables needed for some set of
computations and establishes the relationships
1.4 Feature Engineering between them [27], [28].
The feature engineering exploits the domain The ontology can be applied in many fields of
knowledge of training data set to creating testing data software engineering, systems engineering, semantic
set based on the features that managed machine web, and artificial intelligence in order to contribute
learning algorithms to work function properly. The the solving problems through limit complexity and to
feature is a distinguishing characteristic that might organize information.
help when analyze the problem in order to solve it Our work aims to overcome the very common
[17]. The quality and quantity of the features will have problems with the recommender systems through
great influence on whether the model is good or not create new feature from extracted attribute of movie
[18]. genres. These features represent testing data set that
The right features chosen require extensive testing to will be feedback to the content-based approach to get
pick up a relevant feature that achieves better results, average of distinctive genres ratings of the rated item
it's very important parts. The right features make a for each feature depending on item description and
model simpler and more flexible, and they often yield user's rates. The testing data set will be the inputs to
better results [17]. However, the success of an Pearson Correlation filtering.
algorithm is not entirely depending on the selected 2 RELATED WORK
features, the model and the data set represented an
important role in the success of the algorithm to We review some example of the hybrid recommender
achieving satisfactory results. The feature is a piece of systems that applying in a various domains. Netflix
information in the data set that might be containing Inc. [26] for the movie rent recommendation. It
many attributes, useful for prediction and will released a challenge in 2006 and offered grand prize
influence the recommendation that required to of one million US dollars to person or team who could
achieve. Any attribute could be a feature, as long as it succeed in modeling a given data sets to within a
is useful to the model [24]. certain specification [1], [2], [5]. It combines
collaborative filtering and content-based filtering
1.5 Metadata through similar habits of users as well as by higher
Metadata is data that provide information about other rates of shared movies characteristics.
data [16]. Three types of metadata exist: structural, Lawrence et al. [20] describes a personalized
descriptive, and administrative metadata [22]. recommender system to shoppers in supermarkets
Structural metadata indicates to the containers of data rely on their previous behavior towards the purchases
that contain the compound objects, for example, how to suggest new products for them. This system
web pages are ordered to form the site. Descriptive developed at IBM research has been implemented as
metadata uses the item description, it can include a part of SmartPad, a personal digital assistant based
features such as title, author, date, location, etc. remote shopping system. This system built based on
Administrative metadata provides information about combining content based filtering with collaborative
the management, such as creation, access, and file filtering to improve the recommendations.
type information. Metadata could provide information
221
MovieLens [31] the online movie recommendation HetRec 2011 data set: The 2nd International
that used its data set in our approach, propose to new Workshop on Information Heterogeneity and
user login some watched movies which be most Fusion in Recommender Systems HetRec 2011
popular generally in order to evaluate it. Then, these [32], has released data sets from Delicious, Last.fm
ratings are exploited to recommend other movies not Web 2.0, MovieLens, IMDb, and Rotten
seen by the user. It also uses collaborative filtering Tomatoes. These data sets contain social
based on similar users according to these ratings. networking, tagging, and resource consuming
These two approaches are combined to create (Web page bookmarking and music artist
personalized recommendations. listening) information from sets of around 2,113
3 METHODOLOGY users. The rating values are ranging between 0.5 to
5 of around 2,113 users and 10,197 items.
3.1 Overview Table 1 summarizes the statistics of training data sets,
The overall procedure of our proposed approach is as where the ratings matrix density is defined as the
follows: fraction of number of ratings over the total multiplies
Tests all the item features to choose the appropriate number of user and items in the rating matrix. The
feature for the purpose of obtaining the better average number of the users who gave the rate of the
results. items and the average number of the items that rated
Extraction all the attributes of the selected item by user can be seen from Table 1.
feature. Table 1. Statistics of training data sets
Extracting the attributes without repetition. Statistics HetRec MovieLens
Creating testing data set with new features based Number of users 2113 6040
on these attributes. Number of items 10197 3883
Exploit the content based recommender systems to Number of ratings 855598 1000209
fill this testing data sets with average of distinctive Average number of ratings
404.921 165.598
genres ratings. by users
Average number of ratings
Get the recommendation through the score of for items
83.91 257.587
similarity between users depending on entire Density 3.97% 4.265%
testing data sets based on collaborative
recommender systems. The number of ratings given by one user to all items,
Evaluate the results of proposed approach using HetRec 2011 data set, ranging from 20 to 3410 with
two evaluation Metrics: predictive accuracy percentage from 0.2% to 33.5% respectively.
metrics and classification accuracy metrics to MovieLens 1M data set, ranging from 20 to 2314 with
verify the accuracy of recommendation, at the next percentage from 0.52% to 59.6% respectively.
section. The number of ratings given by all users to one item,
HetRec 2011 data set, ranging from 1 to 1670 with
3.2 Data Description
percentage from 0.05% to 79.05% respectively.
In this part introduces the data sets, we will describe MovieLens 1M data set, ranging from 1 to 3428 with
the data sets collection process and the feature percentage from 0.02% to 56.75% respectively.
representations for each data set, as well as some basic
3.3 Feature Learning
statistics of the data set. The two data sets used in this
study were downloaded from the GroupLens Machine learning, feature learning or representation
Research website [30]. learning is a set of techniques that learn a feature [19],
MovieLens 1M data set: GroupLens Research has [23]. The training data set (i.e. raw data) defined as a
collected and made available rating data sets from set of aggregated features, exploits to produce a sort
the MovieLens website [31]. The data sets were of representation that can make the machine learning
collected over various periods of time. The rating algorithms simpler and more flexible.
values are ranging between 0.5 to 5 of around The training data set in our paper consists of two
6,040 users and 3,883 items. major categories: users and items (movies), each one
222
contains many of the features which include many Likewise, is represented all distinctive genres
attributes. For example, user's category contains of items (i.e. without repetition), whose structure can
gender, occupation, age and Zip-code, item's category be shown as:
contains title, genres, actors and year of release. Thus, 1. Adventure
a transformation of raw data into the sort of 2. Comedy
representation requires more than one feature testing 3. Romance
in order to determine useful features. Feature learning 4. Action (4)
is motivated by the fact that machine learning 5. Crime
algorithms often require appropriate inputs 6. Musical
mathematically and computationally. However, the 7. Animation
success of the algorithm depending on the selected Let denotes the average of distinctive genres
features besides the model and the data set to
ratings based on Eq. (4), then can be
achieving satisfactory results, as mentioned earlier.
Usually, the initial choice of feature based on our represented as:

experience and a prior knowledge about the existing = (5)

data set details.
Let denote training sample i, then can be Where is represented the value of ratings of item i,
represented as: TF is represented the term frequency of the distinctive
={ , , } (1) genre in the user's profile for rated items.
Where , and stand for the input Algorithm 1 explained the sequence operational for
vector and the two output vector for training sample i, creating testing data set and feedback this data set
with the average of distinctive genres ratings that
respectively. is represented all features of item
rated by user.
i, whose structure can be shown as:
Title (i) Algorithm 1. Creating testing data set
Year (i) 1: input: read the item features file.
Genre (i) 2: choose an appropriate step size
(number of items).
Location (i) (2) 3: for t=1,..,T do
Director (i) 4: extracting the genres of items
Actors (i) using Eq. (3).
Country (i) 5: end for
All the entries either textual or integers, Genre (i) is 6: extracting the distinctive genres
using Eq. (4).
the genre of item i that will be extracted from other 7: get the average of distinctive
features of the item, it is textual. genres ratings using Eq. (5).
Likewise, is represented all extracted genres of 8: output: creating testing data set
items, whose structure can be shown as: based on Eq. (4) and Eq. (5).
1. Adventure, Children, Fantasy Table 2 and Table 3 shows the structure and the
statistics of testing data sets (HetRec 2011 and
2. Comedy, Romance
3. Comedy MovieLens 1M data set) that have been configured
4. Action, Crime, Thriller after implemented Algorithm 1, respectively.
5. Adventure, Children, Action (3) Table 2. Structure of testing data set
6. Comedy D1 D2 D3 D4 D5 D6 D7 Dn
7. Adventure, Children, Action U1 4.13 4.3 4.1 2.5 1.8 4.7 1.5 0
8. Animation U2 3.34 0 2.7 1 5 3.2 1 3.9
U3 1.5 4.2 3.7 0 2.3 0 3 4.6
9. Musical, Romance Um 2.17 3.5 3.26 4.7 0 3.26 0 0
223
Table 3. Statistics of testing data sets Figure 1 shows the general schematic of proposed
Statistics HetRec MovieLens approach which applied in our paper.
Number of users 2113 6040
Number of items 19 18 Feature
Items Database Engineering
Number of ratings 25029 62484
Average number of ratings 11.85 10.35
by users
Average number of ratings 1317.32 3471.34 Feature learning
for items
Density 62.35% 57.5%
Items Database
In Table 3 the number of items in testing data sets has Contributing
been reduced. Therefore, increasing the ratings matrix Recommender
User's Ratings
density that contributes to solving the problems of the
recommender systems. The percentage decrease the
items by compared the two data sets before and after Learned Model
implemented the Algorithm 1 up to 99.8%.
The number of ratings given by one user to all items
in testing data sets. HetRec 2011 data set, ranging Actual
from 3 to 19 with percentage from 15.8% to 100% Users Ratings Recommender
Database
respectively. MovieLens 1M data set, ranging from 2
to 18 with percentage from 11.1% to 100%
respectively. Score
Meta-Level Technique
The number of ratings given by all users to one item
in testing data sets. HetRec 2011 data set, ranging
from 2 to 2110 with percentage from 0.1% to 99.9%
respectively. MovieLens 1M data set, ranging from Overall Score
630 to 6012 with percentage from 10.5% to 99.6% Figure 1. General schematic of proposed approach
respectively.
According to the description of testing data set above, In this section, the results obtained through creating
the Pearson Correlation similarity of two users i, j is testing data set can be summarized as follows:
defined as: Decreasing the number of items.
, , Increasing the ratings matrix density.
, (6) Increasing the ratings of users.
, ,
Increasing the ratings of items.
The formula used to predict the rating depending on Now, we have two important questions will be
the score of similarity, the user's rate of training data provable in the next section:
sets, and the distinctive genre rating which item How useful are the reducing items?
belongs to it, can be represented as: Can the proposed approach improve the accuracy
,
,
,
(7) of recommendation?
,
4 EXPERIMENTS
3.4 Meta-Level Technique
The recommender systems have been evaluated in
Meta-level technique is one of seven hybridization different evaluation metrics. Evaluating
recommendation techniques subordinates to the recommender systems is difficult because the
pipelined hybridization design, exploits to get a sort evaluation results mutable, it's based on algorithms,
of model which will be the input of the next technique. data sets and evaluation metrics together.
As a result, the contributing recommender completely Many algorithms have been designed some of it
replaces the raw data with a learned model that the applied effectively on some of the data sets, while not
actual recommender uses in its computation. worked with others. Also, a variety of data sets are
224
available downloaded online, but some of it is not requires reducing time-consuming and the number of
valid for performed with some algorithms. As a result, similar users in order to make the software systems
there should be a consensus between the algorithm faster for getting the recommendations quickly. So, it
and data sets selected, a potentially overwhelming set was our approach focuses on reducing the items in
of choices. Finally, evaluation metrics can be divided testing data set to the extent that it can get satisfactory
into two major categories will be discussed later, the results. In addition, the testing data set increases the
first category based on the numeric value (i.e. error density of users' rates, it make to get the right similar
ratio) that represents the difference value of the user more flexible.
original rate and the predicted rate called predictive Figure 2 illustrated the percentage of increasing the
accuracy metrics, and the second category based on ratings of users for training and testing data sets.
the related as if that the predicted rate is relevant or We proposed a method of the hybrid recommender
irrelevant compared with the original rate called systems according to testing data set that combine two
classification accuracy metrics, this is the motivation approaches content-based filtering and collaborative
of the both types of the evaluation metrics applied in filtering Pearson Correlation approach.
this paper because every category follows a certain Let HRS denotes to combining two approaches
pattern for evaluation. It would be better to choose content-based filtering and collaborative filtering
one or more evaluation metrics in order to compare Pearson Correlation approach, and CFP denotes to the
the accuracy of different recommender systems [25]. pure collaborative filtering Pearson Correlation
approach.
4.1 Data Sets and Preprocessing
Then, we will compare our results that got it from
We used testing data set as the input data in our HRS method based on testing data set with CFP
proposed approach, which got it after implemented method based on training data set, for the two data sets
the algorithm 1 on the two publicly available data sets selected HetRec 2011 and MovieLens 1M.
HetRec 2011 and MovieLens 1M as we mentioned in Table 4 shows the advantage of reducing the items,
Section 3. The purpose of this process to improve through reducing the Time-consuming in order to
performance and get accurate recommendations. predict the rate and reducing the average number of
As is well known, today the increasing growth in the similar users for each predict operation with keeping
web with thousands of users who interact with an efficient result.
thousands of items if not millions. This growth
Table 4. Time-consuming and similar users

Data Sets HetRec 2011 MovieLens 1M
Methods CFP HRS CFP HRS
Time-consuming for one sample testing (s) 0.568 0.061 0.83 0.133
Average number of similar users for each sample 394 308 680 528
TrainingData TestingData TrainingData TestingData
120 120
100 100
80 80
Items
Items
60 60
40 40
20 20
0 0
1065
1198
1331
1464
1597
1730
1863
1996
1135
1513
1891
2269
2647
3025
3403
3781
4159
4537
4915
5293
5671
1
134
267
400
533
666
799
932
1
379
757
Users Users
Figure 2. Percentage of the users' rates for training and testing data sets
225
4.2 Evaluation Metrics Classification Accuracy Metrics: Classification

accuracy metrics based on the relevance between
There are many of the published evaluation metrics the predicted ratings and the true ratings in order to
differ from each at its work and its results (such as determine which items are relevant (i.e. good) and
predictive accuracy metrics, classification accuracy which are irrelevant (i.e. bad). It means the
metrics, rank accuracy metrics and an empirical existence of different groups and the decision will
comparison of evaluation metrics etc.). be to any groups belongs the predicted ratings. For
We will focus only on the most common evaluation instance, the rating scale of the two data sets range
metrics to evaluate the accuracy of recommender (0.5,...,5), the separation threshold could be
systems. Herlocker et al. [25] provide a arbitrary to 4 according to fine estimate as in [33].
comprehensive discussion of accuracy metrics In our paper, we proposed 3 stars as a threshold to
together with alternate evaluation criteria, which is give more flexibility in the case of unavailability
highly recommended for reading. the items more than 4 stars also, the global average
Predictive Accuracy Metrics: Predictive accuracy of the ratings in the HetRec 2011 and MovieLens
metrics based on the numerical difference values 1M data set is less than 4 roughly 3.5. We can
between predicted ratings and true ratings that are classify each recommendation such as [21]:
given by the user to the movies which is an 1. True positive (TP, an acceptable item is
estimate of a five-star according to the data sets recommended to the user).
used HetRec 2011 and MovieLens 1M. The 2. True negative (TN, an unacceptable item is not
success of recommender systems evaluation relies recommended to the user).
on how close the predicted ratings and the true 3. False positive (FP, an unacceptable item is
ratings (i.e. if the numerical difference values is recommended to the user).
small the recommender systems deemed 4. False negative (FN, an acceptable item is not
successful vice versa). recommended to the user).
When evaluating the ability of a recommender Precision Eq. (8) and recall Eq. (9) are the most
systems to correctly predict for a specific item, popular evaluation metrics in the information
mean absolute error (MAE) and Root Mean retrieval field depend on the separation of relevant
Squared Error (RMSE) one of the most important "positive" and irrelevant "negative" items, it has
evaluation metrics of this class compared with been used in [34], [35]. F-measure Eq. (10) allows
other evaluation metrics. combines precision and recall into a single score.
| |
(8) (10)
| | (11)
= (9)

(12)
Where and represent the predicted ratings and the
real ratings of users, respectively, and T denotes to the
total number of predictions generated for all active The performance evaluations of classification
users in the data set. accuracy metrics for HRS method compared to CFP
The performance evaluations of predictive accuracy method according to the evaluation metrics:
metrics for HRS method compared to CFP method precision, recall and F-Measure, are summarized in
according to the two evaluation metrics: MAE and Table 6 and Figure 4.
RMSE, are summarized in Table 5 and Figure 3.
226
Table 5. MAE and RMSE evaluations

MAE 0.654 0.598 0.718 0.679
RMSE 0.846 0.788 0.9 0.869
0.9 1
0.8 0.9
0.7 0.8
0.6 0.7
0.6
0.5
0.5
0.4 CFP CFP
0.4
0.3 HRS 0.3 HRS
0.2 0.2
0.1 0.1
0 0
MAE RMSE MAE RMSE
HetRec2011 MovieLens1M
Figure 3. Comparison of evaluations of predictive accuracy metrics
Table 6. Precision, Recall and F-measure evaluations

Precision 0.866 0.858 0.891 0.878
Recall 0.874 0.896 0.904 0.92
F-Measure 0.87 0.877 0.897 0.898
0.9 0.93
0.89 0.92
0.88 0.91
0.9
0.87
0.89
0.86 CFP CFP
0.88
0.85 HRS HRS
0.87
0.84 0.86
0.83 0.85
Precision Recall FMeasure Precision Recall FMeasure
HetRec2011 MovieLens1M
Figure 4. Comparison of evaluations of classification accuracy metrics
227
Inasmuch to the results obtained in this Section, we

ACKNOWLEDGMENTS
proved that:
The reducing items was useful. I'm grateful to Al-Iraqia University and Computer
The proposed approach improved the accuracy of Engineering Department, Yildiz Technical University
recommendation. for this opportunity to get the master's degree. Special
In Table 7, all results obtained in this Section of HRS thanks to Mona Mohamed Wafy employee at Al-
method based on testing data set compared with CFP Iraqia University. Finally, thanks everyone that
method based on training data set, for the selected helped and supported me.
data sets: HetRec 2011 and MovieLens 1M are listed. REFERENCES
The performance superiority of HRS method
1. Dietmar Jannach, Markus Zanker, Alexander Felfernig, and
compared with CFP method represented by Yes or
Gerhard Friedrich, Recommender Systems_ an
No. The HRS method excelled at the most results Introduction, US, 2011.
obtained, can be seen in Table 7. 2. Andrey Feuerverger, Yu He, and Shashi Khatri, Statistical
Significance of the Netflix Challenge, Institute of
Table 7. All results obtained Mathematical Statistics, 2012.
Data Sets HetRec MovieLens 3. Mohammad Amir Sharif and Vijay V. Raghavan, A
2011 1M Large-Scale, Hybrid Approach for Recommending Pages
Based on Previous User Click Pattern and Content, pp. 8,
Time-consuming for
Performance
Yes Yes USA, 2016.

one sample testing (s)
4. Deepak Agarwal, Bee-Chung Chen, Pradheep Elango, and
Average number of Raghu Ramakrishnan, Content recommendation on web
similar users for each Yes Yes portals, In Communications of the ACM, 56(6): 92101,
sample 2013.
5. Robert M. Bell and Yehuda Koren, Lessons from the
MAE Yes Yes
predictive
Netflix prize challenge, In ACM SIGKDD Explorations

accuracy
metrics
Newsletter, 2007.
6. Abhinandan Das, Mayur Datar, Ashutosh Garg, and Shyam
RMSE Yes Yes Rajaram, Google news personalization: scalable online
collaborative filtering, In WWW, pp. 271280, May 2007.
Precision No No 7. Greg Linden, Brent Smith, and Jeremy York, Amazon.
Classification
Com recommendations: Item-to-item collaborative

accuracy
metrics
Recall Yes Yes filtering, In Internet Computing, IEEE, 7(1): 7680, 2003.
8. Alexander Felfernig, Klaus Isak, Kalman Szabo, and Peter
F-Measure Yes Yes Zachar, The VITA Financial Services Sales Support
Environment, pp. 1692-1699, Vancouver, Canada, 2007.
9. Pankaj Gupta, Ashish Goel, Jimmy Lin, Aneesh Sharma,
5 CONCLUSIONS Dong Wang, and Reza Bosagh Zadeh, WTF:The who-to-
follow system at Twitter, Proceedings of the 22nd
In this paper, we proposed creating the testing data set international conference on World Wide Web, ACM New
that incorporates limited items in order to alleviating York, NY, USA, 2013.
the impact of scalability, sparsity and cold start 10. Francesco Ricci, Lior Rokach, and Bracha Shapira,
Introduction to Recommender Systems Handbook,
problem by increasing the ratings matrix density. As Recommender Systems Handbook, Springer, 2011.
an additional benefit, we used the testing data set as 11. Facebook, Pandora Lead Rise of Recommendation
the inputs for the hybrid recommender systems and Engines - TIME, TIME.com. 27 May 2010. Retrieved 1
evaluated the results according to two evaluation June 2015.
metrics to prove the accuracy of the recommendation. 12. Zan Huang, Hsinchun Chen, and Daniel Zeng, Applying
associative retrieval techniques to alleviate the sparsity
According to description above, we proved useful and problem in collaborative filtering, ACM Transactions on
effectiveness the proposed approach to all most Information Systems 22, no. 1, pp. 116142, 2004.
aspects compared with the pure collaborative filtering 13. Gediminas Adomavicius and Alexander Tuzhilin, Toward
Pearson Correlation approach based on the training the Next Generation of Recommender Systems: A Survey
data set for the selected data sets. of the State-of-the-art and Possible Extensions, IEEE
Transactions on Knowledge and Data Engineering, vol. 11,
no. 6, pp. 734-749,2005.
228
14. Delgado Joaquin, Ishii Naohiro, and Ura Tomoki, 34. Badrul M. Sarwar, George Karypis, Joseph A. Konstan,
Content-based Collaborative Information Filtering: and John T. Riedl, Application of Dimensionality
Actively Learning to Classify and Recommend Reduction in Recommender System -- A Case Study, in
Documents, in Proceedings of the Second International ACM WebKDD 2000 Web Mining for E-Commerce
Workshop on Cooperative Information Agents II, Workshop, 2000.
Learning, Mobility and Electronic Commerce for 35. Badrul M. Sarwar, George Karypis, Joseph A. Konstan,
Information Discovery on the Internet, pp. 206-215, 1998. and John T. Riedl, Analysis of recommendation
15. Peter Brusilovsky, The Adaptive Web, pp. 325, 2007. algorithms for E-commerce, In Proceedings of the 2nd
16. http://www.merriam-webster.com/dictionary/metadata. ACM Conference on Electronic Commerce (EC00).
17. Discover Feature Engineering, How to Engineer Features ACM, New York, pp. 285295, 2000.
and How to Get Good at It - Machine Learning Mastery,
Machine Learning Mastery. Retrieved 11-11-2015.
18. Feature Engineering: How to transform variables and
create new ones?, Analytics Vidhya. 12-03-2015.
Retrieved 12-11-2015.
19. Zdenek Zabokrtsky, Feature engineering in Machine
Learning, Retrieved 12 November 2015.
20. Richard D. Lawrence, George S. Almsi, Vladimir Kotlyar,
Marisa S. Viveros, and Sastry S. Duri, Personalization of
supermarket product recommendations, 2001.
21. Paolo Cremonesi, Roberto Turrin, Eugenio Lentini, and
Matteo Matteucci, An evaluation methodology for
recommender systems, Proc. of the 4th International
Conference on Automated Solutions for Cross Media
Content and Multichannel Distribution (AXMEDIS),
IEEE, 2008.
22. National Information Standards Organization (NISO),
Understanding Metadata, 2004.
23. Yoshua Bengio, Aaron Courville, and Pascal Vincent,
Representation Learning: A Review and New
Perspectives, IEEE Trans. PAMI, special issue Learning
Deep Architectures 35: 17981828, 2013.
24. Navid Razmjooy, Bibi Somayeh Mousavi, and Fazlollah
Soleymani, A hybrid neural network Imperialist
Competitive Algorithm for skin color segmentation,
Mathematical and Computer Modelling, Volume 57, Issues
34, February 2013.
25. Jonathan L. Herlocker, Joseph A. Konstan, Loren G.
Terveen, and John T. Riedl, Evaluating collaborative
filtering recommender systems, ACM Transactions on
Information Systems (TOIS) 22, 2004.
26. http://www.netflix.com.
27. Fredrik Arvidsson and Annika Flycht-Eriksson, A
Ontologies I, Retrieved 26 November 2008.
28. Thomas R. Gruber, A Translation Approach to Portable
Ontology Specifications, June 1993.
29. www.theguardian.com, A Guardian Guide to your
Metadata, Guardian News and Media Limited, 12 June
2013.
30. http://www.grouplens.org.
31. http://www.movielens.org.
32. http://ir.ii.uam.es/hetrec2011/.
33. Jonathan L. Herlocker, Joseph A. Konstan, Al Borchers,
and John T. Riedl, An algorithmic framework for
performing collaborative filtering, In SIGIR 99:
Proceedings of the 22nd annual international ACM SIGIR
conference on Research and development in information
retrieval, pp. 230237, New York, NY, USA, 1999.
229
Proposal of Reproductive Design Education based on Knowledge and Resource

Discovery through SNS Community
Masatoshi Imai * and Yoshiro Imai **

* Department of Management Information, Kagawa Junior College
1-10 Utazu-cho, Ayautagun 769-0201 Japan
** Graduate School of Engineering, Kagawa University
2217-20 Hayashi-cho, Takamatsu 631-0396 Japan
E-mail: * imai@kjc.ac.jp , ** imai@eng.kagawa-u.ac.jp
ABSTRACT Because students of the relevant education course

must have suitable knowledge to design some
Design education is one of the most creative topics and objects as well as applicable techniques to produce
themes in Higher Educations and Trainings. Students original shape and structure for the self-designed
of the design education course also need to learn both objects. In order to grow their knowledge and
of knowledge and techniques, the former is necessary techniques during the effective course, students,
to design some objects and the latter are essential to namely learners want to face some practical
utilize tools as well as equipments. It is important to
designing and producing situation as exercises
provide not only knowledge but also techniques in
efficient and effective ways. which shall be able to provide very important
One of the most attractive approaches to design in experience for the relevant students/learners.
Ecological and/or Recycling methods is to utilize and If they find themselves in successful results, they
discover reproductive tools and resources. It is a good will really gain possession of skills, great
way to create some reproductive objects. Especially, experience, self-confidence and moreover
some furnitures are worth enough to be reused and applicable challenging spirits for other targets. In
reproduced in the above ways. these cases, there may be some problems how to
This study focuses how to utilized recycling resources support and realize their fruitful courses and how
and useful knowledge for design education. And it also to reduce/shorten their reasonable periods for the
presents a practical scheme to utilize Resources, total length of courses. Many students/learners
Knowledge and Techniques for Design Education in
need different knowledge and materials and they
order to retrieve and discover in network environment.
The paper challenges to visualize practical scheme for want to face several kinds of target and plans to
design process by means of comparison between usual design and implement their objects.
steps in the normal design education and special steps This research has focused on utilization of
using Internet and/or SNS-based network community. network community to quest for suitable
And it summaries to be important for design education knowledge and discover desired resources in order
to visualize scheme for resources and knowledge resolve given/own problems. Network sometimes
discovery through network environment. shows a lot of scenes to its users, from domestic
LAN to wide WAN namely Internet. Currently,
KEYWORDS social networks become more and more applicable
for the users to achieve information sharing and
Visualization of Design Education, Ecological and exchanging.
Recycling, Utilization of Network Community for This paper introduces brief comparison of
Retrieving and Discovering. conventional and our proposal design education
with network community and benefits in the
1 INTRODUCTION second section. It explains characteristics and
advantages of our practical design education for
Education, especially Design education, needs reproductive scheme with recycling resources and
combination of knowledge, technique and exercise.
230
its ecological manner in the third section. It also network in the Russian segment of the Internet,
describes some evaluations of our design namely, My.Mail.Ru (also known as "My World"
education and mentions its future expansion for in Russian). Their main goal was to study the self-
Internet-wide scale in the fourth section. And it disclosure patterns of the site users as a function
finally summaries our concluding remarks in the of their age and gender. Their paper compared the
fifth section. findings of their analysis to the previous studies on
Western users of SNS and discussed the culturally
2 RELATED WORKS distinctive aspects. Their study highlighted some
important cultural differences in usage patterns
This section reports four useful articles of related among Russian users, which called for further
works of ours. These articles focuses on case studies in SNS in various cultural contexts.
study about how to utilize SNS community for Ohbyung Kwon and Yixing Wen from Korea
information sharing, decision making and so on. explained Social network services which were
Sue Yeon Syn of The Catholic University of emerging as a promising IT-based business, with
America and Sanghee Oh of Florida State some services being provided commercially such
University report why SNS (social network site) as Facebook [3]. However, it was not yet clear
users do share information, knowledge and which potential audience groups would be key
experience on Facebook and Twitter [1]. Their social network service participants. Moreover, the
study examined why SNS users shared process showing how an individual actually
information, knowledge, and personal experiences decided to start using a social network service
with others on SNSs. Through an online survey, might be somewhat different from current web-
10 motivation factors were tested with Facebook based community services. Hence, the aims of
and Twitter users. Their findings indicated that the their paper were twofold. 1) They empirically
motivations of SNS users in sharing information examined how individual characteristics affected
could be attributed to various aspects such as actual user acceptance of social network services.
demographic characteristics, experiences of SNSs To examine these individual characteristics, they
and Internet usage, as well as the characteristics applied a Technology Acceptance Model (TAM)
and features of SNSs. SNS users could be highly to construct an amended model that focused on
motivated by the learning and social engagement three individual differences: social identity,
aspects of SNS services. They also found that the altruism and telepresence, and one perceived
motivations could vary depending on the construct: the perceived encouragement, imported
characteristics of services. They said that results of from psychology-based research. 2) They
their study could be helpful for researchers in examined if the users' perception to see a target
understanding the underlying reasons for social social network service as human relationship-
activities as well as for SNS developers in oriented service or as a task-oriented service could
improving SNS services. be a moderator between perceived constructs and
Slava Kisilevich from University of Konstanz, actual use. They said their result discovered that
Germany et al. reported that online social network the perceived encouragement and perceived
services (SNS) provided an unprecedented rich orientation are significant constructs that affected
source of information about millions of users actual use of social network services.
worldwide [2]. However, most existing studies of Tristan Henderson, Luke Hutton & Sam McNeilly
this emerging phenomenon were limited to of University of St Andrews, UK reported about
relatively small data samples, with an emphasis on Ethics in online social network research [4].
mostly "western" online communities (such as They described that Social network sites (SNSs)
Facebook and MySpace users in Western and other online social networks such as Facebook
countries). In order to understand the cultural and Twitter represented a huge source of data for
characteristics of users of online social networks, research in many fields, including sociology,
their paper explored the behavioral patterns of medicine, anthropology, politics and computer
more than 16 million users of a popular social science. Such sites might contain sensitive
231
information and care needs to be taken when giving reality to miniature. Scaling of miniature
designing experiments or collecting SNS data. will be from 1/10 to 1/8 possibly.
This case study outlined two such experiments and
discussed the ethical concerns within. They
described lessons learned, a set of experiments
designed to test some of these lessons, and an
architecture that addressed some of the ethical
challenges.
From these articles, it is confirmed that we had
better utilize knowledge and techniques of SNS
community. And at the same time we must choose
and/or determine more useful and reliable ones
among the proposed knowledge and information
from SNS community.
3 COMPARISON OF CONVENTIONAL AND

PROPOSAL DESIGN EDUCATION
Figure 1. Prototyping and coloring of miniature for target
This section compares our proposal design furniture.
education with a conventional one and explains
the latter's characteristics against the former's one. Figure 2 presents the according miniature of
furniture with the same kind of miniature of seat
3.1 Real Production Process for Furniture sofas which have been made up of "foam
polystyrene" because of easy forming. Such a
A real production process of furniture includes the prototype, however, may give someone a quality
following steps; feeling so that some people say there is no special
need to utilize Virtual reality rendering with
1. Design of the target furniture: normally, expensive effect by computer.
some prototyping is necessary in the
design process. Making miniature is a part
of prototyping. It is convenient for
overviewing such a target furniture.
2. Discussion of the target furniture: Designer
and sale manager discuss the profile about
the target furniture by means of miniature
as a prototype. Some sale plan is to be
prepared by means of prototyping, namely
using miniature.
3. Production of the target furniture: After
prototyping and discussing, producing
process begins in accord with previous
processes. Display and trial usage will be
available with finished product. Figure 1. Display and evaluation with miniature of
furniture.
Figure 1 shows prototyping a miniature of
reference furniture on the work desk. In this case, Figure 3 displays a real model of furniture which
prototyping includes coloring of miniature. is produced based on miniature after prototyping.
Suitable coloring may be good for the sake of A real model must be good and useful if previous
232
prototyping is well-discussed and suitable enough community and carry out information exchanging
to produce real furniture. and sharing on the networks.
In the case of our proposal design education,
recycling resources of materials has been focused
and illustrated in order to reproduce some useful
products with recycled resources. We will explain
sample of utilization of SNS community, decision
making on the networks (i.e. resource finding,
knowledge obtaining to redesign, presenting by
miniature, discussing, etc.), reproduction of real
model, and evaluation.
Generally speaking, reproduction of furniture may
be included with the following procedures, namely,
1. Designer reforms his/her original model

into a new one, which has both of a part of
the same resources of the original model
Figure 3. Production of furniture based on miniature. and other new parts.
2. The designer must decide to keep what part
As comparison with Figure 2 and Figure 3, not of original resources and to design others
only designer but also sale manager can feel that newly.
real production is identical with prototyped 3. In order to decide to keep what part of
miniature. As a consequence, potential buyers who original resources, it is necessary to retrieve
may stand at the same position of sale manager past results. On the other hand, in order to
can recognize and decide to pay their costs to buy decide to create new part, it may need to
the relevant furniture only through reference of search future trends, namely, prediction of
prototype. As you know, not a few people trend.
sometimes buy products only with reference of 4. The former must utilize retrieval of past
catalogs or online browsing, instead of touching track records just like as one of database
and checking real model. applications, while the latter had better
employ market research, trend watching,
3.2 Proposal of Reproductive Design Education questionnaire investigation for users and so
using SNS Community on.
Network communities have been attractive and Of course, it is very difficult for only one or a few
useful for us to perform information exchanging designers to manage the above procedures
and sharing among the registered people who are efficiently. Several staffs and/or support team
living in the distance. If one describes some must be necessary for such designer(s).
resource is unnecessary in one's community, We describe schematic procedure for reproduction
others may rely those resources must be necessary during Design Education using SNS community in
in the other's community. And if one asks some order to improve effectiveness and efficiency
questions which need knowledge to be resolved, educational results.
others may reply the relevant answers which In order to accomplish retrieval of past track
include suitable knowledge for resolution. SNS records, we have utilized SNS community.
community is one of the efficient and effective Such a community can play important roles to
environments which can transfer information to provide huge and excellent database for
the relevant position/ people. retrieving.
In order to perform resource recycling and We have also utilized SNS Community to
discovering, it is very good to utilize SNS perform market research, trend watching,
233
questionnaire investigation, and user's 2. Resource Finding stage:

demands. Probabilistically speaking, small A) Requesting information about furniture
size of SNS community may have not large to be constructed
demands but steady ones even for productions. B) Requesting information about materials
We have employed SNS community as of the furniture
suitable media to perform information sharing C) Searching resources for materials/
and exchanging. Namely, some members of furniture
Social networks may be able to provide and/or D) Obtaining information about resources
point out both of resources and know-how for E) Obtaining information about resources
reproduction in Design Education. 3. Knowledge Collecting stage:
As described before, values of people may be A) Requesting information how to
not similar and identical. If so, it must have fabricate, manufacture and/or process
possibilities that something which is such resources
unnecessary for someone is necessary for B) Searching knowledge for fabrication,
other ones from the global viewpoints. manufacturing and/or processing
Especially, recycling will be more and more C) Obtaining knowledge about the above
popular in many fields and probably be techniques
dominant. Production, such as furniture, has D) Accumulating knowledge like database
relatively long lifetime such as 10 years or 4. Furniture Constructing stage:
more, so those resources may be useful and A) Selecting staffs and/or work places
available for multiple generation users. The B) Pouring resources and know-how into
problems are how to adjust changes and the above factory (i.e. workplace with
variation of their tasty, favorites and trends. staffs)
C) Reproducing the relevant furniture
4 ADVANTAGES OF PROPOSED DESIGN
EDUCATION The above workflow can be separated into 4 major
stages, which includes some more detailed steps.
This section demonstrates characteristics and
advantages of our proposed design education 4.2 Reproduction Modeling for Proposed
showing practical reproduction processes of Design Education with Knowledge and
furniture as a sample of recycling resources. It Resource Discovery in SNS Community
also includes workflow of real reproduction,
explanation of detailed stages for reproduction and We have utilized SNS community in order to
modeling of knowledge and resource discovery obtain "Requests", "Resources", "Knowledge" and
using SNS community. "Announcement" for Modeling for Proposed
Design Education. Our sample is to reproduce
4.1 Workflow of Practical Reproduction some furniture using Knowledge and Resource.
They can be retrieved and discovered in SNS
First of all, workflow of reproduction of furniture community.
can be summarized as follows. Such workflow At first, we have established Human relation for
utilizes resources and know-how using SNS-based demand analysis, trend retrieval, decision making,
human relation. All the operations and functions and so on. SNS is powerful and reliable for us to
are especially geared towards SNS and intended achieve our aim relatively in a short period. They
for users of the relevant SNS community. are very useful and suitable to perform
information sharing and exchanging in convenient
1. Furniture Designing stage: ways.
A) Analyzing needs/demands Figure 4 shows such human relation realized in
B) Choosing kinds of furnitures SNS community such as in Campus network
C) Determining kinds of materials environment. Of course such a community may
234
not be limited to local and/or domestic community or not, some colleague replies his/her information
in the same campus (College and/or University). It about according resource. Of course, it is possible
can be more widely spread and enlarged like SNS, that others do not reply in a short period nor reply
for example, Facebook[5], Mixi[6] and/or only they know nothing about such resources.
Twitter[7]. Probably suitable resources will be found
potentially in a short period through human
relation established with SNS community. This is
an example of "Resource Discovery through SNS
community".
In the same manners, if a user wants to obtain
some tools and know-how to re-produce furniture
efficiently, he asks his colleagues, "Does anyone
know where suitable tools are?" or "Does anyone
have adequate information how to re-produce such
kind of furniture?" This is also an example of
"Knowledge Discovery through SNS community".
Figure 6 shows that a user has obtained a
necessary tool from SNS community and he/she
can use the relevant tool for Design Education in
order to achieve his/her purpose in a short period.
Figure 6. Establishment of Human Relation through SNS

community.
.Inthe case of reproduction of furniture, it is very

much necessary to find useful resources efficiently.
With utilization of SNS community, resource
discovery can be carried out more easily than
others, which is shown in Figure 5.
Figure 6. Tools Discovery in SNS community.
If a user is a beginner of our Proposed Design

Education who cannot re-produce such a furniture
by himself, he may want to know how to (re-
)produce good furniture with his resources. So he
needs several kinds of knowledge to use resources
and to handle tools effectively and efficiently. As
you know, by means of SNS community, such a
Figure 5. Resource Discovery in SNS community.
user may obtain suitable know-how to achieve
If a user asks his colleagues in SNS community his/her purpose. He/She can re-produce furniture
whether convenient resources exist close to them with his/her material discovered in SNS
235
community by means of utilization of Know-How may be necessary to prepare some market research
which can be also discovered in SNS community. and securement of materials which are not only
Such a scheme is conceptually illustrated in Figure unused resources but also newly created ones.
7.
Figure 8. Re-producing Furniture using Knowledge and

Tools Obtained from SNS Community.
5 QUALITATIVE AND QUANTITATIVE

Figure 7. Knowledge Discovery in SNS community. EVALUATION FOR OUR PROPOSAL
Even a beginner of design education may This section reports two types of evaluation,
sometimes be brought face-to-face with some namely qualitative and quantitative ones.
related problems and then he/she must
retrieve/utilize SNS community and solve them 5.1 Reproduction Qualitative Evaluation for
through such a community. In the case of re- Reproduction of Furniture as Recycling
production of furniture as an example of design Resources
education, he/she does really re-produce furniture
with powerful supports from SNS community. As evaluation of reproduction of furniture
With help of good tools and suitable knowledge described above, we explain the following three
how to manipulate as well as timely discovered items, namely cost-performance, feasibility study
material, the relevant beginner can perform his/her and human-relation based activity.
duty as his/her task for design education to re- Cost-performance: Recycling of resources is
produce some kinds of furniture. positive but necessity to transport
Figure 8 shows that even a beginner can re- tools/resources/products is negative. The
produce furniture by means of tools and former is a good effect for ecology, cost-
knowledge discovered in SNS community. And saving, and environmental protection.
he/she can accumulate not only all necessary Resources for furniture are almost woods so
techniques for tool manipulation but also their recycling can reduce some impacts from
knowledge about furniture reproduction through deforestation. Recycling also brings cost-
practical experience to use SNS community as saving normally. The latter is a bad effect for
well as to utilize resource, tools and knowledge. If emissions of carbon dioxide through traffic
needs are not very few, the next demands about increasing and all-too-easy way of borrowing
furniture re-production may occur potentially. tools and know-how. Emission of carbon
Such demands are steady and continuous so that it dioxide must increase by means of
236
transporting resources and tools. If an satisfaction from Knowledge and/or Resource

imprudent person wants to participate in such through SNS community for their practical
SNS community, he/she frequently raises reproductive design education. Table 1 shows the
troubles based on borrowing tools and know- relevant result.
how in easier ways than other conventional
approaches. Table 1. SNS access times and Satisfaction level of 5
learners.
Feasibility study: Our viewpoint for
reproduction of furniture stands for the very
Name Project 1 Project 2
best case to be performed. If some conditions
Learner ID times* level** times* level**
are not satisfied, such reproduction cannot
continue any more. For example, resources #01 2 2 3 4
are necessary to be supplied in a low cost #02 3 3 4 4
(although paying transport dues) and SNS #03 5 4 5 5
community kindly provide know-how about #04 4 5 4 4
relevant requests from users. In order to keep #05 2 2 3 2
(NB) times* =SNS access times, level** =Satisfaction level.
and satisfy the above conditions, we need to
maintain and expand suitable human relation
on SNS community. This may be one of most For example, as shown in Table 1, each learner
difficult problems! receives two types of projects and his/her maximal
Human-relation based activity: Utilization of access times for SNS is 5 and his/her satisfaction
SNS itself must be a good idea and it can be level is expressed from 1 to 5 (1: bad, 5: very
expected to make our life styles more fruitful. satisfied). Table 2 shows correlation between
Although one person does not carry out works, access times of SNS by learner and satisfaction
many persons can perform such works level.
probabilistically. Namely, activities based on Table 2. Correlation between SNS access times and
human relation will be identical to times of Satisfaction level.
single person's activity. It may be expected to
have synergistic effect based on human access Satisfaction level
relation through our practical experiences times 1 2 3 4 5
[8][9]. Anyway, it is necessary to lay out a 1 0 0 0 0 0
well-suited goal to contribute to the 2 0 2 0 0 0
maintenance of human relation on SNS 3 0 0 2 1 0
community. 4 0 0 0 2 1
The above discussion has been limited to 5 0 0 0 1 1
reproduction of furniture with recycling resources
and tools/knowledge. But our concept may be This may have been the first evaluation about
applicable in other target of reproductive design reproductive design education through SNS
education and finally suitable in practical design community, because the more times of SNS access
education schemes. are increasing, the higher level of satisfaction is
obtained by learner. So we have applied statistical
5.2 Quantitative Evaluation (PART I) analysis to Table 2 as the first quantitative
evaluation for effectiveness of SNS utilization
We have cordially asked 5 learners in reproductive during reproductive design education.
design education to do feedback of correlation of In order to confirm whether our approach can be
themselves between their behaviors in SNS access significantly effective, we will perform chi-square
and their corresponding satisfaction level based on test, namely 2-test for reduced version of Table 2,
utilization of SNS community for accomplishment where the rows and columns contain all zeros are
during reproductive design education. This means
whether it is useful for those 5 learners to get good
237
removed, as one of statistical analysis. The of learners at the end of design education. We
procedure is demonstrated as follows; have tried to evaluate quantitatively a scheme of
(1) Calculating 2, namely our proposal through the second questionnaire
2 = (22*2/10)2/(2*2/10)+(0 again in order to obtain some certain eligibility of
2*2/10)2/(2*2/10) + ~ +(1-2*2/10)2/(2*2/10) our proposal based on more scale of user size.
(2) Obtaining 2= 15.92 from the above Table 3 shows results of the above questionnaire.
(3) Degree of freedom of Table 3 is (4-1)*(4-
Table 3. Another comparison of SNS access times and
1)= 9 Satisfaction level for 15 learners.
(4) From 2-distribution table, we can get chi-
square percentile with degree-of-freedom= 9 at Name Total State
the 5% significance level and 10% one as Learner ID times*+ level**
follows; S01 4 5
2 = 0.05(9) : [2]at 5% level with 9 degree of S02 3 3
freedom = 16.9 and2 = 0.10(9) : [2]at 5% level S03 4 4
with 9 degree of freedom = 14.7, respectively. S04 5 3
In the above results of chi-square-testing, we can S05 5 4
describe the following; S06 5 5
S07 4 4
(a) If H01: Scheme of our proposal is not useful at S08 3 3
the 5% significance level ( = 0.05) is a null S09 3 2
hypothesis, based on expression: S10 5 4
2 = 15.92 < 2 = 0.05(9) = 16.9, at the 5% S11 5 5
significance level, H01 cannot be rejected. S12 4 3
Therefore, it can not be confirmed that S13 5 5
Satisfaction level is dependent on SNS-access S14 5 4
times. In other words, the former may be S15 5 5
independent from the later, namely, utilization of (NB) times*+ =SNS access times (if 5 and more, leaners are
SNS services is not significantly dependent on requested to express only 5 for convenient statistical
obtaining satisfaction of learners at the 5% analysis), level** =Satisfaction level.
significance level ( = 0.05).
(b) However, if H02: Scheme of our proposal is Just like the same way as the previous analysis, we
not useful at the significance level ( = 0.10) is will demonstrate statistical analysis about the
another null hypothesis, based on expression: relation between utilization level of Knowledge/
2 = 15.92 > 2 = 0.10(9) = 14.7, at the 10% Resource Discovery through SNS community and
significance level, H02 can be surely rejected. learner satisfaction level for our real Reproductive
Therefore, it can be confirmed that Satisfaction Design Education. Table 4 shows correlation
level is dependent on SNS-access times in this between access times for SNS community and the
case. In other words, the former may be dependent relevant satisfaction level.
on the later at the 5% significance level ( = 0.05).
Table 4. Correlation between SNS access times and
Satisfaction level.
5.3 Quantitative Evaluation (PART II)
access Satisfaction level
In order to investigate more precisely whether times 1 2 3 4 5
SNS-access times (namely, utilization of SNS 1 0 0 0 0 0
services) are significantly dependent on obtaining 2 0 0 0 0 0
satisfaction or not, after obtaining the above 3 0 1 2 0 0
evaluation results, we have decided to carry out 4 0 0 1 2 1
classroom-level questionnaire for larger numbers 5 0 0 1 3 4
238
Based on Table 4, reduced correlation SNS access 2 = 14.72551 > 2 = 0.05(6) = 12.5916, so that H+
times and the relevant learners Satisfaction level can be definitely rejected at the 5% significance
for Reproductive Design Education is described in level ( = 0.05). Therefore, it can be surely
Table 5 and its auxiliary parameters for statistical confirmed that Satisfaction level of every learner
analysis are calculated and contained in Table 6. is dependent on SNS-access times, namely,
utilization of SNS services can be significantly
Table 5. Reduced Correlation between SNS access times dependent on obtaining satisfaction of learners.
and Satisfaction level for Table 4.
Our scheme of proposal for Reproductive Design
Education is useful and effective for learners to
access Satisfaction level
perform Knowledge and Resource Discovery even
times 2 3 4 5 subtotal
at the 5% significance level ( = 0.05)]
3 1 2 0 0 3
4 0 1 2 1 4
6 CONCLUSION
5 0 1 3 4 8
This paper describes a practical model of
subtotal 1 4 5 5 15
Reproductive Design Education utilizing services
based on Knowledge and Resource Discovery
Table 6. Auxiliary parameters for statistical analysis of
Table 5. through SNS community. And it also explains the
characteristics and advantages from scheme of our
access Satisfaction level proposal for Reproductive Design Education in
times 2 3 4 5 Detail. The paper illustrates a practical flow for
3 1*3/15 4*3/15 5*3/15 5*3/15 proposed Reproductive Design Education utilizing
4 1*4/15 4*4/15 5*4/15 5*4/15 several kinds of services from SNS community
5 1*8/15 4*8/15 5*8/15 5*8/15 with comparison of conventional design process in
design education. Knowledge, Resources, Tools
obtained from SNS community can realize a
The procedure of 2-test for Table 5 with Table 6
fruitful reproductive design education. In the case
can be expressed below just like demonstration in
of furniture reproduction, our proposed
the previous 5.2 Quantitative Evaluation (PART
Reproductive Design Education has brought
I);
important and significant values to learners as well
(1) Calculating 2, using parameters in Table 6, as their according SNS community. Such values
namely include resource recycling, tool sharing, energy
2 = (11*3/15)2/(1*3/15)+(24*3/15)2/(4*3/15) + saving, cost-performance, knowledge retrieving/
(0-5*3/15)2/(5*3/15) + (0-5*3/15)2/(5*3/15) + ~ mining and so on.
+(4-5*8/15)2/(5*8/15)
With the above discussion, it can be summarized
(2) Obtaining 2= 14.72551 from the above in this paper as follows:
(3) Degree of freedom of Table 5 is (3-1)*(4- Reproductive Design Education has provided
1)= 6, because of row=3 and column=4 the effect and evidence of recycling, ecology
(4) From 2-distribution table, we can get chi- and cost saving.
square percentile with degree-of-freedom= 6 at Reproduction of furniture, itself, as a good
the 5% significance level as follows; example of proposed Reproductive Design
2 = 0.05(6) = 12.5916, namely [2] at 5% level Education can play a certain role of utilization
with 6 degree of freedom. of services about Knowledge and Resource
In the above results of chi-square-testing, we can Discovery from SNS community.
describe the following; Reproduction, sharing and recycling with
Assuming that H+ [Satisfaction level of learners is support from networks seems to be some case
independent from SNS-access times, namely study of Resource and Knowledge Discovery
utilizing SNS services] is a null hypothesis, then through SNS community.
it is demonstrated below:
239
Qualitative and quantitative evaluation have 9. Imai, M., Imai, Y. : A Scheme of Resource Discovery in
been performed for limited members of Reproductive Design Education. Proc. of The Fifth
International Conference on E-Learning and E-
learners as well as larger size of ones in Technologies in Education (ICEEE2016, @Asia Pacific
classroom level. University of Technology & Innovation, Kuala Lumpur,
Results from qualitative and quantitative Malaysia) on September 6-8, 2016) pp.68-76, 2016
evaluation can allow us to consider that it is ISBN: 978-1-941968-37-6 2016 SDIWC
confirmed for our proposed scheme to provide
learners satisfaction for Reproductive Design
Education utilizing SNS community through
Knowledge and Resource Discovery.
Our future plan is to provide more suitable

educational schema and practical models for
schools/institutes to employ more fruitfully and
smoothly.
Acknowledgement
The authors are thankful to Dr. Yoshio Moritoh,

Professor of Kagawa Junior College for his
continuous supports for their research. They are
also grateful for Professor Toshiaki MORITA of
Tokyo Zokei University who has been teaching
from authors school days.
REFERENCES
1. Syn, S. Y., Oh, S.: Why do social network site users

share information on Facebook and Twitter?
Journal of Information Science vol. 41, no. 5, pp. 553
569 ( Oct. 2015)
2. Kisilevich, S., Ang, C. S. Last,M.: Large-scale analysis
of self-disclosure patterns among online social networks
users: a Russian context. Knowledge and Information
Systems, vol. 32, no. 3, pp 609628 (Sep. 2012)
3. Kwon, O., Wen, Y.: An empirical study of the factors
affecting social network service use. Computers in
Human Behavior, vol. 26, no. 2, pp 254263 (Mar.
2010)
4. Henderson,T., Hutton,L., McNeilly,S.: FRRIICT case
study report: Ethics in online social network research.
2012. 15 pages
http://tethys.eaprs.cse.dmu.ac.uk/rri/sites/default/files/o
bs-case-study/henderson_frriict_report_0_0.pdf
5. https://www.facebook.com/
6. https://mixi.jp/
7. https://twitter.com/
8. Imai, M., Imai, Y., Hattori, T. : Collaborative design
and its evaluation through Kansei engineering
approach, International Journal of Artificial Life and
Robotics (Springer), Vol. 18, No. 3, pp. 233 240
(2013).
240
Assessment of Quality-of-Experience in Telecommunication Services
Demstenes Z. Rodriguez, Renata L. Rosa, Rodrigo D. Nunes and Emmanuel T. Affonso

Department of Engineering Federal University of Lavras
Minas Gerais, Brazil
{demostenes.zegarra, renata.rosa}@dcc.ufla.br, {rdantasnunes, emmanuelcomp}@gamil.com
ABSTRACT for services providers, because the quality

information can help to improve the user's
In recent years, the number of researches in Quality-of- satisfaction; therefore, the users loyalty.
Experience (QoE) is increasing because this concept is In this context, it is very important that
used in several areas. In telecommunication services, communication services, especially in wireless
the subscriber loyalty depends on the users QoE level. environments, consider the QoE approach. Hence,
Thus, service providers need to know the real users to determine the users QoE, the multimedia signal
QoE to improve their services offered. In this arena,
quality perception needs to be complemented with
the study of methods to assess QoE is very important.
In this paper is shown the different approaches that other criteria set related with sensorial processing,
compose the QoE concept, from technology, social and cognitive process and psychological approaches.
human aspects, differentiating it from others well In recent years, researchers have tried to improve
known concepts such as, Quality of Service (QoS) and existing Quality of Service (QoS) methodologies
User Experience (UX). In order to assess users QoE, to evaluate communication services in order to
subjective tests need to be conducted, and based on provide a most comprehensive approach for user
their results mathematical models are formulated to satisfaction assessment [1]-[4].
establish quality metrics. In this document, two case The concept of QoS is more related to
studies are presented. Firstly, the video streaming technological aspects, such as, network
service is analyzed, in which the users QoE is performance, transmission channel capacity,
modeled considering the pauses as a degradation
among others. In general, QoS deals with
factors obtaining a video quality metric. As a second
case study, a research regarding voice quality is performance aspects of physical systems. The
presented, in which the voice calls in a cellular concept of Service Level Agreement is related to
network are assessed. technological and economic aspects, because users
pay more for a better service. As stated before, the
KEYWORDS QoE concept does not only cover technical,
economic or social aspects, it is also related to
QoE, MOS, quality metrics, QoS telecommunication human aspects, such as, their necessities and
service, video streaming, cellular networks, voice and preferences. As a consequence, the concept of
video quality. QoE has gained popularity in different research
areas, in which one of the most important is the
1 INTRODUCTION telecommunication services.
Thus, the QoE was defined by IEEE in Standard
Nowadays, communication services and their 3333.1 [5] as the degree of delight or annoyance
applications are growing. Such offers have raised of the user of an application or service resulting
the interest in providing ever more quality services from the fulfillment of his or her expectations with
to the subscribers. A subscriber, who is using a respect to the utility and/or enjoyment of the
product or service, needs to feel and perceive a application or service.
good quality associated with reasonable rates. Furthermore, the number of research regarding
These concerns gave rise to the concept of QoE presented an important growing over the last
Quality-of-Experience (QoE), which is very useful 10 years. For instance, Table 1 presents the
241
number of citations of the words QoE and QoS in applications. Finally, Section 6 draws the
abstracts of papers registered in IEEE database, conclusions of this research.
and also the ratio between QoE and QoS is shown.
The period of time used for comparison is three 2 DEFINITION AND APPLICATIONS OF
years, only the most recent period considers two QUALITY OF EXPERIENCE
and a half years.
In this section, firstly, we will review different
Table 1. Number of QoE and QoS Citations in Abstracts general concepts related to perception, quality, and
Obtained in the IEEE Database experience, on an individual basis, in order to
Period of Reference define the overall concept of QoE. Later, QoE is
QOE QOS QOE/QOS
(Years)
compared with the concepts of QoS and User
2002 - 2004 4 3451 0.12%
2005 - 2007 23 5328 0.43% Experience (UX). Finally, the application areas of
2008 2010 235 6689 3.51% QoE and their influence factors are introduced.
2011 - 2013 853 5291 16.12% The terms perception and experience are defined
2014 Jun. 2016 826 4051 20.39% to understand the QoE concept. The process of
perception begins with the incidence of the
The application areas of QoE concept is very respective stimulus for one or more human
large, that include, communication and sensory organs. Perception is a conscious
multimedia services, educational solutions, processing of sensory information to which
medical applications, business models, humans are exposed, and involves two stages:
entertainment services, among others. Conversion of stimulus from a sensory
In this context, the main objective of this research organ into a neural signal.
is to demonstrate the relevance of QoE concept Processing and transmission of neural
and its applicability in different signals in the central nervous system to the cortex.
telecommunication services. The perception is influenced by events stored in
In this research is presented two case studies, both memory. As a result, neural features that belong to
of them are extended contributions of previous the same object are associated. According to
works [6]-[9]. Thus, the main contribution of this Cowan [10], Coltheart [11] and Baddeley [12],
paper is shown how important is the application of [13] different memory levels has been identified,
QoE, and how the existent metrics can be each one with their roles in the process of
enhanced. The motivation to present these case perception, and their duration of storage. Such
studies is to show the different steps of the memories are:
development process to model objective quality Sensory memory: the peripheral memory
metrics to be used in communication services. that stores short representations between 150 ms.
The first case study is related to assess users QoE to 2 s.
in video streaming service over HTTP/TCP, in Working memory: stores information
which a video quality metric named VsQM is lasting up to 10 s. [9]. It is also known as short-
modeled [6], considering psychological term memory.
approaches. The second case study treats about Long-term memory: stores information for
voice quality assessment during a phone call [7]- long times even years or a lifetime.
[9], in which a new quality indicator is proposed. The process of creation of quality concept may be
The remainder of this work is structured as viewed as a parallel process of cognitive high
follows. Section 2 presents an overview of level associated with the process of experiment.
definition of QoE and its applications. Section 3 The reflection can be triggered by an external task
describes the methods for audio and video quality to assess what has been experienced, during or
assessment. Section 4 introduces the first case after the process of experiment. It is important to
study regarding video streaming service. Section 5 note that the emotional state of the person as well
presents the study about the assessment of phone as your personality plays an important role in the
calls quality in cellular network and its procedures of quality assessment. Based on the
242
concepts described, the term of quality refers to context of market research, considering the role of
feelings of individual perception, sensory the person as a customer. Perception can refer to
perception and concepts that occur in a particular both the perception during the meeting with the
situation, such as when a person experiences a service, and the concept of QoS related to a
multimedia service. particular company in terms of satisfaction or
According to Jackson [14], the word Qualia can be dissatisfaction of customers.
seen as a property to experience something that Another author defines QoE as "the characteristics
cannot be shared by verbal or technical of sensations, perceptions and views of people
descriptions, thus being an individual and about a particular service or product; these
subjective experience. Martens and Martens [15] characteristics can be good, fair or bad [5]. Also,
discuss two existing approaches to understanding one should emphasize that to determine the users
quality: (1) objective, rational and oriented to a QoE, the perceived quality of multimedia signal
product and (2) perceptual and subjective. The needs to be complemented by other criteria related
first approach focuses on the characteristics and to sensory processing, the human cognitive
properties of an item (product or service) in terms process and psychological approaches.
of quality; while the second approach requires
human evaluation considering terms of 2.1 Quality of Experience and Quality of
"assessment of excellence". Reeves and Bednar Service
[16] define quality in a more intrusive manner, for
instance, "the form that a product or service meets The difference between the concepts of QoS and
or exceed customer expectations"; this definition QoE can be shown in Figure 1, in which the QoS
comes from the marketing literature. For these must be ensured by the network providers, by
reasons, the definition of quality is based on managing some parameters such as delay, jitter,
standards, such as, the given by the International bandwidth and packet loss rate; these parameters
Organization for Standardization (ISO) in the ISO must be measured and controlled to guaranty a
9000:2000 standard [17]: "quality is the ability of better service. While the QoS is more related to
a set of features of a product, system or process to technical aspects, the QoE cares about the
meet certain customer requirements, consumers customer service; therefore, the QoS can be part of
and other stakeholders" the QoE definition. Also, other features need to be
A definition of greater acceptance regarding to considered, such as the user preference and service
QoE concept, stated by ITU-T on P.10 costs.
Recommendation is: "QoE is the general
acceptance of an application or service, as
perceived subjectively by the end user." From this Network
definition we can see two points: (1) Includes full
performance of the end-to-end system. And, (2) User
User Application
equipment Server
can be influenced by user expectations and
context. QoS
In recent years there have been some criticisms
about the definition of QoE. The term QoE
"acceptability" included as a basis for the QoE is Figure 1. Differences between QoE and QoS in a mobile
not the most appropriated as stated by Moller [18] communication.
and other scientists; because acceptance is the
result of a decision which is partially based on Human aspects, such as user preference and
Quality of Experience". If a more service-oriented service preference, are related with the users
view is considered, arguing that quality is based QoE. Studies consider [19] the user preference for
on the comparison of perceptions with video content, in which the user makes a more
expectations. Aspects of expectation have been critical assessment, giving a low quality score,
addressed in a more comprehensive manner in the
243
depending on his or her preference regarding the having a more practical view. Over the past years,
video content. several more holistic conceptual frameworks
begin to be presented in the literature, for instance,
2.2 Quality of Experience and User Experience taxonomy focused on QoE [25], the Gr@sp- QoE-
framework [26] and the approach of Quality of
QoE and User Experience (UX) are new research Experience centered in the user [27].
topics and they have become popular in recent The QoE is often represented in terms of the index
decades. They are also related to development named Mean Opinion Score (MOS). The MOS
studies and the validation of products and services. scale was already used before the QoE become a
In 2010, the ISO introduced its new standard ISO research topic [28].
9241-210 [20], which included the following
definition of the term User Experience: "The 2.3 Quality of Experience and its application
perceptions and responses of a person resulting areas
from the use of a product, service or system". This
standard highlights four UX features: For example, the IPTV is a service that has several
UX presents temporal and dynamic researches that includes QoE [29]-[31]. The area
aspects; therefore, UX changes through the time; of Cellular Networks also needs to concern about
UX is dependent on the context in which the users QoE; studies as [32] propose a new
each situation and experience lived by someone, Voice Quality Indicator and [33] evaluates voice
and this experience can be unique; quality index for mobile systems. Considering all
UX is considered subjective and these services, the perceived QoE becomes an
indispensable study. There are many application
individual;
areas that consider the QoE, among them:
The focus of UX is on enjoyment to use a
Telecommunication services: covering a
new technology, instead of highlighting problems
large variety of multimedia communications and
and difficulties when dealing with a computer. tradictional fixed and mobile networks.
Studies show that usability methods are not Assistive technology: wich goal is design
sufficient to measure UX and consequently, new and develop assistive and rehabilitative devices for
methods of measurement have been studied [21]. people with disabilities to improve their QoE.
Also, aspects like qualities and emotions are not in Cloud Computing [34]: this new
the scope of Human-computer interaction (HCI). technology must be transparent for the user, the
While QoE has its origin in the capacity to share, transfer and collaborate with the
Telecommunication area, the UX has its origin in cloud must maximize the users QoE;
the study of HCI. The origin and evolution of QoE Multimedia learning [35]: the use of
has always focused on industry, in order to avoid multimedia contents must be easy and natural with
users frustration when they are using certain cognitive resources, facilitating the access to
systems. Also, the approaches of economic aspects information.
and customer loyalty are related to the QoE. This Games: the quality of game software affect
economic dimension is less prominent in the directly the users QoE.
literature regarding to UX. In this context, it is Sensory Experience [36]: QoE must be
concluded that QoE is much closer to a global multi-dimensional and multi-sensorial, users need
customer experience than UX [22]. to have the impression to be part of the
multimedia asset. For this can be used sensory
Another difference between UX and QoE is that
effects (light, wind, vibration) and specific devices
UX is more focused on the user. Roto et al. [23]
and effects (air vaporizer, motion chairs).
emphasizes that the UX is not related to
technology. QoE is considered mostly dependent 2.4 Influence Factors of Quality of Experience
on QoS [24]. The QoE is focused on application,
244
The task of determination of users QoE in its destination to the original one to evaluate the
telecommunication services is very complex quality of an audio transmission. One of the most
because the number of Influence Factors (IFs). accepted intrusive method is the ITU-T P.862
These factors can be classified in the following recommendation [35], most known as Perceptual
categories, human, system and context [37]. Evaluation of Speech Quality (PESQ), because its
Human IFs are regarding to the users subjectivity, high correlation with subjective tests.
such as, users preferences [19] and the The P.862 algorithm measures the voice distortion
characteristics of the Human Visual System effects and the noise in the speech quality.
(HVS). System IFs are related to the Basically, their algorithm compares an original
characteristics of end-users devices such as signal X(t) with a degraded signal Y(t) that is the
display screen or speaker, processing and energy result of passing X(t) through a communications
capacity; and also the characteristics of the system [39]. In the first step of PESQ the delays
network transmission [38]. Context IFs are between original input and degraded output are
concerned with the space and time in which a computed and an alignment algorithm is
service is used, considering social and economic performed. Later, the following stages are
aspects; for instance, the cost of a phone call, or implemented, a level alignment to a calibrated
the characteristics of the physical environment in listening level, a time-frequency mapping,
which a user watches a video. It is worth noting frequency warping, and compressive loudness
that all of the IFs are interrelated, and each one scaling.
adding some degradation on the global users ITU-T P.862 recommendation only evaluates the
QoE. effects of one-way speech distortion and noise on
speech; then, delays, echo, loudness loss and other
3 AUDIO AND VIDEO QUALITY impairments related to two-way conversations are
ASSESSEMENT METHODS not reflected in the PESQ scores. The scenarios
for which PESQ had demonstrated acceptable
In this section, the most relevant methodologies to performance are: speech input levels to a codec,
assess the quality of audio and video signals are transmission channel errors, packet loss and
briefly described. packet loss concealment, bit rates if a codec is a
multi-rate codec, transcoding, environmental noise
3.1 Quality assessment of Audio Signals at the transmission side, varying delay in listening
only tests, an different techniques of coding, such
Audio quality assessment methods can be
as waveform codecs, code-excited linear
classified in two main categories, subjective
prediction (CELP), adaptive multi-rate (AMR),
methods and objective methods. Subjective
among others.
methods are based on the users evaluation of the
On the other hand, non-intrusive methods are
content and objective methods are based on
based only on the degraded audio file; this method
algorithms that contain technical parameters
does not require a reference file.
related to network performance or content
It is important to note that non-intrusive methods
characteristics.
have a lesser correlation with subjective test in
Subjective methods are also classified in presence
relation to objective methods. However, non-
and remote tests, which in turn are separated into
intrusive methods are recommended for real time
utilitarian and analytical methods.
quality evaluation, since the only information
Moreover, objective methods can be classified as
needed is the audio itself. This characteristic is
intrusive, non-intrusive and parametric methods.
very important when considered that in most
Intrusive methods require a reference for the
online streaming applications, the original audio is
evaluation; thus, an original file is used; it is
not available.
necessary to compare the audio that has arrived to
245
The most popular nonintrusive objective method is Several objective methodologies for video quality
ITU-T P.563 which predicts the speech quality of assessment are focused on determining the quality
a degraded signal without a given reference of human visual perception considering videos
speech signal [40]. with spatial impairments [47]-[48] or temporal
The parametric method uses physical measures of interruptions [49]-[52]. In the last 6 years, some
the system, including the network, codecs and researches [53]-[59] investigate the impact of
acoustic parameters. The E-Model is a parametric video resolution changes on the users QoE.
method that is standardized as ITU-T G.107 Hence, many communication service providers
Recommendation [41]. This metric tries to predict measure the user satisfaction to find a manner to
the audio quality by analyzing some parameters of improve their services, and other solutions were
the network transmission. Although, these metrics developed in order to improve existing image and
are considered state-of-the-art standard, some video quality metrics [60]-[63].
studies [42]-[44] suggest different algorithms to Also, measure the quality in audio and video
improve the voice quality assessment metrics. multimedia services is very important, and some
The subjective tests methods for audio quality other studies [64]-[69] are dedicated to describe
assessment are used in tests conducted under the impairments, quality models, components, and
laboratory conditions, in which the instructions are metrics. In the next section, additional video
explained by a supervisor to assessors. Assessors quality metrics are treated.
listen to different audio files and grant an adjective
score using different scales. The most popular 4 FIRST CASE STUDY: QOE IN VIDEO
scale is the five-point Mean Opinion Score (MOS) STREAMMING SERVICE
scale described in the Absolute Category Rating
(ACR) method, which is introduced Table 2. This case study is based on a previous work [6]
and its goal is to show how a new metric that
Table 2. Absolute Category Rating (ACR) quantifies the users QoE in a video streaming
Estimated session is modeled. Hence, our motivation to
Score
Quality present this case study is show the development
1 Bad process of a video quality metric, considering the
2 Poor
different steps involved, such as, the identification
3 Fair
4 Good of key degradation factors in this specific service,
5 Excellent initial subjective tests, mathematical model
definition, technical implementation, and finally,
3.2 Quality assessment of Video Signals the validation tests.
It is worth noting that the proposed metric can be
In general, the video quality assessment applied in many realistic applications, such as,
methodologies can be classified in two groups, adaptive video streaming, in which the video
subjective and objective methods. coding characteristics depends on the video
Nowadays, the most popular subjective tests quality assessed in conjunction with other
methodologies for video quality assessment are parameters as network capacity at end user device.
stated in ITU recommendations ITU-R BT-500 Nowadays, most of video streaming services are
[45] and ITU-T P.910 [46]. These subjective tests running over HyperText Transfer Protocol
are conducted in a laboratory environment with (HTTP), which uses Transmission Control
special requirements concerning lighting and Protocol (TCP). In order to minimize the network
acoustics conditions. In order to conduct the congestion effects, TCP implemented different
subjective tests, a supervisor explains the test congestion control mechanisms [70]-[72]. When
instructions to the assessors. Then, they score each TCP detects packet losses in the network, the
video assessed using a MOS scale, such as, the number of transmitted IP packets decreases, and if
scale presented in Table 2 or another quality scale. this new rate is smaller than the playback rate, the
player takes all the buffer information and then
246
enters into a rebuffering process. In this 4.1 Subjective Test Methods for assessing Video
rebuffering time period, no information is Quality
displayed and this negatively affects the users Nowadays, subjective test methodologies do not
QoE. consider the effects of temporal interruptions, such
A customized player was implemented to extract as the pauses. Especially, in the video streaming
information regarding the player buffer states service over HTTP/TCP, degradations do not
during a video streaming. These states are the happen in the spatial domain, because TCP
application layer parameters and they indicate: the guarantees the packet delivery.
number of pauses and their frequency, mean pause In general, the video impairments can appear in
length and temporal location. the temporal or spatial domain. In Figure 2 the
In subjective tests of video quality, the human temporal and spatial impairments are presented
perception on the quality of the tested material is using TCP and UDP as transport protocol,
quantified by a score and the global quality of the respectively. It can be observed that the users
service is evaluated according these results. QoE is affected in different ways.
Results of subjective tests are very important,
because product improvements are based on users
requirements [73].
It is important to note that objective metrics such
as: Mean Squared Error (MSE), Peak Signal-to-
Noise Ratio (PSNR), Structural Similarity (SSIM)
[47], Video Quality Metric (VQM) [48] and Figure 2. Effects of transport protocols in the video quality:
algorithms based on Region of Interest [74] or (a) Impairment using TCP. (b) Impairment using UDP in a
packet loss scenario.
visual attentions maps [75], [76] are not indicated
for video streaming running over TCP, because
4.2 Limitation of Current Subjective Test
they do not take into account the characteristics of
Methods for assessing Video Streaming Quality
degradations on the temporal domain. Some
solutions based on application parameters consider
As stated before, most of the subjective test
the temporal degradations, specifically the number
methods are described in the ITU
and duration of the temporal interruptions
recommendations: ITU-R BT-500 [45] and ITU-T
[49],[50]. In this case study, subjective test results,
P.910 [46]. Furthermore, others works compare
from experimental tests, are related with the
these subjective methods [77]-[81].
application layer parameters and, as a result, the
ITU-R BT-500 described the following methods:
metric named Video streaming Quality Metrics
Double Stimulus Impairment Scale (DSIS),
(VsQM) was established. This approach
Double Stimulus Continuous Quality Scale
considered the temporal location of each temporal
(DSCQS), Single Stimulus Continuous Quality
interruption or pause. Therefore, QoE does not
Evaluation (SSCQE) and Simultaneous Double
only depend on the number of pauses and their
Stimulus for Continuous Evaluation (SDSCE).
mean period of time, as stated in [49]-[59]. For
In the DSIS method the test sequences are
video streaming service over TCP, the temporal
presented in pairs: the first stimulus presented in
location of each pause must be considered.
each pair is always the source reference, while the
Moreover, the proposed metric is used in a useful
second stimulus is the impairment video [82].
scenario, in which a feedback mechanism sends
DSCQS method requires the evaluation of two test
the quality metric score from end user device to
videos. One of each pair is unimpaired while the
the video server. This scenario can be used for
other video might or might not contain
different purposes such as monitoring, reports or
impairment; but the assessors do not know which
as input of a Rate Determination Algorithm
video is the reference. Also, the position of the
(RDA) to improve the user satisfaction or the
reference picture is changed in pseudo random
network performance.
order [82].
247
The SSCQE methodology is recommended for more useful to evaluate the performance of a
longer video sequences. The original video is not particular encoder. Thus, the effects of
used as reference to reproduce viewing conditions degradation caused by pauses cannot be properly
that are similar to real situations [45]. The evaluated by these methods. Also, it is important
assessors give a score at each certain period of to note that, in some cases, the length of pauses is
time during the overall video; therefore, there is almost equal to the total video length.
not a sole global score for a video sequence test.
SDSCE has been developed taking as reference Table 3. Parameters of Video Quality Assessment Methods
the SSCQE method, in which the presentation of Video Explicit Hidden Simulta- Continuous
Methods Length Refe- Refe- neous Quality
video sequences, and the rating scale had some (s) rence Rence Stimuli Scale
variations [45]. DSIS 10 Yes No No No
The ITU-T P.910 recommendation introduces the DSCQS 10 No Yes No No
following methodologies, Absolute Category SSCQE 300 No No No Yes
Rating (ACR), Absolute Category Rating with SDSCE 10 No No Yes Yes
SAMVIQ 10 Yes Yes No No
Hidden Reference (ACR-H), Degradation
ACR 10 No No No No
Category Rating (DCR) and Pair Comparison ACR-HR 10 No Yes No No
method (PC). DCR 10 Yes No No No
The ACR methodology is a category judgment, in PC 10 No No No No
which the test sequences are presented one at a
time and are rated independently on a category As stated before, the SSCQE methodology is
scale. The ACR-H methodology is a variant of indicated for longer videos. However, grant a
ACR, in which the assessor does not know which score during a pause would not be reasonable,
is the original video sequence [46]. because there is not visual information. Also, there
The DCR methodology is characterized because is not any spatial degradation in the video frames
the first stimulus presented in each pair is always because the streaming service uses the TCP.
the source reference, while the second stimulus is
the same source with some impairment [46]. 4.3 The Proposed Subjective Test Methodology
In PC methodology the video sequences for
testing are also presented in pairs; however, both In subjective tests performed in this case study,
of them are representing different impairments the variability of the cognitive processes of
[46]. assessors is considered. They have different
Another method named Assessment Methodology cognitive characteristics, such as, attention, speed
for Video Quality (SAMVIQ) is introduced in in information processing, short-term and long-
[83], [84] and it considers some variants of the term memory, prior knowledge about technology,
ITU-T methodologies. Table 3 presents the main and even preferences of video content.
parameters of the previous described The main differences of the proposed subjective
methodologies. The parameters used to compare test methodology in relation to the described in
these methods are (1) the video length; (2) the Table 3 are the length of the test video sequences,
explicit reference, the assessor knows which are and the global score given by the assessors at the
the original and impairment video; (3) the hidden end of the video test. Therefore, the proposed
reference, the original video is presented but the method is in accordance to the current video
assessor does not know; (4) simultaneous stimuli, streaming services.
in which two videos are presented at the same The subjective video tests were conducted in a
time; and (5) continuous quality scale, the assessor laboratory using the recommendation introduced
score several times during a sole video sequence. in Table 4 [6]. These recommendations try to
As presented in Table 3, the focus of all these establish a realistic environment of the video
methods, except SSCQE, is to assess the effects of streaming service.
spatial degradation, because they consider a video The results demonstrated that a metric for
length around 10 seconds. These methods are
248
assessing video quality in streaming service have WI

to consider the temporal location of each pause in
Temporal Segment Weight

WA
the video. Based on this criterion several test
scenarios or impairment videos were built to
WB
create pauses at different instants of the video and
with certain duration. WC
WD
Table 4. Considerations for Conducting Subjective Tests
tc1 T
ta1 ta2 TA tb1 TB TC td1 TD
Item Considerations
Time (s)
In order to preserve the temporal effect of pauses
video sequences are longer than 10 seconds. Video Figure 3. Parameters used to determine the VsQM metric
1 sequences with two and four minutes were chosen [6].
taking as a reference the top videos in the most
popular video-sharing service. Then, the VsQM metric is modeled as following
2
Assessors watch the videos according to their (1) [6]:
preference, and the times they deemed necessary.
Considering the variability of attention of assessors, k
N i LiWi
3
the instructions were specific. Thus, assessors know VsQM (1)
that video degradation is only due to the presence of i 1 Ti
pauses.
The assessors had different speeds in processing
information, hence no limited time to score is
Where: Ni is the number of pauses; Li is the
4 average length of pauses, in seconds, that occurs
considered. Also, the tests were performed
individually. in the same temporal segment; Wi is a weigh factor
Considering the characteristics of assessors which represents the degradation of each segment;
5 memory, they could watch the videos as many times
as each of them considered necessary. Ti is de duration of each segment; k is the number
of temporal segments; in this work four segments
were considered.
4.3.2 The Proposed Video Quality Model The results of subjective video tests [6], permitted
to determine the weights of each segment
Firstly, the concept of a video temporal represented by WA, WB, Wc and WD.
segmentation is introduced, and the following Also, the VsQM values were mapped to a 5-point
temporal segments are defined: (a) segment A, scale using an exponential function. VsQM at
initial video segment; (b) segment B, first MOS scale is named as VsQMMOS, which is
intermediate segment; (c) segment C, second described in [6], and it is presented in (2).
intermediate segment; and segment D, final video
segment.
k
N i LiWi
VsQM MOS C exp( ) (2)
The proposed metric is named VsQM and it was i 1 Ti
determined by the following parameters: number
of pauses, pauses length and weight of the Where: VsQMMOS is the MOS index expressed in
temporal segments. a 5-point scale, C is a constant for scaling
Figure 3, adapted from [6], helps to understand the purposes, and the other variables are the same that
proposed metric, at which the video playback time were described in (1).
was of TD seconds. In this scenario, four segments Some studies [85], [86] states that the exponential
were established, and six pauses of different function is the most correlated with subjective test
durations were distributed randomly. The number results.
of segments could be increased, but to calculate In this case study were performed 20 different test
the degradation weight of each segment, more test scenarios. The result of each test scenario is
video sequences would be necessary. represented by a MOS index, for example, the
249
following equation corresponds to scenario 1 4.3.3.1 Customized Player

(VsQMMOS-1):
The buffer states allow measure the following
W N L parameters: (a) number of pauses; (b) length of
Ln(VsQM MOS 1 ) Ln(C ) A A A
TA each pause, which corresponds to the duration of
(3) rebuffering state; (c) frequency of pauses; and (d)
WB N B LB WC N C LC WD N D LD
temporal location of each pause.
TB TC TD
Playing video
Considering the 20 scenarios and (3), an over Buffering
determined linear system with 2 variables and 20
Data Stored at Buffer

equations was obtained. To solve this equation
system, the least squared method, specifically the
pseudo-inverse, was used. Where C is a constant
and WX is the weight of temporal segment X to
be determined. Also, for each scenario, the
variables MOSX, NX, TX and LX are known. This
t1 TA TB t2 TC t3 TD
equation linear system is represented by: Time (seconds)
Figure 4. Events traced on the customized player (R:
1 t1, 2 .... t1,5 Ln(C ) Ln(VsQM MOS 1 ) Rebuffering status and P: Playing status).
1 t .... t2,5 WA Ln(VsQM MOS 2 )
2, 2 Subjective test results shows the relevance of the
: : : WB (4) temporal location parameter; for instances, five

: : : Wc pauses at the beginning of the video affects the
1 t20, 2 .... t20,5 WD Ln(VsQM MOS 20 video quality in a different way in relation to five

pauses at end of the video.
In which: t1,2 to t1,5 represent the first scenario; t2,2
to t2,5 represent the second scenario and so on. 4.3.3.2 Video Data Set
4.3.3 Test Scenario and Application As stated before, twenty different impairment
models were built for each video content type
A video server, a video client and a network considered. The content types used were sport,
emulator were used in the test scenario. The news and documental; thus, in total 60 impairment
network emulator is used to insert network video were created following one of the 20
impairments. The HTTP/TCP protocols were impairment models; five of these impairment
used. models (S1, S2, S3. S4 and S5) are depicted in
Different impairments, such as, reduction of Figure 5, which is adapted from [6].
bandwidth and packet were considered. These The main characteristics of the three original
impairments are responsible of pauses with videos are: video and audio format followed
different lengths. H.264/ACC standard, spatial resolution of
In order to monitor the buffer behavior, a 640x360, temporal resolution of 30 fps and video
length of 240 seconds. Figure 6 presents a
customized player that captures all the events
snapshot of the content information of each
related to buffering and playing status was used
original video.
and it is described as follows [6].
250
S5
S4
Test Scenarios
S3
S2
S1
TA TB TC TD t
Time (seconds) Figure 7. Application Scenario with Feedback Mechanism

Figure 5. Test Scenarios used as Reference. for the VsQM metric.
4.3.4 Results and Final Considerations
In the subjective tests, 96 evaluators participated,

who reported to have no vision problems. Each
video had at least fifteen MOS scores. The tests
Documentary News Sport were performed in the same laboratory
Figure 6. Snapshots of the three video content types used as environment.
test material. To analyze the results, the average MOS value of
As can be observed in Figure 6, the snapshots of the three video content types for the same scenario
the documentary and sport video content types was considered. The values of the constant C and
present three circles at the center of each figure. temporal segments weights were obtained: WA,
These circles were used to represent a temporal WB, WC and WD. Figure 8 shows the weigh
interruption or pauses. In total, 60 test sequences factor values [6].
were created, each one with different number and
1,6
temporal distribution of pauses. These videos were
Weight of Temporal Segments
1,4
stored in the personal computers used in the
1,2
experimental tests.
1,0
4.3.3.3 Service Application Scenario 0,8
0,6
Figure 7, adapted from [6], introduces an useful 0,4
scenario, in which, VsQM metric was sent from 0,2
the end users device to the service provider using 0,0
WA WB WC WD
a feedback mechanism. Temporal Segments
In this work, VsQM value is obtained
Figure 8. Weight of Temporal Segments: W A, WB, WC and
automatically in a pre-defined time period.
WD .
Depending on this period, it is possible to use
VsQM as input of an RDA; thus, the number of It is important to note that pauses at the beginning
users utilizing the service is increased. of the video, have a higher negative effect on the
For non-real-time applications, the video quality user QoE.
metric can be used to prepare reports or to perform Figure 9 shows the relation between the proposed
operations and maintain tasks. The feedback metric [6] and the subjective test results. The
mechanism was implemented using a socket exponential model is very confident, because the
interface [87]. maximum error obtained was 0.013 at 5-point
MOS scale and a Pearson Correlation Coefficient
251
of 0.96. This case study is based on the following previous

Hence, the VsQMMOS metric can be used in real works [8], [9] and it intends to demonstrate the
video applications, such as, Dynamic Adaptive relevance of considering the voice quality index to
Streaming over HTTP (DASH) [88], in which the improve the users QoE. In a voice call service,
video resolution transmitted depends on the the users QoE depends on different factors [98],
network capacity at end user, but these algorithms mainly the perceived voice quality at end user
can be improved with other parameters, such as device, and also the service rate.
video quality index [89]-[92]. Nowadays, in cellular networks there are several
services, such as, video conference, location based
services, social network and others applications.
However, the greatest amount of economic gains
of cellular operators corresponds to the voice
services. Although, mobile operators are
supervised by national regulatory agencies of
telecommunications services, the quality of voice
services do not have an acceptable quality in all
the geographic areas where the cellular operator
provides the service. The operation tasks, such as,
drive tests, generate high costs for the cellular
operator and also these tasks are not enough to
discover network coverage problems.
Figure 9. Relation between Subjective MOS results and This problem occurs mainly because the
VsQMMOS estimation. parameters of quality indicators - KPI (Key
Parameter Indicators) used by regulatory agencies
Subjective test results of video quality showed the do not reflect the real users satisfaction. These
relevance of considering the temporal location KPI parameters are not related with the voice
parameter in a video quality metric. It was signal quality. Specifically, in Brazil, the National
demonstrated that pauses at the beginning of the Agency of Telecommunications (ANATEL)
video have a higher negative effect on users QoE supervises the performance of mobile operators
in relation to the intermediate and final parts of the with 12 indicators [93], and none of them consider
video. Furthermore, the concept of the effect of the users QoE. Thus, there is not a KPI that
recent memory is not applicable in the streaming monitors the voice quality after the voice call has
video service. been initialized.
The VsQM was determined considering the In this case study, a network topology to assess
approach of temporal location parameter, and that voice quality is presented. This network topology
also considers the number of pauses in a specific is based on the 3GPP project called Minimization
video segment, and the duration of each pause. Drive Test (MDT) [94]. On the other hand, there
Results show how the MOS index values, are studies [95], [96] that present solutions for
calculated from VsQMMOS, change if the temporal monitoring the quality of voice calls, in which
locations of pauses are considered or not. network parameters are collected to detect
Furthermore, a function to map the proposed coverage problems. However, these solutions are
metric values into a 5-point MOS scale was difficult to be implemented in commercial cellular
showed. networks.
The objective of this case study is to demonstrate
5. SECOND CASE STUDY: QOE the importance of including the MOS index [97] in
ASSESSMENT OF VOICE CALL SERVICE the 3GPP - MDT project. Because, the RF
IN A CELLULAR NETWORKS
252
parameters, such as signal strength reception, co- call was not established due congestion or
channel interference (C/I), and others, not always coverage problems.
are correlated with the voice quality. Thus, MDT
not only will be focused to discover coverage 5.2 Overview of the Minimization Drive Tests
problems; additionally, MDT will be able to Solution
monitor the voice call quality.
To improve the cellular network performance,
5.1 Parameter Indicators used for assessing the field engineers perform tasks called drive tests.
Service Quality of Cellular Networks in Brazil Thus, radio frequency (RF) parameters are
collected to discover some coverage holes or weak
According to ANATEL reports [99], the number coverage areas. It is important to note that the
of prepaid users represents about 77.5% of the drive tests cannot be performed in all the coverage
total cellular users of cellular networks in Brazil areas, because some areas are access restricted.
and the main service used for prepaid subscribers Also, drive test tasks are expensive in both time
is the voice call service. and money.
The quality control of cellular networks operators In this context, MDT solution deals with the two
in Brazil is supervised by ANATEL. Indicators are problems mentioned above, because user
important to evaluate system performance [100]. equipment's (UE) from the real subscribers are
The ANATEL resolution number 335 of April, used to collect RF parameters. As a consequence,
2003 [93] established the definitions, methods and the costs regarding to drive tests are reduced
frequency of collection of Personal Mobile considerably, and the network measurements can
Services (PMS) quality indicators. Table 5 be performed in all different places of the network
introduces the indicators and quality targets for coverage area in short periods of time.
PMS.
5.2.1 Main characteristics of MDT solution
Table 5. Key Parameter Indicators of PMS In Brazil.
Target The main characteristics and functionalities of the
Index Description Value
MDT are [39]:
PMS 1 Rate of complaints 1%
PMS 2 Rate of coverage and congestion 4% There are two MDT modes to capture the
Rate of call completion by call 98% network parameters: (i) The logged MDT,
PMS 3
centers where the UE captures the RF parameters, then
Attendance by telephone / 95% they are stored for a certain period of time
PMS 4
electronic service before the data is sent to the MDT server. This
PMS 5 Rate of completed calls 67%
PMS 6 Rate of call set-up 95% MDT mode is performed when the UE is in
PMS 7 Rate of call drops 2% idle state. (ii) Immediate MDT is referred
PMS 8 Rate of user response 95% when the UE captures and reports immediately
Rate of response to requests for 95% the RF parameters to the MDT server. This
PMS 9
information mode of MDT is performed when the UE is in
PMS 10 Rate of personal service to the user 95%
Rate of Assistance to the user 5% active state.
PMS 11 Collection and report of network parameters
accounts
PMS 12 Rate of failures recovery 95% by the UE. The measurement logs captured for
the UE consist of multiple events logged in
As can be seen from Table 5, none of the PMS different timestamps.
indicators correspond to the voice signal quality. Network operators can choose specific
These PMS are more related to the service geographical regions to perform the MDT
assistance of call center of cellular operators. measurements.
Also, some PMS treat the cases when the phone
253
The UE geographical location is important to OAM

determine the regions with a weak coverage,
HLR/
this information depends of the UE capability. HSS
Timestamp in the measurement logs, which
MDT
need to be correlated with every event of the Server
Selection of Geographical
parameters captured. area with weak coverage
STP
The network operator can select some UE

based on their capabilities. RAN/ MSC - SGNS/
RNC MNE
Immediate mode
5.2.2 MDT architecture Logged mode
MDT UE
Figure 10, adapted from [5], depicts the MDT

network architecture defined in the 3GPP Rel. 10.
A brief description of the network nodes in this
figure is presented in following. MDT UE MDT UE
Home Location Register (HLR) / Home (data stored)
Figure 10. Network Architecture of the MDT solution.

Subscriber Server (HSS) are the data base of
the subscriber profiles in a cellular network.
Signal Transfer Point (STP), which is a 5.2.3 Proposed Network Solution Architecture
signaling point that is responsible only for the and Methodology to Assess Voice Quality
routing functions within an SS7 network.
Mobile Switching Center (MSC) - Serving In order to analysis and determine the voice
GPRS Support Node (SGSN) / Mobility quality index, the E6474A (Wireless Measurement
Management Entity (MME). The MSC is a Software) and VQT (Voice Quality Test) tools
telephone exchange that makes the connection were used. These tools use the ITU-T
between mobile users within the same or Recommendation P.862 [39]. A Personal
different mobile and fixed networks. The Computer (PC) with the VQT tools installed is
SGSN is a main component of the GPRS connected to an audio card and an UE. The server
network, which handles all packet switched sends and receives the audio files over a cellular
data within the network. network under test.
Radio Access Network (RAN) / Radio The mobile phone originates a phone call to the
Network Controller (RNC). The RAN is the quality server, which process estimation of voice
base station controller in LTE networks. quality is started using P.862 algorithm [39].
MDT Server is the node in charge of collects It is worth noting that the P.862 algorithm is
all network parameters from de UE. considered as standardized state-of-the-art.
Operations, Administration, and Maintenance Figure 11 shows the test scenario used for
(OAM). capturing the MOS scores in different coverage
Also, a target geographic area with a possible areas of a commercial cellular network.
weak coverage can be analyzed. Additionally, the RF parameters were also
measured and collected for analysis purposes.
254
because it is applicable in several areas such as

electrical, electronic, computer,
telecommunication engineering, computer science,
information technology, professional in the areas
of ergonomic and usability, among others. Hence,
a considerable number of new researches are
expected in the next years, considering the
increasing number of works in the last 10 years.
Figure 11. Representation of the test scenario for Voice
The assessment of the users QoE is very complex
Quality Assessment in a Cellular Network.
because the QoE concept is based in different
Figure 12, adapted from [9], shows simplified influence factors, which are classified into three
network architecture of the proposed system, categories: human, system and context. The first
based on [94], in which is included the MOS one is especially hard to be evaluated, because it is
index. related to the human subjectivity and each person
is different to another. In this context, the
HLR/
subjective tests are really important to determine
HSS the users satisfaction, because during these tests
OAM
can be identified and determined the main
impairment factors that are degrading the users
VQE
Server STP QoE. Based on the results of subjective tests and
considering the key parameters of the service
Application to capture:
evaluated, a mathematical model can be defined
- Different RF parameters establishing a relation between these parameters
- Quality index (MOS). MSC/
RAN and the subjective scores.
MNE
Thus, in the first case study, the experimental
Figure 12. Simplified Network Architecture of MDT results indicated that the main impairment factors
including the MOS index. were the number, duration and the frequency of
the pause happened during a video streaming
It is worth mentioning that is necessary to install session. Using these impairment factors, or
an application in the cellular phone. This parameters, a mathematical model was defined
application runs a voice quality algorithm to which output represents the users QoE index. In
determine an MOS index. For real-time the second case study, we presented how a voice
applications, this algorithm needs to be performed quality index can be used to improve the global
without any voice reference. ITU-T users QoE in the voice telephone service.
Recommendation P.563 [40] is a non-intrusive
objective quality metric, which accomplishes these REFERENCES
requirements.
1. Chitra, K. and Senkumar, M. R.: Hidden Markov model
based lightpath establishment technique for improving
6. CONCLUSIONS QoS in optical WDM networks, In Proc. of 2nd
International Conference on Current Trends in
The application of QoE concept is important to Engineering and Technology, pp. 53-62, Cairo, Egypt
everyone who works in the development of (2014).
product and services, including researchers, 2. Oddershede, A. and Carrasco, R.: Methodology to
evaluate and improve the QoS ICT networks in the
administrators and network operators, service and healthcare service, In Proc. of International Symposium
content providers and product manufacturers. on Communication Systems Networks and Digital
Nowadays, the research about QoE are increasing, Signal Processing, pp. 871-875, Newcastle, United
Kingdom (2010).
255
3. Gorlatch, S., Humernbrum, T. and Glinka, F.: 19. Rodriguez, D. Z., Rosa, R. L. and Bressan, G. : Video
Improving QoS in real-time internet applications: from quality assessment in video streaming services
best-effort to Software-Defined Networks, In Proc. of considering user preference for video content, IEEE
International Conference on Computing, Networking Trans. on Consumer Electron., vol. 60, no. 3, pp. 436-
and Communications, Honolulu, HI, U.S., pp. 189- 444 (2014).
193(2014). 20. ISO 9241-210: Ergonomics of human system
4. Seno, L., Valenzano, A. and Zunino, C.: A dynamic interaction-part 210: human-centered design for
bandwidth reassignment technique for improving QoS interactive systems, International organization for
in EDF-based industrial wireless networks, In Proc. of standardization (ISO), Geneva (2010).
13th IEEE International Conference on Industrial 21. Bargas-Avila J. and Hornbk, H. : Foci and blind spots
Informatics, pp. 892-899, Cambridge, United Kingdom in user experience research, Interactions, vol.19, no. 6,
(2015). pp. 2427 (2012).
5. Zapater, M. N. and Bressan, G.: A Proposed Approach 22. Lai-Chong, E., Roto, V., Hassenzahl, M., Vermeeren,
for Quality of Experience Assurance of IPTV, In Proc. A. and Kort, J. : Understanding, scoping and defining
of First International Conference on the Digital Society, user experience: a survey approach, in Proc. of the
pp. 25-25, Guadeloupe (2007). International Conference on Human Factors in
6. Rodriguez, D., Abraho, J., Begazo, D., Lopes, R. and Computing Systems, NY, U.S., pp 719728 (2009).
Bressan, G.: Quality metric to assess video streaming 23. Roto, V., Lai-Chong, E., Vermeeren, A. and Hoonhout,
service over TCP considering temporal location of J. : User experience : bringing clarity to the concept of
pauses, IEEE Trans. on Consumer Electron., vol. 58, user experience, White Paper (2011).
no. 3, pp. 985-992 (2012). 24. Callet, P. Le, Mller, S., Perkis, A.: Qualinet white
7. Rodriguez, D., Pivaro, G. and Sousa, J. : Apparatus and paper on definitions of quality of experience, In Proc.
method for evaluating voice quality in a mobile Europ. net. on quality of experience in multimedia
network, US. Patent number US 9,078,143 B2 by US systems and services, White Paper, Lausanne (2012).
Patent and Trademark Office (2015). 25. Mller, S., Engelbrecht, K.-P., Khnel, C., Wechsung,
8. Rodriguez, D., Lopes, R. and Bressan, G.: A billing I., Weiss, B.: A taxonomy of quality of service and
system model for voice call service in cellular networks quality of experience of multimodal human-machine
based on voice quality, in Proc. of IEEE International interaction, in Proc. of the International Workshop on
Symposium on Consumer Electronics, Hsinchu Taiwan, Quality of Multimedia Experience, San Diego, CA, pp.
pp. 89-90 (2013). 712 (2009).
9. Rodriguez, D. and Bressan, G.: Improving the 26. Geerts, D., Moor, K. De, Ketyko, I., Jacobs, A.,. Bergh,
minimization drive tests using voice quality index, in J. V, Joseph, W., Martens, L., Marez, L. De.: Linking an
Proc. of International Workshop on integrated framework with appropriate methods for
Telecommunications, Minas Gerais, pp. 10-13 (2013). measuring QoE, in Proc. of Quality of Multimedia
10. Cowan, N.: On short and long auditory stores, Psychol Experience, Trondheim, pp 158163 (2010).
Buletin, vol. 96, no.2, pp.341370 (1984). 27. Jumisko-Pyykk, S.: User-centered quality of
11. Coltheart, M.: Iconic memory and visible persistence, experience and its evaluation methods for mobile
Percept Psychophysics, vol. 27, no. 3, pp. 183228 television, PhD. dissertation, Tampere University of
(1980). Technology, Tampere (2011).
12. Baddeley, A.: Human memorytheory and practice, 28. Rothauser, E.H., Chapman, W.D., Guttman, N., Nordby,
Taylor & Francis: Psychology Press, East Sussex K.S., Silbiger, H.R., Urbanek, G.E., Weinstock, M.:
(1997). IEEE recommended practice for speech quality
13. Baddeley, A.: Working memory: looking back and measurements, IEEE Trans. Audio Electroacoust., vol.
looking forward, Nat. Rev. Neurosci., vol. 4, no. 10, pp. 17, no.3, pp.225246 (1969).
829839 (2003). 29. Rodriguez, D. and Bressan, G.: Video Quality
14. Jackson, F.: Epiphenomenal qualia, Philos. Quaterly, Assessments on Digital TV and Video Streaming
vol.32, no.127, pp. 127136 (1982). services using Objective Metrics, IEEE Latin America
15. Martens, H. and Martens, M.: Multivariate analysis of Transactions, vol. 10, no.1, pp. 1184-1189 (2012).
quality, Wiley, Chichester, England, pp. 445 (2001). 30. Rosa, R. L., Rodrguez, D. Z., Souza, V. A. and
16. Reeves, C. and Bednar, D.: Defining quality: Bressan, G.: Recommendation system based on user
alternatives and implications, Acad. Manage Rev., profile extracted from an IMS network with emphasis
vol.19, no. 3, pp. 419445 (1994). on social network and digital TV, in Proc. of the Latin
17. International Organization for Standardization: ISO America Networking Conference, New York, US, pp.
9000:2000, Quality management systems: fundamentals 40-47 (2011).
and vocabulary (1999). 31. Rodriguez, D. and Bressan, G.: Video Quality
18. Mller, S.: Quality engineering, Springer, London Assessment on Digital TV and Video Streaming
(2010). services using Objective Metrics, in. Proc. International
256
Information and Telecommunication Technologies, 45. ITU-R Rec. BT.500: Methodology for the subjective
Florianopolis, Brazil (2011). assessment of the quality of television pictures, ITU-T,
32. Souza, J., Rodriguez, D. Z. and Cavalcante, A.: Geneva, Switzerland (2012).
Proposal of a new voice quality indicator for the 46. ITU-T Rec. P.910: Subjective video quality assessment
Brazilian mobile telecommunications system based on methods for multimedia applications, ITU-T, Geneva,
the end user quality experience, In Proc. of National Switzerland (2008).
Conference on Telecommunications, Arequipa, Peru 47. Wang, Z., Bovik, A. C. and Sheikh, H.: Image quality
(2011). assessment: from error visibility to structural similarity,
33. Stuber, G. L.: Principles of Mobile Communication, IEEE Transactions on Image Processing, vol. 13, no. 4
Kluwer Academic Publishers, 1st ed., Norwell, MA, (2004).
USA (1996). 48. Pinson, M. and Wolf, S.: A new standardized method
34. Hofeld, T., Schatz, R., Egger, S., Fiedler, M., Masuch, for objectively measuring video quality, IEEE Trans. on
K. and Lorentzen, C.: Initial delay vs. interruptions: Broadcasting, vol. 50, no. 3, pp. 312322 (2004).
between the devil and the deep blue sea," International 49. Porter, T. and Peng, X. H.: An objective approach to
Workshop on Quality of Multimedia Experience, pp.1- measuring video playback quality in loss networks
6, Yarra Valley, VIC (2012). using TCP, IEEE Communications Letters, vol. 15, no.
35. Mayer, R. E.: Multimedia Learning, Cambridge 1 (2011).
University Press, New York, USA, pp. 320 (2009). 50. Mok, R., Chan, E. and Chang, R.: Measuring the quality
36. C. Timmerer, M. Waltl, B. Rainer, H. Hellwagner, of experience of HTTP video streaming, in Proc.
Assessing the quality of sensory experience for International Symposium on Integrated Net.
multimedia presentations, Signal Processing: Image Management, Dublin, Ireland, pp. 485-492 (2011).
Commun.,vol. 27, no. 8, pp. 909-916, Sept. 2012. 51. Xue, Y., Erkin, B. and Wang, Y. : A novel no-reference
37. Reiter, U., Brunnstrm, K., Moor, K. De, Larabi, M.-C., video quality metric for evaluating temporal jerkiness
Pereira, M., Pinheiro, A., You, J. and Zgank, A.: Factors due to frame freezing, IEEE Trans. on Multimedia, vol.
influencing quality of experience, in Quality of 17, no. 1, pp. 134-139 (2015).
Experience, Eds. S. Moller and A. Raake, London: 52. Khan, A., Sun, L., Fajardo, J., Liberal, F. and Ifeachor,
Springer, pp. 55-72 (2014). E. : Where's the music? Comparing the QoE impact of
38. Atzori, L., Floris, A., Ginesu, G. and Giusto, D.: Quality temporal impairments between music and video
perception when streaming video on tablet devices, J. streaming, in Proc. International Workshop on Quality
Vis. Commun. Image R., vol. 25, no. 3, pp. 586-595 Multimedia Experience, Klagenfurt, Austria, pp. 64-
(2014). 69(2013).
39. ITU-T Rec. P.862: Perceptual evaluation of speech 53. Hsio, Y., Chen, C., Lee, J. and Chu, Y. : Designing and
quality (PESQ): An objective method for end-to-end implementing a scalable video-streaming system using
speech quality assessment of narrowband telephone an adaptive control scheme, IEEE Trans. Cons.
networks and speech codecs, ITU-T, Geneva, Electron., vol. 58, no. 4, pp. 1314-1322 (2012).
Switzerland (2005). 54. Mok, R., Luo, X., Chan, E. and Chang, R. : QDASH: A
40. ITU-T. Rec. P.563: Single-ended method for objective QoE-aware DASH system, in Proc. Multimedia System
speech quality assessment in narrow-band telephony Conference, N. Carolina, US, pp. 11-22 (2012).
applications, ITU-T, Geneva, Switzerland (2004). 55. Chen, C., Choi, L., Veciana, G. De, Caramanis, C.,
41. ITU-T. Rec. G. 107: The E-model: a computational Heath, R. and Bovik, A.: A dynamic system model of
model for use in transmission planning, ITU-T, Geneva, time-varying subjective quality of video streams over
Switzerland (2000). HTTP, in Proc. IEEE Acoustics, Speech and Signal
42. Hines, A., Skoglund, J., Kokaram, A. and Harte, N.: Processing Conference, Vancouver, Canada, pp. 3602-
Visqol: the virtual speech quality objective listener, In 3606 (2013).
Proc. of International Workshop on Acoustic Signal 56. Oyman, O. and Singh, S.: Quality of experience for
Enhancement, pp. 1-4, Aachen, Germany (2012). HTTP adaptive streaming services, IEEE
43. Malfait, L., Berger J. and Kastner, M.: P.563&8212; Communications Magazine., vol. 50, no. 4, pp. 20-27
The ITU-T Standard for Single-Ended Speech Quality (2012).
Assessment, IEEE Transactions on Audio, Speech, and 57. Essaili, A., Schroeder, D., Staehle, Shehada, D. M.,
Language Processing, vol. 14, no. 6, pp. 1924-1934 Kellerer, W. and Steinbach, E. :Quality-of-experience
(2006). driven adaptive HTTP media delivery, in Proc. of IEEE
44. Thiede, T., Treurniet, W. C., Bitto, R. , Schmidmer, C., International Conference on Communications,
Sporer, T., Beerends, J. G. and Colomes, C. : PEAQ - Budapest, Hungary, pp. 24802485 (2013).
The ITU Standard for Objective Measurement of 58. Ni, P., Eg, R., Eichhorn, A., Griwodz, C. and
Perceived Audio Quality, Journal of the Audio Halvorsen, P. :Spatial flicker effect in video scaling, in
Engineering Society 48(1-2), 329 (2000). Proc. of International Workshop on Quality Multimedia
Experience, Mechelen, Belgium, pp. 55-60 (2011).
257
59. Rodriguez, D., Wang, Z. , Rosa, R. and Bressan, G.: multimedia applications, IEEE Trans. on Networking,
The impact of video-quality-level switching on user vol. 7, no.2, pp. 339-355 (2005).
quality of experience in dynamic adaptive streaming 73. Park, H-J. and Har, D-H.: Subjective image quality
over HTTP, EURASIP Journal on Wireless assessment based on objective image quality
Communications and Networking, vol. 216, no. 1, pp. 1- measurement factors, IEEE Trans. Consumer Electron.,
15 (2014). vol. 57, no. 3, pp. 1176-1184 (2011).
60. Lin, W. and Kuo, C.-C. J.: Perceptual visual quality 74. Kwon, H., Han, H., Lee, S., Choi, W. and Kang, B. :
metrics: A survey, Journal of Visual Communication New video enhancement preprocessor using the region-
and Image Representation, vol. 22, no. 4, pp. 297312 of-interest for the videoconferencing, IEEE Trans.
(2011). Consumer Electron., vol. 56, no. 4, pp. 2644-2651
61. Moorthy, A. K. and Bovik, A. C.: Visual quality (2010).
assessment algorithms: what does the future hold? 75. You, J. Perkis, A., Gabbouj, M. and Hannuksela, M. M.:
Multimedia Tools and Applications, vol. 51, no. 2, pp. Perceptual quality assessment based on visual attention
675696 (2011). analysis, in Proc. of ACM Int. Conf. Multimedia, pp.
62. Chikkerur, S., Sundaram, V., Reisslein, M. and Karam, 561564, Beijing, China (2009).
L. J.: Objective video quality assessment methods: A 76. Noorthy, A. K. and Bovik, A. C.: Visual importance
classification, review, and performance comparison, pooling for image quality assessment, IEEE J. Select.
IEEE Transactions on Broadcasting, vol. 57, no. 2, pp. Topics Signal Processing, vol. 3, no. 2, pp. 193201
165182 (2011). (2009).
63. Chandler, D. M.: Seven challenges in image quality 77. Chikkerur, S., Sundaram, V., Reisslein M. and Karam,
assessment: past, present, and future research, ISRN L. J.: Objective Video Quality Assessment Methods: A
Signal Processing, vol. 2013, pp.1-53 (2013). Classification, Review, and Performance Comparison,
64. Hands, D. S.: A Basic Multimedia Quality Model, IEEE IEEE Transactions on Broadcasting, vol. 57, no. 2, pp.
Transactions on Multimedia, vol. 6, no. 6, pp. 806816 165-182 (2011).
(2004). 78. Pinson, M. H. and Wolf, S.:Comparing subjective video
65. Garcia, M. N., Schleicher, R. and Raake, A.: quality testing methodologies, SPIE Video
Impairment-factor-based audiovisual quality model for Communications and Image Processing, pp. 573582
IPTV: Influence of video resolution, degradation type, (2003).
and content type, EURASIP Journal on Image and 79. Janowski L. and Pinson, M. H.: Subject bias:
Video Processing, pp. 114 (2011). Introducing a theoretical user model, in Proc. of Sixth
66. Yamagishi, K. and Gao, S.: Light-weight audiovisual International Workshop on Quality of Multimedia
quality assessment of mobile video: ITU-T Rec. Experience, pp. 251256, Singapore (2014).
P.1201.1, In Proc. of IEEE 15th International Workshop 80. Streijl, R. C., Winkler, S. and Hands, D. S.: Mean
on Multimedia Signal Processing, Pula, Italy, pp. 464- opinion score (MOS) revisited: methods and
469 (2013). applications, limitations and alternatives, Multimedia
67. Pinson, M. H., Ingram, W. and Webster, A. : Systems, vol 22. No2, pp. 213227 (2016).
Audiovisual Quality Components, IEEE Signal 81. Dawson, W. E. and Brinker, R.: Validation of ratio
Processing Magazine, vol. 28, no. 6, pp. 60-67 (2011). scales of opinion by multimodality matching,
68. Martinez, H. B. and Farias, C.: Full-reference audio- Perception & Psychophysics, vol. 9, no. 5, pp. 413417
visual video quality metric, Journal of Electronic (1971).
Imaging, v. 23, p. 061108 (2014). 82. Tominaga, T., Hayashi, T., Okamoto, J. and Takahashi,
69. Jones, C. and Atkinson, D. J.: Development of opinion- A.: Performance comparisons of subjective quality
based audiovisual quality models for desktop video- assessment methods for mobile video, In Proc. Second
teleconferencing, In Proc. of Sixth International International Workshop on Quality of Multimedia
Workshop on Quality of Service, Napa, CA, pp. 196- Experience (QoMEX), Trondheim, Norway, pp. 82-87
203 (1998). (2010).
70. Hiroki, O., Hisamatsu, H. and Noborio, H. : Design and 83. Blin, J.: New quality evaluation method suited to
evaluation of hybrid congestion control mechanism for multimedia context: Samviq, in Proc. of the
video streaming, in Proc. of the IEEE International International Workshop on Video Processing and
Conference on Computer and Information Technology, Quality Metrics, Arizona, U.S., pp. 1-6 (2006).
pp. 585-590, Pafos, Cyprus (2011). 84. Redi, J., Liu, H., Alers, H., Zunino, R. and
71. Hisamatsu, H., Hasegawa, G. and Murata, M.: Non Heynderickx, I.: Comparing subjective image quality
bandwidth-intrusive video streaming over TCP, in. measurement methods for the creation of public
Proc. of the International Conference on Information databases, in Proc. of SPIE-IS&T Electronic Imaging,
Technology, pp. 78-83, Las Vegas, USA (2011). SPIE vol. 1, no. 7529, California, USA, pp. 1-11 (2010).
72. Cai, L., Shen, X., Pan, J. and Mark, J.: Performance 85. Hosfeld, T., Biedermann, S., Shatz, R., Platzer, A.:The
analysis of TCP friendly AIMD algorithms for memory effect and its implications on Web QoE
modeling, in Proc. of 23rd International Teletraffic
258
Congress (ITC), San Francisco,U.S., pp. 103110, 92. Rodriguez, D., Rosa, R. L, Alfaia, E., Abraho J. and
(2011). Bressan, G.: Video Quality Metric for Streaming
86. Aroussi, S., Bouabana-Tebibel, T., Mellouk, A.: Service Using DASH Standard, IEEE Trans. on
Empirical QoE/QoS correlation model based on Broadcasting., vol. 62, no. 3, pp. 1-12 (2016).
multiple parameters for VoD flows, in Proc. of Global 93. ANATEL Resolution No 335: Regulation of personal
Communications Conference, pp. 19631968, Mobil Service Indicators - SMP, Brazil (2003).
California, U.S. (2012). 94. 3GPP TS 37.320 version 10.4.0 Release 10: Radio
87. Rodrguez, D. and Arjona, M.: VoIP quality measurement collection for Minimization of Drive Tests
improvement with a rate-determination algorithm, in (MDT) (2012).
Proc. of International Workshop on 95. Diethorn J.: Purposeful receive-path audio degradation
Telecommunications, Sao Paulo, Brazil, pp. 30-34 for providing feedback about transmit- path signal
(2009). quality, US PATENT number USA 7,542,761, by
88. ISO/IEC IS 23009-1: Information Technology Avaya Technology. (2009).
Dynamic adaptive streaming over HTTP (DASH) 96. Sarkar: Voice quality on a communication link based on
(2012). customer feedback, US PATENT number US
89. Chen, C., Choi, L., Veciana, G. de., Caramanis, C., 7,542,761, by ATT. (2009).
Heath, R., and Bovik, A., "Modeling the time-varying 97. ITU-T Rec. P.800: Methods for subjective
subjective quality of HTTP video streams with rate determination of transmission quality, ITU-T, Geneva,
adaptations," IEEE Transactions on Image Processing, Switzerland (1996).
vol. 23, no.5, pp. 2206-2221 (2014). 98. Herrero, R.: Encapsulation of Real Time
90. Chen C., Zhu, X., Veciana, G. de., Bovik, A. and Heath, Communications over Restrictive Access Networks,
R.: Adaptive video transmission with subjective quality International Journal of Digital Information and
constraints, in Proc. of IEEE International Conference Wireless Communications, vol. 6, no. 3, pp. 173-183
on Image Processing, Paris, France, pp. 2477-2481 (2016).
(2014). 99. National Agency of Telecommunication (Anatel)
91. Singhal, C., De, S., Trestian, R. and Muntean, G-M.: Annual Tech. Report of 2014 (2015).
Optimization of user-experience and energy-efficiency 100. Freitas, V., Uren V., Brewster, C., Gonalves, A.:
in wireless multimedia broadcast, IEEE Transactions on Ontology for Performance Measurement Indicators
Mobile Computing, vol. 13, no. 7, pp. 1522-1535 Comparison, International Journal of Digital
(2014). Information and Wireless Communications, vol. 6, no.
2, pp. 87-96 (2016).
259
Music Emotion Recognition with Audio and Lyrics Features
C. V. Nanayakkara1 and H. A. Caldera2

University of Colombo School of Computing (UCSC)
35, Reid Avenue, Colombo 7, Sri Lanka
1
charini.nanayakkara@gmail.com
2
hac@ucsc.cmb.ac.lk
ABSTRACT 1.1 Music and Emotion: A Background Review

Music, according to the definition provided by
Music Emotion Recognition (MER) is a field of WordNet, is an artistic form of auditory
science dedicated to recognizing emotions associated communication incorporating instrumental or vocal
with music pieces. With the new interest in music tones in a structured and continuous manner [1].
therapy and music recommendation systems, MER has Three primary emotion categories have been
caught immense interest of scientists. This study is an identified to be associated with the context of
attempt at discerning how well music related emotions
music [2]. Expressed emotion relates to the
can be predicted with music features; audio and lyrics.
emotions a performer or composer wishes to
Emotion classes associated with songs were initially
communicate to listeners through a song whereas
identified with clustering. Independent classification
experiments were executed utilizing lyrics and audio
Felt emotion reflects emotion actually felt by the
features, to assess the comparative best model for listener when listening to music. However its
predicting music emotions. The classification Perceived emotion which is highlighted as main
algorithms attempted in this research are Nave Bayes, focus in Music Emotion Recognition (MER)
Random Forest, SVM and C4.5 decision. Random context. This relates to the emotions a listener
Forest with oversampling on the audio feature set perceives or assumes as being expressed by a
produced comparative best results. music composition.
1.2 Music Features

KEYWORDS Four primary aspects; structural features, listener
Music Emotion Prediction, Lyrics and Audio Features, features, performance features and contextual
Machine Learning, Hierarchical Clustering, features; are said to be decisive of what emotions
Classification, Music Specific Emotion Model an individual gets when listening to music [3].
Structural features relate to the sounds with which
a song comprises of. Melody, tempo and rhythm
1 INTRODUCTION are few of these structural features associated with
The capability people possess to intuitively music. It is inclusive of the lyrics of the song,
interpret the notion conveyed by a music piece, which refers to the words a music piece comprises
even in the absence of words, is possibly of. Listener features reflect on traits of each
accountable to the omnipresence of music in our individual listener, such as age, reason for listening
lives since birth. In fact, research attest to the fact to a song, psychological state, etc. Mannerism in
that sensitivity to music is expressed even during which the music piece is performed, skill and
prenatal stage, whereas the capability of perceiving appearance of performer, etc. deal with
emotion in music develops since infancy. Such performance features whereas contextual features
extensive evidence concerning the impact which relate to the location where music is performed
music depicts on emotional aspects have motivated (e.g. wedding, musical show, funeral). Among
researchers to conduct number of studies on the these, structural features and listener features
matter during recent years. (listeners opinion of the emotions associated with
the song) have been utilized in this research. The
focus has been to conduct experiments to analyze
260
the possibility of predicting musically induced research is determined. Preprocessing and feature
emotions, with lyrics and audio features. retrieval tasks are executed in phase 2 whereas
phase 3 deals with forming a music specific
emotion model using tags. Identification of
1.3 Research Problem emotion related tags has been initially executed
The primary research problem addressed in this following which dimensionality of emotion space
paper is evaluating capability of automatically has been reduced via combining synonyms and
predicting the emotions associated with a music executing hierarchical clustering. Subsequently,
piece, with a fair level of accuracy. Emotions classification algorithms; Nave Bayes [5], Random
perceived by an individual when listening to a Forest [6], Support Vector Machine (SVM) [7] and
certain song, is referred to as emotions associated C4.5 [8]; have been attempted in phase 4. This last
with a music piece in this study. This research phase is executed separately using audio and lyric
problem would be addressed by modelling several features.
data driven approaches for prediction of musically
induced emotion and evaluating them to ascertain
their level of acceptability. Thus, its required to 3 DATA ACQUISITION
identify the classes of emotion which are elicited in Six popular music datasets; RWC database,
people by music, since music may be expressive of GTZAN genre collection, Uspop2002,
merely some emotions but not all. Furthermore this MagnaTagATune, Musicbrainz and MSD; were
is necessitated by the distinction of musically benchmarked based on relevance, quantity and
induced emotion from the general class of quality. While the magnitude of the former 3
emotions [4]. Subsequently, a model which has the datasets were inadequate considering the existence
capability of automatically identifying the of larger music collections, audio features were
emotions expressed by a song must be identified. required to be obtained separately for
This model may utilize either lyrics or audio MagnaTagATune and Musicbrainz datasets.
features, based on what appears to be most Considering the magnitude of dataset and
promising in the field of MER, according to our availability of features necessary to conduct
research. research, MSD was opted for.
The lyrics of songs in the MSD have been
obtained from musiXmatch which is presently the
2 METHODOLOGY
largest lyrics catalogue in the world. Lyrics have
The methodology devises a music specific emotion
been represented as bag-of-words in this dataset.
model and evaluates the capability of different
classification algorithms to predict the emotions MSD provides with song level tags, which were
expressed by a song. The research was based on the utilized in the research for determining emotions
Million Song Dataset (MSD). MSD comprises of a associated with each song. Tags are terms which
million songs whereas tags, lyrics and audio song listeners have associated with the music
features associated with these songs are provided pieces in the MSD, via the API provided by
by musiXmatch, last.fm and Vienna University of Last.fm . Due to this being a site used by millions
Technology respectively. of people around the world to satisfy their musical
requirements, tags appearing in this dataset reflect
Figure 1 depicts the architectural view which
the opinions of a global community, thus making it
comprises of four primary components. Phase 1 of
the research is dedicated to evaluating and a suitable resource based on which to determine
benchmarking several music datasets, to evaluate emotions perceived by music.
their aptitude for achieving the research objectives. Furthermore, the Vienna University of Technology
Each dataset is benchmarked with relation to has provided with a multitude of audio features for
relevance, quantity and quality subsequent to 994 960 tracks in the MSD.
which the most suitable dataset to utilize for
261
Figure 1: Overall architecture of proposed methodology
4 FEATURE ENGINEERING Inverse Document Frequency (IDF) of a term is

defined as follows. It assigns a high value to
Subsequent to selecting MSD via comparative those terms which rarely occur, whereas words
analysis of several music datasets, preprocessing commonly appearing in a number of documents
and feature selection were executed to refine get a low value (2).
dataset to suit classification task.

= (2)
4.1 Data Preprocessing
Initially the data files were organized as ARFF
and CSV files since utilized tools (R, Weka,
Final TF-IDF value of term t in document d is
RapidMiner, etc.) supported these formats.
calculated by multiplying these two values; (1)
The lyric words associated with songs were then and (2). It reflects importance of t with respect
weighted by Term Frequency Inverse to d.
Document Frequency (TF-IDF) value. TF-IDF Using IDF together with TF helps give more
weighting helps determine the importance of a importance to words that often occur in a given
term in a document, with relation to a collection song, but less in the collection of music.
of documents considered [9]. This extensively Furthermore, it helps minimize the importance
applied score in document classification context given to frequently occurring terms such as the
could be utilized in our research by considering and a.
songs as documents and words from song
Using this measure, the importance of each lyric
lyrics as terms. Rather than merely depicting
word with relation to a specific track was
the presence or absence of a word from song
calculated. If TF-IDF of a word was less than
lyrics, using TF-IDF value allows reflecting
the average TF-IDF value of all words for
how important a word is for a song. Calculation
corresponding song, we considered the word to
of TF-IDF score is as follows.
be of no significance for that song. Commonly
Term Frequency (TF) calculates the frequency occurring words were thus removed from the
with which a term occurs in a document [9] (1). lyrics dataset. This reduced the lyric feature
dimensionality from 5000 terms to 1536 English
words. A CSV file corresponding to this lyric

=
(1) ARFF file was created,
262
Of the one million songs provided in Million Table 1: Audio features

Song Dataset, audio, lyric and tag features were Tool Feature Description
not provided for all. This has resulted in Spectral The center of mass of the power
occurrence of missing values in the dataset. It centroid spectrum.
was resolved via elimination of a tuple if a Spectral roll- The fraction of bins in the power
feature was missing since the resultant dataset off point spectrum at which 85% of the
was of considerable magnitude (167,023 songs). power is at lower frequencies.
This is a measure the right-
Tags which conveyed emotion alone were skewedness of the power
identified and retained using WordNet tool and spectrum.
GEMS [4]. This preprocessing stage was
Tools used by Vienna University for retrieving audio features

Spectral flux A measure of the amount of
important for creation of music specific emotion spectral change in a signal.
model. Outliers of the dataset were identified Found by calculating the change
in the magnitude spectrum from
using the Inter Quartile Range (IQR). frame to frame.
JMIR
Compactness A measure of the noisiness of a
4.2 Feature Selection recording. Found by comparing
Feature selection is the mechanism of selecting the components of a windows
magnitude spectrum with the
a subset from feature space, which is most magnitude spectrum of its
relevant to class attribute values. Feature neighboring windows.
redundancy analysis, otherwise known as Spectral The standard deviation of the
correlation based feature selection is a renowned variability magnitude spectrum.
filter technique used for selecting subset of Root mean A measure of the power of a
features. Linear correlation coefficient , which square signal over a window.
is the primary measure adopted in this Zero The number of times the
technique, is defined in (3), where xi, yi are two crossings waveform changed sign in a
variables and , are their respective means. window. An indication of
is between -1 and 1 where equality to -1 or 1 frequency as well as noisiness.
signifies complete correlation. Features Fraction of The fraction of the last 100
depicting high correlation are considered to be low energy windows that has an RMS less
windows than the mean RMS of the
redundant, thus reflecting that retaining only one
Marsyas
last100 windows.
of such a feature pair is adequate for
Timbre Measurement on the Fast
classification. Subsequent to applying this features Fourier Transformation (FFT) of
algorithm using (+/-) 0.75 as threshold, ten sounds generated by octave
audio features were retained. notes.
( )( ) (3)
=
( )2 ( )2
Of the Audio features depicted in Table 1, mean

and standard deviation values of Spectral
centroid, Spectral flux, Compactness, Spectral
variability and Fraction of low energy windows
were retained subsequent to applying
correlation. Subsequent to applying correlation
based feature selection to the lyrics feature
space, it was noted that no significant
correlation was depicted among words in songs.
Thus we were compelled to retain all 1536 lyric
words for classification.
263
5 MUSIC EMOTION MODEL Count specifies the number of songs

which belong to each cluster whereas
In agglomerative hierarchical clustering column Emotion Tags conveys which
approach, each data point (tags in this tags each cluster comprises of. These
instance) would be in separate clusters at emotion clusters were incorporated in the
the beginning, which would be dataset prior to proceeding with supervised
successively merged to form larger learning (i.e. If all the tags in a specific
clusters. The merging could be halted cluster were associated with a particular
either when all data points are in one song, that cluster was incorporated with
cluster or when a certain stopping criterion relevant song). Since more than 80% of the
is met [10]. Merging of two clusters in songs are distributed among seven majority
agglomerative hierarchical clustering is emotion classes; C1, C2, C3, C7, C17, C22
based on the linkage type opted for. It and C25; merely those classes were
specifies the criterion of merging two or retained for conducting classification
more clusters (i.e. whether to consider the experiments.
furthest data points of two clusters, the
closest, etc.). Furthermore, a distance Table 2: 25 emotion clusters
measure is associated with linkage, which
specifies the manner in which to measure Class Emotion tags Count
distance between two (or more) clusters. C1 Joyful, Danceable 1202
When applying hierarchical clustering in C2 Witty 1734
this research, the distance measure used C3 Calming, Melancholic, 1639
Romantic
was jaccard distance x [11] which is
C4 Uplifting, Feel good 199
defined in (4). Five linkage methods single, C5 Aggressive, Angry 78
complete, group average, Ward1 and Ward C6 Moving, Inspiring 20
2 [10] [12] were applied. Thus the final C7 Sensual 3821
clusters were formed via combining tags C8 Haunting, Dark, 16
which appear together at the lowest level of Depressing
hierarchy in 80% of these algorithms. This C9 Angst 211
C10 Sick 177
is a significant deviation from the common C11 Makes me smile 340
method of opting for one linkage method C12 Heartbreaking, 23
over the rest. On the contrary, tags, which Bittersweet
were assigned to the same cluster in 4 out C13 Crazy 251
of 5 (80%) clustering approaches C14 Exciting 175
attempted, were grouped together (e.g. if C15 Creepy 371
C16 Ethereal, Dreamy, 7
tag A and tag B were grouped together in Soothing
1st iteration of clustering using complete, C17 Cool 3723
group average, Ward1 and Ward2 methods, C18 Energetic, Powerful 158
except in single linkage method, those two C19 Soulful 369
tags were still qualified for merging due to C20 Hypnotic 216
80% or more constraint). C21 Sentimental 345
C22 Psychedelic 2335
| | C23 Lonely 111
=1 (4) C24 Spiritual 193
| |
C25 Nostalgic 1800
The emotion clusters formed when this
terminology was adhered to, are as
depicted in Table 2. Emotion clusters have
been numbered from C1 to C25. Column
264
6 RESULTS AND DISCUSSION via studying the class label prediction of each
tree. Each tree is constructed using a sample of
6.1 Classification Algorithms the training dataset, where each sample is
Four renowned classification algorithms were created with replacement, adhering to
attempted in this study, each of which has been bootstrapping mechanism. The overfitting
extensively applied in former machine learning problem of general decision trees is minimized
research work. Classification algorithms are by this classifier.
supervised learning techniques, where a labelled
6.2 Experimental Setup
dataset is required to be fed to them to obtain
As mentioned before, merely the seven majority
results.
emotion classes were retained for further study.
Support Vector Machines (SVM): SVM [7] Since this dataset depicts class imbalance
attempts to construct optimum hyperplanes (number of data points from each class is not
which best separate a dataset into classes. A equal) to a certain extent, classification
good separation is said to be obtained by finding experiments were executed on undersampled
the hyperplane which has largest distance to and oversampled datasets as well. The
nearest training record from any class. undersampling algorithm applied, strives to
Sequential Minimal Optimization (SMO) reduce the number of data points from each
implementation of SVM was utilized in this class, to equal number of data points in class of
research. smallest magnitude (C1). In oversampling
Nave Bayes: Nave Bayes [5] is based on mechanism suitable percentages must be
Bayes Theorem which describes the probability provided for creation of synthetic data points in
of an event, based on conditions that might be each class, to render magnitude of each class
related to the event. Nave Bayes algorithm is closer to magnitude of two largest classes; C7
depicted by (5) where vj depicts value of class and C17 (i.e. 200% adds another 2404 data
attribute and ai refers to value assumed by each points to C1 thus resulting in 3606 data points).
of the other attributes. (There are j number of Table 3 represents data distribution subsequent
classes as v1, v2, . vj). to undersampling and oversampling the dataset.

(5)
= arg max ( ) ( | ) Table 3: Undersampling and oversampling the dataset

=1 Emotion class No. of data No. of data
points after points after
undersampling oversampling
Nave Bayes assumption depicted by (6) is C1 1202 3606
applied when using this algorithm.
C2 1202 3468
C3 1202 3278
(1 , 2 , . . , | ) = ( | ) (6)
C7 1202 3821
C17 1202 3723
C4.5: C4.5 [8] is a decision tree algorithm C22 1202 3502
which is a variant of the original ID3 algorithm
C25 1202 3600
devised by Ross Quinlan. Information gain is
the function used in this method to determine
the attribute to be chosen at each level of the
The evaluation metrics utilized to assess
tree. Attribute with highest gain is chosen as
aptitude of classifiers are as follows.
root whereas the next attributes are chosen in
descending order of their gains. (The abbreviations stand for; TP True
Positive, TN True Negative, FP False
Random Forests: Random forests algorithm [6]
Positive and FN False Negative.)
forms a combination of prediction trees and
produces the eventual result using ensemble Recall: Also known as true positive rate,
methods. Thus the final class label is predicted sensitivity and hit rate, this helps evaluate the
265
probability of correctly labeling members of the Table 4: Classifier performance based on AUC value
target class [13]. Area under the Curve Reflection on performance
Recall = TP/ (TP + FN). 0.9 - 1 Excellent
Precision: Also known as positive predictive 0.8 0.9 Good
value, this helps evaluate the probability that a 0.7 0.8 Fair
positive prediction is correct [13].
0.6 0.7 Poor
Precision = TP/ (TP + FP). 0.5 0.6 Fail
F-measure: This is the harmonic mean of
precision and recall [13]. Classified as
F-measure = Positive Negative
2 * Precision * Recall/ (Precision + Recall).
Positive Negative
True Positive False
(TP) Neagative (FN)
Actual class
False Positive Rate: Otherwise known as false
alarm rate, this reflects the probability of falsely
rejecting the null hypothesis for a particular test
(i.e. classifier inaccurately states that an instance False True Negative
belonging to negative class is positive) [13]. Positive(FP) (TN)
False Positive Rate = FP/ (FP + TN) Figure 2: Confusion matrix
Accuracy: This measure evaluates the
proportion of correct predictions by a classifier.
Accuracy = (TP + TN) / (TP + TN + FP + FN)
Area under the Curve: AUC [13] is a
graphical measure which depicts the area under
Receiver operating characteristic (ROC) curve.
ROC curve represents true positive rate against
false positive rate, thus providing with an
indication regarding overall performance of
classifier. Table 4 depicts classifier evaluation
based on AUC value.
Figure 2 depicts the confusion matrix on which
each evaluation measure is based. The two
classes concerned are the positive class and the
negative class. Classification output is labelled Figure 3: Accuracy of audio based classification
as TP, TN, FP and FN by considering the actual
Table 5: Legend for figures 4, 5, 6, 7 and 8
class of a data point and the class into which it is
OS Oversample + SVM
classified. Obtaining relatively high values for OR Oversample + Random Forest
Recall, Precision, F-measure, Accuracy and ON Oversample + Nave Bayes
AUC is preferred, whereas acquiring low values OC Oversample + C4.5
for False Positive Rate is desired. US Undersample + SVM
6.3 Audio Based Classification Experiment UR Undersample + Random Forest
UN Undersample + Nave Bayes
Figure 3 depicts accuracy of audio based UC Undersample + C4.5
classification. Thus, Random Forest with NS Non-sample + SVM
oversampling could be inferred as the best NR Non-sample + Random Forest
solution, based on Accuracy metric. NN Non-sample + Nave Bayes
NC Non-sample + C4.5
266
Legend related to Figures 4 to 13 is depicted in Figure 5 depicts results obtained for precision
Table 5. metric.
According to Figure 4 which depicts values Best precision for classes C1, C2, C3, C22 and
obtained for recall metric, classes C1, C2 and C25 has been obtained with Oversampling +
C25 have achieved the best results with Random Forest. Classes C7 and C17 however,
Oversampling + Random Forest. C3 and C22 have depicted best performance with Non-
have obtained best results with Undersampling sampled Nave Bayes.
+ SVM whereas classes C7 and C17 have
obtained best results with Non-sampled SVM.
Figure 4: Recall: Audio based classification Figure 5: Precision: Audio based classification
267
Figure 6: Area under the curve (AUC): Audio based Figure 7: F-measure: Audio based classification
classification
Figure 6 depicts values obtained for area
under the curve measure (AUC). Best results Figure 7 depicts results obtained for f-
were obtained for each class when Random measure. Classes C1, C2, C3, C22 and C25
Forest algorithm was executed on have obtained best values for f-measure with
Oversampled dataset. According to Table 4 Oversampling + Random Forest. Classes C7
classification performance of Random Forest and C17 have achieved best results with Non-
with Oversampling could be categorized as sampled SVM.
good for classes C1 and C3, whereas it has
performed fairly for classes C2, C22 and
C25.
268
Oversampling + Nave Bayes. C22 class has

obtained the best classification result with
Oversampling + Random Forest.
Table 6: Best classification algorithm according to

different metrics: Audio based classification
Recall Precisio AUC F- FP

n measure rate
C1 OR OR OR OR NS
C2 OR OR OR OR NS
C3 US OR OR OR NS
C7 NS NN OR NS ON
C17 NS NN OR NS ON
C22 US OR OR OR OR
C25 OR OR OR OR NS
Table 6 summarizes the results obtained for

each metric in audio based classification.
Random Forest with Oversampling has
qualified as the best classifier in 21 out of 35
instances according to Table 6 (when metric +
class pairs are considered individually, 35
instances are obtained). Thus it could be
assigned an aptitude score of 0.6 (21/35 =
0.6). Scores assigned for other algorithms are
as follows.
Non-sampled SVM = 0.22

Undersampled SVM = 0.06
Non-sampled Nave Bayes = 0.06
Oversampled Nave Bayes = 0.06
Random Forest with Oversampling could be

assigned a higher score when considering the
algorithm which has produced best results
with respect to different metric and individual
classes. Furthermore, in the instances where
Oversampling + Random Forest hasnt
qualified as the best classifier, its often
included among the best five. Figure 3 depicts
Figure 8: False positive rate: Audio based classification that best accuracy could be obtained with
Oversampling + Random Forest. Thus, for
audio based classification, Random Forest
Unlike for other metrics, obtaining minimal with Oversampling has qualified as the
value for false positive rate is preferred. comparative, best classifier.
Hence, as shown in Figure 8 which depicts
values obtained for false positive rate, classes Due to the positive impact oversampling
C1, C2, C3 and C25 have achieved best depics on the experiments, lyric based
results with Non-sampled SVM method. classification was executed on the
Classes C7 and C17 have performed best with oversampled dataset alone.
269
6.4 Lyric Based Classification Experiment
As depicted by Figure 9, classes C1, C2, C3

and C25 have obtained best classification
accuracy with C4.5 algorithm. Random Forest
has produced best results for C22 and C17
whereas SVM has surpassed other algorithms
for class C7.
Figure 10 depicts the precision acquired with

lyric based classification. Random Forest has
performed best for classes C1, C3 and C25
according to precision metric whereas Nave
Bayes has produced best results for classes C2
and C17. C4.5 has given best results for C7
and SVM for C22.
According to AUC metric, Random Forest

has outperformed the rest for all classes as
shown in Figure 11
Figure 10: Precision: Lyric based classification
Figure 9: Recall: Lyric based classification

Figure 11: AUC: Lyric based classification
270
Figure 12 is a graphical representation of how

well classifiers have performed with respect
to F-measure metric. According to that, C17
and C22 have shown best results for Random
Forest. C1, C2, C3 and C25 have depicted
best results for C4.5 whereas C7 has shown
best performance for SVM.
As mentioned before, getting minimal value

for False Positive Rate is desired. Thus,
according to Figure13, classes C1, C2, C3 and
C25 have shown best performance with SVM.
C7, C17 and C22 have performed best with
Nave Bayes.
Table 7: Best classification algorithm according to

different metrics: Lyric based classification
Recall Precisio AUC F- FP

n measure rate
C1 OC OR OR OC OS
C2 OC ON OR OC OS
C3 OC OR OR OC OS
C7 OS OC OR OS ON
C17 OR ON OR OR ON
Figure 12: F-measure: Lyric based classification C22 OR OS OR OR ON
C25 OC OR OR OC OS
Table 7 summarizes the performance analysis

of lyric based classification. Only the
oversampled dataset was used for lyric based
classification. An aptitude score could be
assigned to each lyric based classification
experiment as follows.
C4.5 = 0.257
SVM = 0.2
Random Forest = 0.4
Nave Bayes = 0.143
Random Forest has qualified as the best

classifier for lyric based classification as well,
as conveyed by the associated aptitude scores.
Furthermore, the AUC score depicts that
Random Forest outperforms any other
classifier for lyric based classification.
Oversampling is a technique which ensures
that valuable information in the dataset is not
lost when attempting to balance the dataset.
Rather, it creates synthetic data in emotion
classes with fewer data points, such that the
Figure 13: FP-rate: Lyric based classification characteristics of the newly created data are
271
similar to other data points in the respective Table 8: Music emotion classes identified with emotion
class. On the contrary, undersampling results model
in loss of certain information, which is a Musically induced emotion
probable reason why oversampling performs Joyful, Danceable
better than undersampling. Witty
Random forest is an ensemble method, where Calming, Melancholic, Romantic
a multitude of decision trees are created when Sensual
creating classification model. This allows Cool
Psychedelic
selecting the most probable emotion class of
Nostalgic
a given song. The other three classification
methods attempted however, do not adopt this
A series of experiments were conducted to
ensemble method. Rather, they classify a song
attempt classification of music into
to a single specific emotion class alone, due to
recognized emotion classes. Of the
those being devoid of an interim phase where
classification attempts, the best music
several probable classes are found. Since a
emotion prediction model was provided as a
given song often has the potential of evoking
combination of Random Forest with
several emotions, Random forest seemingly
oversampling. This fact is supported by the
has performed better at predicting the most
aptitude scores provided in Table 9, which
likely emotion class to which a song belongs.
summarizes the performance of each classifier
for audio and lyric based classification.
7 CONCLUSION AND FUTURE Table 9: Summary of results

WORK
Potential for music to evoke emotion in Classifier Classification method
individuals has forever been an inexplicable, Audio Lyric
Sampling Score Sampling Score
yet evident phenomenon. While numerous C4.5 - - OC 0.257
research work has been conducted to deduce SVM NS 0.22 OS 0.2
whether the incurrence of emotion when US 0.06
listening to music is scientifically explicable, Random OR 0.6 OR 0.4
the literature survey conducted helped realize Forest
that there existed numerous limitations with Nave NN 0.06 ON 0.143
Bayes ON 0.06
existent mechanisms. One such limitation was
usage of generic emotion models for
predicting emotions in music. It has been A simple comparison of the AUC values
observed that emotions which are incurred helps deduce the fact that audio based
when listening to music take a different form classification is superior to lyric based
than generic emotions. For instance, the classification. Fair performance has only
sadness evoked by music is not necessarily as been obtained for C2 in lyric based
intense as sadness evoked from real life classification, according to Table 4. Thus it
experience. In fact, sadness (melancholic) could be concluded that audio features are
belonged to the same cluster as calming and more apt for forming music emotion
romantic in the music specific emotion model recognition models.
created based on the MSD dataset. Thus
creation of a music specific emotion model
for classification of music was viewed to be a
positive aspect of this study. Seven significant
emotion classes were thus identified, which
are depicted in Table 8.
272
The emotion model formed in this study was [5] I. Rish, An empirical study of the naive Bayes
reliant on the assumption that emotion related classifier, Int. Jt. Conf. Artif. Intell., vol. 3, no.
tags associated with music convey perceived 22, pp. 4146, 2001.
emotion. This assumption may not always be
[6] L. Breiman, Random Forests, Mach. Learn.,
valid since words such as love could be
vol. 45, no. 1, pp. 532, 2001.
conveying that the song is a love song, but not
that the emotion was felt by a person. This [7] I. Steinwart and A. Christmann, Support Vector
limitation could be resolved by collecting Machines. New York: Springer, 2008, pp. 1
information from a number of people 25.
regarding what emotions they experienced
when listening to songs from a dataset. [8] N. V Chawla, C4 . 5 and Imbalanced Data
sets: Investigating the effect of sampling
method , probabilistic estimate , and decision
REFERENCES tree structure, in Proc. Intl Conf. Machine
Learning, Workshop Learning from
Imbalanced Data Sets II, 2003, p. 8.
[1] Princeton University, About WordNet.
WordNet, 2010. [Online]. Available:
https://wordnet.princeton.edu/. [Accessed: 02- [9] P. Kanters, Automatic mood classification for
Aug-2015]. music, Tilburg University, Netherlands, 2009.
[2] A. Gabrielsson, Emotion perceived and [10] J. Han and M. Kamber, Data mining: concepts
emotion felt: Same or different?, Music. Sci., and techniques, 2nd ed. Elsevier, 2006.
vol. 5, no. 1, pp. 123147, 2002.
[11] S. Guha, R. Rastogi, and K. Shim, ROCK: a
[3] K. R. Scherer and M. R. Zentner, Emotional robust clustering algorithm for categorical
effects of music: production rules, in Music attributes, in Proceedings., 15th International
and Emotion: Theory and Research, 2001, pp. Conference on Data Engineering, 1999, pp.
361387. 512 521.
[4] M. Zentner, D. Grandjean, and K. R. Scherer, [12] T. Hill and P. Lewicki, Statistics: Methods and
Emotions evoked by the sound of music: Applications, 2nd ed. StatSoft, Inc., 2007, p.
characterization, classification, and 800.
measurement., Emotion, vol. 8, no. 4, pp.
494521, Aug. 2008. [13] D. M. . Powers, Evaluation: From precision,
recall and f-measure to ROC, informedness,
markedness & correlation, J. Mach. Learn.
Technol., vol. 2, no. 1, pp. 3763, 2011.
273
Proposition of an Intelligent System for Predictive Analysis Using Medical Big Data
Basma Boukenze * Abdelkrim Haqiq2

1 2
Computer, Networks, Mobility and Modeling Computer, Networks, Mobility and Modeling
laboratory laboratory
FST, Hassan 1st University, Settat, Morocco FST, Hassan 1st University, Settat, Morocco
Email: basma.boukenze@gmail.com e-NGN Research Group, Africa and Middle East
ahaqiq@gmail.com
ABSTRACT
Nowadays, a number of technologies including
Emerging technologies such as mobile application, cloud computing, mobile phone, big data and
cloud computing, big data analytics, predictive predictive analytics have been identified as
analytics revolutionized all sectors. This is particularly Emerging Technologies (ET) [1]. (ET) are defined
true for the healthcare system as a sensitive sector. as innovation in a constantly and accelerating
Nowadays, healthcare industry mainly depends on
evolving and touch every sector resulting a big
information technology to provide best services.
change in process and procedure, efforts are
A big-data revolution in healthcare starts with the
vastly increased supply of health data. In fact, these multiplied now to applying ET in sensitive sector
new technologies are applied to improve the medical like healthcare system.
sector. Digitized information is omnipresent because Data
This paper proposes an intelligent system that is growing and moving faster than healthcare
combines big-data analysis with data-mining and organizations can consume it. This mainly is due
mobile healthcare techniques for self-monitoring. The to the efforts of researchers in the medical field
system attempts to exploit the healthcare data through and their discoveries take as an example human
an intelligent process analysis and big data processing. DNA. Widespread use of the electronically
This approach aims to extract useful knowledge to be medical records wishes totally transforms medical
used in decision making and to ensure a real-time care [2]. the latest innovations concerning genetics
medical monitoring.
and smart home or smart places enables patient
self-monitoring and treatment by using simpler
KEYWORDS
devices[3]. The appearance of sensing technology
like M-health [4]; healthcare data appears like a
Emerging Technologies, Healthcare Data Analysis,
Mobile Application, Cloud Computing, Big Data digital flood creating puddles and lakes, creeks
Analytics, Data Mining, Learning Algorithm and torrents, of data , and this increases in parallel
with the rapid growth in the use of mobile devices
like smart phones, laptops, tablets, personal
1 INTRODUCTION sensors .
Large data volumes at high velocities were
originally an option that characterizes
supercomputers, nuclear physics, military
simulations and space travel. Late in the 20th
274
century, bigger and faster data appeared in airline health data is managed in the appropriate database,
and bank operations, particularly with the growth doctors / nurses can investigate the relevant data
of credit cards. Starting in 1990, The Human according to their requests and the necessary
Genome Project was the launch of Big Data in information to self-healthcare are provided by the
healthcare [5], and this was due to a statistic that Web service [6].
showed that 80% of medical data is unstructured if we talk about Cloud computing as new
and is clinically relevant and much significant. technology applied in the healthcare system ,it
This data resides in multiple places like individual brings many benefits ; by creating a network
EMRs, lab and imaging systems, physician notes, between doctors; patients and healthcare institutes
medical correspondence, claims, CRM systems, and facilitates access to medical information
and finance. anywhere and anytime [7],
The potential of Big Data analytics allows to slow Cloud computing provides healthcare a much
the ever-increasing costs of care, help providers to appreciated services concerning data handling by
practice more effective medicine, empower ensuring [8, 9]:
patients and healthcare providers, support fitness
and preventive self-care, and dream about more Resiliency: platforms offers by cloud
personalized and predictive medicine. Yet, social service providers are characterized by a
media, cloud computing, and using the intelligent powerful infrastructure that provides
procedure for managing analyzing and extracting
redundancy and storage of any data
information from Data; this approach will
transform healthcare system and gives the power quantity to ensuring high availability
to explore, predict and why not anticipate the cure. anytime and anywhere.
Big-data analysis promises and affirms that future
is no longer mysterious. Mobility: the cloud infrastructure is
We discuss the great role played by new providing the backbone for medical
technology in the field of health like healthcare personnel to access all sorts of information
analysis, and then we present our proposed system from any location and from a whole set of
and its contribution. devices; the communication will be done in
The rest of this paper will present as follow: in
an easier way given that the facility of
section II, we present related works concerning
technologies applied in the healthcare system and access will be the same to one patient or
researchs work in this field. Section III is several in the same time.
reserved for description of our proposed platform.
And the last section gives conclusions and Privacy: cloud computing platforms are
perspectives. characterized by a very high level of
security than local IT department in a
hospital can ensure.
2 RELATED WORKS
External management: By cloud provider
dont need doing updates or installing the
The medical industry has been swamped by new certificates or repairing blocking systems .
technologies because of many proofs, we take as
an example the implementation of e -health In addition to all these benefits, cloud adapts
systems. Such as Kagawa University that was to all situations to ensure ease of access at a
designed and implemented for the academic health high level.
education. E-Healthcare is a form of private cloud
service for university students who can get their Many researchers are currently focusing on the
health records from physical measurement devices benefits of new technologies [10, 11], for their
with their authentication based on smart card, their advantages and promises, including the great role
275
of cloud computing in the stage of managing can serve both roles of organizing and data
healthcare data that are becoming increasingly analyzing tool .Hadoop can handle very large
large. volumes of data with different structures or no
More than that, some of them give the design of a structure at all. But Hadoop is a little difficult to
cloud computing-based Healthcare SaaS Platform install, configure and manage, and people with
(HSP) to deliver healthcare information services Hadoop skills are not easily found. In addition, for
with low cost, high clinical value and high these reasons, it appears organizations are not
usability with the high level of security [12,13]. quite ready to embrace Hadoop completely.
Knowing that the adoption of EHRs and
Big data analysis specially in healthcare area is electronics data, prepared a submitted base for
considered as a revolutionary approach to applying analysis and become the norm in
improving the quality of healthcare service [14, healthcare, it enables the building of predictive
15], because analytics figures to play a pivotal role analytic solutions. These predictive models, as we
in the future of healthcare system and as a result of know have the potential to lower cost and improve
research to develop healthcare sector [16] systems the overall health of the population. As predictive
found obliged to receive a new form of data such models become more pervasive, some standards
as: human DNA, data genetics; hence the necessity appear to be used by all the parties involved in the
of leveraging all these resources and embitter modeling process: like The Predictive Model
human health. Analytics also are now applied in Markup Language (PMML) [22].It allows for
healthcare to compare the cost and effectiveness of predictive solutions to be easily shared between
interventions, treatments, public health policies, or applications and systems. And it can be used to
medical devices to reduce failed investments. expedite the adoption and use of predictive
In fact, this kind of analysis can give the best solutions in the healthcare industry.
solution to prevent medical disasters. For example,
infectious diseases can be predicted by data According to our research, we found that there are
healthcare analysis and the health authority could many efforts to creating platforms based on cloud
manage this situation and save the humans. computing for managing medical records and
Also we will soon be awash in genomic data [17], simplify access to data. The patient does not care
given the incredible size and dimensionality of about the way with his doctor manages his medical
these datasets, the field of analytics will need to data. But the most important for him is what is the
borrow techniques to face it and to make it useful. positive impact of this, on his health situation?
What we propose is a platform that combines the
in addition to that , some predictive analytics benefits of mobile healthcare and big data
platform for disease targets across varying patient analysis. Making as the primary objective,
cohorts using electronic health records (EHRs) are exploration and extraction of useful knowledge
created to facilitate specific biomedical research and self-monitoring in real time for patients.
workflows, such as refinement of hypotheses or
data semantics [18].
A lot of tools are used now to create platforms for 3 SYSTEM ARCHITECTURE
big data analytics, the most known is the open-
source distributed data processing platform or
Hadoop (Apache platform) [19]. It belongs to the 3.1 System Characteristics:
class of technologies "NoSQL" that have evolved
to managing data at high volume. Hadoop has the The proposed solution is an intelligent system that
potential to process extremely large amounts of we gave the name of Intelligent Predictive
data mainly by allocating partitioned data sets to Healthcare System (IPHCS), it analyzes
numerous servers (nodes), each of which solves medical data which coming from different sources
different parts of the larger problem and then in a real-time, this process helps to decide about
integrates them for the final result [20,21]. Hadoop patient health condition by using the extracted
276
information from his own data . So the system The captured information is sent and subsequently
will: managed by the system, which monitors in real-
time.
The doctor intervenes on the basis of the received
be hosted in a cloud and can be accessed report, and the patient will be contacted for the
anytime, anywhere, and by any necessary. (Figure1).
communication equipment,
make a quick analysis in real-time to give

accurate future information using
intelligent and very specific tools,
The reaction of the two main actors (Patient /

Doctor) in the system is as follows:
Doctor accedes to IPHCS for:
Consult patients profiles

Monitoring and controlling the health
status of each patient
Introducing new data (patient or
subscription of treatment)
Patient has a dual interaction with the system:
Indirect interaction: the patient follows the

guidelines of his doctor, who is based on turn on
IPHCS to decide.
Direct interaction: the patient has a medical

device such as (Smartphone, Smart watch,
Bracelet) equipped with a sensor designed to
detect for example (heart rhythm using infrared
LEDs and photodiodes sensor; evaluate the
intensity of effort by measuring your heart rate
etc).
Figure1. Typical intelligent Healthcare system schema
Information received by the sensor will be
managed by the IPHCS which then produces the 3.2 System Architecture
following:
IHPCS is capable to:
A report sends to the doctor to alert him Analyze a large amount of medical data;
about the change of patient status. predict what the patient may have in the
An emergency warning message sends to future as complexity and pathologies by
the patient in case of emergency to alert data mining techniques;
and warn him about his status, pending the Anticipate the cure and treatment;
intervention of the concerned doctor. Monitor patient in real time;
277
Provide Patient opportunities to make a laboratories, pharmacies, insurance companies etc,

self- monitoring in real-time by the use of in various formats (flat files, .csv, tables,
health mobile devices. ASCII/text, etc.) .
Our proposed system architecture is shown in Storage and processing: it is a very important
Figure2 comprise of several steps. phase seen it demands very powerful techniques
and tools to manage and process the voluminous
data.
Predictive analysis: is the master step in all this
process, because it rests on the exploration of
analyzed data to extract useful knowledge on the
basis of data mining tools and algorithms to find
links between the medical data.
Processing visualized reports: The results
obtained after the predictive analysis process are
Exploited by Doctors and Patients as follows:
Doctor for help in decision making and giving a
general view of the patient's status.
Patient will have the results of this process by his
doctor but he is always in interaction with the
system by a mobile device that he owned.
3.3 Used Technologies
In the first layer, Hadoop is used as an open source

framework designed to perform processing on
massive medical data, the operating principle is as
follows (please check whether you have clearly
conveyed the principles.)
The infrastructure applies the well-known
principle of grid computing, of dividing the
execution of a process on multiple nodes or
clusters of servers.
In Hadoop architecture logic, this list is divided
into several parts, each part being stored on a
different server cluster. Instead of lean processing
in a single cluster, as is the case for traditional
Figure 2.Architecture of the predictive analysis system-
Health Care Application
architecture, the distribution of information helps
distribute the processing across all compute nodes
on which the list is distributed [23].
Data collection: is the most important and To implement such a technical process, Hadoop is
sensitive phase because the data is the main coupled to a file system called HDFS (Hadoop
element and the pivot of the system. We must Distributed File System for). It manages the
mention that more data is accurate the predicted allocation of storage of user data in blocks of
information is more accurate. information on different nodes. HDFS was
The voluminous medical data can come from inspired by a technology used by Google to own
various Electronic Health Record (EHR) / Patient these cloud services, and known as Google File
Health Record (PHR), Clinical systems and System (GFS).
external sources like government sources,
278
Map/Reduce: the distribution and management of subscription processing up to the prediction of

the calculations are carried out by Map Reduce. diseases before their apparition.
This technology combines two types of function: being given that medicine faces long certain
The Map function: which resides on the master diseases called silent diseases like some chronic
node and then divides the input data or task into diseases and that early diagnosis can eradicate
smaller subtasks, which it then distributes to them.
worker nodes that process the smaller tasks and With predictive analysis of big data, we will have
pass the answers back to the master node? the power to solving the major problems that are a
The subtasks are run in parallel on multiple real challenge in front of the development of
computers. medicine just to save human life.
The Reduce function: collects the results of all the
subtasks and combines them to produce an
aggregated Final result which returns as the
answer to the original big query. REFERENCES
The second layer is characterized by the great role 1. Macharia, P., Singh,U.G., Zuva,T.: Improving, Quality
and Access to Healthcare by Adopting Emerging
of Map-Reduce module for the process of Technologies, International Journal of Digital
predictive analysis. And to reinforce more and Information and Wireless Communications (IJDIWC) ,
more the system in matters of prediction, it must 2016, vol,6 pp, 41,45
be equipped a powerful predictive algorithm or 2. Richard, H., James, B., Anthony, B., Federico, G.,
learning algorithm to ensure the important phases Robin, M., Richard, S. and Roger, T.: Can Electronic
Medical Record Systems Transform Health Care?
of the process and build a suitable model of Potential Health Benefits, Savings, and Costs. Health
prediction. Affairs, 24, no.5 (2005):1103-1117 .
Data mining technology like a delicate process , 3. Marianthi, T., Nikos, T.: Smart Home Solutions for
executed by predictive algorithms, which have Healthcare: Privacy in Ubiquitous Computing
shown a strong effectiveness and efficiency in Infrastructures.
4. Santosh, K., Wendy, J.: Mobile Health Technology
predicting , take as an example supporting vector Evaluation ; ELSEVIER ; Volume 45, Issue 2, August
machine (SVM) [24], decision tree(C4.5 ) [25] , 2013, Pages 228236
and Naive Bayes (NB) [26], as They Are 5. Bonnie, F., Ellen M., Tobi, S.: Big Data in Healthcare
Currently classified Among the top 10 Hype and Hope; Business Development for Digital
classification methods Identified by IEEE Python Health; October 2012.
6. Miyazaki, M., Kamano, H., Imai, Y.: An e-Healthcare
& Related Resources [27]. System for Ubiquitous and Life-long Health
For that, our system should be equipped with a Management, International Journal of Digital
learning algorithm among the cited ones or a Information and Wireless Communications (IJDIWC),
combination of several learning algorithms to 2016, vol. 6, pp 163-172.
benefit from its performances and build a powerful 7. Madhusudhana, R., Rambhupal , B.: Survey of Adapting
Cloud Computing in Healthcare; International Journal of
hybrid algorithm that will be applied to all types of Advanced Research in Engineering and Applied
medical prediction. Sciences ; 2278-6252
8. Barbara, C., Mario, C.: Cloud Computing In Healthcare
And Biomedicine, SCPE; Volume 16, Number 1, pp. 1
4 CONCLUSION 18.
9. Sanjay, P., Sindhu, M., Jesus, Z.: A Survey of the State
of Cloud Computing in Healthcare; Canadian Center of
Science and Education. Vol. 1, No. 2, 2012 , September
19, 2012 .
Certainly predict the future is no longer a difficult 10. Jonathan, N., Brian, M., Sharat, K.: Healthcare in the
task, with emergent technologies , medical field cloud: the opportunity and the challenge. MLD.
will benefit from all the voluminous medical data 11. Lena, l. , Hans-Ulrich, P.: A scoping review of cloud
to extract knowledge for helping to decision computing in Healthcare ; BMC Medical Informatics and
making, reduction of cost and go beyond the Decision Making (2015) ; DOI 10.1186/s12911-015-0145-7.
279
12. Sunkara, V., Akula, S.: Role of Cloud Computing in

Health Monitoring System ; IJSEAT, Vol 2, Issue 10,
October 2014
13. Sungyoung, O., Jieun C.: Architecture Design of
Healthcare Software-as-a- Service Platform for Cloud-
Based Clinical Decision Support Service; Health Inform
Res. 2015 April; 21(2):102-110.
pISSN 2093-3681 , eISSN 2093-369X.
14. Michael, J., Keith, A., Craig, M.: Applications of
business analytics in healthcare. ELSEVIER. Business
Horizons (2014) 57, 571582.
15. Nura, E., Mohammad Reza, B.: Knowledge discovery in
medicine: Current issue and future trend; ELSEVIER;
Expert Systems with Applications 41 (2014) 4434
4463.
16. Srinath, S., Sameep, M.: Big Data Analytics. Springer
;December 20-23, 2014 ; Volume 8883 2014 .Available
at http://link.springer.com/book/10.1007/978-3-319-
13820-6
17. Peter, G., Basel, K.: The Big Data Revolution in
Healthcare; McKinney and Company April 2013. Center
for US Health System Reform Business Technology
Office. Accelerating-value-and-innovation.pdf
18. Kenney, N., Amol G.: PARAMO: A Parallel Predictive
Modeling Platform for Healthcare Analytic Research
Using Electronic Health Records; ELSEVIER. Journal
of Biomedical Informatics 48 (2014) 160170.
19. Wullianallur, R., Viju, R.: Big Data Analytics in
Healthcare: Promise and Potential, Health Information
Science and Systems 2014.
20. Borkar, V., Carey, M., Chen, L.: Big Data Platforms:
What's Next? ACM Crossroads 2012, 19(1):4449.
21. Zikopoulos, P., Eaton, C., DeRoos, D., Deutsch, T.,
Lapis, G.: Understanding Big Data Analytics for
Enterprise Class Hadoop and Streaming Data. McGraw-
Hill: Aspen Institute; 2012.
22. Alex, G.: Predictive Analytics in Healthcare the
Importance of Open Standards. ZEMENTIS IBM
Developers Worker. 29 November 2011.
23. Alfredo, C., Yeol S.: Analytics Over Large-Scale
Multidimensional Data: The Big Data Revolution;
DOLAP11, October 28, 2011, ACM 978-1-4503-0963-
9/11/10.
24. Cristianini, N., Shawe-Taylor, J.: An Introduction to
Support Vector Machines and Other Kernel-based
Learning Methods, Cambridge University Press, 2013,
200 pp,
25. Rahman, R., Rabbi, F.: Using and comparing different
decision tree classication techniques for mining
ICDDR, B Hospital Surveillance data, Elsevier, V, 38,
pp 1142111436
26. Ang, S., Ong, H., Chin Low, H. .: Classification Using
the General Bayesian Network ,science and technology,
24 (1) 205 211 pp, (2016)]
27. Top Data Mining Algorithms Identified by IEEE &
Related Python Resources:
http://www.datasciencecentral.com/profiles/blogs/pytho
n-resources-for-top-data-mining-algorithms
280
A Hybrid Credibility Analysis Method Applied on Turkish Tweets with TV

News and Discussion Programs Related Content
Ali Fatih Gndz1 and Pnar Karagz2
Akadag Vocational School, Inn University

1
2
Computer Engineering Department, Middle East Technical University
1
Akadag, Malatya, Turkey
2
ankaya, Ankara, Turkey
fatih.gunduz@inonu.edu.tr, karagoz@ceng.metu.edu.tr
ABSTRACT ity is an important aspect of communication.

From past to today, we observed change in the
In this paper, credibility analysis of microblog mes-
technology and the tools of communication.
sages is studied. We collected our data from one of
However measuring believability of messages
the most important microblogging services, Twit-
always remained as an important and crucial
ter. Our data set is created from the Turkish tweets
task. Today internet based social media tools
written for weekly television programs broadcast
are on the rise not only in Turkey but also in the
in Turkey with political, cultural and/or financial
world. The number and variety of web-based
contents. We adapted a new credibility definition
media platforms available is impressive.
based on three dimensions: being free from offen-
Social media became extremely popular for
sive words, not being spam and being newswor-
many reasons. Following personal and profes-
thy. To analyse credibility of tweets, we proposed
sional pursuits, building and sustaining friend-
a method by hybridizing supervised learning based
ship relations, advertising business promotions
techniques with graph based techniques. More-
and many other purposes are met by those on-
over we used content based and collaborative filter-
line microblogging services. Another reason
ing based techniques in this hybrid method. The
as to why social networking attracted public at-
proposed method consists of two phases: super-
tention is the ability to post about spontaneous
vised learning phase and graph based improvement
events easily. People can express their feelings
phase. Supervised classification algorithms are ap-
by sharing messages with many topics ranging
plied in the first phase and the results are improved
from their daily and ordinary activities to cul-
in the graph based phase of the method. We focused
tural and social events.
on tweet-tweet, tweet-writer and writer-writer rela-
Twitter1 is one of the most important mi-
tions in the second phase of our study. The per-
croblogging services and it allows its users
formance of the proposed method is measured by
to share 140 character long messages called
comparing it with human volunteers overall evalu-
tweets. Twitter has been providing an online
ation at the end.
environment in which people can share their
ideas, comments and concerns since 2006. To-
KEYWORDS
day it is a huge company with 313 million ac-
Twitter, social media, information, credibility, mi- tive users. Every day millions of tweets are
croblogs, data mining, machine learning, natural written in this online environment about many
language processing, collaborative filtering, cosine topics.
similarity, classification, content based filtering Users of Twitter can read, favourite and share
tweets of other users. Moreover users can cre-
1 INTRODUCTION ate friendship networks among themselves by
following each other. Those friends can enjoy
Internet based communication tools provides a
1
perfect research area and information credibil- https://about.twitter.com/tr/company
281
creating private or public messages. To send tually similar and whether writers of credible
public messages, users mention about their tweets form a closer clique in the graph. In
friends user names in tweet content by adding order to achieve this, we created links between
@ character in front of it. This practice is tweet-tweet nodes only if their contents cosine
called as mentioning and this tag is called as similarity is bigger than a predefined thresh-
mention tag. Similarly users can use another old. Also we used followership relations be-
tag by adding # character in front of a spe- tween authors to create user-user links between
cific word to create hashtags. Twitter displays nodes. Finally tweet-author relations are used
the tweets containing same hashtags together to link a tweet node with a user node in the
so that users can follow and contribute to spe- graph.
cific discussions. On the other hand, we proposed a definition
Twitter enables researchers to read, query and for credibility as well. Credible is defined as
collect tweets of users who do not disable pub- "able to be trusted or believed" by Cambridge
lic visibility of their statuses. Our data set dictionary2 . By its nature, credibility is a sub-
is created from those publicly written tweets jective matter and it is always open to discus-
which are tweeted for news and discussion sion. In addition, its measurement depends on
programs broadcast on the television weekly. individual opinions and changes greatly from
Many TV programs ask and encourage their person to person. Fogg and Tseng [1] state
audience for participation through Twitter ac- that credible information is believable informa-
counts of programs. Most of them have Twit- tion and they describe credibility as a perceived
ter accounts in order to enable their audience quality composed of multiple dimensions. So
to contribute in the program flow by asking we based credibility definition on three dimen-
questions, making comments and expressing sions: being free from offensive words, being
their feelings by writing tweets consisting of free from spamming and being newsworthy.
program specific hashtags and/or mention tags. Our study is conducted with binary rating
Hosts of those programs read those tweets and while deciding dimensional values of credibil-
direct the program flow accordingly if they de- ity. The proposed hybrid method is used to
sire to do so. However separating junk from assign yes or no statuses to each dimension.
useful information in the tweet flood is a big Each one of those three dimensions is exam-
challenge and time consuming. ined separately and final credibility decision
is made according to those three dimensions
To analyse the credibility of a tweet, we pro-
overall results. A tweet is labelled as credi-
posed a hybrid method in this study. Con-
ble only if it is free from offensive words, free
tent based techniques and collaborative filter-
from spams and it is newsworthy. The perfor-
ing based techniques are hybridized in our
mance of the proposed method is analysed by
method. Moreover, we improved our method
comparing it with the human volunteers clas-
by deploying word-checking algorithms from
sification results. Since those dimensions are
a slang-word dictionary. The proposed method
likely to be interpreted variously by different
is composed of two phases namely supervised
people, each tweet of our data set is read by
learning phase and graph based improvement
three volunteers and majority voting is used for
phase. Firstly, in supervised learning phase,
each dimension.
we applied machine learning classification al-
gorithms on the tweet set and then the classifi-
cation results are improved in the graph based 2 RELATED WORKS
phase. We examined both tweets and users in In 2012, Kang et al. [2] proposed two defini-
order to create a connected graph from them tions for tweet credibility as degree of believ-
with links between user-user, user-tweet and ability that can be assigned to a tweet about
tweet-tweet elements. This graph is used to
2
investigate whether credible tweets are contex- dictionary.cambridge.org/dictionary/english/credible
282
a target topic and expected believability im- social environment. They encourage5 their
parted on a user as a result of their standing in users to report both profiles and individual
the social network. In addition to these, they tweets for spamming. Moreover they present
stated later that [3] credibility is a function of technical solutions such as link shortener (t.co)
perception consisting of the object being per- to detect whether links lead to malicious con-
ceived and the person who perceives it. tents as well.
Fogg [4] expressed website credibility in terms To detect Twitter spam, there are two different
of prominence and interpretation which are de- approaches in the literature: focusing on the
fined as likelihood of being noticed and judge- user classification and examining tweet con-
ment of the people who noticed the element in tent. In the first approach, profile details of
his study. the user, number of followers and friends, re-
Castillo and Yamaguchi studied both cred- cent activities in the previous weeks, user be-
ibility assessment and newsworthiness of haviours and tweeting frequencies are investi-
tweets [5]. In their study, they focused on cred- gated. Studies like [9], [10] and [11] aimed to
ibility of information and used the term credi- classify users as spammers and non-spammers
bility in the sense of believability. They classi- according to these user attributes. The sec-
fied tweets as credible or not. They randomly ond approach considers topics of the tweets,
selected 383 topics from Twitter Monitor3 [6] duplications between the tweets, urls in the
collection and get it evaluated by Mechanical tweets, number of words and characters in the
Turk4 by asking evaluators if they consider that texts. Martinez et al. [12] presented an ex-
a certain set of tweets as newsworthy or only ample of this approach in which they detected
informal conversations. Then they asked an- spam tweets without any previous user infor-
other group to read the text content and state if mation but by using contextual features ob-
they believe that those tweets are likely to be tained by natural language processing. Clark
true or false. In this evaluation, they consid- et al. [13] proposed a solution to the problem
ered four levels of credibility and asked evalu- of separating automated spam generators from
ators to provide justification in that fuzzy for- human tweeters by a classification algorithm
mat. They proposed a supervised learning operating by using linguistic attributes like url
based method to automatically assess the cred- count, average lexical dissimilarity and word
ibility level of tweets which has a precision and introduction rate decay.
recall rate between 70% and 80%. There are hybrid solutions of user based and
Detecting and preventing spam is another as- content based approaches like [14] and [15].
pect of credibility. Not only individuals write Bara et al. [15] proposed a three step solu-
those spam tweets but also designed tweet gen- tion in which they firstly look for malicious
erator tools are used to carry out this annoy- links provided by Twitter database, secondly
ing and potentially malicious activity. Ferrara they look for pattern similarities between spam
et al. [7] stated that hundreds of thousands of tweets and original tweets and finally they con-
social, economic and political incentives pre- struct a bipartite network between users and
sented by highly crowded social media ecosys- corresponding tweets.
tems attract spammers to design human imitat- Pal et al. [16] studied tweet credibility from an-
ing bot algorithms. Forelle et al. [8] stated that other perspective by classifying tweet writers.
bots are used for political lobbying in several In their study, they tried to find most interest-
countries like Russia, Mexico, China, UK, US ing and authoritative authors among millions
and Turkey. of Twitter users for given specific topics. They
Twitter attaches importance to the fight against computed self-similarity score for authors be-
the spammers in order to sustain a spam-free tween their last two tweets so as to measure
how similar an author writes. This score is
3
http://www.twittermonitor.net
4 5
http://www.mturk.com/ https://support.twitter.com/articles/64986?lang=en
283
used to explore the width of the users interest nodes. Then they applied ObjectRank [20] on
area. They also classified tweets into three cat- the user-tweet schema graph to evaluate the
egories: original tweets, conversational tweets users authority scores.
and repeated tweets. They counted the num- Similarly, Gun and Karagoz [21] proposed a
ber of tweets in different categories of authors hybrid solution combining feature based and
while deciding about their interestingness for graph based methods for credibility analysis
clustering the users. problem in microblogs. They focused on mes-
Other than classification approaches, there are sage, user and topic relationship in the graph
graph based solutions as well. Graph based based part of their study. They gathered 43 fea-
solutions are basically use variations of well- ture attributes from tweet, topic and user data
known PageRank [17] and HITS [18] algo- in order to use them in feature based classifica-
rithms. Page and Brin, with PageRank, aimed tion. They tried to label tweets as newsworthy,
to measure and rate relative importance of Web important and correct for determining which
pages mechanically. In this algorithm, the link- information in Twitter is credible.
ing design among the web pages is consid-
ered in a graph structure. Being query inde- 3 DATA COLLECTION AND CON-
pendent and more sophisticated than simply STRUCTION OF THE GOLD STAN-
counting links, PageRank ranks pages accord- DARD
ing to their importance of back links and for- Our data set consists of tweets, their authors
ward links which directs to and are directed and the ground truth evaluations obtained from
from the web pages. With HITS algorithm, volunteers. In order to carry out this study,
Kleinberg [18] aimed to extract information we crawled tweets with specific query key-
from the link structure of network environment words related with weekly Turkish television
too. Although HITS is not solely specific to programs. We selected television programs
WWW, aiming to improve web search systems with political, social, economic and cultural
it identifies two kinds of web pages: authorities contents. Concepts of the selected programs
which are the pages that users look for to reach is built upon discussions between experts who
information and hubs which are pointer pages are hosted by the channel or presentations
that lead to authorities. Kleinberg focused of celebrities about mentioned topics. Those
on the mutual relationship between those two programs are open to audience contributions
kinds by giving non-negative invariant weights through Twitter and the hosts read comments
to each node and then making iterative score and direct questions to guests during the pro-
transfers between interlinked hub and author- gram flow if they desire to do so.
ities until scores converge to the equilibrium The crawled query keys are explicit hashtags
values. and/or mention tags used by the program pro-
Another graph based study is TURank which ducers so that they receive comments and ques-
constituted a base to our study. Yamaguchi tions from their audience through Twitter. We
et al. [19] proposed Twitter user ranking al- only gathered the tweets which are deliberately
gorithm (TURank) to determine authoritative written for the selected television programs.
users. They defined authoritative users as the During data collection period we crawled tweet
ones who frequently submit useful information id, tweet text, user id, retweet and favorite
and they aimed to measure authoritativeness of counts of the tweet. Tweet text is parsed and
users in order to rank them. They constructed a 22 different features obtained about the tweet
user-tweet schema graph where nodes are cre- such as length of tweet, number of words in
ated from users and tweets. On the other hand it, fraction of upper case letters, fraction of
edges are created from post, posted by, follow, tagged words as hashtags and mention tags,
followed by, retweet and retweeted by relations whether tweet contains question mark, excla-
between user-tweet, user-user and tweet-tweet mation mark and whether emoticons exist in
284
tweet etc. Moreover positive and negative sen- 4 METHODOLOGY

timent scores of the tweets are obtained from
SentiStrength API6 and added to the feature Tweet data collection and ground truth evalua-
set. tions form the first chapter of our study. The
Other than tweet, data of the users are crawled next two chapters are the phases of the pro-
as well. Collected user features are friend posed method which are applied respectively.
count, follower count, tweet count, friend and In the first phase we used supervised learning
follower lists7 . The follower and friend lists are techniques and then the obtained results were
used to determine user-user links of the datas improved in the graph based part. For the of-
internal friendship network. fensiveness analysis of the tweet we also exper-
In order to construct the gold standard for the imented slang word dictionary based methods
evaluation, we conducted a user study with and compared performances of them.
contribution of volunteers. Each tweet is read
by three people and they answered to the ques-
tions: 8
1. Does the tweet contain swearing, abusing
or offensive words?
2. Is the tweet written for distracting, unre-
lated, advertising or out of program scope
purposes?
3. Is the content interesting, important or
news-worthy?
The volunteers answered each of these ques- Figure 1. Activity diagram of the proposed credibility
analysis system.
tions as either Yes or No. The ground truth
label is determined by using majority voting.
Each question is experimented separately and 4.1 First Phase - Supervised Learning
formed a dimension of this study. Finally
we labelled a tweet as credible only if both In this study we used Weka9 API for Naive
three dimensions of it provide proper answers. Bayes, kstar, ADTree and J48 decision tree
Tweets classified as no with respect to the classification algorithms. We used 10 folds
first two questions and yes with respect to the cross validation in this phase and obtained best
last question are labelled as credible while the results with J48 decision tree algorithm for
others are identified as ineligible. both three dimensions. The features used in
this phase is shown in Table 1.
6
http://sentistrength.wlv.ac.uk/
7
Friend refers to the users followed by the user and
followers refers to the users following the user 4.2 Second Phase - Graph Based Improve-
8
As our tweet data were constructed from Turkish ment
tweets, the questions above were Turkish in our website
and volunteers were native Turkish speakers. Original After applying the supervised learning phase
questions in Turkish were: of the proposed method, aiming to improve
1. Kfr, Hakaret, Saldrgan veya Incitici Ifade classification results we applied second phase
Ieriyor mu? of our method on the data set. Firstly, con-

2. Dikkat Dagtc, Alakasz, Reklam Ierikli veya verting tweets and users to nodes we created a
Program Ds Bir Amala m Yazlms? connected undirected graph. Secondly, we as-

3. Ierik
Ilgin, Dikkate Deger veya Haber Degeri signed initial scores to those nodes according
Tasyor mu?
9
Weka. http://www.cs.waikato.ac.nz/ml/weka
285
Table 1. Features of tweets used at supervised learning each other. On the other hand, we examined
phase
whether credible tweets are contextually simi-
No Feature lar or not by linking those tweets in the graph.
1 Length of tweet To this aim, we first parsed the text of the
2 Fraction of upper case letters tweets and obtained the word sets and elimi-
3 Total number of words nated the effect of stop words. Those word sets
4 Number of words with mention tags were processed with Zemberek10 Turkish NLP
5 Number of words with hashtags tool and we replaced them with their corre-
6 Number of words without @/# tags sponding longest lemma term so that we could
7 Fraction of tagged words identify relations among the same words in dif-
8 Whether contains question mark ferent morphological forms. This textual data
9 Whether contains exclamation mark is converted to term vector for each tweet.
10 Whether contains smile emoticon Term vector of a tweet contains longest lem-
11 Whether contains frown emoticon mas of all unique words existing in its text and
12 Whether contains URL corresponding term frequency-inverse docu-
13 Positive sentiment score ment frequency multiplication score pairs.
14 Negative sentiment score In order to obtain multiplication results firstly
15 Whether contains first pronoun we calculated the term frequencies of the
16 Whether contains second pronoun longest lemma terms of tweets according to
17 Whether contains demonstrative pronoun Equation 1.
18 Whether contains interrogative pronoun
T F (Ti , wj )
19 Retweet count
N umber of times word wj occurs in Ti
20 Is retweet =
21 Is reply to a user N umber of words in tweet Ti
22 Favorite count (1)
Then inverse document frequencies of words

to the results of the first phase and run random are calculated according to Equation 2.
walk iterations on this graph.
T NT V
IDF (w, D) = log10 ( ) (2)
4.2.1 Graph Construction CT V
During graph construction, we linked nodes where TNTV is total number of term vectors
according to following rules: in dataset D and CTV is the number of term
vectors which contain the word w.
1. A user node is directly linked to a tweet Those terms and their corresponding term
node if the user is the writer of the tweet. frequency-inverse document frequency multi-
plication result pairs are used to obtain Tf-
2. A user node is directly linked to another Idf based term vectors of tweets according to
user node if the other user exists in the fol- Equation 3.
lower/friend list of the user.
T f Idf Based T erm V ector of Ti (3)

3. A tweet node is directly linked to another = (w1 , tf idf1 ), (w2 , tf idf2 ), ..., (wn , tf idfn )
tweet node if the tweets content has equal
to or more than a predefined cosine simi-
Finally, for each tweet, we calculated cosine
larity with the other tweets content.
similarity of its term-vector with all others
according to Equation 4. Depending on the
By linking users together we aimed to inves-
tigate whether credible tweets are written by 10
Zemberek Project, https://github.com
similar people that are friends and followers of ahmetaa/zemberek-nlp
286
cosine similarity value, we linked associated

tweet nodes in the graph. N ode Nj0 s hub score
linked nodes with Nj
Cosine similarity(T weeti , T weetj ) =
X
weight (5)
T weeti . T weetj (4) i
= Ni authority score
kT weeti k kT weetj k
Experimentally we found the maximum cosine Hub score of a node is updated by adding a pre-
similarity threshold which is used to determine defined ratio of authority scores of the neigh-
whether two tweets should be linked or not. bour nodes to its hub score according to Equa-
Tweet pairs with cosine similarities higher than tion 5.
or equal to 0.063 are linked and we obtained a N ode Nj0 s authority score
connected graph. linked nodes with Nj
X
Finally we assigned initial scores to the train = weight Ni hub score
nodes of the graph according to the results of i
the first phase. Similar to studies [22] and [18], (6)

nodes have two kinds of scores namely hub and
Authority score of a node is updated by adding
authority. Authority score of a node indicates
a predefined ratio of hub scores of the neigh-
the direct meaningfulness value to the exam-
bour nodes to its authority score according to
ined dimension while hub score shows the de-
Equation 6.
gree of being connection point among mean-
Nodes hub and authority scores increases or
ingful nodes. During iterations hub scores are
decreases according to the link structure of the
used to calculate authority scores. Similarly
graph during the random walk iterations.
authority scores are used to update hub scores
as well. 4.3 Dictionary Based Analyses
4.2.2 Random Walk Iterations On The For analysing the being free from offensive
Graph words, we tried some slightly different ap-
proaches as well. We checked the existence
10 fold cross validation is applied in graph of offensive words from a slang-word dictio-
based improvement phase as well. A tenth nary. Other than the explained hybrid method,
of tweets are separated as test set while the we made 3 more experiments for the first di-
rest of the tweets and all of the user nodes mension of credibility:
are assigned initial scores. Positively classi-
fied tweets in the first phase are assigned 1000 1. Only considering word existence in slang-
and negatively classified tweets are assigned - word dictionary of tweet text
1000 initial hub and authority scores. On the
other hand, users with credible tweets assigned 2. Considering both word existence in slang-
1000 hub and authority scores and the rest of word dictionary of tweet text and first
the users are assigned -1000 hub and authority phase classification result
scores. 3. Selecting the tweets with negative senti-
After constructing the graph and assigning ini- ment score less than -2 which also contain
tial hub and authority scores to the nodes, we slang word
run a predefined number of iterations on the
graph for hub/authority transfers among nodes. Those methods are used to make the initial
At the end of those iterations, a tweet is clas- classification of the tweets. After the initial
sified as positive if its final authority score is hub/authority score assignment according to
greater than zero, and classified as negative each one of those methods, random walk iter-
otherwise. ations are applied and hub/authority scores are
287
transferred in the graph. Performance results First Phase Results

of those 4 methods are compared in the sec- Second Phase Results
tion 5. 1
5 EXPERIMENTS AND RESULTS
Three questions are asked to volunteers and 0.5

majority of votes are used to determine three
dimensional ground truth statuses of the tweet.
Each question defines a credibility dimension
of this study and each dimension is experi- 0
mented separately. In this section we compared YCR NCR YCP NCP F1
the classification results of the proposed hybrid
Figure 2. Random walk iteration improvement results
method with human vote based ground truth of the first dimension.
data.
We made a large number of experiments. In
In Figure 2, during second phase, YCR results
this section we show the best results ob-
increased 36% and YCP results increased 10%.
tained. Best supervised learning phase re-
NCR results decreased 4% and NCP slightly
sults are obtained with j48 decision tree clas-
changed. Final F1 score increased 24% at the
sification algorithm. We used WEKA API
end of the graph based improvement phase.
for machine learning algorithms. Second
We also made slang-word dictionary based ex-
phase experiments are conducted with differ-
periments for this dimension. Those experi-
ent hub/authority weights and best yes class re-
ments and their performance results are given
call(YCR), no class recall(NCR), yes class pre-
below:
cision(YCP), no class precision(NCP) and F1
score(F1) of the experiments are shown in the
E1: Proposed hybrid method without dic-
figures.
tionary based improvements
5.1 First Dimension - Being Free From Of- E2: Only considering word existence in
fensive Words slang-word dictionary of tweet text
The first dimension is about filtering offensive E3: Considering both word existence in
tweets. To check this, in the user study, volun- slang-word dictionary of tweet text and
teers were asked the following question: "Does first phase classification result
the tweet contain swearing, abusing or offen- E4: Selecting the tweets with negative
sive words?" sentiment score less than -2 which also
contain slang word
Table 2. First Dimension Supervised Learning Phase
Best Results
Table 3. First Dimension Slang-Word Dictionary Based
Yes Class Recall: 0.351 Methods Performances
No Class Recall: 0.978
Yes Class Precision: 0.457 Experiment: E1 E2 E3 E4
No Class Precision: 0.966 YCR: 0.351 0.373 0.787 0.413
F1 Score: 0.397 NCR: 0.978 0.942 0.934 0.966
Accuracy: 0.947 YCP: 0.457 0.252 0.384 0.392
Specificity: 0.978 NCP: 0.966 0.966 0.988 0.969
Sensitivity: 0.353 F1 Score: 0.397 0.301 0.516 0.403
288
E1 First Phase Results

E2 Second Phase Results
0.4 E3 1
E4
0.8
0.2 0.6
0.4
0 0.2
F1 Score
Figure 3. First Dimension Slang-Word Dictionary
0
Based Methods F1 Score Comparisons YCR NCR YCP NCP F1
of the second dimension.
As it can be seen from Figure 3, best F1 score

is obtained in experiment 3. Selecting tweets
with slang-words and considering supervised In Figure 4 we observed that YCR increased
classification results together increased the F1 16% but YCP decreased 5%. On the other hand
score performance 30% with respect to origi- NCR decreased 8% whereas NCP incrased 1%.
nal method for first dimension analysis. The graph based phase improved final F1 score
4%.
5.2 Second Dimension - Being Free From

Spamming 5.3 Third Dimension - Being Newsworthy
The second dimension is about filtering spam The third dimension is about the news-
tweets. To check this, in the user study, vol- worthiness. To check this, in the user study,
unteers were asked the following question: "Is volunteers were asked the following question:
the tweet written for distracting, unrelated, ad- "Is the content interesting, important or news-
vertising or out of program scope purposes?" worthy?"
Table 4. Second Dimension Supervised Learning Phase Table 5. Third Dimension Supervised Learning Phase
Best Results Best Results
Yes Class Recall: 0.569 Yes Class Recall: 0.843

No Class Recall: 0.946 No Class Recall: 0.623
Yes Class Precision: 0.464 Yes Class Precision: 0.832
No Class Precision: 0.936 No Class Precision: 0.642
F1 Score: 0.511 F1 Score: 0.837
Accuracy: 0.897 Accuracy: 0.775
Specificity: 0.946 Specificity: 0.623
Sensitivity: 0.569 Sensitivity: 0.844
289
First Phase Results tweets written for current Turkish TV pro-

Second Phase Results grams about social and political discussions.
Even though we developed a method based
0.8 on Turkish language, the proposed method can
be generalized for other languages by chang-
0.6 ing language parser and word separator com-
ponents.
0.4
REFERENCES
0.2
[1] B. Fogg and H. Tseng. The elements of computer
credibility. In Proceedings of the SIGCHI con-
0
ference on Human Factors in Computing Systems,
YCR NCR YCP NCP F1 pages 8087. ACM, 1999.
of the third dimension. [2] B. Kang, J. ODonovan, and T. Hllerer. Modeling
topic specific credibility on twitter. In Proceed-
ings of the 2012 ACM international conference on
In Figure 5 YCR and NCP changed insignif- Intelligent User Interfaces, pages 179188. ACM,
icantly less than 1% but NCR and YCP in- 2012.
creased 3%. Final F1 score is improved only [3] B. Kang, T. Hllerer, and J. ODonovan. Believe
1%. it or not? analyzing information credibility in mi-
croblogs. In Proceedings of the 2015 IEEE/ACM
6 CONCLUSIONS International Conference on Advances in Social
Networks Analysis and Mining 2015, pages 611
In this study we proposed a new method to 616. ACM, 2015.
analyse tweet credibility by hybridizing con-
tent based techniques with collaborative filter- [4] B. J. Fogg. Prominence-interpretation theory: Ex-
plaining how people assess credibility online. In
ing based techniques. The proposed method CHI03 extended abstracts on human factors in
consists of two phases. In the first phase we ap- computing systems, pages 722723. ACM, 2003.
plied classification algorithms on the tweet set.
22 features of tweet text are obtained and used [5] C. Castillo, M. Mendoza, and B. Poblete. Infor-
in the supervised learning phase. In the second mation credibility on twitter. In Proceedings of the
20th international conference on World wide web,
phase we constructed a connected graph from pages 675684. ACM, 2011.
users and tweets in which the edges are cre-
ated according to author-text relation between [6] M. Mathioudakis and N. Koudas. Twittermonitor:
users and tweets, friendship relation between trend detection over the twitter stream. In Pro-
users and normalized contextual similarity be- ceedings of the 2010 ACM SIGMOD International
Conference on Management of data, pages 1155
tween tweets. We investigated whether credi- 1158. ACM, 2010.
ble tweets are linked with each other or not. We
aimed to separate positive and negative classes [7] E. Ferrara, O. Varol, C. Davis, F. Menczer, and
by applying hub/authority score transfers in the A. Flammini. The rise of social bots. arXiv
graph. preprint arXiv:1407.5225, 2014.
We brought a new credibility definition based [8] M. Forelle, P. Howard, A. Monroy-Hernndez, and
on three dimensions: being free from offensive S. Savage. Political bots and the manipulation
words, not being spam and being newswor- of public opinion in venezuela. arXiv preprint
thy. Those three dimensions are examined sep- arXiv:1507.07109, 2015.
arately in supervised learning and graph based
[9] A. H. Wang. Dont follow me: Spam detection in
improvement phases. twitter. In Security and Cryptography (SECRYPT),
This study focused on the tweets written in Proceedings of the 2010 International Conference
Turkish language. We created our data set from on, pages 110. IEEE, 2010.
290
[10] F. Benevenuto, G. Magno, T. Rodrigues, and [21] A. Gn and P. Karagz. A hybrid approach for
V. Almeida. Detecting spammers on twitter. In credibility detection in twitter. In Hybrid Artificial
Collaboration, electronic messaging, anti-abuse Intelligence Systems, pages 515526. Springer,
and spam conference (CEAS), volume 6, page 12, 2014.
2010.
[22] S. Brin and L. Page. The anatomy of a large-scale
[11] C. Yang, R. C. Harkreader, and G. Gu. Die free hypertextual web search engine. In COMPUTER
or live hard? empirical evaluation and new design NETWORKS AND ISDN SYSTEMS, pages 3825
for fighting evolving twitter spammers. In Recent 3833. Elsevier Science Publishers B. V., 1998.
Advances in Intrusion Detection, pages 318337.
Springer, 2011.
[12] J. Martinez-Romo and L. Araujo. Detecting ma-

licious tweets in trending topics using a statistical
analysis of language. Expert Systems with Appli-
cations, 40(8):29923000, 2013.
[13] E. M. Clark, J. R. Williams, R. A. Galbraith, C. M.

Danforth, P. S. Dodds, and C. A. Jones. Sifting
robotic from organic text: A natural language ap-
proach for detecting automation on twitter. arXiv
preprint arXiv:1505.04342, 2015.
[14] M. Mccord and M. Chuah. Spam detection on

twitter using traditional classifiers. In Autonomic
and trusted computing, pages 175186. Springer,
2011.
[15] I.-A. Bara, C. J. Fung, and T. Dinh. Enhanc-

ing twitter spam accounts discovery using cross-
account pattern mining. In Integrated Network
Management (IM), 2015 IFIP/IEEE International
Symposium on, pages 491496. IEEE, 2015.
[16] A. Pal and S. Counts. Identifying topical author-

ities in microblogs. In Proceedings of the fourth
ACM international conference on Web search and
data mining, pages 4554. ACM, 2011.
[17] L. Page, S. Brin, R. Motwani, and T. Winograd.

The pagerank citation ranking: bringing order to
the web. 1999.
[18] J. M. Kleinberg. Authoritative sources in a hyper-

linked environment. Journal of the ACM (JACM),
46(5):604632, 1999.
[19] Y. Yamaguchi, T. Takahashi, T. Amagasa, and

H. Kitagawa. Turank: Twitter user ranking based
on user-tweet graph analysis. In Web Information
Systems EngineeringWISE 2010, pages 240253.
Springer, 2010.
[20] A. Balmin, V. Hristidis, and Y. Papakonstantinou.

Objectrank: Authority-based keyword search in
databases. In Proceedings of the Thirtieth interna-
tional conference on Very large data bases-Volume
30, pages 564575. VLDB Endowment, 2004.
291
A Question-Answering Inferencing System Based on Definition and Acquisition of

Knowledge in Written English Text
Kenta Hiratsuka and Hiroki Imamura

Graduate school of Engineering, Soka University
1-236 Tangi-machi, Hachioji-shi, Tokyo, Japan
e15m5221@soka-u.jp, imamura@soka.ac.jp
ABSTRACT by using the information extracted from

documents and can generate highly accurate
In recent years, question-answering systems have answers. However, these systems require that
become a major area of research and development in information to become the direct answer is listed
information search technologies. We have proposed a on documents.
new question answering system that can produce new In contrast, we have proposed a new type question
knowledge by the logical inferencing. The proposed answering system that can produce new
system is based on two characteristics that are logical
knowledge by logical inferencing in natural
inferencing using the knowledge and dealing with
diversity in natural language. To realize this system, in language[4]. Our system extracts not only
this paper, we construct a question-answering knowledge but also definition which is a rule of
processing section that can perform logical inferencing the inference from web documents and can
in the proposed system. This system performs generate answers about the knowledge that is not
inferencing based on "knowledge sentences" and listed in web documents by logical inference using
"definition sentences". It can perform a more the definition. Therefore, this system has the
complicated logical inference by using definition potential to become not only as simple question-
sentences and performing the inference repeatedly. In answering system but also an important aid to
this way, new knowledge is not only acquired by human intellectual activity.
humans, but may be produced by the logical As shown in figure 1, we have been developing
combination of knowledge. In the future, this system
the question-answering system having two
has the potential to become an important aid to human
intellectual activity. characteristics of logical inference using
knowledge and dealing with the diversity of
KEYWORDS natural language. The proposed system normalizes
all input sentences having the same meaning to
Nature Language Processing, Question Answering just one sentence. It can deal with the diversity of
System, Logical Inference, Knowledge Sentence, the natural language by using normalized
Definition Sentence sentences as knowledge. In addition, this system
can answer questions even if it does not have
1 INTRODUCTION direct knowledge by performing logical inference
using knowledge sentences.
In recent years, question-answering systems have In this paper, we construct a question answering
become a major area of research and development processing section among other sections of the
in information search technologies[1]. For proposed system (within the dotted frame in figure
example, question answering systems have been 1). In this process, this system performs inferences
developed such as "Power-Answer", which can based on "knowledge sentences" and "definition
perform question answering based on logic[2] or sentences". These two kinds of sentences are
"WATSON" that can deal with the diversity of acquired from text on the internet or inputted by
natural language[3]. These conventional systems the users. "Definition sentences" are sentences in
were intended to answer the question of the user the form of if..., then... and these sentences
292
express rules of inference that combine knowledge or definition sentences, this system saves it into
to produce different knowledge. For example, database as text data. In contrast, when this system
users may input definition sentences such as "If X classifies the input sentences as question sentences,
is a father of Y and Y is a father of Z, then X is a this system performs the question answering
grandfather of Z" and knowledge such as "Michael processing. It matches knowledge data, inferences
is a father of Daniel" and "Daniel is a father of and multiplex inferences sequentially and outputs
David". Users can also input a question such as all the answers that are produced for a question. If
Who is a grandfather of David?. In this case, this system does not produce answers, it outputs
even if this system does not have direct knowledge, information necessary to allow successful
the system can answer such as "Michael is a inferences.
grandfather of David" by the inference based on
the definition sentence. In addition, this system
can perform a more complicated logical inference
by using definition sentences, and performing the
inference repeatedly. Thereby, new knowledge is
not only acquired by humans, but may be
produced by logical inference.
Figure2. The system flowchart
2.1 Morphological Analysis
This system performs morphological analysis of

an input sentence using Stanford CoreNLP
1
Figure1. Concepts of the proposed system which is a morphological analyzer for English
language. This system classifies input sentences
2 PROPOSED SYSTEM by the part of speech of the head word of a
sentence. When the head word of an input
Figure 2 shows a flowchart of the system from the sentence is a "be verb", "general verb", "auxiliary
dotted frame in figure 1. Here users manually verb" or "interrogative", it is classified as a
input the knowledge and definition sentences for question sentence. For example, sentences such as
the system. At first, this system performs "Does Saburo like coffee?" are classified as
morphological analysis of an input sentence to question sentences. The other input sentences are
provide parts of speech information of each word. classified as knowledge sentences or definition
This system classifies the analyzed sentences as sentences. For example, sentences such as "Taro is
any one of knowledge sentences, definition a man" are classified as knowledge sentences. In
sentences or question sentences. When this system
classifies input sentences as knowledge sentences 1
http://nlp.stanford.edu/software/corenlp.shtml
293
contrast, sentences are classified as definition For example, "Does Saburo like coffee?" is
sentences if they match the "if... then..." structure. classified as type2 when this sentence is inputted.
The sentences such as If Taro is a man and Taro Because Does, the head word of this sentence, is
has a son then Taro is a father are classified as classified as a general verb. The classified
definition sentences. Thus, this system classifies sentence is converted into the form of the
input sentences into three types, "knowledge declarative sentence to perform the matching with
sentences", "definition sentences" and "question knowledge sentences. For examples, the
sentences". We regard "knowledge sentence" and previously described example sentence is
''definition sentence'' as declarative sentence. converted into "Saburo likes coffee".
When a declarative sentence is inputted, this
system classifies the sentence as either a 2.2 Matching with Knowledge Data
knowledge or a definition sentences and saves it as
text data. If a similar sentence has already been This system matches a converted sentence from
inputted into the database, it is not saved. In the question sentence and saved knowledge data.
addition, in the case of knowledge sentences, old The matching starts from head words of each
knowledge is deleted when contradictory sentence, and the matching is shifted to next word
knowledge is inputted; that is the old knowledge is one by one. When this matching arrives at the last
overwritten by the new inputted knowledge. of each sentence, the matching becomes the
Based on information given by morphological success. Conversely, if this system fails in
analysis when a question sentence is inputted, this matching in the middle of the process, it cannot
system performs question answering processing. output an answer. If matching is successful, it
Based on the part of speech information of the outputs the answer to the question.
input sentence, question sentences are classified
into 13 patterns according to English rules of 2.3 Omission of the Modifier
grammar as follows.
When this system matches the knowledge
Type1 sentence: be verbs sentences including modifier and the question
sentences not including modifier, it can perform
Type2 sentence: general verbs
the matching that against these omitted a modifier.
Type3 sentence: auxiliary verbs Figure 3 shows the rule of the modifier omission
using an example sentence.
Type4 sentence: interrogative as the subject
For example, we assume that this system is
Type5 sentence: which + be verbs inputted a question "Is Mary a student?" when the
system saves a knowledge "Mary is a serious
Type6 sentence: which + general verbs
student" such as (a) of figure 3. In this case, this
Type7 sentence: which + auxiliary verbs system cannot output correct answer because it
failed in matching "student" and "serious". To deal
Type8 sentence: why + be verbs
with this problem, when a word of the knowledge
Type9 sentence: why + general verbs sentence side is the modifiers this system matches
with a next word once again. This system can
Type10 sentence: why + auxiliary verbs
output the answer to the question if this system
Type11 sentence: interrogative + be verbs succeeds in the matching.
In contrast, in the case of a knowledge expressed
Type12 sentence: interrogative + general verbs
as a negative sentence such as (b) of figure 3, this
Type13 sentence: interrogative + auxiliary system does not omit the modifier in the above
mentioned. Because the sentences that omitted a
verbs
modifier are not necessarily to be true in the case
of negative sentence. For example, in the case of
(b) of figure 2, this system cannot determine
294
whether Mary is not a student or not. Thus, this For example, we assume that this system has one
system does not omit the modifier about the definition sentence such as (1) and one knowledge
knowledge expressed as a negation sentence. sentence such as (2). In this time, if this system is
asked a question such as (3), this system performs
inference by using above definition sentence. Next,
this system performs the matching with
conditional parts and knowledge sentences. In this
case, this system matches the conditional part of
(1) such as Noah has feathers'' with a knowledge
sentence such as (2). As a result, this system
outputs the answer that Yes.''.
When wild cards (the part expressed in capital
letters of the alphabet such as "X" and "Y" shown
in (4) are included in the definition sentence, this
system substitutes applicable nouns for wild cards.
If X is father of Y and Y is father of Z

then X is grandfather of Z. (4)
Figure3. The matching rule in the modifier omission
Bob is father of Daniel. (5)

2.4 Inference
Daniel is father of John. (6)
This system performs the inference using the
definition sentences. At first, this system needs to Who is grandfather of John? (7)
find the definition sentence that can arrive at the
appropriate answer in conclusion. Therefore, it For example, we assume that this system has one
matches a question sentence with the conclusion definition sentence such as (4) and two knowledge
part of the definition sentence, because a sentences such as (5) and (6). In this time, if this
conclusion produced from the definition sentence system is asked a question such as (7), this system
must be the answer for the question. A conclusion performs inference by using above definition
part is the sentence after then'' in definition sentence. And, the word "John" is substituted for
sentences. If an appropriate definition sentence is Z by matching with conclusion part of above
not found, it finishes processing because the definition sentence and question sentences. Next,
answering to the question is impossible. If an this system performs the matching with
appropriate definition sentence is found, it conditional parts and knowledge sentences, and
performs the matching knowledge sentences with this system substitutes words that "Bob" and
a precondition part of the definition sentence. A "Daniel" for X and Y each. As a result, this system
precondition part is the sentence before then'' in outputs the answer that ``Bob is grandfather of
definition sentences. It outputs the answer to the John.''.
question if matching of precondition parts of
sentences surround and'' or or'' are successful. If 2.5 Multiplex Inference
the matching is not successful, it outputs the
necessary knowledge to succeed in the inference. Figure 4 shows a flow of the multiplex inference
using an example sentence. This system performs
If Noah has feathers then Noah is a bird. (1) a more complicated inference by following up the
associated definition sentence.
Noah has feathers (2) For example, we assume that this system was
inputted the question sentence when this system is
Is Noah a bird? (3) inputted the sentences that are shown in figure 4.
295
At first, this system performs the matching with 2.6 Output Answer
knowledge sentences. In this case, the matching is
not successful, therefore this system matches a When this system succeeds in obtaining answers,
question sentence with the conclusion part of the this system outputs the answer and the process of
definition sentence. Next, this system succeeds in logical inference. The reason why this system
the matching with conclusion part of a definition output a process of logical inference is to know
sentence that If B is a father of D and C is a the inference process that arriving at an answer or
mother of D, then D is a son of B. After that, this new knowledge. In addition, if new knowledge is
system matches knowledge and the precondition produced, users can know the process of logical
parts. If this system failed in the matching, it inference of this system. When this system does
matches the precondition part of a definition not acquire answers, this system outputs
sentence and the conclusion part of other information necessary to let the inference succeed.
definition sentences again. In this figure, this If there are multiple answers to questions, this
system has some knowledge sentences that ``Taro system outputs all answers and process of
is a man.'', Jiro is a man., Mary is a woman. inferences.
and Jiro is twelve years old.. However, these
knowledge sentences cannot match each 3 EVALUATION EXPERIMENTS
conditional part. Accordingly, this system
performs the matching with conditional parts of To evaluate the effectiveness of this system, we
first definition sentence and conclusion parts of experimented to evaluate whether this system can
other definition sentences. The second sentence answer to questions including following items.
and the third sentence from the top succeed in
matching with each condition part of first Item1: The system can answer to the question
definition sentence that "B is a father of D" and "C
about knowledge sentences
is a mother of D". After that, this system succeeds
in the matching each conditional part of the Item2 The system can answer to the question
second and third sentences to four saved
needs to omit a modifier
knowledge sentences. Thus, this system can
performs the inference repeatedly until it succeeds Item3 The system can answer to the question
in the producing the answer or cannot perform the
about definition sentences
inference.
Item4 The system can answer to the question
about definition sentences including wild cards
about multiplex inference
about multiplex inference including wild cards
In this experiment, we had the thirteen subjects

asking two ways of questions of Yes/No and
5W1H about each item. Accordingly, this system
was asked twelve questions by each subject.
The subjects evaluated whether the answer to each
Figure4. An example of Multiplex Inference questions of this system is correct or not. In
advance, knowledge and definition sentences
296
necessary for question answering processing were system could not distinguish a certain semantic
inputted by subjects. difference in the range of nouns the policeman
Manabu". Consequently, question sentence is
3.1 Results converted into the sentence of wrong form such as
the right side of figure 5. This system fails in
Table 1 shows a correct answer rate in each matching of knowledge sentences and question
question. On average, we could see that this sentences.
system was able to answer correctly more than Accordingly, we consider that this system can
80% of all questions in each experimental item. answer the question given right parts of speech
information. However, the content of the question
is limited to the question that semantic analysis is
Table1. Experimental results unnecessary.
Experiment Items Correct / Question(rates)

Item1 24/26 (0.923)
Item2 21/26 (0.808)
Item3 25/26 (0.962)
Item4 22/26 (0.846)
Item5 23/26 (0.885)
Item6 19/26 (0.732)
Figure5. Cause of wrong conversion
3.2 Discussions
4 CONCLUSIONS
There were two reasons why this system could not
answer correctly about some questions. The first In this paper, we have worked on the construction
reason is that a morphological analyzer failed in for the question answering processing section that
analysis of some sentences and thereby a question can perform the logical inference in the proposed
sentence was not given right part of speech system. We experimented to evaluate the
information. For example, this system was asked effectiveness of this system.
question "Is Mike a kind teacher?". The From the experimental result, we could see that
morphological analysis result of "kind" is output this system was able to answer the almost question
with "noun". However, the analysis result of sentences of the assumed form. In the future, we
"kind" should become "adjective". Therefore, this attempt for implementation of the process of
system becomes not able to appropriate matching. semantic analysis and the process of
Another reason is that this system was not able to discrimination of proper noun.
make a right conversion for a question sentence
because of not having performed semantic REFERENCES
analysis. Figure 5 shows an example of the false
conversion that is caused by not doing semantic 1. Hirschman L and Gaizauskas R, Nature Language
analysis. Qusetion Answering: the View from Hereg, volume 1,
When this system was asked question Is the Cambridge University Press, 2001.
policeman Manabu?", this system needs to find a 2. Moldovan, Dan I., Christine Clark, and Moldovan
Bowden. "Lymba's PowerAnswer 4 in TREC 2007."
main clause to convert the question into an TREC. Vol. 1. No. 5.3. 2007.
assertive sentence. If this system is able to find a 3. David Ferrucci, Eric Brown, Jennifer Chu-Carroll,
main clause, this system can convert the question James Fan, David Gondek, Aditya A. Kalyanpur, Adam
sentence into the sentence of right form such as Lally, J. William Murdock, Eric Nyberg, John Prager,
the left side of figure 5. However, this system Nico Schlaefer and Chris Welty, Building Watson: An
Overview of the DeepQA Project", AI Magazine, pp.
cannot find a right main clause because this 59-79, 2010.
297
4. Hiroki Imamura, A Dialog System being able to Infer

based on Definition and Acquisition of Knowledge'',
The 27th Annual Conference of the Japanese Society for
Artificial Intelligence, 1K3-OS-17a-1, 2013.
298
International Journal of
DIGITAL INFORMATION AND WIRELESS COMMUNICATIONS
The International Journal of Digital Information and Wireless Communications aims to provide a forum
for scientists, engineers, and practitioners to present their latest research results, ideas, developments
and applications in the field of computer architectures, information technology, and mobile
technologies. The IJDIWC publishes issues four (4) times a year and accepts three types of papers as
follows:
1. Research papers: that are presenting and discussing the latest, and the most profound
research results in the scope of IJDIWC. Papers should describe new contributions in the
scope of IJDIWC and support claims of novelty with citations to the relevant literature.
2. Technical papers: that are establishing meaningful forum between practitioners and
researchers with useful solutions in various fields of digital security and forensics. It includes
all kinds of practical applications, which covers principles, projects, missions, techniques,
tools, methods, processes etc.
3. Review papers: that are critically analyzing past and current research trends in the field.
Manuscripts submitted to IJDIWC should not be previously published or be under review by any other
publication. Original unpublished manuscripts are solicited in the following areas including but not
limited to:
Information Technology Strategies

Information Technology Infrastructure
Information Technology Human Resources
System Development and Implementation
Digital Communications
Technology Developments
Technology Futures
National Policies and Standards
Antenna Systems and Design
Channel Modeling and Propagation
Coding for Wireless Systems
Multiuser and Multiple Access Schemes
Optical Wireless Communications
Resource Allocation over Wireless Networks
Security; Authentication and Cryptography for Wireless Networks
Signal Processing Techniques and Tools
Software and Cognitive Radio
Wireless Traffic and Routing Ad-hoc Networks
Wireless System Architectures and Applications

IJDIWC - Volume 6, Issue 4

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

IJDIWC - Volume 6, Issue 4

Hochgeladen von

Copyright:

Verfügbare Formate

I

Copyright 2016 sdiwc.net, All Rights Reserved

The issue date is October 2016.

Volume 6, Issue No. 4 2016

PAPER TITLE AUTHORS PAGES

ENGINEERING MINING A LARGE SCALE DATA BASED ON FEATURE

PROPOSAL OF REPRODUCTIVE DESIGN EDUCATION BASED ON

MUSIC EMOTION RECOGNITION WITH AUDIO AND LYRICS

PROPOSITION OF AN INTELLIGENT SYSTEM FOR PREDICTIVE

A HYBRID CREDIBILITY ANALYSIS METHOD APPLIED ON TURKISH

A QUESTION-ANSWERING INFERENCING SYSTEM BASED ON

didn't have any idea about it or which they are likely

Table 4. Time-consuming and similar users

TrainingData TestingData TrainingData TestingData

4.2 Evaluation Metrics Classification Accuracy Metrics: Classification

Table 5. MAE and RMSE evaluations

Figure 3. Comparison of evaluations of predictive accuracy metrics

Table 6. Precision, Recall and F-measure evaluations

Figure 4. Comparison of evaluations of classification accuracy metrics

Inasmuch to the results obtained in this Section, we

Yes Yes USA, 2016.

Netflix prize challenge, In ACM SIGKDD Explorations

Com recommendations: Item-to-item collaborative

Proposal of Reproductive Design Education based on Knowledge and Resource

Masatoshi Imai * and Yoshiro Imai **

ABSTRACT Because students of the relevant education course

3 COMPARISON OF CONVENTIONAL AND

1. Designer reforms his/her original model

questionnaire investigation, and user's 2. Resource Finding stage:

Figure 6. Establishment of Human Relation through SNS

.Inthe case of reproduction of furniture, it is very

Figure 6. Tools Discovery in SNS community.

If a user is a beginner of our Proposed Design

Figure 8. Re-producing Furniture using Knowledge and

5 QUALITATIVE AND QUANTITATIVE

transporting resources and tools. If an satisfaction from Knowledge and/or Resource

Our future plan is to provide more suitable

The authors are thankful to Dr. Yoshio Moritoh,

1. Syn, S. Y., Oh, S.: Why do social network site users

Assessment of Quality-of-Experience in Telecommunication Services

Demstenes Z. Rodriguez, Renata L. Rosa, Rodrigo D. Nunes and Emmanuel T. Affonso

ABSTRACT for services providers, because the quality

assessing video quality in streaming service have WI

Temporal Segment Weight

following equation corresponds to scenario 1 4.3.3.1 Customized Player

determined linear system with 2 variables and 20

Data Stored at Buffer

Time (seconds) Figure 7. Application Scenario with Feedback Mechanism

4.3.4 Results and Final Considerations

In the subjective tests, 96 evaluators participated,

of 0.96. This case study is based on the following previous

The UE geographical location is important to OAM

The network operator can select some UE

Figure 10, adapted from [5], depicts the MDT

Home Location Register (HLR) / Home (data stored)

Figure 10. Network Architecture of the MDT solution.

because it is applicable in several areas such as

Music Emotion Recognition with Audio and Lyrics Features

C. V. Nanayakkara1 and H. A. Caldera2

ABSTRACT 1.1 Music and Emotion: A Background Review

1.2 Music Features

Figure 1: Overall architecture of proposed methodology

4 FEATURE ENGINEERING Inverse Document Frequency (IDF) of a term is