Networks Hossain MD. Shakawat Department of Computer Science & Engineering ID 11-18494-1 American International Universiry-Bangladesh Najeeb, Ahmad Taher Department of Computer Science & Engineering ID 11-18198-1 American International Universiry-Bangladesh Alam Shah Department of Computer Science & Engineering ID 10-17685-3 American International Universiry-Bangladesh September 8, 2014 1 Table of Contents: Abstruct . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . :3 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . : 4 2. Previous Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . : 5 2.1 Location Based Social Network. . . . . . . . . . . . . . . ....: 5 2.2 Collaborative Recommendation Based Social Network. . . . . . . . . . . . . . . . . . . . . . . . . . . :10 2.3 Sentimental Intensity Analysis of Informal Texts. . . . . . . . . . . . . . . . . . . . . ..:13 2.4 Big 5 Modelling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ....:18 3. Objective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . : 28 4. Research Question . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . : 29 5. Proposed Research Methodology. . . . . . . . . . . . . . . . . . ...: 29 5.1 Data Collection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . :30 5.2 Data Analysis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ...:31 5.3 Results. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ....:32 5.4 Recommendation Analysis. . . . . . . . . . . . . . . . . . . . . ...:33 6. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ...: 35 2 Abstract At present social networks play an important role to express users sentiment & his/her interest on a particular eld. Extracting ones public data (what he/she shares with friends/relatives & his/her ex- pression over others thought) means extracting ones behavior. Den- ing some determined hypothesis if we make machine able to under- stand humans sentiment and interest, it is possible to recommend a user on his/her personal interest on basis of his/her sentiment by machine. Our main approach is suggesting one regarding his/her spe- cic interests that anticipated based on his/her respective public data analysis which can be extended to further business analysis to suggest dierent companies products or services depend on consumer personal choice. This automation would also help to choose the correct candi- date for any questionnaire. And anyone to know about his/her own. How his/her behavior may inuence others. It is possible to Easily select one for leadership, People who seem to be eager with, People who have chance to oppose, Find out a dependable one. . . . Acknowledgements: Special thanks to our honorable teacher and supervisor Md. Saddam Hossain Faculty, Department of CS.American International University-Bangladesh. 3 1 Introduction With millions of users, social networking services like Facebook and Twitter have become some of the most popular internet applications. These applica- tions are the source of knowledge and information. The rich knowledge that has accumulated in these social sites enables a variety of recommendation systems for new friends and media [1]. To use such opportunity, it is pos- sible to create automated system that can categorize users according to big 5 personality factor. To categorize users in such categorization system, it is needed to collect users data without interfering users daily activities. Thus the system will help others and user itself to know about himself or others. For example: An employee need vacation and if boss is listed as friend on OSN then employee get chances to apply for his demand according to boss motive generated by the system (Neuroticism indicates chances of higher chances disagree when agreeableness indicates chances of higher agree). On- line Social Networks (OSN) deals with big data, to analysis such data; sys- tem will be able to predict the suitable person for leadership, people who may oppose. These opportunities and challenges have been tackled by many new approaches to recommendation systems, using dierent data sources and methodologies to generate dierent kinds of recommendations. In this article we provide a description of such system. From the very beginning, Consumer interests have a great impression on business policy. Oering the right prod- ucts or services to the right customers is the main theme of every successful business policy. Many business organizations can be beneted by using the data collected from the OSN. And at present the popularity of social net- works is rising very rapidly. From the sociologists point of view, OSN can be characterized as collective goods produced through computer mediated collective action [2]. Users spend a huge amount of time involving in OSN of their daily life and share a lot of information about them and their friends and family. So, this is a great opportunity to know about peoples sentiment and interest. It is possible to understand the behavior of user from OSN as it becomes a crucial factor for advertising policies and better site design. In particular giving the success of item recommendation systems to commercial websites, such as Amazon.com and Net x, it is considered worthwhile to revisit the recommendation problem through the novel perspective of social networking. In general, recommendation systems aim to provide personal- ized recommendations of items to users based on their previous behavior as well as on other information gathered by item descriptions and user proles. 4 Our experiment is based on Twitter and Facebook; the most popular OSN website having a large place of advertisement. These websites have a huge amount of user and the user feels comfortable using these sites because of the user friendly features of micro blogging, status update, photo and video sharing, comment on posts, joining and creating groups, like page/s, create events, playing games and so on. We aim to analyze user sentiment through his past activity while using the OSN and map it on Big ve factor. Finding out a set or a particular user interests eld, and recommend him or her by giving informative services. 2 Previous work OSN is the practice of expanding the number of ones business and social contacts by making connections through individuals [3]. In this era of internet OSN is extremely popular among people [4]. Two third of world population spent 10 2.1 Location Based Social Network: A social network is a social structure made up of individuals connected by one or more specic types of interdependency, such as friendship, common interests, and shared knowledge. Generally, a social network- ing service builds on and reects the real-life social networks among people through online platforms such as a website, providing ways for users to share ideas, activities, events, and interests over the Internet. The increasing availability of location-acquisition technology (for ex- ample GPS and Wi-Fi) empowers people to add a location dimension to existing online social networks in a variety of ways. For example, users can upload location-tagged photos to a social networking service such as Flickr [12], comment on an event at the exact place where the event is happening (for instance, in Twitter [13]), share their present location on a website (such as Foursquare [14]) for organizing a group activity in the real world, record travel routes with GPS trajectories to share travel experiences in an online community. Here, a location can be represented in absolute (latitude-longitude coordinates), relative (100 meters north of the Space Needle), and symbolic (home, oce, or shopping mall) form. Also, the location embedded into a social network can be a stand-alone instant location of an individual, like in a bar at 9pm, or a location history accumulated over a certain period, such as 5 a GPS trajectory: a cinema a restaurant a park a bar. The dimension of location brings social networks back to reality, bridg- ing the gap between the physical world and online social networking services. For example, a user with a mobile phone can leave his/her comments with respect to a restaurant in an online social site (after nishing dinner) so that the people from his/her social structure can reference his/her comments when they later visit the restaurant. In this example, users create their own location-related stories in the physical world and browse other peoples information as well. An online social site becomes a platform for facilitating the sharing of peoples experi- ences. Furthermore, people in an existing social network can expand their social structure with the new interdependency derived from their locations. As location is one of the most important components of user context, extensive knowledge about an individuals interests and behavior can be learned from her locations. For instance, people who enjoy the same restaurant can connect with each other. Individuals constantly hiking the same mountain can be put in contact with each other to share their travel experiences. Sometimes, two individuals who do not share the same absolute location can still be linked as long as their locations are indicative of a similar interest, such as beaches or lakes. These kinds of location-embedded and location-driven social structures are known as location-based social networks, formally dened as fol- lows: A location-based social network (LBSN) does not only mean adding a location to an existing social network so that people in the social struc- ture can share location embedded information, but also consists of the new social structure made up of individuals connected by the interde- pendency derived from their locations in the physical world as well as their location-tagged media content, such as photos, video, and texts. Here, the physical location consists of the instant location of an indi- vidual at a given timestamp and the location history that an individual has accumulated in a certain period. Further, the interdependency in- cludes not only that two persons co-occur in the same physical location or share similar location histories but also the knowledge, e.g., common interests, behavior, and activities, inferred from an individuals location (history)and location-tagged data. In a location-based social network, people can not only track and 6 share the location-related information of an individual via either mo- bile devices or desktop computers, but also leverage collaborative so- cial knowledge learned from user generated and location-related con- tent, such as GPS trajectories and geo-tagged photos. One example is determining this summers most popular restaurant by mining peo- ples geo-tagged comments. Another example could be identifying the most popular travel routes in a city based on a large number of users geo-tagged photos. Consequently, LBSNs enable many novel applica- tions that change the way we live, such as physical location (or ac- tivity) recommendation systems [15,16] and travel planning , while oering many new research opportunities for social network analysis (like user modeling in the physical world and connection strength anal- ysis)[17,18] , spatio-temporal data mining [19], ubiquitous computing [20], and spatio-temporal databases [19, 21] Existing applications pro- viding location-based social networking services can be broadly cate- gorized into three folds: geo-tagged-media-based, point-location-driven and trajectory-centric. Geo-tagged-media-based. Quite a few geo-tagging services enable users to add a location label to media content such as text, pho- tos, and videos generated in the physical world. The tagging can occur instantly when the medium is generated, or after a user has returned home. In this way, people can browse their content at the exact location where it was created (on a digital map or in the physical world using a mobile phone). Users can also comment on the media and expand their social structures using the interdepen- dency derived from the geo-tagged content (for example, in favor of the same photo taken at a location). Representative websites of such location-based social networking services include Flickr, Panoramio, and Geo-twitter. Though a location dimension has been added to these social networks, the focus of such services is still on the media content. That is, location is used only as a feature to organize and enrich media content while the major interdependency between users is based on the media itself. Point-location-driven. Applications like Foursquare and Google 7 Latitude encourage people to share their current locations, such as a restaurant or a museum. In Foursquare, points and badges are awarded for checking in at venues. The individual with the most number of check-ins at a venue is crowned Mayor. With the real-time location of users, an individual can discover friends (from her social network) around her physical location so as to enable certain social activities in the physical world, e.g., inviting people to have dinner or go shopping. Meanwhile, users can add tips to venues that other users can read, which serve as suggestions for things to do, see, or eat at the location. With this kind of ser- vice, a venue (point location) is the main element determining the in-terdependency connecting users, while user-generated content such as tips and badges feature a point location. Trajectory-centric.In a trajectory-centric social networking ser- vice, such as .Bikely, SportsDo, and Microsoft GeoLife, users pay attention to both point locations (passed by a trajectory) and the detailed route connecting these point locations. These services do not only tell users basic information, such as distance, duration, and velocity, about a particular trajectory, but also show a users experiences represented by tags, tips, and photos for the trajec- tory. In short, these services provide how and what information in addition to where and when. In this way, other people can ref- erence a users travel/sports experience by browsing or replaying the trajectory on a digital map, and follow the trajectory in the real world with a GPS-phone. Table 1 provides a brief comparison among the set here services. The major dierences between the point-location-driven and the trajectory- centric LBSN lie in two aspects. One is that a trajectory oers richer information than a point location, such as how to reach a location, the temporal duration that a user stayed in a location, the time length for travelling between two locations, and the physical/trac conditions of a route. As a result, we are more likely to accurately understand an individuals behavior and interests in a trajectory-centric LBSN. The other is that in a point-location-driven LBSN users usually share their 8 real-time location while the trajectory-centric more likely delivers his- torical locations as users typically prefer to upload a trajectory after a trip has nished (though it can be operated in a continuously upload- ing manner). This property could compromise some scenarios based on the real-time location of a user, however, it reduces to some extent the privacy issues in a location-based social network. In other words, when people see a users trajectory the user is no longer there Table 1 Comparison of dierent location-based social networking services Table 1: Data tables LBSN Services Focus Real-time Information Geo-tagged-media-based Media Normal Poor Point-location-driven Point location Instant Norma l Trajectory-centric Trajectory Relatively Slow Rich Actually, the location data generated in the rst two LBSN services can be converted into the form of a trajectory which might be used by the third category of LBSN service. For example, if we sequentially connect the point locations of the geo-tagged photos taken by a user over several days, a sparse trajectory can be formulated. Likewise, the check-in records of an individual ordered by time can be regarded as a low-sampling-rate trajectory. However, due to the sparseness, i.e., the distance and time interval between two consecutive points in a trajec- tory could be very big, the uncertainty existing in a single trajectory from the rst two services is increased. Aiming to put these trajecto- ries into trajectory-centric LBSN services, we need to use them in a collective and collaborative way. The following sections will pay closer attention to trajectory data, which is the most complex data structure to be found in the three LBSN services, and provides the richest information. If it is handled well, other data sources become easier to deal with. Moreover, as men- tioned above, location data can be converted into a trajectory on many occasions. Consequently, some methodologies designed for trajectory data can be employed by the rst two LBSN services. 9 2.2 Collaborative Recommendation Based Social Network: With the recent advances in technology, there is an emerging pres- ence of social media and social networking systems. In the case of multimedia enriched social network systems, such as last.fm, the col- lective goods are musical tracks and the collective action is the process of crafting individual proles of musical preference and linking them either explicitly, via bonds of friendship, or implicitly, through collab- orative annotation [22]. This collective action leads to the creation of an implicit social net- working structure, which we aim to further explore. In particular given the success of item recommendation systems in commercial websites, such as Amazon.com and Net x, it is considered worthwhile to revisit the recommendation problem through the novel perspective of social networking. In general, recommendation systems aim to provide per- sonalized recommendations of items to users based on their previous behavior as well as on other information gathered by item descriptions and user proles. However, no emphasis has been placed yet on personalization based explicitly on social networks. The reason is that despite there is an increasing interest in the exploration of social networks, there does not exist a concrete dataset that includes both explicit bonds of friendships among users and free-form collaborative annotation of items. This is due to that most social media systems do not allow for free access to all user pro les or lists of friends. Given the incentives of the widespread add option of social networks and of the lack of some previous study that directly addresses the prob- lem of eciently integrating the added value knowledge provided by those networks in the eld of collaborative recommendation, we pro- pose a new methodology that tackles the aforementioned issues. Within this context we make the following contributions: We introduce a dataset based on data from the last.fm social net- work that describes a social graph among users, tracks and tags, eectively including bonds of friendship and collaborative anno- tation. 10 We evaluate a Random Walk with Restarts (RWR) model on this dataset and show that the incorporation of friendship and social tagging can improve the performance of an item recommendation system. We show that the RWR method outperforms the standard Col- laborative Filtering (CF) method, which we also evaluate against the same dataset. We show that our method using the RWR method requires no training and successfully manages to capture We may distinguish two broad categories of collaborative recommen- dation systems, namely content-based and collaborative ltering. A content-based system selects items based on the correlation between the content of the items (e.g. keywords describing the items, such as album genre, artists, etc., for music tracks) and the users preferences [23]. However, it is limited to dictionary-bound relations between the keywords used by users and the descriptions of items and therefore does not explore implicit associations between users. Collaborative ltering systems are divided into two categories, i.e. memory- based and model-based. In the memory based systems [24] we calculate the similarity between all users, based on their ratings of items using some heuristic measure such as the cosine similarity or the Pearson correlation score. Then we predict a missing rate by aggregating the ratings of the k nearest neighbors of the user we want to recommend to. The problem with memory-based systems is that we have to de- cide on a rather arbitrary basis over parameters such as the number of neighbors. What is more, in the case of social networks there is no straightforward way to introduce similarities between users based on friendships and social tagging, other than some way of ad hoc interpo- lation of similarity weights from those dierent sources. The model-based ltering systems assume that the users build up clus- ters based on their similar behavior in rating of items. A model is learned based on patterns recognized in the rating behaviors of users using clustering, Bayesian networks and other machine learning tech- niques [25, 26]. The problem with model-based methods is that it is 11 necessary to ne-tune several parameters of the model as well as the fact that the models produced might not generalize well in radically dierent context. What is more, as in the case of memory-based sys- tems extra eort and training needs to be done in order to introduce knowledge from social networks. Many research publications have been lately revolving around the area of social media. In particular, several studies focus on dataset collec- tion and analysis from social networks. Das et al. [27] propose sample based algorithms that capture information in the neighborhood of a user in dynamic social networks utilizing random walks. Halpin et al. [8] study the distribution of tags in the social bookmarking site del.icio.us and propose a generative model of collaborative tagging in order to evaluate the dynamics that lie beneath the act of collabora- tive recommendation. Their ndings prove that the dataset collected follows a power-law distribution. Even though both studies examine social networks that are based on social tagging, they do not explore the dynamics of friendships among users. Taking into account the power of free-form tagging of items by users other than their authors/owners, researchers also focus on tag recommendation. Subramanya and Liu [28] propose a system that automatically recommends tags for blogs, using similarity ranking in a manner similar to collaborative ltering techniques. Stromhaier [29] studies a novel idea in tag recommenda- tion, which bridges the gap between the keywords issued by a user in a query and the tags actually used by a social system. He argues that the tags used by a user when performing a query exhibit his or her intent, whereas the annotations of items describe content semantics. As a result, he proposes a new form of purpose tags, which extract the intent of the user and facilitate goal oriented search in a social network. Both studies underline the importance and discriminative power of so- cial tagging, which is also validated by our work. Several studies exist in the eld of applying Random Walks on bipartite graphs. Craswell and Szummer [30] study a clickthrough data graph in order to perform item recommendation. Nevertheless, no social content is available between users. Yildirim and Krishnamoorthy [31] propose a novel recommendation algorithm which performs Random Walks on 12 a graph that denotes similarity measures between items. They evalu- ate their system using data from Movie Lens. Although, the use of the Random Walk model performs well in the context of recommendation, their use of an Item-Item similarity matrix raises some issues as to the ability of the system to extend when other similarities are introduced based on social tagging. Recent work has also been done in the eld of applying Random Walks over a social graph instead of bipartite graphs, similar to what we propose in this paper. Clements et al. [32] propose a single term query system performing Random Walks on graphs in- cluding users, items and tags. They use data from LibraryThing, an online book catalogue where users rate and tag books they have read. Due to lack of ground truth, they assume that the tags assigned to an item by each user are the same as they would use as query terms to retrieve the annotated item. We argue that this assumption is rather strong and that a user experiment would be more appropriate in order to properly establish the ground truth. Hotho et al. evaluate a variation of adapted PageRank on a dataset from del.icio.us, exploring folksonomies of bookmarks based also on collaborative annotation [33] . However, since they evaluate their pro- posed algorithm empirically, any comparison attempts to their results becomes cumbersome. Although both studies are close to our approach, we use a dierent model, namely RWR, in which we explicitly include friendships in our dataset and perform collaborative recommendations instead of queries on the graph. 2.3 Sentiment Intensity Analysis of Informal Texts: The proliferation of social networks such as blogs, forums and other online means of expression and communication have resulted in a land- scape where people are able to freely discuss online through a variety of means and applications [34]. Probably one of the most novel and interesting way of communication in cyberspace is through 3D virtual environments. In such environ- ments, people, represented by their avatars, socialize and interact with each other and with virtual humans operated by machines i.e., com- 13 puter systems. Examples of such virtual environments are ourishing and include Second Life World of Warcraft [35], There [36], IMVU [37], Moove [38], Activeworlds [39], Bluemars [40], Club Cooee [41], etc. Despite the fact that the graphics of those environments remain rela- tively poor, futuristic movies such as Avatar [42] provide an example of sophisticated landscapes and renderings that will be attainable by such environments in the foreseeable future. However, regardless of how at- tractive and realistic such articial 3D worlds become, they will always remain heavily dependant on the quality of human communication that takes place within them. As shown in [43, 37], communication in en- vironments that are not limited to one, textual modality, consists of not just semantic data transfer, but also of dense non-verbal commu- nication where sentiment plays an important role. Moreover, without emotion no consistent and coherent (virtual) body language is possi- ble. Such primordial movements include facial expressions, eye looks, arm-language coordination, etc. Sentiment detection from textual utterances can play an important role in the development of realistic and interactive dialog systems. Such systems serve various educational, business or entertainment oriented functions and also include systems that are deployed in 3D virtual en- vironments. With the aid of dialog coherence modules, conversational systems aim at a realistic interaction ow at the emotional level e.g., Aect Listeners [44] and can greatly benet from the correct identi- cation of the emotional state of their participants. Taking into consid- eration that the majority of input to practical conversational systems constitute of short, informal, textual exchanges, it is essential that the sentiment analysis component integrated in the dialog system is able to cope with this type of informal, often incomplete or ill-formed type of communication. Sentiment analysis, the process of automatically detecting if a text segment contains emotional or opinionated content and extracting its polarity or valence, is a eld of research that has received signicant attention in recent years, both in academia and in industry. The afore- mentioned increase of user-generated content on the web has resulted in a wealth of information that is potentially of vital importance to institutions and companies, providing them with data to research their consumers, manage their reputations and identify new opportunities. As a result, most of the research in the eld has been limited to product 14 reviews, where the aim is to predict whether the reviewer recommends a product or not, based on the textual content of the review. The focus of this paper is dierent. Instead of focusing our attention to product reviews, we explore a more ubiquitous eld of informal, so- cial interactions in cyberspace. The unprecedented popularity of social platforms such as Facebook, Twitter, MySpace as well as 3D virtual worlds has resulted in an unparallel increase of textual exchanges that remains relatively unexplored especially in terms of its emotional con- tent. Specically, we aim to answer the following question: can lexicon-based approaches perform more eectively than machine-learning approaches in this domain? This question is particularly important, because pre- vious research in sentiment analysis using product reviews has shown that machine-learning approaches typically outperform lexicon-based ones but no exploration of whether the same holds for informal, so- cial interactions has been carried in the past. The dierence between the two domains is numerous. Firstly, reviews tend to be longer and more verbose than typical social interactions which may only be a few words long and often contain signicant spelling errors [45]. Secondly, no clear golden standard exists in the domain of informal communi- cations with which to train a machine-learning classier in opposition to the thumbs up or thumbs down feature of reviews. Lastly, social exchanges on the web tend to be much more diverse in terms of their topics with issues ranging from politics and recent news to religion while in contrast; product reviews by denition have a specic subject, i.e. the product under discussion. The study of emotional and social interactions in virtual worlds implies the study of virtual human (VH) behaviors. Two types of VH exist: avatars (i.e. the projection of a real human in the 3D environment) and agents (i.e. the projection of an autonomous machine simulating a human in the virtual world). These VH types result in three possible types of communications: avatar to avatar, agent to agent and avatar to agent. Each one of those has the following interesting aspects respectively: - A non verbal body language based on VH emotional states and mind prole. - A potential visualization of the interaction from a third VH that 15 should be represented by an avatar; - A non-verbal communication for the human representation and an action of agent strongly inuenced by interpreted emotions from the avatar. It seems only logical that articial intelligence and conversation systems would strongly benet these aspects in order to make the communication more realistic. The structure of this paper is as follows. The next section provides a brief overview of relevant work in sentiment analysis. Section 3 presents the lexicon based classier and section 4 presents the two machine-learning classiers that will be used in this study. Section 5 describes the data sets that were used and explains the experimental setup while section 6 presents and analyzes the results. Finally, we conclude and present some potential future directions of re- search. Sentiment analysis, also known as opinion mining, has known considerable interest recently. Most research has focused on analyz- ing the content of either movie or general product reviews (e.g. [46]). Attempts to expand the application of sentiment analysis to other do- mains, such as debates [47], news and blogs [48] are also prominent. The seminal book of Pang and Lee [49] presents a thorough analysis of the work in the eld. In this section we will focus on the more prominent work which is relevant to our approach. Pang et al. [46] were amongst of the rst to explore the sentiment analysis of reviews, focusing on machine-learning approaches. These approaches generally function as follows: initially, a general inductive process learns the characteristics of a class during a training phase, by observing the properties of a number of pre classied documents (i.e. reference corpus ) and applies the acquired knowledge to determine the best category for new, un- seen documents, during testing. Pang et al. [46] experimented with three dierent algorithms: Support Vector Machines (SVMs), Naive Bayes and Maximum Entropy classiers, using a variety of features, such as unigrams and bigrams, part-of-speech tags, binary and term frequency feature weights and others. Their best attained accuracy in a dataset consisting of movie reviews, was attained using a SVM classier with binary features, although all three classiers gave very comparable performance. Other approaches (e.g. [50, 51]) have focused on extending the feature set with semantically or linguistically-driven features in order to improve classication accuracy. Dictionary/lexicon- 16 based sentiment analysis is typically based on lists of words with some sort of pre-determined emotional weight. Examples of such dictionar- ies include the General Inquirer (GI) dictionary [52] and the Linguistic Inquiry and Word Count (LIWC) software [53], which are also used in the present study. Both lexicons are build with the aid of experts that classify certain tokens in terms of their aective content (e.g. positive or negative). The Aective Norms for English Words (ANEW) lexicon [39] contains ratings of terms on a nine-point scale in regard to three individual dimensions: valence, arousal and dominance. The ratings were produced manually by psychology class students. Ways to pro- duce such emotional dictionaries in an automatic or semi-automatic fashion have also been introduced in research [40]. Emotional dictio- naries have mostly been utilized in psychology or sociology oriented research [54]. The idea of emotional conversationalists is relatively old. First at- tempts to create such a system can be traced back to Parry [55], a chatterbot intended for studying the nature of paranoia and able to express fears, anxieties or beliefs. More recent work include research on the development of synthetic characters and chatterbots with per- sonalities [35] and studies on emotional responses and their inuence on the creation of believable agents or interactive virtual personalities [36]. In [56] authors focused on the role of emotions for gaining rapport in spoken dialog systems by rendering responses that contain suitable emotion, both lexically and auditory. Studies on the role of facial ex- pressions in building rapport in a virtual human-users interactions were conducted in [57]. A chatterbot system that generates emotional re- sponses by selecting and displaying expressive images of the character emulated by the chatterbot was presented in [58]. It has been almost two decades that emotional communication for virtual worlds is a chal- lenging research eld. One of the pioneer paper has been proposed by Cassel et al. [42]. In the proposed system, conversations between multiple human-like agents were automatically generates and animates with appropriate and synchronized speech, intonation, facial expres- sions, and hand gestures proposed numerous ways to design personal- ity and emotion models for virtual humans. More recently, predicted a specic personality and emotional states from hierarchical fuzzy rules to facilitate personality and emotion control, and in 2009, Pelachaud et al. [32] developed a model of behavior expressivity using a set of six 17 parameters that act as modulation of behavior animation. Finally, this year, [60] introduced a graphical representation of human emotion ex- tracted from text sentences. The main contributions of that approach included an original pipeline that extracts, processes, and renders emo- tion of 3D VH. Additionally, the paper presented methods to optimize the computational pipeline so that real time virtual reality rendering can be achieved on common PCs. Lastly, it was demonstrated how the Poisson distribution can be utilized to transfer database extracted lex- ical and language parameters into coherent intensities of valence and arousal (i.e. parameters of Russells circumplex model of emotion). 2.4 Big 5 modeling: At present, many researchers believe that there are ve core personality traits and the evidence of this theory has been growing over the past 50 years [6]. From the point of view of a sociologist, social media can be characterized as collective goods produced through computer-mediated collective action [7]. While people of each category have dierent atti- tude corresponding sites, taste of products, dierent skill to accomplish work. The ve factors are Extraversion, Agreeableness, Conscientious- ness, Neuroticism and Openness [8]. The people of dierent category have dierent way to express their thoughts [6] and OSN user have dierent level of signicance to express their thoughts or express their behavior[5]. The user of OSN categorize according to Big Five factors [9]. The behavior of OSN user varies from users location to location [10]. But there is a similarity having same behavior in people from same or nearby location [11]. Also behavior varies from dierent aged people. The personality traits used in the 5 factor model are Extraversion, Agreeableness, Conscientiousness, Neuroticism and Openness to ex- perience [61]. It is important to ignore the positive or negative as- sociations that these words have in everyday language. For example, Agreeableness is obviously advantageous for achieving and maintaining popularity. Agreeable people are better liked than disagreeable people. On the other hand, agreeableness is not useful in situations that require tough or totally objective decisions. Disagreeable people can make ex- 18 cellent scientists, critics, or soldiers. Remember, none of the ve traits is in themselves positive or negative, they are simply characteristics that individuals exhibit to a greater or lesser ex tent. Each of these 5 personality traits describes, relative to other people, the frequency or intensity of a person s feelings, thoughts, or behav- iors. Everyone possesses all 5 of these traits to a greater or lesser degree. For example, two individuals could be described as agreeable (agreeable people value getting along with others). But there could be signicant variation in the degree to w hich they are both agree- able. I n other words, all 5 personality traits exist on a continuum (see diagram) rather than as attributes that a person does or does not have. Extraversion Extraversion is marked by pronounced engagement with the exter- nal world. Extraverts enjoy being with people, are full of energy, and often experience positive emotions. They tend to be enthu- siastic, action-oriented, individuals who are likely to say Yes! or Let s go! to opportunities for excitement. I n groups they like to talk, assert themselves, and draw attention to them- selves. Introverts lack the exuberance, energy, and activity levels of extraverts. They tend to be quiet, low -key, deliberate, and disengaged from the social world. Their lack of social involvement should not be interpreted as shyness or depression; the introvert simply needs less stimulation than an extravert and prefers to be alone. The independence and reserve of the introvert is sometimes mistaken as unfriendliness or arrogance. In reality, an introvert who scores high on the agreeableness dimension will not seek oth- ers out but w ill be quite pleasant w hen approached. Agreeableness Agreeableness reects individual dierences in concern with coop- eration and social harmony. Agreeable individuals value getting along with others. They are therefore considerate, friendly, gener- ous, helpful, and willing to compromise their interests with others . Agreeable people also have an optimistic view of human nature. They believe people are basically honest, decent, and trustworthy. Disagreeable individuals place self-interest above getting along w ith others. They are generally unconcerned with others w ell- 19 being, and therefore are unlikely to ex tend themselves for other people. Sometimes their skepticism about others motives causes them to be suspicious, unfriendly, and uncooperative. Agreeable- ness is obviously advantageous for attaining and maintaining pop- ularity. Agreeable people are better liked than disagreeable peo- ple. On the other hand, agreeableness is not useful in situations that require tough or absolute objective decisions. Disagreeable people can make excellent scientists, critics, or soldiers. Conscientiousness Conscientiousness concerns the way in which w e control, regu- late, and direct our impulses. Impulses are not inherently bad; occasionally time constraints require a snap decision, and acting on our rst impulse can be an eective response. Also, in times of play rather than work, acting spontaneously and impulsively can be fun. Impulsive individuals can be seen by others as colorful, fun-to-be-with, and z any. Nonetheless, acting on impulse can lead to trouble in a number of ways. Some impulses are antisocial. Uncontrolled antisocial acts not only harm other members of society, but also can result in ret- ribution toward the perpetrator of such impulsive acts. Another problem with impulsive acts is that they often produce immediate rewards but undesirable, long-term consequences. Examples in- clude excessive socializing that leads to being red from ones job, hurling an insult that causes the breakup of an important rela- tionship, or using pleasure-inducing drugs that eventually destroy one s health. Impulsive behavior, even w hen not seriously destructive, dimin- ishes a person s eectiveness in signicant ways. Acting impul- sively disallow s contemplating alternative courses of action, some of which would have been wiser than the impulsive choice. Impul- sivity also sidetracks people during projects that require organized sequences of steps or stages. Accomplishments of an impulsive person are therefore small, scattered, and inconsistent. A hallmark of intelligence, w hat potentially separates human be- ings from earlier life forms, is the ability to think about future consequences before acting on an impulse. Intelligent activity in- volves contemplation of long-range goals, organizing and planning 20 routes to these goals, and persisting toward one s goals in the face of short-lived impulses to the contrary. The idea that intelligence involves impulse control is nicely captured by the term prudence, an alternative label for the Conscientiousness domain. Prudent means both wise and cautious. Persons w ho score high on the Conscientiousness scale are, in fact, perceived by others as intelli- gent. The benets of high conscientiousness are obvious. Conscientious individuals avoid trouble and achieve high levels of success through purposeful planning and persistence. They are also positively re- garded by others as intelligent and reliable. On the negative side, they can be compulsive perfectionists and workaholics. Further- more, extremely conscientious individuals might be regarded as stuy and boring. Unconscientious people may be criticized for their unreliability, lack of ambition, and failure to stay within the lines, but they w ill experience many short-lived pleasures and they will never be called stuy. Neuroticism Freud originally used the term neurosis to describe a condition marked by mental distress, emotional suering, and an inability to cope eectively with the normal demands of life. H e sug- gested that everyone show s some signs of neurosis, but that w e dier in our degree of suering and our specic symptoms of distress. Today neuroticism refers to the tendency to experience negative feelings. Those w ho score high on Neuroticism may ex- perience primarily one specic negative feeling such as anxiety, anger, or depression, but are likely to experience several of these emotions. People high in neuroticism are emotionally reactive. They respond emotionally to events that would not aect most people, and their reactions tend to be more intense than normal. They are more likely to interpret ordinary situations as threaten- ing, and minor frustrations as hopelessly dicult. Their negative emotional reactions tend to persist for unusually long periods of time, which means they are often in a bad mood. These problems in emotional regulation can diminish a neurotic s ability to think clearly, make decisions, and cope eectively with stress. 21 At the other end of the scale, individuals w ho score low in neuroti- cism are less easily upset and are less emotionally reactive. They tend to be calm, emotionally stable, and free from persistent neg- ative feelings. Freedom from negative feelings does not mean that low scorers experience a lot of positive feelings; frequency of pos- itive emotions is a component of the Extraversion domain. Openness to Experience Openness to Experience describes a dimension of cognitive style that distinguishes imaginative, creative people from down-to-earth, conventional people. Open people are intellectually curious, ap- preciative of art, and sensitive to beauty. They tend to be, com- pared to closed people, more aw are of their feelings. They tend to think and act in individualistic and nonconforming ways. In- tellectuals typically score high on Openness to Experience; con- sequently, this factor has also been called Culture or Intellect. Nonetheless, Intellect is probably best regarded as one aspect of openness to experience. Scores on Openness to Experience are only modestly related to years of education and scores on stan- dard intelligent tests. Another characteristic of the open cognitive style is a facility for thinking in symbols and abstractions far removed from concrete experience. Depending on the individuals specic intellectual abil- ities, this symbolic cognition may take the form of mathematical, logical, or geometric thinking, artistic and metaphorical use of language, music composition or performance, or one of the many visual or performing arts. People with low scores on openness to experience tend to have narrow , common interests. They prefer the plain, straightforward, and obvious over the complex, ambigu- ous, and subtle. They may regard the arts and sciences with suspi- cion, regarding these endeavors as abstruse or of no practical use. Closed people prefer familiarity over novelty; they are conservative and resistant to change. Openness is often presented as health- ier or more mature by psychologists, w ho are often themselves open to experience. However, open and closed styles of thinking are useful in dierent environments. The intellectual style of the open person may serve a professor w ell, but research has show 22 n that closed thinking is related to superior job performance in police work, sales, and a number of service occupations. Subordinate Personality Traits or Facets Each of the big 5 personality traits is made up of 6 facets or sub traits. These can be assessed independently of the trait that they belong to. Extraversion Facets: Friendliness. Friendly people genuinely like other people and openly demonstrate positive feelings toward others. They make friends quickly and it is easy for them to form close, intimate relation- ships. Low scorers on Friendliness are not necessarily cold and hostile, but they do not reach out to others and are perceived as distant and reserved. Gregariousness. Gregarious people nd the company of others pleasantly stimulating and rewarding. They enjoy the excitement of crowds. Low scorers tend to feel overwhelmed by, and therefore actively avoid, large crowds. They do not necessarily dislike being with people sometimes, but their need for privacy and time to themselves is much greater than for individuals w ho score high on this scale. Assertiveness. High scorers Assertiveness like to speak out, take charge, and direct the activities of others. They tend to be leaders in groups. Low scorers tend not to talk much and let others control the activities of groups. Activity Level. Active individuals lead fast-paced, busy lives. They move about quickly, energetically, and vigorously, and they are involved in many activities. People who score low on this scale follow a slower and more leisurely, relaxed pace. Excitement-Seeking. High scorers on this scale are easily bored without high levels of stimulation. They love bright lights and hustle and bustle. They are likely to take risks and seek thrills. Low scorers are overwhelmed by noise and commotion and are adverse to thrill-seeking. 23 Cheerfulness. This scale measures positive mood and feelings, not negative emotions (which are a part of the Neuroticism domain). Persons w ho score high on this scale typically experience a range of positive feelings, including happiness, enthusiasm, optimism, and joy. Low scorers are not as prone to such energetic, high spirits. Agreeableness Facets: Trust. A person with high trust assumes that most people are fair, honest, and have good intentions. Persons low in trust may see others as selsh, devious, and potentially dangerous. Morality. High scorers on this scale see no need for pretence or manipulation when dealing with others and are therefore candid, frank, and sincere. Low scorers believe that a certain amount of deception in social relationships is necessary. People nd it relatively easy to relate to the straightforward high-scorers on this scale. They generally nd it more dicult to relate to the low - scorers on this scale. I t should be made clear that low scorers are not unprincipled or immoral; they are simply more guarded and less willing to openly reveal the whole truth. Altruism. Altruistic people nd helping other people genuinely re- warding. Consequently, they are generally willing to assist those w ho are in need. Altruistic people nd that doing things for others is a form of self-fulllment rather than self-sacrice. Low scorers on this scale do not particularly like helping those in need. Re- quests for help feel like an imposition rather than an opportunity for self-fulllment. Cooperation. Individuals w ho score high on this scale dislike confrontations. They are perfectly willing to compromise or to deny their own needs in order to get along with others. Those w ho score low on this scale are more likely to intimidate others to get their way. Modesty. High scorers on this scale do not like to claim that they are better than other people. I n some cases this attitude may derive from low self-condence or self-esteem. Nonetheless, some 24 people with high self-esteem nd immodesty unseemly. Those w ho are willing to describe themselves as superior tend to be seen as disagreeably arrogant by other people. Sympathy. People w ho score high on this scale are tender-hearted and compassionate. They feel the pain of others vicariously and are easily moved to pity. Low scorers are not aected strongly by human suering. They pride themselves on making objective judgments based on reason. They are more concerned with truth and impartial justice than with mercy. Conscientiousness Facets: Self-Ecacy. Self-Ecacy describes condence in ones ability to accomplish things. High scorers believe they have the intelligence (common sense), drive, and self-control necessary for achieving success. Low scorers do not feel eective, and may have a sense that they are not in control of their lives. Orderliness. Persons with high scores on orderliness are well- organized. They like to live according to routines and schedules. They keep lists and make plans. Low scorers tend to be disorga- nized and scattered. Dutifulness. This scale reects the strength of a persons sense of duty and obligation. Those w ho score high on this scale have a strong sense of moral obligation. Low scorers nd contracts, rules, and regulations overly conning. They are likely to be seen as unreliable or even irresponsible. Achievement-Striving. Individuals who score high on this scale strive hard to achieve excellence. Their drive to be recognized as successful keeps them on track toward their lofty goals. They often have a strong sense of direction in life, but extremely high scores may be too single-minded and obsessed with their work. Low scorers are content to get by with a minimal amount of work, and might be seen by others as lazy. Self-Discipline. Self-discipline-w hat many people call will-power- refers to the ability to persist at dicult or unpleasant tasks until they are completed. People w ho possess high self-discipline are 25 able to overcome reluctance to begin tasks and stay on track de- spite distractions. Those with low self-discipline procrastinate and show poor follow -through, often failing to complete tasks-even tasks they w ant very much to complete. Cautiousness. Cautiousness describes the disposition to think through possibilities before acting. High scorers on the Cautious- ness scale take their time w hen making decisions. Low scorers often say or do rst thing that comes to mind without deliberating alternatives and the probable consequences of those alternatives. Neuroticism Facets: Anxiety. The ght-or-ight system of the brain of anxious indi- viduals is too easily and too often engaged. Therefore, people w ho are high in anxiety often feel like something dangerous is about to happen. They may be afraid of specic situations or be just generally fearful. They feel tense, jittery, and nervous. Anger. Persons w ho score high in Anger feel enraged w hen things do not go their w ay. They are sensitive about being treated fairly and feel resentful and bitter when they feel they are being cheated. This scale measures the tendency to feel angry; whether or not the person ex presses annoyance and hostility depends on the individuals level on Agreeableness. Low scorers do not get angry often or easily. Depression. This scale measures the tendency to feel sad, dejected, and discouraged. High scorers lack energy and have dicult initi- ating activities. Low scorers tend to be free from these depressive feelings. Self-Consciousness. Self-conscious individuals are sensitive about w hat others think of them. Their concern about rejection and ridicule cause them to feel shy and uncomfortable abound others. They are easily embarrassed and often feel ashamed. Their fears that others w ill criticize or make fun of them are exaggerated and unrealistic, but their awkwardness and discomfort may make these fears a self-fullling prophecy. Low scorers, in contrast, do not suer from the mistaken impression that everyone is watching and judging them. They do not feel nervous in social situations. 26 Immoderation. Immoderate individuals feel strong cravings and urges that they have diculty resisting. They tend to be ori- ented toward short-term pleasures and rewards rather than long- term consequences. Low scorers do not experience strong, irre- sistible cravings and consequently do not nd themselves tempted to overindulge. Vulnerability. High scorers on Vulnerability experience panic, con- fusion, and helplessness when under pressure stress. Low scorers feel more poised, condent, and clear-thinking when stressed. Openness Facets: Imagination. To imaginative individuals, the real world is often too plain and ordinary. High scorers on this scale use fantasy as a w ay of creating a richer, more interesting world. Low scorers are on this scale are more oriented to facts than fantasy. Artistic Interests. High scorers on this scale love beauty, both in art and in nature. They become easily involved and absorbed in artistic and natural events. They are not necessarily artistically trained or talented, although many will be. The dening features of this scale are interest in, and appreciation of natural and arti- cial beauty. Low scorers lack aesthetic sensitivity and interest in the arts. Emotionality. Persons high on Emotionality have good access to and awareness of their own feelings. Low scorers are less aw are of their feelings and tend not to ex press their emotions openly. Adventurousness. High scorers on adventurousness are eager to try new activities, travel to foreign lands, and experience dierent things. They nd familiarity and routine boring, and will take a new route home just because it is dierent. Low scorers tend to feel uncomfortable with change and prefer familiar routines Intellect. Intellect and artistic interests are the two most impor- tant, central aspects of openness to experience. High scorers on Intellect love to play with ideas. They are open-minded to new and unusual ideas, and like to debate intellectual issues. They 27 enjoy riddles, puzzles, and brain teasers. Low scorers on Intel- lect prefer dealing with people or things rather than ideas. They regard intellectual exercises as a waste of time. Intellect should not be equated with intelligence. Intellect is an intellectual style, not an intellectual ability, although high scorers on Intellect score slightly higher than low -Intellect individuals on standardized in- telligence tests. Liberalism. Psychological liberalism refers to a readiness to chal- lenge authority, convention, and traditional values. In its most ex- treme form, psychological liberalism can even represent outright hostility toward rules, sympathy for law -breakers, and love of ambiguity, chaos, and disorder. Psychological conservatives pre- fer the security and stability brought by conformity to tradition. Psychological liberalism and conservatism are not identical to po- litical aliation, but certainly incline individuals toward certain political parties It is possible, although unusual, to score high in one or more facets of a personality trait and low in other facets of the same trait. For ex ample, you could score highly in Imagination, Artistic Interests, Emotionality and Adventurousness, but score low in Intellect and Liberalism. 3 Objective The main objective of this paper is to draw user virtual behaviour model analyzing his/her OSN existence and can recommend on basis of behavior model. To reach our main goal, we need to consider few sub objectives as below- 1. Analysis user behaviour in OSN for last few days. 2. Categorize his/her existence in big 5. 3. Percentage of existence in big 5 factors help to elaborate user behaviour pattern. 4. Recommend some services/products to user on basis of his her behavior model 28 4 Research Questions Therefore main research question of this paper is How to categorize users of OSN according to big 5 factor from their behaviours in OSN? and sub research questions are 1. How OSN represent one user? 2. How could we analysis user behavior ? 3. How to categorize user behavior in big 5 factor? 5 Proposed Research Methodology In this paper our aim is to make relationship among text corpus from social network with psychological theory of personality. We will also try to imple- ment a recommendation system based on behavior analysis. So correlational and exploratory methodologies are used in this paper where our concept is Behavior indicator is BIG 5 Modeling and variables are Extraversion, Neu- roticism, Agreeableness, Openness and Conscientiousness. 29 5.1 Data Collection: In this research to categorize users behavior the big data is collected. The data is collected from OSN( Twitter). Where the data is stored in OSN by user activity such as posts by own, posts by his friend, liked 30 pages category etc. The collected data will be the public data where is no barrier to use those kind of data. At a time a users previous 20 days data will be collected. Data will be directly collected by the system from OSN by fully user authorization. After collection of Data it will be stored in system database with security. Twitter, a social network site, can be used for sentiment analysis as it has a very large number of short messages created by its user [62]. So we used Twitter to collect users data. Using Twitter REST api 1.1, we collected public tweets and re-tweets. Our twitter app requires users to authorize the app for extracting data from their proles. The app will not collect data if users do not allow it to run. We made sure all data we extract from twitter is public data. By calling get statuses/user timeline and get statuses/retweets of me methods we can collect the users tweets and retweets. The app can also collect public data from proles that the user is currently following by using get friends/ids method. The data we collected are in json format and the app can write the data to text les. As separated les are easier to use we separated each users data le by using users unique identier- userid or username. 5.2 Data Analysis: Text le which contain past data of a single user is analyzed through LIWC (Linguistic Inquiry and Word Count). It is a text analysis soft- ware program designed by James W. Pennebaker, Roger J. Booth and Martha E. Each text le analyzed by LIWC2007 can be treated as a whole or broken into segments. It counts the words according to its dictionary. After nishing this process it saves in a specied le where the result is written on the below corresponding its category. Where, these categories indicate dierent aspects of big 5 factor. On basis of these results the modelling is implemented. The data table is given below which shows which category lies in which factor The collected data is analyzed by LIWC to split every sentence. Then according to the meaning and use of word there will be a percentage 31 Table 2: Data tables Extraversion Openness Neuroticism Consciousness Agreeableness Social process Leisure Swear words Relativity Positive Emotion Family Insight Negation Motion Feel Friends Body Negative emo Space Discrepancy(should, would) Humans Ingestion Anxiety Time Tentative(maybe) Aective Anger Religion Hear Biological process Sadness Death Sexua Sexual Money Achievement Certainty See marking according to big 5 category. After marking the percentage will be sum up and the higher marking category will be taken as user behavior. 5.3 Results: Result of total counted words provided by LIWC is in percentage. LIWC gives the result in such way: result=(TC*100)/WC Where WC = total words in text le. TC = total words in category. The opposite method is used to know the exact number of words. Where, TC=(result*100)/WC Then which categories lie in same factor of Big 5, values of those cat- egories is sum up using linear regression formula. Linear regression f(X)=X1+X2+X3+. . . +Xi After getting the value of each factor it is percentage. Percentage formula part/whole=%/100 These results are used to draw the pie chart using EXCEL. Example: 32 5.4 Recommendation analysis:: 33 De- pending on the behavior analysis some brands of products are suggested or recommended to users. Major percentage of behavior inuence one to like such products brands. There are some examples given in table below which show majority of people having such behavior have inter- est on these brands or categories of product/services. Table no 3,4 & 5 shows some example of recommandations 34 Table 3: Data tables Big 5 Factor Categories/Brands of Game Movie Extraversion Strategy(Age of Empire, Commandos) Political, Fantasy, Family Openness Racing(NFS) Comedy, Sports, Drama Neurotic Shooting(COD, CS) Crime scene, Action, Horror Conscious Sudoku, Chess Political, Historical, Conspiracy Agreeable Sports Romantic, Drama Table 4: Data tables Big 5 Factor Categories/Brands of Music Food Extraversion Rock Bead, Meat Openness Classical, Vocal, Country wood Multicultural Food, Pizza Neurotic POP, Heavy Metal Fast Food Conscious New Released, Historic Salad, Vegetable Agreeable Romantics, Country Bread, Chess Table 5: Data tables Big 5 Factor Categories/Brands of Beverage Play Extraversion Coee, Tea Football, Athletics Openness Milkshake, Green Tea Cricket, Swim Neurotic Soft Drinks Boxing, Rugby, Marshal arts Conscious Green tea, Black Coee Athletics, Marshal arts Agreeable coee, tea, soft Drinks Gymnastics 6 Conclusion We show that personality can be recognised by computers through language cues. To date, There has been little work on automatic recognition of user personality and our research is the rst to examine the recognition of person- ality in dialogue and recommendation based on sentiment analysis results. What we clearly emerges is that extraversion is the easiest trait to model in 35 general, followed by emotional stability and conscientiousness. We can also see that feature selection is very important, as some of the best models only contain a small subset of the full feature set. Prosodic features are impor- tant for modelling observed extraversion, emotional stability and openness to experience. LIWC features are benecial for all traits. We also analysed the inuence of the most relevant individual features in specic models, for all recognition tasks. We also used Stanford NLP (natural language processing) application to analysis and split the texts. But as LIWC generates more accurate results than Stanford NLP so later we used only LIWC. At this moment our system can only use text information. But in future we will enable our system to mine data from shared links or videos. Our system cannot identify quotes (which user uses to express others speech). There is a big scope of aanalysis in more categories of sentimental words/sign. Recommendation system on brands more accurately depends on percentage on big 5 factor. Depth of measuring and scale of marking will be more ecient. References 1. Bao, J., Zheng, Y., Mokbel, M. 2012. Recommendations in Location- based Social Networks. ACM TIST. V, N, Article A(January YYYY), 30 pages. DOI = 10.1145/0000000.0000000 http://doi.acm.org/10.1145/0000000.0000000M. Smith, V. Barash, L. Getoor, and H. W. Lauw. Leveraging social context for searching social media. In SSM 08: Proceeding of the 2008 ACM workshop on Search in social media, pages 91-94, New York, NY, USA, 2008. ACM. 2. A. M. Ferman, J. H. Errico, P. van Beek, and M. I.Sezan. Content- based ltering and personalization using structured metadata. In JCDL 02: Proceedings of the 2nd ACM/IEEE-CS joint conference on Digital libraries, pages 393(393, New York, NY, USA, 2002.ACM). 3. Nielsen Online Report. Social networks & blogs now 4th most popular online activity, 2009. 36 4. F Benevento and Tiago Rodrigues and Meeyounga Cha* and Virgilio Almedia. Characterizing User Behavior in Online Social Networks. 2009. 5. Kendra Cherry. The Big Five Personality Dimensions. http://psychology.about.com/od/personalitydevelopment/a/bigve.htm . [19-june-14]. 6. Jie Bao and Yu Zheng and David Wikle and Mohamed F. Mokbel. A Survey On Recommendation in Location-Based Social Networks. ACM TIST. V,N, Article A. January 2012. 7. Ward, James C., and Amy L. Ostrom. The internet as information mineeld: an analysis of the source and content of brand information yielded by net searches. Journal of Business research 56.11 (2003): 907-914. 8. Shuotian BAi, Tingshao Zhu and Li Cheng. Big-Five Parsonality Prediction based on User Behaviors at Social Network Sites. arXiv: 1204.4809v1[cs.CY] 21 apr 2012. 9. Mia O. Hoogenboom, John D. Armstrong, Ton G.G. Grootuis and Neil B. Metcalfe. The growth benits of aggressive behaviour vary with indi- vidual metabolism and resource predictability. http://beheco.oxfordjournals.org/content/early/2012/09/25/beheco.ars161.full . Behaviour Ecology(2012) dol:10.1093/beheco/ars161. 28-september- 2012. 10. M.Smith, V. Barash, L.Getoor and H. W. Lauw. Leveraging social context for searching social media. In SSM 08: proceeding of the 2008 ACM workshop on search in social media, pages 91-94, New York, NY, USA, 2008. ACM. 11. Katherine R. Luking, Joan Luby and Deanna M. Barch. Developmental Cognitive Neuroscience. Volume 9. July 2014. Pages 82-92. Download: http://www.sciencedirect.com/science/article/pii/S1878929314000073/ pdt?md5=8162e9d9e0d9730b51269c0619cc205c&pid=1-s2.0-S1878929314000073- main.pdf 12. . Flickr.http://www.ickr.com 13. Twitter.http://twitter.com 37 14. Foursquare.https://foursquare.com 15. Cao, X., Cong, G., Jensen, C.S.: Mining signicant semantic locations from gps data. Proc. VLDB Endow.3, 10091020 (2010) 16. Zheng, Y., Zhang, L., Xie, X., Ma, W.Y.: Mining interesting locations and travel sequences from gps trajectories. In: Proceedings of the 18th international conference on World wide web, WWW 09, pp. 791 800. ACM, New York, NY, USA (2009) 17. Li, Q., Zheng, Y., Xie, X., Chen, Y., Liu, W., Ma, W.Y.: Mining user similarity based on location history. In: Proceedings of the 16th ACM SIGSPATIAL international conference on Advances in geographic information systems, GIS 08, pp. 34:134:10. ACM, New York, NY, USA (2008) 18. Xiao, X., Zheng, Y., Luo, Q., Xie, X.: Finding similar users using category-based location history. In: Proceedings of the 18th SIGSPA- TIAL International Conference on Advances in Geographic Information Systems, GIS 10, pp. 442445. ACM, New York, NY, USA (2010) 19. Liu, W., Zheng, Y., Chawla, S., Yuan, J., Xie, X.: Discovering spatio- temporal causal interactions in trac data streams. In: The 17th ACM SIGKDD international conference on Knowledge Discovery and Data mining, KDD 11. ACM, New York, NY, USA (2011) 20. Zheng, Y., Li, Q., Chen, Y., Xie, X., Ma, W.Y.: Understanding mo- bility based on gps data. In: Proceedings of the 10th international conference on Ubiquitous computing, UbiComp 08, pp. 312321. ACM, New York, NY, USA (2008) 21. Wang, L., Zheng, Y., Xie, X., Ma, W.Y.: A exible spatio-temporal indexing scheme for largescale gps track retrieval. In: Proceedings of the The Ninth International Conference on Mobile Data Management, pp. 18. IEEE Computer Society, Washington, DC, USA (2008) 22. Ioannis Konstas, Vassilios Stathopoulos, Joemon M Jose: On Social Networks and Collaborative Recommendation. 23. A. M. Ferman, J. H. Errico, P. van Beek, and M. I. Sezan. Content- based ltering and personalization using structured metadata. In JCDL 38 02: Proceedings of the 2nd ACM/IEEE-CS joint conference on Digital libraries, pages 393{393, New York, NY, USA, 2002. ACM. 24. J. L. Herlocker, J. A. Konstan, A. Borchers, and J. Riedl. An algo- rithmic framework for performing collaborative ltering. In SIGIR 99: Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, pages 230{237, New York, NY, USA, 1999. ACM. 25. G. Adomavicius and A. Tuzhilin. Toward the next generation of rec- ommender systems: A survey of the state-of-the-art and possible ex- tensions. Knowledge and Data Engineering, IEEE Transactions on, 17(6):734{749, 2005. 26. H. Yildirim and M. S. Krishnamoorthy. A random walk method for alleviating the sparsity problem in collaborative ltering. In RecSys 08: Proceedings of the 2008 ACM conference on Recommender systems, pages 131{138, New York, NY, USA, 2008. ACM. 27. G. Das, N. Koudas, M. Papagelis, and S. Puttaswamy. Ecient sam- pling of information in social networks. In I. Soboro, E. Agichtein, and R. Kumar, editors, SSM, pages 67{74. ACM, 2008. 28. S. B. Subramanya and H. Liu. Socialtagger -collaborative tagging for blogs in the long tail. In SSM 08: Proceeding of the 2008 ACM work- shop on Search in social media, pages 19{26, New York, NY, USA, 2008. ACM. 29. M. Strohmaier. Purpose tagging: capturing user intent to assist goal- oriented social search. In SSM 08: Proceeding of the 2008 ACM work- shop on Search in social media, pages 35{42, New York, NY, USA, 2008. ACM. 30. N. Craswell and M. Szummer. Random walks on the click graph. In SIGIR 07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, pages 239{246, New York, NY, USA, 2007. ACM 31. H. Yildirim and M. S. Krishnamoorthy. A random walk method for alleviating the sparsity problem in collaborative ltering. In RecSys 08: 39 Proceedings of the 2008 ACM conference on Recommender systems, pages 131{138, New York, NY, USA, 2008. ACM. 32. M. Clements, A. P. de Vries, and M. J. T. Reinders. Optimizing single term queries using a personalized markov random walk over the social graph. In Workshop on Exploiting Semantic Annotations in Informa- tion Retrieval (ESAIR), March 2008. 33. A. Hotho, R. Jaschke, C. Schmitz, and G. Stumme. Information Re- trieval in Folksonomies: Search and Ranking. 2006. 34. Georgios Paltogloua, Stephane Gobronb, Marcin Skowronc, Mike Thel- walla, and Daniel Thalmannb. Sentiment analysis of informal textual communication in cyberspace. 35. Barthelemy, F., D.B.G.S., Magnant, X.: Believable synthetic charac- ters in a virtual emarket. In: In Proceedings of the IASTED Articial Intelligence and Applications (2004) 36. Bates, J.: The role of emotion in believable agents. Communications of the ACM 37(7), 122{125 (1994) 37. Becheiraz, P., Thalmann, D.: A model of nonverbal communication and interpersonal relationship between virtual actors. In: CA 96. p. 58. IEEE Computer Society, Washington, DC, USA (1996) 38. Blitzer, J., Dredze, M., Pereira, F.: Biographies, bollywood, boom- boxes and blenders: Domain adaptation for sentiment classication. In: 45th ACL. pp. 440{ 447. Association for Computational Linguistics, Prague, Czech Republic (June 2007) 39. Bradley, M., Lang, P.: Aective norms for english words (anew): Stim- uli, instruction manual and aective ratings. Tech. rep., Gainesville, FL. The Center for Research in Psychophysiology, University of Florida (1999) 40. Brooke, J., Toloski, M., Taboada, M.: Cross-linguistic sentiment anal- ysis: From english to spanish. In: ICRA-NLP (2009) 41. Cassell, J.: Embodied conversational agents. MIT Press, Cambridge, MA, USA (2000) 40 42. Cassell, J., Pelachaud, C., Badler, N., Steedman, M., Achorn, B., Becket, T., Douville, B., Prevost, S., Stone, M.: Animated conver- sation: rule-based generation of facial expression, gesture & spoken intonation for multiple conversational agents. In: SIGGRAPH 94. pp. 413{420. ACM, New York, NY, USA (1994) 43. Kappas, A., Hess, U., Scherer, K.R.: Voice and emotion. In: Fun- damentals of nonverbal behavior. p. 200238. Cambridge University Press, Cambridge and New York (1991) 44. Skowron, M.: Aect listeners: Acquisition of aective states by means of conversational systems. In: COST 2102 Training School. pp. 169{181 (2009) 45. Thelwall, M., Wilkinson, D.: Public dialogs in social network sites: What is their purpose? J. Am. Soc. Inf. Sci. Technol. 61(2), 392{404 (2010) 46. Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up? sentiment classi- cation using machine learning techniques. In: EMNLP 2002 (2002) 47. Thomas, M., Pang, B., Lee, L.: Get out the vote: Determining sup- port or opposition from congressional oor-debate transcripts. CoRR abs/cs/0607062 (2006) 48. Ounis, I., Macdonald, C., Soboro, I.: Overview of the trec-2008 blog trac. In: The TREC 2008 Proceedings. NIST (2008) 49. Pang, B., Lee, L.: Opinion Mining and Sentiment Analysis. Now Pub- lishers Inc. (2008) 50. Mullen, T., Collier, N.: Sentiment analysis using support vector ma- chines with diverse information sources. In: Lin, D., Wu, D. (eds.) Proceedings of EMNLP 2004. pp. 412{418. Association for Computa- tional Linguistics, Barcelona, Spain (July 2004) 51. Whitelaw, C., Garg, N., Argamon, S.: Using appraisal groups for sen- timent analysis. In: CIKM 05. pp. 625{631. ACM, New York, NY, USA (2005) 41 52. Wilson, T., Wiebe, J., Homann, P.: Recognizing contextual polarity in phraselevel sentiment analysis. In: HLT/EMNLP 2005. Vancouver, CA (2005) 53. Pennebaker J., F.M., R., B.: Linguistic Inquiry and Word Count: LIWC. Erlbaum Publishers, 2 edn. (2001) 54. Slatcher, R., Chung, C., Pennebaker, J., Stone, L.: Winning words: Individual dierences in linguistic style among U.S. presidential and vice presidential candidates. Journal of Research in Personality 41(1), 63{75 (2007) 55. Colby, K.: Articial paranoia. Articial Intelligence 2(1), 1{25 (1971) 56. Acosta, J.: Using Emotion to Gain Rapport in a Spoken Dialog System. Ph.D. thesis, University of Texas at El Paso (2009) 57. Gratch, J., W.N.G.J.F.E., Duy, R.: 58. Turney, P.D., Littman, M.L.: Unsupervised learning of semantic ori- entation from a hundred-billion-word corpus. CoRR cs.LG/0212012 (2002) 59. Pelachaud, C.: Studies on gesture expressivity for a virtual agent. Speech Commun. 51(7), 630{639 (2009) 60. Gobron, S., Ahn, J., Paltoglou, G., Thelwall, M., Thalmann, D.: From sentence to emotion: a real-time three-dimensional graphics metaphor of emotions extracted from text 26(6-8), 505{519 (June 2010) 61. Shuotian BAi, Tingshao Zhu and Li Cheng. Big-Five Parsonality Prediction based on User Behaviors at Social Network Sites. arXiv: 1204.4809v1[cs.CY] 21 apr 2012. 62. Pak, Alexander, and Patrick Paroubek. Twitter as a Corpus for Sen- timent Analysis and Opinion Mining. LREC. 2010. Page 1326 42