Beruflich Dokumente
Kultur Dokumente
IN BIG DATA
SOUMYA M HEGDE
CSE Department,VishweshwaraihTechnological University,DonBosco College of Engineering
Banglore, Karnataka, 560074, India
soumyahegde22@gmail.com
RASHMI K S
CSE Department,VishweshwaraihTechnological University,DonBosco College of Engineering
Banglore, Karnataka, 560074, India
rashmikapil.sirdeshpande@gmail.com
ABSTRACT: In social media it is difficult to get data on user interest basis.In this case based on
browsing and downloading history along with mining the interest of the user contents are filtered
and recommended to user.The project has two goals.First studying the user problems and second
bulding model for processing and controlling of project.
INTRODUCTION
Social media platforms are growing in proportional with increase in population.The services are
focused towards users through addons like mobile texting,facebook,twitter etc. These media
produce large amount of internet traffic, but contents remains fragmented which limits the user
experience ,analysis also content redistribution.These social networking sites rely more on
individual experience and their suggestions and the problem the users over come rather than
anything else.
Since these social platforms are more involved in common peoples life for their
entertainment,networking and business improvement, ease of access of these media in single
platform need to be considered.Also this is a continuous experience rather then once use and stop
using it.
When social networks had been introduced for first time within short duration of time it started to
become famous.Later these networks started to involve in most of the peoples life as it included
photos sharing,video sharing ,status updates,groups for mass communication.
Many of the drawbacks of social networks like interoperability were addressed by starting up with
single platform .Content Redistribution in FaceBook was provided by sharefunction (eg [2] and
[3]).Open Identity was provided by Google.But these features were not successful as it took too
much of time effort of users.The drawback of share function was user could not engage in one-
to-many applications and Google open identity was limited to targeted platforms rather than
multiple platforms.
To account for these disadvantages our project was developed where unified access was granted
for different type of social networks, using Hadoop system where one can access,rate the content
in single platform also get new recommendation for contents which user has not seen,based on
how he has rated other contents and on his interest.The user can set filters for unwanted data.The
only thing user need to do is register and set filters. Based on what user browse, view,download
the system learns interest and Recommend to the user.
THEORETICAL PREMISES
Figure1. Services
The goal of this project is to overcome few of the drawbacks of interactivity described above.The
project provides user centric experience with interactivity.
556
Increase in number of users and data led to cognitive overload[6].To overcome this[5]suggested
solution like information retrival through keywords,Information Filtering[1],brute force interaction
that enable immediate and effortless interaction.
Considering the challenges and interactivity of the user we developed mobile oriented
framework.This framework eases users work also helps to understand the process of
interactivity.As the project utilizes cloud interaction with large scale of data is possible.Also
analysis at large scale is made easy
The front end part of project consists of social media Interactivity and backend consists of
processing which consists of four phases which include content matching,Interest
mining,information extraction,user management and subscription management.
The workflow follows below procedure
Subscription management:
Here the user can register,add content filter,remove content filter,search and view content.while
registering user data is taken like name ,address and his preferences where location is used for
building user profiling based on which content will be recommended to him .
User management
This is used for internal montoring of user data and his preferences,calculation and storing of
information what can be recommended to the user.the main important step over here is user
profiling.where user profiling is group of people with similar interest.It is based on user profiling
contents will be recommended to set of people.
557
Interest mining
This uses data mining aggloromative clustering algorithm to extract users interest based on rating
given by him to the viewed content also what is his most viewed content.This happens in backend
of the system where processing happens and it is used for further recommendation of content to
the user.
Information extraction
This uses WEB-API services for information extraction.Based on users downloading and browsing
,data will be sent to the processing machine everytime .Further this information is sent for content
matching phase
Content matching
It matches the interest mined and Information extracted by WEB-API and is used for content
recommendation,for example if a user has set a filter he do not want particular data,even if has
browsed similar kind of data then those kind of data will not be recommened for him.
Content recommendation:
This is one of the main intention of the project .Based on all the above criterias content will be
recommended to the user.Use can view it or rate it but everytime he views the data the database
will be updated.
Figure4.User Profiling
CONCLUSION
In this paper we propsed a model that encircle around interconnection between services and social
media platforms.Using the BigData,cloud and datamining services we built the model for
558
Information Extraction and Content Recommendation. Future work explains about future scope of
the project that utilizes general information for the purpose of building model
FUTURE WORK
The idea can be extended for generalized information.Further these generalized information need
to be categerized for ease of access for users.New technique should also be found for categorizing
the content by its own details without being explicitly specified.
REFERENCES
559