Beruflich Dokumente
Kultur Dokumente
Popular Web
Platforms
Sherali Zeadally and Stephanie Winkler
P
Figure 1. Sign up screenshots from LinkedIn, Twitter, and Facebook.
june 2016
1932-4529/162016IEEE
75
76
Research Contributions
We summarize our main research contributions as fol
lows.
We analyze privacy agreements from four of the
most popular web platforms (i.e., Google, LinkedIn,
Twitter, and Facebook).
We compare and contrast the privacy agreements to
identify the platform that gathers the most informa
tion and the platform that offers the most informa
tion privacy for the user.
We analyze the privacy policies in order to deter
mine what the average user can comprehend.
We recommend some potential solutions for users
to increase their privacy in spite of these agree
ments.
Finally we identify future research opportunities
aimed at protecting consumer privacy.
Related Work
Privacy policies have been the focus of many researchers
over the years as the Internet has become more populat
ed by various companies. Many past studies have
focused on privacy policies in general, rather than on
only a few web platforms as we do in this article. Also,
prior research has examined one type of privacy policy at
a time. Graber et al. [18] examined the privacy policies of
80 different Internet health web platforms. Lewis et al.
[32] decided to focus on the financial industry by examin
ing 75 different online companies. The privacy policies
selected were comprised of an equal number of banks,
credit counseling companies, and check cashing compa
nies. The primary focus of both of these studies was to
analyze the readability of the privacy policies. The
researchers concluded in both cases that privacy policies
from these groups were not easily understood by the
majority of users and are not considered a good measure
by which to inform the user of their privacy rights [18],
[32]. Caramujo and Rodrigues Da Silva [4] was one of the
few efforts that came closest to our objectives in this
paper, as they focused on Facebook and LinkedIn. How
ever, they did not study other popular platforms such as
Google and Twitter and did not conduct an in depth anal
ysis of the privacy policies. Rather these researchers
used the privacy policies to inform their proposal of a pri
vacy-aware unified modeling language profile [4].
Our approach for analyzing the privacy policies is dif
ferent from the efforts of other researchers [4], [18], [32],
[42], [44] since we include a side by side comparison,
readability analysis, and what these mean for the users
of these platforms. Our readability analysis uses a simi
lar approach to [18] and Lewis et al. [32], since we use
the Flesch-Kincaid Grade Level Readability Formula. The
Flesch-Kincaid Grade Level Readability Formula mea
sures the grade level a text is written at by analyzing
june 2016
Yes
Yes
Yes
Yes
User-generated content
Yes
No
Yes
Yes
Device information
Yes
Yes
Yes
Yes
Location information
Yes
Yes
Yes
Yes
Payment information
Yes
No
Yes
Yes
Yes
No
No
No
Yes
Yes
Yes
No
Web Platforms
Social media platforms have become one of the most
popular technologies on the Internet. By 2014, 71% of
adult Internet users [7] and 56% of the entire adult pop
ulation were using some form of social media [43].
While social media remains most popular among young
adults (ages 18-29), usage is steadily increasing each
year in other age groups [7]. Today some of the most
widely used social media platforms are Facebook, Twit
ter, Google, and LinkedIn [25]. Each of these platforms
are discussed more in depth below, along with the pri
vacy agreements that their users are required to agree
to before accessing the site and using the service it pro
vides. Special attention was paid to personally identifi
able information (i.e., information that could be used to
identify the user); user-generated content (anything that
the user may create while using the platform); device
information (e.g., IP address, if it is considered a mobile
device); location information; payment information (e.g.,
credit card number); off-platform activities (activity on
june 2016
Facebook
Facebook was initially launched in 2004 as an online
directory for college students. Since that initial launch,
Facebook has become a publicly traded company with a
market value of over 240 billion dollars [31]. This multibillion dollar corporation is a social networking site that
allows users to connect with people they know to share
photos, play games, and share information. Facebook
currently has over 1.2 billion monthly active users world
wide, making it one of the largest social media plat
forms today [28].
Facebooks privacy policy (last revised January 30,
2015) offers users very little actual privacy. According to
the policy, Facebook collects numerous forms of data,
which are summarized in Table 1. This immense amount
of data is used to better tailor their site to fit users needs
as well as to inform future site development [11]. This
means that users information is widely shared with many
parties, including with developers, advertisers, and other
third parties. In their cookie policy Facebook lists no less
than 10 separate companies that advertisers commonly
use, along with their contact information [11]. This list is
not exhaustive, but it does reveal that the companies that
have access to this information could be far larger than
this initial list. Any information that Facebook collects is
shared with these companies, though Facebook does not
mention the protocol third parties must follow in order to
access the information. Facebook stores this information
for as long as it is necessary to provide products and ser
vices to you and others ([11, para. 29. Essentially Face
book can store the information indefinitely since they are
the ones who decide how long it is necessary to keep the
data. When a user decides to delete their account, then
Facebook states they will delete all of the data that that
user posted. Any data that another member posted about
the user deleting the account, will remain on Facebook.
77
78
Google
Google was initially launched in 1997 [16], and its
founders Larry Page and Sergey Brin filed for incorpora
tion in 1998 [30]. Googles main function is a search
engine that indexes the information on the Internet and
makes it searchable by Internet users. Since then
Google has expanded into many other areas including
mobile phones, operating systems, wearable technolo
gy, entertainment, and the company acquired the popu
lar video sharing site YouTube in 2006 [48]. At this
point Google has grown into a multinational company
with its search engine dominating in nearly every mar
ket. This dominance has led to the company being
charged with holding a monopoly on searching in
Europe, with many people stating that we live in a
Google age [55].
Googles privacy policy is universal for all of their
platforms and describes the information collected from
its users, how that information is shared, and options
the user has to control what information is collected
[16]. Google collects many types of information, which
are summarized in Table 1. The server log information
that Google collects includes telephone logs, device
crash reports, Internet Protocol (IP) address, and cook
ies [16]. The information that Google collects is only
shared within Google unless certain conditions apply. If
the user has a domain administrator, then that adminis
trator has access to the users personal information.
Google also shares information with advertisers, but
only after personally identifiable information has been
stripped from the data. Any information that has been
generated while using a Google account Google consid
ers to be public, making it possible for the information
to be indexed by search and found by other users.
Google uses the data in order to help improve their
products and services [14].
Information is also shared with other parties with the
users explicit consent to share the information. This is
in contrast to other web platforms that focus on having
the option for users to opt-out of consumer tracking and
collection. Googles default function is for user informa
tion to remain private in reference to third parties. The
user must consciously choose if they wish to share their
information with advertisers.
june 2016
LinkedIn
LinkedIn is a social networking site geared more
towards a professional network than friends. Users cre
ate a profile that can serve as a resume with informa
tion including past work experience, current job, special
skills, awards, and group affiliations. Users also have
the option to allow contacts to endorse them on particu
lar skills. The platform launched in 2003 and reached
225 million users in 2013 [46].
LinkedIns privacy policy is not very specific on the
types of information they collect from users. According
to their statement LinkedIn collects various forms of
information which are summarized in Table 1. In addi
tion to these types of information, LinkedIn also uses
cookies to track user habits on and off their services,
but the statement does not specifically mention what
types of information these cookies help them collect
[33]. Instead, LinkedIn directs the user to a list of eight
different third party web platforms that use cookies
through LinkedIn for the user to view their privacy
june 2016
Twitter
Twitter is a micro-blogging site that allows users to cre
ate profiles and instantly broadcast short statements to
the world. These statements have a limit of 140 charac
ters and can include hyperlinks, videos, pictures, and
hashtags. Hashtags help sort tweets into categories for
users to easily find later. Twitter was founded in 2006
79
80
Data Collection
Each web platform collects data on the device used by
the user, user location, and information used in account
creation. This data collection breaks down to the users
IP address, general location (city, state, and country),
email address, name, birthdate, type of computer, oper
ating system, and web browser. Beyond those basic ele
ments each privacy policy varies widely on the specific
data they collect. Part of this is due to each platform
having its own uses and expectations by the users. How
ever, every platform uses the same three main methods
of collecting user information: registration, user IP
address trace, and search track recording cookies [5].
We found that the platform that collects the largest
variety of data is Facebook. Facebook collects every
piece of information that the user posts on the site, any
information posted about that user, payment informa
tion, and also stretches to collect information about
their general Internet usage. Since the main purpose of
Facebook is social networking, the type of information
that the user posts can vary widely and is likely to
include potentially stigmatizing information [52]. Cur
rently Facebook has the largest membership of all of
the platforms discussed, with users disclosing in depth
details of their lives. These circumstances make it pos
sible that Facebook could also be collecting the largest
amount of data of any platform discussed here.
Even though Googles privacy policy covers all of
their web platforms, they collect the least amount of
information. Google has two different social media plat
forms, Google+ and YouTube. Google+s adoption has
not been nearly as widespread as the other web plat
forms and consequently there is less information there
to collected [43]. YouTube would allow for a large
amount of data collection, but Google does not collect
user generated content unlike all of the other web plat
forms discussed. The majority of data that Google col
lects originates from its main function: search. Google
june 2016
Data Sharing
The second major area that exhibits differences across
the platforms is how user data is shared. Google, Linke
dIn, Facebook, and Twitter share data within their orga
nizations for research purposes and site improvement.
Facebook, Twitter, and LinkedIn all share their collect
ed user data with third parties for the purposes of
advertising. Google does not always share information
with third parties. However, this could be due to the
fact that Google now owns at least four advertising
companies [3], [56] and does not have a need to
include third parties. In addition to using data for adver
tising purposes, Twitter and LinkedIn provide research
ers with access to their API to help further academic
and professional research. Twitter offers this free of
charge while LinkedIn charges a fee. Facebook is the
one platform that has worded their privacy policy in
such a way that they have the right to do anything with
a users data [11, para. 8-12].
Data Security
Data security is one of the more interesting aspects of
the platforms privacy policies because none of the plat
forms extensively cover how a users data is kept secure
once it is collected. Twitter has the least secure policy
of the four platforms because data security is not even
mentioned once in their privacy policy. This could be
because Twitter deviates from the other platforms by
assuming that any content generated on their site is
considered public anyway. They are completely up front
about this and state in the first paragraph of their policy
that Any registered user of the Twitter Services can
send a Tweet, which is a message of 140 characters or
less that is public by default and can include other con
tent like photos, videos, and links to other websites
[51]. This distinction puts them in a different position
than LinkedIn, Google, or Facebook where there is an
implicit assumption that the information shared on the
platform remains private between the individuals it is
shared with [58].
LinkedIn and Facebook offer about the same level of
protection. When it comes to data security however,
Facebook is vague about the protection it offers. Face
book states that it protects the users data by using a
june 2016
User Comprehension
Privacy policies are used as a contract between the user
and the company that is offering their services. These
companies offer their services to millions of people with
varying levels of education, skill, background, and age.
Privacy policies are meant to inform the user about
what privacy is given when using a particular service.
However, privacy policies cannot do their job if the aver
age member of the population cannot understand them.
Currently in the United States up to 50% of adults can
not understand literature written at an eighth grade
reading level [35]. This is problematic given that most
privacy policies would be written at a level higher than
eighth grade. In order to examine the potential compre
hension expected of adults in the United States, each of
these privacy policies were analyzed using the FleschKincaid Grade Level Readability Formula. This analysis
was completed using an open source online readability
test tool that allows for the full text of the document to
81
Recommendations
Concerns for online privacy have been building for years
as more people have transferred their lives into the digi
82
june 2016
User Recommendations
These issue leave users in a bind. While Twitter and
LinkedIn might be optional for consumers, Facebook
and Google are increasingly more difficult to get along
without. If the user does not like the privacy agreement
but is forced to use the platform, then they are left in a
no-win situation. There are a few of unconventional
methods available if a user wants to protect their priva
cy such as private browsing, do not track signatures,
and browsing using Tor. If users decide that they do not
want cookies to be used when they access web plat
forms, they can choose to block them with their web
browser, but this might cause some web platforms to
83
Author Information
do not track signature when they are accessing the
Internet. This notifies the website that the user does not
wish for cookies to be placed on their computer.
The drawback to these approaches is that they are
not very effective because the majority of web platforms
do not honor do not track signatures [40]. The last
option available to users to circumvent the privacy agree
ments is to use the Tor browser [1], [60]. The Tor brows
er offers completely anonymous browsing so the user
does not have to worry about a website tracking them
online or knowing their IP address [47]. However, this
does not fix the problems with creating user accounts on
web platforms or posting on them since the users identi
ty is tied to the account.
Sadly the problems with user accounts and use of
web platform cannot be solved by the user alone. Fur
ther research should examine ways users could take
control of their own privacy, or at a minimum, explore
better ways to keep users informed about information
they are giving away. Self-regulation with web platforms
has had some success in the United States; however
only 14% of companies that collect consumer informa
tion provide notice of their information policies [5]. Out
side of academic research, governmental agencies
should consider changing how the law applies to Inter
net communications because regardless what the law
states, users consider what they state on social media
and in email to be private information [2], [39]. If there
is such a large disconnect between the majority of the
public and the law, something should change.
84
References
[1] R. Abbot, An onion a day keeps the NSA away, J. Internet Law,
vol. 13, no. 11, pp. 22-28, 2010.
[2] T. Allmer, C. Fuchs, V. Kreilinger, and S. Sevignani, Social net
working sites in the surveillance society, in Media, Surveillance,
and Identity: Social Perspectives, A. Jansson and M. Christensen,
Eds. New York, NY: Peter Lang, 2014, pp. 49-70.
[3] M. Arrington, Google acquires AdMeld for $400 million, TechCrunch, June 6, 2011; http://techcrunch.com/2011/06/09/googleacquires-admeld-for-400-million/, accessed Oct. 10, 2015.
[3] C. Baraniuk, Ashley Madison: Suicides Over Website
Hack, BBC, Aug. 24, 2015; http://www.bbc.com/news/technol
ogy-34044506, accessed Oct. 10, 2015.
[4] J. Caramujo and A.M. Rodrigues Da Silva, Analyzing privacy poli
cies based on a privacy-aware profile: The Facebook and LinkedIn
case studies, in Proc. 2015 IEEE 17th Conf. Business Informatics (CBI), 2015, pp. 77-84.
[5] X. Chen and K. Michael, Privacy issues and solutions in social
network sites, IEEE Technology and Society Mag., vol. 31, no. 4,
pp. 43-53, Dec. 2012.
[6] K. Collins-Thompson and J. Callan, Predicting reading difficulty
with statistical language models, J. American Society for Information Science & Technology, vol. 56, pp. 1448-1462, 2015.
[7] M. Duggan, N.B. Ellison, C. Lampe, A. Lenhart, and M. Madden,
Social media update 2014, Pew Research Center, Jan. 9, 2015;
http://www.pewinternet.org/2015/01/09/frequency-of-social-mediause-2/, accessed Nov. 7. 2015.
[8] C. Dwyer, Privacy in the age of Google and Facebook, IEEE
Technology and Society Mag., vol. 30, no. 3, pp. 58-63, Sept. 13,
2011.
[9] Facebook, Cookies, pixels, & similar technologies, 2015; https://
www.facebook.com/help/cookies/update, accessed Oct. 10, 2015.
[10] Facebook, Data policy, Jan. 30, 2015; https://www.facebook.
com/privacy/explanation, accessed Oct. 10, 2015.
[11] Facebook, Facebook ads, 2015; https://www.facebook.com/
settings?tab=ads&view, accessed Oct. 10, 2015.
[12] Federal Trade Commission, FTC releases survey of identity theft
in the U.S. study shows 8.3 million victims in 2003, Nov. 27,
2007; https://www.ftc.gov/news-events/press-releases/2007/11/
ftc-releases-survey-identity-theft-usstudy-shows-83-million, accessed
Oct. 10, 2015.
june 2016
june 2016
85