Sie sind auf Seite 1von 9

Stephenson 1

Name: Crystal Stephenson

Class: Fundamentals of Information Science and Technology

Assignment: IT Applications

Due Date: March 25, 2018

“Search Engine Bias: A Comparison of Tools”

With the constant evolution of technological innovation, we are increasingly inundated

with a mass spread of information on the World Wide Web (WWW), for which search engines

have come to the rescue in disseminating and distinguishing what is most relevant or useful to

our inquiries. But as “search and discovery engine algorithms are often perceived as insightful,

objective, and neutral” (Cleverley, 2017, Pg. 13), the phenomena of manipulation and

algorithmic bias has put into question the value of our query results. Since “different search

engines weigh different parts of their algorithms differently, they can all have similar relevancy

while having significantly different search results.” (Norman & Millind, 2016, pg. 2). It is

therefore imperative that, when comparing the variety of search engines available to us, we

consider bias embedded within them, which “occurs when a search engine’s algorithms give

preference to certain information.” (Cleverley, 2017, Pg. 13) As Google is the most dominant

search tool today, I will endeavor to compare results provided by this website and the more

academic search engine, RefSeek, in an experiment to show both the distinctive differences as

well as provide some reasons for such contrasting results.

For my experiment, I chose to search the words “social media analytics” in a general

Google inquiry prior to doing the same on RefSeek. I did not use any Boolean distinctions in my

entry, such as words like “and” or “not” within my criteria. As a preliminary overview, I have

included two comparative tables below to highlight the differences in my results.


Stephenson 2
Stephenson 3

Although both search tools presented an even number of advertisements – four to be

exact – the ads on Google were far more prominent and distracting that those listed by RefSeek.

And whereas Google suggested links in marketing one’s business or geared to those seeking to

actively analyze their webpages in an effort to strategize and utilize analytics for growth in traffic

and influence, RefSeek took a more scholarly route, by producing informative links pertinent to

social media analytics, such as classes offered for furthering one’s education in the field.

Notably, the link to Wikipedia on the subject matter was listed fourth on Google, while

Wikipedia was the first link proposed by RefSeek, which I believe illustrates the distinction in a

nutshell.

The obvious differences in query results are not surprising and can be explained in a

multitude of factors, first of which is algorithmic bias. Similar to the plethora of social media

platforms available today, Google, for one, is guilty of invisible, algorithmic editing of the web.

According to Eli Pariser, “there is no standard Google anymore.” (Pariser, 2011) When two

individuals search the exact same term(s) on Google, for instance, they will inevitably retrieve

different results tailored to their specific “filter bubble.” Furthermore, even when logged out,

Pariser warns, there “are 57 signals that Google looks at” (Pariser, 2011), including “everything

from what kind of computer you’re on to what kind of browser you’re using to where you’re

located” in order to “personally tailor your query results.” The problem, Pariser points out, is that

“if you take all of these filters together, you take all these algorithms,” you inevitably find

yourself trapped in a “filter bubble,” made up of “your own personal, unique universe of

information that you live in online.” Unfortunately, “what’s in your filter bubble depends on who

you are, and it depends on what you do,” but most importantly, “you don’t decide what gets in”

or “actually see what gets edited out.” (Pariser, 2011) There are no embedded ethics as of yet, so
Stephenson 4

“if algorithms are going to curate the world for us, if they’re going to decide what we get to see

and what we don’t get to see, then we need to make sure that they’re not just keyed to relevance”

(Pariser, 2011), but rather “show us things that are uncomfortable or challenging” and other

points of view. That is to say, “we need it to introduce us to new ideas and new people and

different perspectives” (Pariser, 2011), but “it's not going to do that if it leaves us all isolated in a

Web of one.”

Matt Cutts, an engineer at Google, explains, “when you do a Google search, you aren’t

actually searching the Web, you’re searching Google’s index of the Web, or at least as much of it

as” (“Web Search”, 2011) Google can find. Through the use of “spiders” (“Web Search”, 2011),

Google will “narrow down hundreds of thousands of results” by asking upwards of two hundred

questions, such as how many times your key words show up in the page and where the webpage

ranks in validity and importance. Page rank is vital, because it “is still the most revealing and

critical metric that governs a domain’s ability to rank.” (Norman & Millind, 2016, pg. 1) As

visibility is the primary goal for businesses and services, “the key to receiving traffic through

Google is to gain first page rankings,” since “first page websites get 91.5% of Google traffic.”

(Norman & Millind, 2016, pg. 2) Google, in particular, is predominantly “focused on site age

and link based authority.” Other search engine formats may center their attention on “on-the page

content” or “local communities” (Norman & Millind, 2016, pg. 2), while others, like RefSeek,

rank higher in academic or scholarly results.

Meanwhile, Christopher Wagner acknowledges how the “prevalent use of search engines

has generated extensive research into improving the speed and accuracy of searches” (Wagner,

2014, pg. iii). However, these algorithmic performance improvements are designed to “predict

many different aspects of user behavior” (Wagner, 2014, pg. iii), which evaluates search histories
Stephenson 5

and sequencing of resources, as well as statistical and collaborative behavior models, but these

recommendations are often skewed, as exemplified in my comparative exercise. For instance, the

“target user is compared to other users to identify the resources that follow the target user’s

recent search history.” (Wagner, 2014, pg. 4) In my Google search for social media analytics, the

results were almost all marketing in nature, glorified advertisements geared towards those who

seek to optimize their pages and profiles in an effort to improve traffic and monitor activity and

influence, but none of which was relevant to my personal use or search history, whilst RefSeek

offered educational links directed at defining and improving my understanding of the key words.

What is most troubling is that by clicking on the links to identify and relay their content, I have

inadvertently redefined my search history. “Users will tend to select resources that are shown to

them” (Wagner et al., 2014, pg. 5), those most popular and displayed more often, and thereby

clicking on those selected for us, our “variety of immediately available resources will shrink.”

Another “liability of search engines for algorithmically produced search suggestions” are

exhibited “through Google’s ‘autocomplete function.” (Karapapa & Borghi, 2015, pg. 261) “This

technical feature, which is now commonly provided by all search modules, means to speed up

the process of entering a query by ‘suggesting’ the word(s) that a user would type before the user

actually finishes entering the query in the search bar” (Karapapa & Borghi, 2015, pg. 262), and

“may be automatically completed or associated with other words based on a complex algorithm.”

The main factors that determine “the algorithm are the popularity of searches made by the

Internet users and of the web pages indexed by the search engine,” whereas other objective

factors include “the user’s geographical location and their prior search history” (Karapapa &

Borghi, 2015, pg. 262), as touched on earlier. Interestingly enough, the autocomplete or word

completion feature is said to have been “designed to assist people with physical disabilities to
Stephenson 6

increase the typing speed” (Karapapa & Borghi, 2015, pg. 264), but eventually “applied in search

engines and other software – databases, web browsers, email programs, word processors – to

facilitate typing the ‘right’ word(s) when submitting a search query,” and now “there is

practically no interactive software applied to computers or smart devices that does not

incorporate an autocomplete function as default.” (Karapapa & Borghi, 2015, pg. 264) The

autofill feature can also be distracting and inadvertently redirect one’s focus through suggestion

and thus influence their results in the process.

With billions of web pages and new content produced daily, “the use of search engines is

becoming a primary Internet activity” (Dwivedi et al., 2009, pg. 63), and these tools “have

developed increasingly clever ranking algorithms in order to constantly improve their quality.”

However, “there are still many open research areas of tremendous interest where the quality of

search results can be improved.” (Dwivedi et al., 2009, pg. 63) While Google has been the most

successful method of information retrieval, and has therefore “invoked a lot of research focus on

web structure mining algorithms” (Dwivedi et al., 2009, pg. 60-61), we see how issues of

ranking, user history, “filter bubble” bias, autocomplete functions, and predictive measures can

alter one’s results dramatically between search engines. I have demonstrated how Google

produced more marketing friendly results to my query of social media analytics, whilst academic

tools such as RefSeek managed to provide more useful and scholarly results for my perusal, and

in doing so I have indicated some reasons as to how and why this may occur. With “advances in

digital data collection and storage technologies” (Kou & Lou, 2010, pg. 123), search engines

have significantly aided in disseminating information for us, but adequately assisting users in

locating “the most relevant web pages from the vast text collections efficiently” has continued to

be a “big challenge.” (Kou & Lou, 2010, pg. 123) To be sure, the “extent of information
Stephenson 7

available to users is unprecedented,” (Wagner, 2014, pg. 1), and “identifying and locating a

specific resource becomes a gargantuan task for which search engines are a necessary tool.”

(Wagner, 2014, pg. 1) With our dependency on these tools in mind, it has been a valuable

exercise to compare algorithms of varied sources. In so doing, it is in my experience that it

would be most prudent of us to reference a variety of search tools when researching topics in

which the content is vital and not restrict ourselves in avenues of inquiry, as no two search

engines are created equal.


Stephenson 8

References

Cleverley, Paul. "Search Algorithms: Neutral or Biased?." Online Searcher, vol. 41, no. 5,

Sep/Oct2017, pp. 12-17. EBSCOhost, ezproxy.lib.usf.edu/login?url=http://search.

ebscohost.com/login.aspx?direct=true&db=ai&AN=125212292&site=eds-live.

Dwivedi, Nripendra1, Lata Joshi and V. P. Gupta. "Improved Ranking Algorithm of Web

Page (Based on Age of Page) for Web Search Engines." IUP Journal of Science &

Technology, vol. 5, no. 3, Sept. 2009, pp. 59-63. EBSCOhost,

ezproxy.lib.usf.edu/login?url=http://search.ebscohost.com/login.aspx?direct=true&db=a

i&AN=44623173&site=eds-live.

Karapapa, Stavroula and Maurizio Borghi. "Search Engine Liability of Autocomplete

Suggestions: Personality, Privacy and the Power of Algorithm Null

[Article]." International Journal of Law and Information Technology, no. 3, 2015, p. 261.

EBSCOhost, ezproxy.lib.usf.edu/login?url=http://search.ebscohost.com/

login.aspx?direct=true&db=edshol&AN=hein.journals.ijlit23.20&site=eds-live.

Kou, Gang and Chunwei Lou. "Multiple Factor Hierarchical Clustering Algorithm for Large

Scale Web Page and Search Engine Clickstream Data." Annals of Operations Research,

no. 1, 2012, p. 123. EBSCOhost, ezproxy.lib.usf.edu/login?url=http://search.

ebscohost.com/login.aspx?direct=true&db=edsbl&AN=RN316339310&site=eds-live.

Norman, Dora and Prakash Millind. "Page Rank and Trust Rank Algorithms of Search

Engine." 4D International Journal of Management & Science, vol. 7, no. 1, July 2016,

pp. 1-9. EBSCOhost, ezproxy.lib.usf.edu/login?url=http://search.ebscohost.com

/login.aspx?direct=true&db=aci&AN=116595137&site=eds-live.
Stephenson 9

Pariser, Eli (2011, March). Beware online "filter bubbles". Retrieved March 05, 2018, from

https://www.ted.com/talks/eli_pariser_beware_online_filter_bubbles/

transcript(ExternalLink)

Wagner, Christopher Shaun. Proactive Search: Using Outcome-Based Dynamic Nearest-

Neighbor Recommendation Algorithms to Improve Search Engine Efficacy. Dissertation

Abstracts International: Section B: The Sciences and Engineering, vol. 77, ProQuest

Information & Learning, 2014. EBSCOhost, ezproxy.lib.usf.edu/login?url=

http://search.ebscohost.com/login.aspx?direct=true&db=psyh&AN=2016-53067-

076&site=eds-live.

“Web Search Strategies for Research.” (2011, April 04). YouTube. Uploaded by Potterdaniel789.

Retrieved March 25, 2018, from https://youtu.be/tJQo_pw74ZY.

Das könnte Ihnen auch gefallen