Sie sind auf Seite 1von 21

Big Data Technology

20 Page Rank

Cyrus Lentin
Page Rank

History
What Is Search Optimization?
PageRank (Two different notations)
How is PageRank Calculated?
Effects Of The Links
Advantages
Limitations

Big Data Technology - Cyrus Lentin 1


Searching

What Is Searching?
Trying To Find Something By Looking.
When Searching On Web, Then We Cant Search Any Specified Thing By Just Simply Looking
There Huge And Voluminous Amount Of Data, Files, Directories And Content Are Present On Web
So We Need A Tool To Search The Required Content On Web
That Tool Is Search Engine.
Search Engine Is A Software System That Is Used To Search For Information On The World Wide Web.
Examples Are Google, Bing, Yahoo, Etc.

Big Data Technology - Cyrus Lentin 2


Search Engine Optimization

Search Engine Optimization (SEO) Is The Process Of Affecting The Visibility Of A Website Or A Web
Page In A Search Engine
The Optimization Techniques Of The Search Engine Differs From One Search Engine To Another.
The Better The Optimization Technique They Have, More Will Be The Visitors And Then That Will Be
Considered As Better Search Engine

Big Data Technology - Cyrus Lentin 3


Types Of Algorithms

Text-based Ranking Algorithm


The ranking scheme used in the conventional search engines is purely Text-Based i.e. the pages are
ranked based on their textual content and number of matched terms with the query string. , which
seems to be logical
HITS (Hyperlink Induced Topic Search)
Hyperlink-Induced Topic Search (HITS; also known as hubs and authorities) is a link analysis
algorithm that rates Web pages, developed by Jon Kleinberg. In other words, a good page is
represented by a page that pointed to many other pages, and a good authority represented a page
that was linked by many different hubs
SALSA
The Stochastic Approach for Link-Structure Analysis. Probabilistic extension of the HITS algorithm.
Page Rank Algorithm
PageRank works by counting the number and quality of links to a page to determine a rough
estimate of how important the website is. The underlying assumption is that more important
websites are likely to receive more links from other websites

All Search Engines Use A Combination Of Algorithm And Not Just One

Big Data Technology - Cyrus Lentin 4


Page Rank Algorithm

In PageRank the page word is not for web page though it is used for ranking pages
The PageRank algorithm originally developed at Stanford University by Larry Page in 1996 as part of
a research project about a new search engine. So it got its name from Larry Page
PageRank is an algorithm used by the Google web search engine to rank websites in their search
engine results
The PageRank algorithm does not rank the whole website, but its determined for each page
individually
PageRank is a link analysis algorithm and it assigns a numerical weighting to each element of a
hyperlinked set of documents, with the purpose of "measuring" its relative importance.
The algorithm may be applied to any collection of entities with reciprocal quotations and references.
The numerical weight that it assigns to any given element A is referred to as the PageRank of A and
denoted by PR(A).
Other factors like Author Rank can contribute to the importance of an entity.

Big Data Technology - Cyrus Lentin 5


Page Rank Formula

Formula for calculating the web page rank :

PR(A) = (1-d) + d(PR(T1)/C(T1)+ + PR(Tn)/C(Tn))

Where:
PR(A) PageRank of page A
T1 Tn=All pages that link to page A
PR(Ti) Page rank of page Ti
C(Ti) the number of pages to which Ti links to
d damping factor which can be set between 0 and 1

Big Data Technology - Cyrus Lentin 6


Page Rank Image

Big Data Technology - Cyrus Lentin 7


H I T S Algorithm

The HITS algorithm stands for Hypertext Induced Topic Selection and is used for rating and ranking
websites based on the link information when identifying topic areas
Clever builds on the HITS (Hypertext-Induced Topic Search) algorithm developed at IBMs Almaden
Research Lab in San Jose, CA
Unlike PageRank which is a static ranking algorithm, HITS is search query dependent. Thus, ranking of
the web page is decided by analysing its textual contents against a given query
The algorithm produces two types of pages:
Authority: pages that provide an important
Hub: pages that contain links to authorities
In this algorithm a web page is named as authority if the web page is pointed by many hyper links
A web page is named as HUB if the page point to various hyperlinks
HITS is a topic specific search. First of all a subset of web pages containing good hub and authority
pages with respect to a query is created
This is done by first firing the query and getting an initial set of documents relevant to the query.
This is called the root set for the query

Big Data Technology - Cyrus Lentin 8


H I T S Algorithm

Big Data Technology - Cyrus Lentin 9


Precision & Recall

Precision and recall are the basic measures used in evaluating search strategies.
As shown in the first two figures on the left, these measures assume:
There is a set of records in the database which is relevant to the search topic
Records are assumed to be either relevant or irrelevant (these measures do not allow for degrees of
relevancy).
The actual retrieval set may not perfectly match the set of relevant records.

Big Data Technology - Cyrus Lentin 10


Recall & Precision

RECALL
is the ratio of the number of relevant records
retrieved to the total number of relevant
records in the database. It is usually expressed
as a percentage

PRECISION
is the ratio of the number of relevant records
retrieved to the total number of irrelevant and
relevant records retrieved. It is usually
expressed as a percentage

Big Data Technology - Cyrus Lentin 11


Search Engine Algorithm Parameters

There Are Lots Of Parameters On Which Search Engine Efficiency And Effectiveness Of Its Page
Ranking Algorithm Depends On But The Basic Among Them Are Following:
On Page
Off Page
Site
Domain

Big Data Technology - Cyrus Lentin 12


On-Page Factors
Keyword In The Title Tag.
The Title Meta Tag Is One Of The Strongest Relevancy Signals For A Search Engine. Including A
Keyword In It Will Indicate To Search Engine What To Rank The Page For.
Keyword In Meta Description Tag.
The Importance Of The Meta Description Tag Today Is Often Discussed In SEO Circles. It Is
Nonetheless Still A Relevancy Signal.
The Length Of The Content.
These Days Searchers Want To Be Educated And Wont Satisfy With Basic Information. Google,
Therefore, Looks For Authoritative And Informative Content To Rank First.
Duplicate Content
Not All Factors Can Influence Your Rankings In A Positive Way. Having Similar Content Across Various
Pages Of Your Site Can Actually Hurt Your Rankings. Avoid Duplicating Content And Write Original
Copy For Each Page.
Canonical Tag
Sometimes, Having Two URLs With Similar Content Is Unavoidable. Canonical Tag Tells Google That
One URL Is Equivalent Of Another, Clearly Stating That In Spite Of Two Pages Are In Fact One.
Content Updates
Google Algorithm Prefers Freshly Updated Content. It Does Not Mean That You Have To Edit Your
Pages All The Time. It Is Wise Content Once Every 12 Months Or So.

Big Data Technology - Cyrus Lentin 13


On-Page Factors

Outbound Links
Linking To Authoritative Pages Sends Trust Signals To The Search Engine. You Would Send A User To
Another Site Because You Wanted Them To Learn More Of The Subject
Back Links
Pages Of Other Sites Linking To The Page Of Your Website Sends Trust Signals To The Search Engine.
Other Users Come To Your Website To Learn More Of The Subject
Internal Links
Interlinking Pages On Your Site Can Pass Their Strength Between Them

Big Data Technology - Cyrus Lentin 14


Site Factors

Sitemap
A Sitemap Helps Search Engine To Index All Pages On Your Site. It Is The Simplest And Most Efficient
Way To Tell Google What Pages Your Website Includes
Domain Trust
Trust Matters. Its Hard No To Think That Sites Google Trusts Should Rank Higher. Check If The IP
Address Of Your Domain Is Clean (Not Black-listed For Nefarious Activities)
Server Location
Some Seos Believe That A Servers Location Helps To Boost Rankings For That Particular Country Or
Region
Mobile Optimized Site
Only A Year Ago, 46% Of Searchers Used Mobile Exclusively To Research. I Believe This Number
Increased Exponentially In The Last 12 Months. It Would Be No Surprise Then That Having A Mobile
Optimized Site Would Affect Rankings In Some Way
Google Search Console Integration
Lastly, Having Your Site Verified At Google Webmasters Tools Is Said To Help With Your Sites Indexing.
Even If Thats Not The Case, The Tool Provides Valuable Data You Can Use To Optimize Your Site Better

Big Data Technology - Cyrus Lentin 15


Off Page Factors

The Number Of Linking Domains


The Number Of Domains Linking To You Is One Of The Most Important Ranking Factors.
The Number Of Linking Pages
There Might Be Some Links From A Particular Domain To Your Site; Their Number Is A Ranking Factor Too.
However, It Is Still Better To Have More Links From Multiple Domains Rather Than From A Single Domain
Domain Authority Of Linking Page
Not All Pages Are Equal. Links To Pages With Higher Domain Authority Will Be A Bigger Factor Than Those
On Low Authority Domains. You Should Strive To Build Links From High Domain Authority Websites
Link Relevancy.
Some SEOs Believe That Links From Pages Related To Your Pages Topic Carry More Relevancy For Search
Engines
Authority Of Linking Domain
The Authority Of A Domain May Be A Ranking Factor Too. For That Reason, A Link From Low Authority Page
On A High Authority Site Will Be Worth More That From A Lower Domain Authority One
Contextual Links.
It Is Said That Links Within The Content Of The Page Are Worth More Than Links In A Sidebar For Instance
Link Anchor
Anchor Text Of A Link Used To Be A Strong Ranking Factor. Today It Can Be Utilized As A Web Spam
Indicator, Negatively Impacting Your Rankings

Big Data Technology - Cyrus Lentin 16


Domain Factors

Domain Registration Length.


Google Considers Domains Registered For Longer Than A Year As More Trustworthy. Quote.
Domain History
You May Not Be The First Person Who Registered The Domain. And If Your Domain Has Been
Penalized In The Past, Its History Might Affect Its Current Rankings
Country TLD Extension
If You Try To Target A Particular Local Market, It Is Said That Having A Domain With A Country Specific
TLD (.PL, .CO.Uk Or .IE For Instance) Will Help To Achieve Better Rankings For That Location

Big Data Technology - Cyrus Lentin 17


Conclusion

To optimize the search we required a better ranking algorithm.


On the basis of this study we conclude that both page rank and HITS algorithm are different link
analysis algorithms that employ different models to calculate web page rank
Page Rank is a more popular algorithm used as the basis for the very popular Google search engine
This popularity is due to the features like efficiency, feasibility, less query time cost, less
susceptibility to localized links etc. which are absent in HITS algorithm.
However though the HITS algorithm itself has not been very popular, different extensions of the
same have been employed in a number of different web sites.

Big Data Technology - Cyrus Lentin 18


Page Rank Tools

Open Admin Tools


Alexa
Google Analytics

Big Data Technology - Cyrus Lentin 19


Thank you!
Contact:
Cyrus Lentin
cyrus@lentins.co.in
+91-98200-94236

Big Data Technology - Cyrus Lentin 20

Das könnte Ihnen auch gefallen