PageRank
PageRank is a link analysis algorithm that assigns a
numerical weighting to each element of a hyperlinked set of
documents, such as the World Wide Web, with the purpose of
"measuring" its relative importance within the set. The
algorithm may be applied to any collection of entities with
reciprocal quotations and references. The numerical weight
that it assigns to any given element E is also called
the PageRank of E and denoted by
PR(E).
The name PageRank is a trademark of Google. The
PageRank process has been patented ( U.S. Patent
6,285,999 ). The
patent is not assigned to Google but to Stanford
University.
General description of PR
Google describes PageRank:
|
“
|
PageRank relies on the uniquely democratic nature
of the web by using its vast link structure as an
indicator of an individual page's value. In
essence, Google interprets a link from page A to
page B as a vote, by page A, for page B. But,
Google looks at more than the sheer volume of
votes, or links a page receives; it also analyzes
the page that casts the vote. Votes cast by pages
that are themselves "important" weigh more heavily
and help to make other pages "important".
|
”
|
In other words, a PageRank results from a "ballot" among all
the other pages on the World Wide Web about how important a
page is. A hyperlink to a page counts as a vote of support.
The PageRank of a page is defined recursively and depends on
the number and PageRank metric of all pages that link to it ("
incoming links"). A page that is linked to by many pages with
high PageRank receives a high rank itself. If there are no
links to a web page there is no support for that page.
Google assigns a numeric weighting from 0-10 for each webpage
on the Internet; this PageRank denotes a site’s importance in
the eyes of Google. The scale for PageRank is logarithmic like
the Richter Scale and roughly based upon quantity of inbound
links as well as importance of the page providing the link.
Numerous academic papers concerning PageRank have been
published since Page and Brin's original
paper.[2]
In practice, the PageRank concept has proven to be vulnerable
to manipulation, and extensive research has been devoted to
identifying falsely inflated PageRank and ways to ignore links
from documents with falsely inflated PageRank.
Alternatives to the PageRank algorithm include the HITS
algorithm proposed by Jon Kleinberg, the IBM CLEVER project
and the TrustRank algorithm.
From Wikipedia, the free encyclopedia
|