PageRank is a numerical score that represents a webpage's importance on the internet. It's a way to measure how well a webpage is connected to other high-quality websites.
In simple terms, PageRank is a measure of a webpage's authority and relevance. A high PageRank score indicates that a webpage is trustworthy and valuable to users.
The PageRank algorithm takes into account the number and quality of links pointing to a webpage. The more high-quality links a webpage has, the higher its PageRank score will be.
What is PageRank?
PageRank is Google's system of counting link votes and determining which pages are most important based on them. It considers links to be like votes, and some votes are more important than others.
Google views a link from page A to page B as a vote, by page A, for page B. This is the core idea behind PageRank.
The number of votes, or links, a page receives isn't the only factor considered. Google also looks at the page that casts the vote and weighs it more heavily if it's an important page.
Important pages receive a higher PageRank, which Google remembers each time it conducts a search. This means that if your webpage has a high PageRank, it's more likely to show up in search results.
PageRank relies on the link structure of the web to determine a page's value. It's not just about the number of links, but also about the quality of those links.
History and Evolution
Google's founders, Brin and Page, were working on information retrieval methods at Stanford in the late 1990s.
Using links to determine a page's importance was a revolutionary way to order pages at that time. It was computationally difficult, but not impossible.
Google launched its search engine with no ability to earn revenue due to institutional belief in its approach.
PageRank was the algorithm used by Google to rank pages in search engine results pages (SERPs).
The Evolution of
The idea of using links to determine a page's importance was revolutionary in the late 1990s.
Brin and Page were students at Stanford at the time, and they were exploring new information retrieval methods.
Google's approach was initially met with institutional belief, leading the business to launch its search engine without the ability to earn revenue.
BackRub, as Google was known then, was a small player in the search engine world.
PageRank was the algorithm that ranked pages in the search engine results pages, and it was a game-changer.
Google's approach was computationally difficult, but it was not impossible to implement.
The institutional belief in Google's approach was so strong that it launched its search engine without a revenue-generating model.
How Has Changed
The original version of PageRank was replaced in 2006, according to a former Google employee. This was necessary due to the significant growth of the web from 1-10 million pages to over 150 billion pages.
The original PageRank algorithm was resource-intensive and required iterations to converge. It hasn't been used since 2006.
The replacement algorithm is faster to compute and produces approximately similar results. It's the number reported in the toolbar and what Google claims as PageRank.
Both algorithms are O(N log N), but the replacement has a smaller constant on the log N factor. This is because it eliminates the need for iterations until the algorithm converges.
Problems and Iterations
The PageRank formula has some challenges.
One of the issues is that if a page doesn't link out to any other page, the formula won't reach an equilibrium, resulting in a distribution of PageRank among every page on the internet.
Newer pages, despite being potentially more important, will have a lower PageRank, which means old content can accumulate a disproportionately high PageRank over time.
The time a page has been live is not factored into the algorithm, which can lead to older content being prioritized.
Fundamental Assumption
The fundamental assumption behind PageRank is that significant web pages are more likely to receive inbound links from other pages, which is a key component in understanding how PageRank works.
This assumption is based on the idea that a page's importance can be measured by the number and quality of links pointing to it.
How PageRank is Calculated
PageRank is calculated using a complex algorithm that takes into account the number of links pointing to a webpage and the number of links on that webpage.
The algorithm starts by giving every page on the internet an estimated PageRank score, which can be any number.
The PageRank score is then divided by the number of links on the page, resulting in a smaller fraction. This fraction is then distributed to the linked pages.
The new estimate for PageRank for each page is the sum of all the fractions of pages that link into each given page. This process is repeated until the PageRank scores reach a settled equilibrium.
Here's a simplified representation of the formula:
PR = (1 - d) / n + d \* (sum of fractions of pages that link into each page)
Where:
- PR = PageRank in the next iteration of the algorithm.
- d = damping factor (set at 0.85).
- n = total number of pages on the internet.
How It Works
Here's how PageRank is calculated:
Initially, every page on the internet is given an estimated PageRank score, which can be any number.
The formula for PageRank involves dividing the PageRank for a page by the number of links out of the page, resulting in a smaller fraction.
The PageRank is then distributed out to the linked pages, and the same is done for every other page on the Internet.
The algorithm repeats this process until the PageRank scores reach a settled equilibrium.
The resulting numbers are then generally transposed into a more recognizable range of 0 to 10 for convenience.
The formula for PageRank can be represented mathematically as:
PR = (1-d) + d \* (PR(j) / C(j))
Where:
- PR = PageRank in the next iteration of the algorithm.
- d = damping factor (typically set at 0.85).
- j = the page number on the Internet (if every page had a unique number).
- C(j) = total number of links out of page j.
The damping factor represents the probability of a user continuing to follow links, while the remaining probability accounts for the user jumping to a random page.
Reasonable Surfer
PageRank is a complex algorithm, but one key concept that's worth understanding is the idea of a "reasonable surfer." This model suggests that the PageRank of a page might not be shared evenly with the pages it links out to.
The model weights the relative value of each link based on how likely a user might be to click on it. This is a more realistic approach, as users don't click on all links with equal probability.
In this approach, the value of a link is determined by how likely a user is to click on it. This is a more nuanced way of looking at link value, and it's more in line with how real users behave.
The concept of a "reasonable surfer" was introduced by a pioneer in the field, who has gone on to found and lead several successful companies.
PageRank in SEO
PageRank is just one of many factors used to produce search rankings, and it's not the most important thing when it comes to ranking well on Google.
Google uses another system to show the most important pages for a particular search, which lists them in order of importance for what you searched on. This means that adding PageRank scores to search results would just confuse people.
PageRank makes more sense when looking at a single page, such as when you're surfing the web, where you want to know how important or reputable that page might be. Google's algorithms identify signals about pages that correlate with trustworthiness and authoritativeness, and PageRank is one of the algorithms comprising Experience Expertise Authoritativeness Trustworthiness (E-E-A-T).
Algorithm Convergence
The PageRank algorithm converges to a stable set of PageRank values after a certain number of iterations. This is achieved by establishing a convergence criterion that measures the difference between the PageRank vectors of two consecutive iterations.
The difference between the PageRank vectors is calculated using a suitable norm, such as the L1 norm or the L2 norm. The L1 norm is not mentioned in the article, but the L2 norm is not explicitly stated either, however the article does mention that the difference between the PageRank vectors can be calculated using a suitable norm.
Once the convergence criterion is met, the iterative process can be stopped, and the PageRank vector from the last iteration can be used as the final set of PageRank values. This represents the relative importance of the web pages.
The convergence of the PageRank algorithm is a crucial aspect of the algorithm's functionality. It ensures that the iterative process reaches a stable state, providing accurate PageRank values.
In the case of "Our Spider Trap Network", the PageRank algorithm converges after applying the updated PageRank equation with random teleportation. The converged PageRank values are [0.12624893, 0.07327053, 0.10441051, 0.69607004] with beta=0.85.
SEO for SEOs
PageRank is still used by Google, but it's not the only factor in determining search rankings. Google's algorithms identify signals about pages that correlate with trustworthiness and authoritativeness, and PageRank is one of the signals used.
PageRank uses links on the web to understand authoritativeness. It's a patented algorithm that analyzes which sites have been "voted" the best sources of information by other pages across the web.
Links are still a crucial factor in determining search rankings. A study to measure the impact of links showed a significant drop in rankings when links were effectively removed using the disavow tool.
PageRank is also a canonicalization signal. Pages with a higher PageRank are more likely to be chosen as the canonical version that gets indexed and shown to users.
Google doesn't make PageRank scores visible in search results, but you can use tools like the PageRank Search tool at SEO Chat to see them. These tools can also show you the PageRank scores of other websites.
PageRank is not the only factor in determining search rankings. Google uses a variety of techniques, including its patented PageRank algorithm, to determine the importance of every web page.
PageRank is still used in the Google Directory, where listings are sorted by PageRank score. This is different from the search results, where PageRank scores are not visible.
Internal PageRank is the PageRank score that Google uses as part of its ranking algorithm, and it's constantly being updated. Toolbar PageRank, on the other hand, is a snapshot of internal PageRank taken every few months.
PageRank has a much wider range of scores than the 0-10 scale used in the PageRank Toolbar.
PageRank and Google
PageRank and Google are closely tied together. Google still uses PageRank, one of the algorithms comprising Experience Expertise Authoritativeness Trustworthiness (E-E-A-T).
PageRank is a signal that identifies trustworthiness and authoritativeness of pages, and it's based on links on the web. Google reps like Gary Illyes have confirmed that Google still uses PageRank and that links are used for E-A-T (now E-E-A-T).
Links are a key factor in determining a page's ranking, and PageRank is a confirmed factor in crawl budget. Pages with a higher PageRank are more likely to be chosen as the canonical version that gets indexed and shown to users.
The Google Directory is a place where pages are listed because human editors have selected them, rather than Google's crawling of the web. The listings in the Google Directory are sorted by PageRank, with a green ratings bar next to each site showing its importance.
PageRank is not the sole factor in how pages are ranked, but it's an important one. Google assesses the importance of every web page using a variety of techniques, including PageRank, which analyzes which sites have been "voted" the best sources of information by other pages across the web.
PageRank Algorithm Details
The PageRank algorithm is a complex process, but it's based on a simple idea: a random surfer jumps from one webpage to another. This algorithm is used by Google to determine the importance of web pages.
The algorithm uses a transition matrix M, which represents the probability of moving from one page to another. For example, if the transition matrix M for "Our Spider Trap Network" is:
r_new = beta*M*r_prev+v
The algorithm also incorporates a damping factor, which is typically set to 0.85. This helps to prevent the random surfer from getting stuck in a loop.
To avoid the spider trap problem, the algorithm introduces random teleportation. This allows the random surfer to jump to any page on the web with a certain probability. The teleportation matrix U is used to achieve this, and it's a column-stochastic version of an adjacency matrix with all 1's in all cells.
The teleportation matrix U ensures that the random surfer has an equal probability of jumping to any page on the web. This is done by setting each cell of the matrix to 1/N, where N is the number of web pages.
The modified PageRank equation with teleportation is:
r_new = beta*M*r_prev + (1-beta)*U*r_prev
This new equation accounts for the probability of a user jumping to a random page.
The algorithm converges to a stable set of PageRank values when the difference between the PageRank vectors of two consecutive iterations is small enough. This is determined by a convergence criterion, such as the L1 norm or the L2 norm.
PageRank Tools and Resources
Google Toolbar is a free browser extension that allows you to see the PageRank of any webpage.
PageRank Checker tools like Ahrefs and Moz can help you calculate the PageRank of a webpage.
You can use the Google Search Console to see the PageRank of your own website.
The PageRank of a webpage is a score between 0 and 10, with 10 being the highest.
Google's PageRank algorithm is based on the number and quality of links pointing to a webpage.
The PageRank of a webpage can affect its ranking in search engine results pages (SERPs).
PageRank is just one of many ranking factors that Google uses to determine the relevance of a webpage.
Frequently Asked Questions
What does Google use instead of PageRank?
Google uses URL Rating (UR) as a replacement metric for PageRank, which measures a page's link profile strength on a 100-point scale. A higher UR score indicates a stronger link profile.
Is Google still using PageRank?
Yes, Google still uses PageRank as a ranking signal, although it's not publicly available to website owners. This is confirmed by Google's Senior Webmaster Trends Analyst, John Mueller.
Sources
- https://www.geeksforgeeks.org/page-rank-algorithm-implementation/
- https://ahrefs.com/blog/google-pagerank/
- https://searchengineland.com/what-is-google-pagerank-a-guide-for-searchers-webmasters-11068
- https://www.searchenginejournal.com/google-pagerank/483521/
- https://computing4all.com/courses/introductory-data-science/lessons/page-rank-algorithm/
Featured Images: pexels.com