Image credit: Infidigit.com
If you want to know what Mueller recently said about a lag in the Google Search Console Indexing Report, you are in the right place. Google bots tirelessly crawl nearly 2 billion websites and index over 56 billion pages to show on the search results. Google uses over 200 criteria while crawling the sites or updates to rank them on the SERP. But for some, though their pages get crawled but not indexed, they are worried about their crawl budget.
Hence, check out the crawl budget and what Mueller said about URLs crawled but not indexed on July 16, 2021, in the Google Office Hours Hangout.
Before going into the lag in the search console for URLs getting crawled and not indexed and Mueller's opinion on it, let us first know about the crawl budget. The amount of resources and time that Google devotes to crawling each site is commonly known as the crawl budget for the site. Google does not confirm that everything crawled on the site does not necessarily index. The reason for Google to say that is the need for evaluating, consolidating, and assessing each page for determining whether or not to index it after crawling it. Even with the Google's tremendous ability, it is challenging to explore and index the nearly infinite web space to index all the available URLs.
Crawl capacity limit
The crawl capacity limit is the maximum number of Googlebot's simultaneous parallel connections for crawling each site, along with the delay in time between the fetches. Googlebot does not want to crawl the sites by overwhelming the servers and providing coverage for the important content. The crawl capacity limit depends on the factors like crawl health, the site response with server errors and the limit set by owners in the Search Console, and the availability of Google's resources.
The second is the crawl demand which depends on the factors like the perceived inventory, which includes the duplicate URLs and others, the popularity of the site for more crawling, staleness to re-crawl to pick up the changes frequently.
With the above elements and factors, Google defines crawl budget as the URL sets that it can and wants to crawl, and if the crawl capacity limit is reached or the crawl demand is low, it may crawl the site less.
What is the crawl budget impact?
Those who asked about lag in the Google Search Console of crawling URLs but not indexing them are because of the crawl budget impact. They were concerned that URLs not getting indexed after crawling may not rank on the SERPs. But those who were asking the questions confirm that verifying the crawled but not indexed site in another report showed as indexed. But the lag makes it challenging for them to track the statistics for crawling and indexing the site as most of them are in the excluded list. Hence, they posed this error to Mueller to know about his view as the webmaster trend analyst of Google.
Mueller's views on the Google Search console error on crawling but not indexing URLs
Image credit: flipweb.org
As a side note, Mueller said that he doubts that the lag in the Google Search Console of not indexing URLs after crawling them will affect the crawling budget. But he said that having seen reports on the anomaly, though does not know exactly the issue but has an idea about it. He says he has recently seen a few threads like it on Twitter as well.
Mueller suspects it is only a matter of time that Google shows them in the Search Console report to get indexed over time. But at some point in time, they get dropped out of the report again. And he guesses that irrespective of the reason behind it getting dropped, it takes a longer time than normal.
Mueller wants to check if those pages crawled but not indexed show on the normal searches. He suggested taking a few words from the page and searching for that, and if they show up on the search, there is no real problem and nothing to do about it. Thus, he concludes that the lag of URLs not getting indexed after being crawled is only a report that gets lagging.
From the above facts and Mueller's view on the lag in the Google Search Console, lag is not indexing the crawled pages may be only a lag in reporting. But it is better to verify the index coverage issue by searching for the few words to show on the normal searches. Also, having the right crawl budget with appropriate servers and avoiding duplicate URLs and errors will help not to have your pages crawled but not indexed.