Introduction to Сrawl Budget Optimization | Lesson 5/34 | SEMrush Academy

You’ll gain an understanding of search crawlers and how to optimally budget for them.
Watch the full course for free:

0:05 Crawl budget optimization and its significance
1:03 Factors affecting crawl budget
1:29 Accessibility
1:49 Efficiency
1:54 Page view
2:08 Crawl rate
2:42 Goals for crawl budget optimization
3:27 Use Google Search Console

✹ ✹ ✹ ✹ ✹ ✹ ✹ ✹ ✹ ✹ ✹ ✹ ✹ ✹ ✹ ✹ ✹ ✹ ✹ ✹ ✹ ✹ ✹ ✹ ✹
You might find it useful:
Tune up your website’s internal linking with the Site Audit tool:
Understand how Google bots interact with your website by using the Log File Analyzer:

Learn how to use SEMrush Site Audit in our free course:
✹ ✹ ✹ ✹ ✹ ✹ ✹ ✹ ✹ ✹ ✹ ✹ ✹ ✹ ✹ ✹ ✹ ✹ ✹ ✹ ✹ ✹ ✹ ✹ ✹

One of the most important and complex aspects of SEO is how to properly control search crawlers, and how to deal with what we call crawl budget. Why is this important?

The main reason is that crawling is very expensive. For example, one of the most recent Data Centers Google built in the Netherlands cost around 770 million US Dollars.
Operating costs are also high – more than 10 billion USD every year.
So a lot of money goes into crawling and that is the reason why Google needs to apply budgeting. Their crawling resources must be used efficiently and economically. Resources are provided in units of “data center computing time” per domain – so essentially crawl budget optimization aims to help Google achieve more with less.

Different factors impact crawl budget, for example

the age of the domain (that’s to say, the older, the better)
the link profile, which reflects the authority and trustworthiness of a domain. The stronger the overall profile is, the more budget you can get.
The size of the domain (as in the amount of URLs)
Also content freshness (for example the amount of new content)
For best in class crawling, accessibility is important. You should make sure that all links are actually readable and reachable, and that you are using proper HTTP status codes. Important content should be prioritised over less important material so that it is crawled first. Lastly, attention should be paid to efficiency. This means that there are no duplicate or thin pages. Page speed is another important factor to consider. Generally speaking, the more URLs you can serve blazing fast, the more effective you’ll be at allocating Google’s data center computing time.

Let’s also have a quick look at what crawl rate actually means. A proper definition would be: The number of crawl requests (accesses) by a search engines crawler (Google) in a certain period (24 hours) on a domain or within directory.

Next, let’s define some potential goals for crawl budget optimization:

We want the domain to be crawled as thoroughly as possible – so we want Google to find all our content.
Changes must be detected quickly. We have to be sure that the content we produce will be found as soon as possible. And if you update that content, you want Google to identify this as soon as possible as well. Say you updated a price for a product in your shop – this should be reflected in Google SERPs asap.
Another point is that resources must work economically and efficiently, that means you want to be sure that Google deals with your server infrastructure in the best possible way.
Lastly, it’s about using your server infrastructure economically; if Google crawls loads and loads of irrelevant stuff on your domain, this essentially costs you money.
A good starting point is to use Google’s Search Console. Go to the tab crawl stats – this gives information about how many pages are being crawled per day – you can use this info as a baseline and keep an eye on the trend.

Bear in mind: Depending on the domain size the impact may differ. If the domain is relatively small, say it has 100 or 1,000 URLs, the impact of it is clearly less significant than in the case of a domain with 100,000 URLs or even more. Google confirms this in their Webmaster Central Blog “Wasting server resources on pages […] will drain crawl activity from pages that do actually have value, which may cause a significant delay in discovering great content on a site”.

Ultimately, make sure you know what kind of content exists on your domain and decide if you want it to be indexed and/or crawled. If not, there are various types of directives and rules that you can apply to make sure that they do not crawl or index the content. Just keep going with this chapter.

#TechnicalSEO #TechnicalSEOcourse #CrawlBudgetOptimization #SEMrushAcademy

You May Also Like