A Complete Guide To Technical SEO in 2021
Technical SEO is the process of making sure your website is following best practices for optimising a site’s performance.
It’s not just about “pretty” design; it’s also about how well a site performs in terms of speed and technical requirements.
This includes things like load time, page size, structured markup, site architecture, etc.
Technical SEO can be intimidating because there are so many factors to consider and understand.
This lesson will help you get started with what you need to know as a beginner and provide some valuable resources that will guide you through the technical aspects of Technical SEO.
Table of Contents
What is Technical SEO?
Technical SEO can be defined as an umbrella term that covers all aspects of ensuring the technical correctness of a website.
Keyword cannibalisation is another aspect of this subject matter.
The benefits of applying good practice in the technical optimisation of a site are to avoid being penalised by search engines and ensure that a site’s pages can be indexed correctly.
Is Technical SEO Difficult to Understand?
There are many resources online that discuss technical SEO in great detail.
Technical SEO is very complicated, especially when it comes to crawling and indexing pages correctly.
Many blog posts aim to provide information about the subject matter.
Still, most of them do require some knowledge of HTML, programming languages such as PHP or other complex coding languages.
Even just understanding how a website works at a basic level is helpful in understanding technical SEO.
When it comes to crawling, search engines want to provide the best user experience possible when a site contains thousands of pages that could be indexed.
Many webmasters assume that their website is constantly being crawled by search engines when this isn’t always true.
For example, Googlebot crawls at a relatively slow pace – depending on how busy Google is, a site can be crawled once every couple of weeks.
There are several things that search engines have to take into consideration.
As an example, the crawl budget is when all the URLs on a site have been indexed, and Googlebot doesn’t know which pages it has already indexed and crawls them again.
While this does happen, it’s not a common occurrence.
The crawl budget is also taken into consideration when a website has a large number of 404 error pages.
Googlebot crawls the homepage and other URLs on a site several times before identifying which ones are valid and which aren’t.
If your website contains many 404 pages, then crawl speed will decrease, which is not ideal for SEO purposes.
URL sources and redirects
When it comes to the sources and redirects of URLs, Googlebot relies on information in HTML to determine where a page’s content is located.
If real-time site crawling technology detects differences between the source and destination URLs for a link, crawl priority is affected.
For example, if your website uses dynamic parameters for tracking purposes, this can cause search engines problems.
Crawl queue and priority
When Googlebot crawls a site, pages have both crawl queue and priority assigned to them.
A page’s crawl queue determines how many URLs from the same host can be crawled in one go.
The pages on your website which don’t contain any broken links will have a higher crawl priority score than those that do – this means that the search engine will prioritise crawling these pages.
Google has an algorithm in place to prevent sites “gaming” the system by fooling Google into thinking that a page hasn’t been crawled when it actually has.
If Googlebot detects inconsistencies, such as being redirected from one URL to another within milliseconds, it will not crawl the page again until a certain amount of time has passed.
This prevents the search engine from crawling your website too much and burning out your crawl budget.
There are currently two types of Googlebot crawlers: desktop and mobile.
Not all crawlers follow the same rules when it comes to crawling your website, so you must be aware of which one is in use.
Crawler type can be identified if you see a “spider” or “bot” string in the user-agent string that Google returns in a request header.
Google has developed several tools for crawling and processing content to make its search engine results as relevant as possible.
These systems pay attention to each word on a page and the links pointing to it.
When it comes to rendering a page, there are two types of browsers that need to be taken into consideration: Web crawler and Web browser.
The former is responsible for identifying nodes on a page while the latter displays them.
How to Take Control of Crawling
Controlling what gets crawled on your website may be done using several different methods:
This file specifies whether a search engine crawler can or cannot access your site and the sections within it.
The problem with this method is that you don’t know if Googlebot has actually respected your “page” – many webmasters have reported backup issues caused by not correctly disallowing crawling of their robots.txt files, for example.
Another way of controlling Googlebot’s crawling rate is to adjust your site’s settings in Search Console.
Here you can view the number of sessions per day that are being indexed for each property.
If you don’t want Googlebot to crawl your website at all, then this can be done by blocking it from accessing any pages using:
robots.txt: User-agent: * Disallow: /
Also, ensure that the robots.txt file is accessible to search engine crawlers and not protected behind authentication or a firewall.
If Googlebot can’t access your robots.txt file, it will return a 404 Not Found status code, and your website will be treated as if it doesn’t exist.
How To View Crawl Activity
The easiest way to see what Googlebot has been up to is by viewing your log files.
This gives you a great insight into the performance of your site’s crawling and indexation.
You can do this using tools like Log File Viewer or through your hosting account, where you will be able to see all incoming requests in chronological order.
A quick search will display the crawl activity from the beginning of the time you ran a report up until now.
Indexing is the term used to describe the process in which search engine crawlers crawl your website and index the content.
Google processes this information so that it can return relevant results when people perform a search query.
A page may have up to three directives applied to it at any one time.
Here are some examples of how you can use them:
Disallow: /directory/directory-2/ Disallows crawling of all pages underneath this directory
Allow: /directory/directory-2/page-1.html Allows this specific page to be crawled but not any others in the directory
Allow: /directory/directory-2/ Allows crawling of all pages within this directory but not its subdirectories
Disallow: /directory/ Disallows crawling of every page underneath this directory regardless if they are disallowed or allowed elsewhere on your site.
If you have duplicate pages on your website, then using a canonical tag is the best way to let search engine crawlers know which version should appear in the results.
This can be inserted in the header of a webpage and should reference the original URL:
<link rel=”canonical” href=”http://www.example.com/” />
You can check if Google has indexed your website by running one of the following search queries:
- site:example.com (shows all pages that are indexed) site:example.com/page (shows which page is indexed)
- site:example.com inurl:https (shows if SSL is enabled for your domain)
These will also work for other search engines like Bing and Yahoo.
Add internal links
Linking to internal pages is an excellent way to help Googlebot discover the hierarchy of your site.
You can do this by linking from one page to another throughout your posts and comments, at the end of articles and even in footers.
Add schema markup
Schema markup helps search engines to better understand the information on your page and display this in results. For example, you might want to specify that a post contains recipes!
Core Web Vitals
Several signals can be used to measure the quality and relevance of your web pages. The most common ones include:
This measures how fast a page loads on average for users across different devices, browsers and geographical regions.
The lower your score, the slower it is likely to be, resulting in a loss of traffic. You can check this on Google’s PageSpeed Insights.
If your site uses an SSL certificate so that data flows between a visitor and the website via secure encrypted HTTPS, Google will display a green padlock icon in search results.
If your site is commonly accessed from mobile devices, you can also provide a mobile-friendly version of it so that Google knows which version to display in results.
An excellent place to check if your website is mobile-friendly is Google Search Console.
Interstitials are HTML5 advertisements that are displayed before, after or during a webpage’s content.
They can be used to provide site visitors with an offer that is only available on that specific page.
A great example of this would be a popup that appears once someone hovers over particular keywords or keyphrases on your website.
If you have a version of your website in multiple languages and you want to specify which one Google should use, then implementing hreflang is the best way to go.
This can be inserted either in the head or body element of your webpage.
<link rel=”alternate” href=”https://example.com/” hreflang=”en-gb” />
<link rel=”alternate” href=”https://example.com/” hreflang=”es-es” />
Here are some general tips for improving the overall health of your website:
These are dead pages that are no longer available on your website.
If Google crawls these, it will index them in the SERPs, resulting in poor UX for visitors. You can use tools like Ahrefs to find broken links on your site and fix them.
These are series of redirects that happen between two different pages. You can check if your site is affected by checking the number of hops in a URL with tools like Ahrefs.
Tools for Effective Technical SEO
There are a lot of tools available that allow you to check the metrics listed above. Some of the most common ones include:
Google Search Console
This is a free tool available from Google that allows you to manage everything from the crawl rate, internal linking and URLs to technical issues.
Google’s Mobile-Friendly Test
The Google Mobile-Friendly Test is a free tool that measures how mobile-friendly your site is.
The Chrome Developer Tools allow you to see how a page loads from a visitor’s perspective. You can also debug issues as these are detected as they happen on your website.
Google PageSpeed Insights
This is a free tool from Google that measures the loading time of pages across different browsers and geographical areas, providing suggestions for improvement.
This is a free browser plugin for Google Chrome and Firefox that allows you to see key metrics for your website, such as:
- On-page SEO report
- Broken link checker
- SERP positions
In this chapter, you have learned what Technical SEO is and the importance of using it to improve your website’s performance.
You now know why implementing a technical SEO strategy can be beneficial for any business and how to do so with the help of tools such as Google Search Console, Chrome DevTools and Ahrefs Toolbar.
Technical SEO should always be part of your strategy because site performance plays a crucial role in your ranking success.
Make sure you practice these skills and use them to your advantage when optimising your site. See you in the next chapter!
More lessons in this guide
About Scott Latham
For over 15 years, I have been building and implementing WordPress web design and SEO, having worked with some of the largest companies across a range of industries to see their business growth to the next level.
Need done for you Solutions?
Get In Contact Today!
NEED SOMETHING ELSE OR HAVE QUESTIONS?
Ready to move your business forward?
To find out how I can help you take your online marketing to the next level with actionable results, reach out and get in touch today.