For traditional websites, using only HTML and CSS, the crawling is fairly simple. Googlebot downloads an HTML file. It extracts all the links from the source code and downloads any CSS files. It then downloads all the resources to its indexer and indexes them.
- Googlebot downloads the HTML file as previously.
- It is at this point that the indexer can index the content and identify new links and add them to Googlebot’s crawling queue.
I will be dealing with these issues in more detail in the next sections.
Coding practices that impact how Googlebot interprets your site
Having a nice user-friendly site does not mean that the site is SEO friendly. In this section we cover client-side coding practices that will generate issues as Googlebot is parsing your site.
1. Lazy-loaded content (Infinite scrolling)
In one of his talks, Martin Splitt went into the details of Google not able to parse pages that implement infinity scrolling.
When a user lands on a page with infinite scrolling, a portion of the page will be shown. When the user scrolls down new content will be loaded from the backend and presented to the user.
Googlebot does not know how to scroll pages. When Googlebot lands on a page, it crawls only what is immediately visible. This will result in a portion of data missing from Google’s search index.
Martin Splitt recommends only lazy loading images and videos and using Intersection Observer API to asynchronously observe any changes in the HTML elements in the page’s viewport.
An alternative, recommended by Google’s official documentation, states that lazy loading is not an option, to use paginated loading with infinite scrolling. Paginated loading allows Google to link to a specific place in the content instead of only the top visible part of the content.
2. Dynamic Content
Rendering is the process of taking the content and displaying it to the user and crawler. There are two types of rendering: server-side rendering and client-side rendering.
Test how Google sees the page
With the URL inspection feature, you can render a page as Googlebot sees it. You can also check out the HTML tab which allows you to see the DOM (the code after your page is rendered).
If your page is not fully visible or content is not present you have an issue of how GoogleBot is interpreting the content of your site.
Googlebot may decide to block some of your content for a number of reasons. Google optimizes its crawlers to only download relevant resources. Content blocking may happen in these scenarios:
- Web Rendering Service algorithm decided it’s not necessary for rendering.
- The scripts take too long to execute (timeouts).
- If the content needs crawler to click, scroll or complete any other action to appear, it won’t be indexed.
1. Render Blocking
4. Minimize work on the main thread
By default, the browsers run all scripts in a single thread called the main thread. The main thread is where the browser processes user events and paints.
This means that long-running tasks will block the main thread giving way to unresponsive pages. The less work required on the main thread the better.
Make use of service workers to prevent functionality running directly on the main thread for better performance. This will help the crawler avoid timing out when requesting data for rendering.
5. Long jobs
A user may notice your UI to be unresponsive if any function blocks the main thread for more than 50ms. These types of functions are called Long jobs. Most of the time, it is due to loading too much data than the user needs at the time.
To resolve this, try to think about breaking your jobs into smaller asynchronous tasks. This should improve your Time To interactive (TIT) and First Input Delay (FID)
6. Client-Side rendering
Such an approach improves the usability and response of the page as the user does not need to wait for the Postbacks each time user accesses a new page.
When building sites that are heavy on client-side rendering, consideration should be given to limiting the size of critical Javscript that will be needed to render the Page.
Final thoughts & considerations
So, it is essential to apply the following techniques to make sure you do not suffer from being skipped by the crawler!
At Gainchanger we automate the tedious part of SEO to allow you to scale your results exponentially and focus on what really matters.