0% Complete
0/0 Steps
  1. SEO Basics
    12 Topics
    |
    1 Quiz
  2. Semantic Core
    12 Topics
    |
    1 Quiz
  3. Keywords Clustering
    14 Topics
    |
    1 Quiz
  4. Website Structure
    11 Topics
    |
    1 Quiz
  5. On-Page SEO
    55 Topics
    |
    1 Quiz
  6. Technical SEO
    9 Topics
    |
    1 Quiz
  7. SEO Reporting
    38 Topics
    |
    1 Quiz
  8. External SEO
    8 Topics
    |
    1 Quiz
  9. SEO Strategy
    2 Topics
    |
    1 Quiz
Lesson 7, Topic 15
In Progress

Indexing Status

11.02.2022
Lesson Progress
0% Complete

As you know, Google Search Console is an essential part of every SEO’s toolbox.

Among other things, Google Search Console reports on your organic performance and how they fared when crawling and indexing your site. The latter topic is covered in their ‘Index Coverage report’, which this article is all about.

What is the Google Search Console Index Coverage report?

When Google is crawling and indexing your site, they keep track of the results and report them in Google Search Console’s Index Coverage report.

.

It’s basically feedback on the more technical details of your site’s crawling and indexing process. In case they detect pressing issue, they send notifications. These notifications are usually delayed though, so dont solely rely on these notifcations to learn about high-impact SEO issues.

Google’s feedback is categorized in four statuses:

  1. Valid
  2. Valid with warnings
  3. Excluded
  4. Error

When should you use the Index Coverage report?

Google says that if your site has fewer than 500 pages, you probably don’t need to use the Index Coverage report. If organic traffic from Google is essential to your business, you do need to use their Index Coverage report, because it provides detailed information and is much more reliable than using their site: operator to debug indexing issues.

The Index Coverage report explained

Screenshot of Google Search Console’s Index Coverage report including details

The screenshot above is from a fairly large site with lots of interesting technical challenges.

Find your own Index Coverage report by following these steps:

  1. Log on to Google Search Console.
  2. Choose a property.
  3. Click Coverage under Index in the left navigation.

The Index Coverage report distinguishes among four status categories:

  1. Valid: pages that have been indexed.
  2. Valid with warnings: pages that have been indexed, but which contain some issues you may want to look at.
  3. Excluded: pages that weren’t indexed because search engines picked up clear signals they shouldn’t index them.
  4. Error: pages that couldn’t be indexed for some reason.

Each status consists of one or more types. Below, we’ll explain what each type means, whether action is required, and if so, what to do.

Valid URLs

As mentioned above, ”valid URLs” are pages that have been indexed. The following two types fall within the “Valid” status:

  1. Submitted and indexed
  2. Indexed, not submitted in sitemap

Submitted and indexed

These URLs were submitted through an XML sitemap and subsequently indexed.

Action required: none.

Indexed, not submitted in sitemap

These URLs were not submitted through an XML sitemap, but Google found and indexed them anyway.

Action required: verify if these URLs need to be indexed, and if so add them to your XML sitemap. If not, make sure you implement the robots noindex directive and optionally exclude them in your robots.txt if they can cause crawl budget issues.

Pro tip

If you have an XML sitemap, but you simply haven’t submitted it to Google Search Console, all URLs will be reported with the type: “Indexed, not submitted in sitemap” – which is a bit confusing.

It makes sense to split the XML sitemap into smaller ones for large sites (say 10,000+ pages), as this helps you quickly gain insight in any indexability issues per section or content type.

Valid URLs with warnings

The “Valid with warnings” status only contains two types:

  1. “Indexed, though blocked by robots.txt”
  2. “Indexed without content”

Indexed, though blocked by robots.txt

Google has indexed these URLs, but they were blocked by your robots.txt file. Normally, Google wouldn’t have indexed these URLs, but apparently they found links to these URLs and thus went ahead and indexed them anyway. It’s likely that the snippets that are shown are suboptimal.

Please note that this overview also contains URLs that were submitted through XML sitemaps since January 2021.

Action required: review these URLs, update your robots.txt, and possibly apply robots noindex directives.

Indexed without content

Google has indexed these URLs, but Google couldn’t find any content on them. Possible reasons for this could be:

  1. Cloaking
  2. Google couldn’t render the page, because they were blocked and received a HTTP status code 403 for example.
  3. The content is in a format Google doesn’t index
  4. An empty page was published.

Action required: review these URLs to double-check whether they really don’t contain content. Use both your browser, and Google Search Console’s URL Inspection Tool to determine what Google sees when requesting these URLs. If everything looks fine, just request reindexing.

Excluded URLs

The “Excluded” status contains the following types:

  1. Alternate page with proper canonical tag
  2. Blocked by page removal tool
  3. Blocked by robots.txt
  4. Blocked due to access forbidden (403)
  5. Blocked due to other 4xx issue
  6. Blocked due to unauthorized request (401)
  7. Crawl anomaly
  8. Crawled – currently not indexed
  9. Discovered – currently not indexed
  10. Duplicate without user-selected canonical
  11. Duplicate, Google chose different canonical than user
  12. Duplicate, submitted URL not selected as canonical
  13. Excluded by ‘noindex’ tag
  14. Not found (404)
  15. Page removed because of legal complaint
  16. Page with redirect
  17. Soft 404

Crawled – currently not indexed

These URLs were crawled by Google, but haven’t been indexed (yet). Possible reasons why a URL may have this type:

  1. The URL was recently crawled, and is still due to be indexed.
  2. Google knows about the URL, but hasn’t found it important enough to index it. For instance because it has few to no internal links, duplicate content or thin content.
Crawled - currently not indexed visualized

Action required: make sure there aren’t important URLs among the ones in this overview. If you do find important URLs, check when they were crawled. If it’s very recent, and you know this URL has enough internal links to be indexed, it’s likely that will happen soon.