Search

What are Canonical URLs and how to use them?

If you have any page on your site accessible through multiple URL addresses or you have different pages with identical content – for example, a mobile version and a desktop version – Google will consider them as duplicated versions of the same page.

Google will select one of these addresses and treat it as the “canonical version,” which will be the version that Google indexes and crawls. The other versions will be treated as duplicated content and will be crawled less frequently.

If you don’t specifically indicate which URL is the canonical version, Google will make this decision for you. Alternatively, they may treat these versions as having equal weight. This situation could lead to unexpected results and even a decrease or distribution of ranking power among these URLs.

What is a Canonical Tag?

A canonical URL is the URL of the page that Google considers the most representative from a collection of similar or identical pages on your site. For example, if you have two URLs for the same page, like the following:

https://domain.com?dress=ABC
https://domain.com/dresses/abc

Google will choose one of them as the canonical URL. Note that the pages don’t need to be 100% identical to be considered duplicated content; minor changes on the page won’t make it unique, like filtering or sorting products by color.

There are various ways to indicate a canonical URL (more on that later), but here’s the most common method you might be familiar with: the canonical tag in the <head> section of a specific page:

<link rel="canonical" href="https://domain.com/master-version/" />

Properly defining and using canonical URLs are integral parts of technical SEO optimization on your site.

Canonical Tags and How to Use Them

Credit: moz.com

How Does Google Choose the Canonical Version?

When Google’s bot scans a specific site, it tries to identify the central content of each page. If it finds several pages that look similar, it will choose the page that seems more useful and mark it as the canonical version.
The canonical version will be crawled more frequently, while the duplicated version will be crawled less. This reduces the load on Google’s crawling resources.

Google chooses the canonical version based on parameters or signals such as:

  • HTTP or HTTPS presentation
  • Preferred domain in Google Search Console
  • Page quality
  • Presence in the sitemap
  • Whether a canonical tag is present (i.e., the rel="canonical" tag)

You can indicate your preference to Google using these techniques, but they may choose a different page as the canonical version based on various factors.

Why Have Identical or Duplicated Content?

There are legitimate reasons why you might have different URLs pointing to the same page or why you have identical or similar pages with different URLs. Here are common scenarios:
A. Supporting Different Device Types:

https://domain.com/news/my-post
https://m.domain.com/news/my-post
https://amp.domain.com/news/my-post

B. Enabling Dynamic URLs with Parameters or Session IDs:

https://www.domain.com/products?category=dresses&color=green
https://domain.com/dresses/cocktail?gclid=ABCD
https://www.domain.com/dresses/green/greendress

C. Automatically Generating URLs for Different Categories:

http://domain.com/red-dresses
https://domain.com/red-dresses
http://www.domain.com/red-dresses

D. Handling www/non-www or http/https Server Variations:

http://domain.com/red-dresses
https://domain.com/red-dresses
http://www.domain.com/red-dresses

E. Content Syndication to External Sites:

/*** Syndication POST ***/
https://news.domain.com/red-dresses-for-every-day-155672

/*** ORIGINAL POST ***/
https://blog.domain.com/dresses/red-dresses-are-awesome/3245/

When Should You Use a Canonical URL?

Ignoring the potentially irrelevant chatter, let’s look at when you should practically use and add a canonical tag:

1. You Know Exactly Why You Have Duplicated Pages:
If you know exactly why you have identical or similar pages, you should explicitly specify which page is the main one and use a canonical tag to mark the secondary pages (Component URLs) with the canonical URL of the main page.

2. When there’s an issue, perform a 301 redirect. In general, there are situations where it’s appropriate to perform a 301/302 redirect instead of using the canonical tag. However, I won’t delve into this topic in this post, and I promise to write about it soon. Feel free to subscribe to the group’s mailing list.

More about permanent redirects (301 redirects) in the attached link.

3. Multiple pages of the same product series in the store. If you’re managing a virtual store and there’s a series of products with only slight differences like color or features, it’s advisable to select one product as the main product and use canonical tags on this product to differentiate it from the other products in the series.

4. For a situation of filtering or sorting products in a catalog. Whether it’s a virtual store or any kind of catalog, and there’s an option to sort or filter through a parameter in the URL, you should point the canonical tag to the default state of the page using a canonical URL. Typically, the default state of the page is the URL without those parameters.

For example, on the following URL:

http://domain.com/dresses?sort=price

Point to the default URL:

<link rel="canonical" href="http://domain.com/dresses" />

5. For the print page. If your print pages are served through a parameter in the URL, you should also point the canonical tag from that URL to the page without the parameter. For instance, the following URL:

http://domain.com/news.html?print=yes

6. When using an affiliate system. If you have affiliate links, make sure they point to the main page without the parameter in the URL. It’s important to note that you can specify to Google through the Search Console which URLs with parameters you don’t want to appear in the index.

Specific Selection of the Canonical Page

There are several ways to indicate to Google which is the canonical URL among duplicated pages, each with its advantages and disadvantages. Take a look at the following table to understand the options:

Method Description
General Fingerprint The general fingerprints apply to all canonicalization methods.
Preferred Domain Selection Use Search Console to choose specific URL addresses within a domain as canonical. For instance, you can choose domain.com instead of www.domain.com. This method is relevant when you have two identical sites with only a subdomain difference. Do not use this method to differentiate between HTTP/HTTPS protocols.

Advantages:

  • Very easy to choose, manage, and modify.
  • Suitable for identical sites with different domains.

Disadvantages:

  • Applicable only when the difference is in the domain. Pages need to have identical paths and names to be considered duplicate.
  • Allows mapping only for singular pages with the same name and paths.
rel=canonical <link> Tag Add the <link> tag to the code of all duplicated pages and direct it to the canonical address.

Advantages:

  • Possibility to map an unlimited number of duplicated pages.

Disadvantages:

  • The page may grow in size.
  • Changing mapping for large websites or sites with frequently changing URLs might be complex.
  • Applicable only to HTML pages, not files like PDFs. In such cases, you can use the HTTP header attribute “rel=canonical”.
HTTP Header “rel=canonical” Send the rel=”canonical” header in the page’s response.

Advantages:

  • The page doesn’t grow (less code).
  • Possibility to map an unlimited number of duplicated pages.

Disadvantages:

  • Changing mapping for large websites or sites with frequently changing URLs might be complex.
Sitemap Explicit indication of canonical pages in the Sitemap.

Advantages:

  • Easy to implement and modify, especially on large sites.

Disadvantages:

  • Googlebot still needs to determine which duplicated pages are related to the canonical pages specified in the Sitemap.
  • The signal sent to Googlebot is weaker compared to using the “rel=canonical” tag.
301 Redirects Use 301 redirects to inform Googlebot that the URL you’re redirecting to is a better version than a specific URL. Use this method only when you want to remove a duplicated page from use.
AMP Version If one of your versions is an AMP page, follow the AMP guidelines to indicate the canonical page and the AMP version.

*The table and much of the information in this post were taken from Search Console Help.

Guidelines for Using ‘rel=canonical’ in the <head> Tag

This post doesn’t go into the guidelines for each method in the table above, but it does provide some important guidelines for using the rel="canonical" tag in the <head> tag of the site. This is because it’s the most common method:

1. Use the <link> tag in the <head> of the page to indicate that the page is a duplicated copy of another page.
2. If the canonical page has a mobile variation, you need to add the rel="alternate" link to point to the mobile version in the following way:

<link rel="alternate" media="only screen and (max-width: 640px)"  href="http://m.domain.com/dresses/red-dresses">

3. A canonical address can (and should) point to itself. In other words, there’s no restriction on displaying a canonical address on page X that points to the address of page X.

Summary

I hope this post helped you understand what a canonical tag is, or in other words, a canonical tag, why it exists, and in which situations to use it.

I’m sure there’s missing information in this post since the topic is broader than it seems, but I promise to update the post later. In any case, you’re welcome to share your opinions and ask questions in the comments below. 🙂

Roee Yossef
Roee Yossef

I develop websites & custom WordPress themes by design. I love typography, colors & everything between, and aim to provide high performance, seo optimized websites with a clean & semantic code.

0 Comments...

Leave a Comment

Quick Navigation

Up!
Blog