Search

Building and Submitting a Sitemap on WordPress Sites

Sitemap is a file in which you provide information about pages, videos, and other types of files that exist on your site, as well as about the relationships between them. Search engines like Google use this file to understand how to intelligently scan your site.

The sitemap indicates to the scanner which files or pages you consider most important and also provides essential information about those files. For instance, for pages, the sitemap indicates when they were last updated, how frequently the page changes, and whether a certain page has versions in other languages.

Do You Need a Sitemap for Your Site?

If your site’s pages are properly linked using internal links and inbound links, and if the site has a proper hierarchy, search engines usually can navigate to most of your site’s pages easily. However, despite this, a sitemap can aid in the crawling process, especially when your site meets one of the following criteria:

  • Your site is very large – Consequently, Google and other search engines might miss scanning new pages and those that were recently updated.
  • If there is an isolated archive – Pages that aren’t naturally linked between each other. In this case, it would be wise to add these pages to the sitemap to ensure that Google doesn’t miss them.
  • If your site is new and lacks sufficient inbound links – Google’s bot (GoogleBot) crawls the web by following links from one page to another. As a result, Google and other search engines might miss pages that are not linked from any other site.

Even if your WordPress site doesn’t meet these criteria, using a sitemap could help direct your crawl budget towards the most recently updated pages. As you will see in the example below, there’s a parameter called <lastmod> for each page address in the Sitemap file.
If you generate a sitemap using a plugin such as Yoast SEO, this parameter will be automatically updated with each page change and update. Google considers this parameter, and logically, it will allocate its crawl budget to those pages that have been updated recently. This is another reason why a sitemap can streamline the site’s crawling process.

To summarize this topic, using a sitemap won’t harm your site’s SEO. At worst, you won’t see better results. And if you don’t believe me, here’s what Google has to say on the subject:

“in most cases, your site will benefit from having a sitemap, and you’ll never be penalized for having one.”

Creating a Sitemap Using Yoast SEO

In general, the process of creating a sitemap consists of three main steps. Let’s go over these steps:

The Yoast SEO plugin that many use allows you to easily create an XML format sitemap. To create a sitemap using Yoast, you don’t need to do more than install the plugin, which activates the Sitemap option by default. You can find your sitemap at the following address:

http://your-domain.co.il/sitemap_index.xml

Here’s what Yoast’s generated sitemap looks like. As you can see, Yoast creates the sitemap as an index that contains multiple sitemaps, each for a different content type on your WordPress site.

If you want to know how to enable or disable sitemap functionality, go to the WordPress admin interface under SEO > General, and under the Features tab, you’ll find the option to enable or disable the sitemap globally.

Clicking the question mark icon will display the link to your created sitemap.

By default, all content on your site will be included in Yoast’s generated sitemap. Pages, posts, tags, taxonomies, archive pages, and custom post types (CPTs) will all be present in the sitemap created by the plugin. Let’s see how to remove specific content types from the sitemap…

Removing Content Types from Yoast’s Sitemap

To exclude a specific content type from the sitemap, simply go to SEO > Search Appearance and disable the Show “Your content type” in search results option for the specific content type you want to remove. Here’s an example for categories:

Note that apart from removing the content type from the sitemap, this action will instruct search engines not to index that content type by adding a noindex tag. As a result, this content won’t be indexed, and probably won’t appear in Google’s search results. So, be cautious.

Excluding a specific page or post from the sitemap involves setting it as noindex in Yoast’s local settings on the page itself. Here’s a post that explains more about Yoast SEO’s local settings.
It’s important to note that you cannot prevent images from being indexed using the Yoast SEO plugin. I wrote a post about how WordPress handles images and how to redirect attachment URLs to the image itself.

What If Your Images Are Served from a Different Location, Such as a CDN?

The wpseo_xml_sitemap_img_src filter allows you to manually change the URL of images that will appear in the sitemap generated by Yoast SEO. The following code will replace the URL when you update the sitemap:

function wpseo_cdn_filter( $uri ) {
  return str_replace( 'http://domain.co.il', 'http://cdn.domain.co.il', $uri );
}
add_filter( 'wpseo_xml_sitemap_img_src', 'wpseo_cdn_filter' );

How to Make the Sitemap Available to Google

After creating the sitemap and deciding which content will appear in it, you need to notify Google about its existence. There are several ways to make your sitemap available to Google, but in this guide, I’ll mention two main methods:
1. Submitting the Sitemap Using the Search Console. Go to the Sitemaps section in your site’s Search Console account, add the URL of your sitemap, and hit submit:

2. Using the robots.txt file. Very simply, add the following line wherever you want in the robots.txt file to indicate to Google the path to your sitemap:

Sitemap: http://domain.co.il/sitemap_index.xml

Various Sitemap Formats

Google supports several types of sitemaps and expects them according to the Sitemaps protocol for each format. The main ones supported are XML, Atom, RSS, and text, but XML is the most common format.

Let’s see an example of an XML-format Sitemap containing a single URL:

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <url>
    <loc>http://www.example.co.il/foo.html</loc>
    <lastmod>2019-06-01</lastmod>
  </url>
</urlset>

Regardless of the format you choose, remember that a single sitemap cannot exceed 50MB (uncompressed) and cannot contain more than 50,000 URLs. If necessary, split the sitemap into multiple sitemaps.

Optionally, you can create an index file for the sitemap (as Yoast does). This is an index that points to a list of different sitemaps, and in this case, you only need to send Google this index file. Alternatively, you can send separate sitemaps as mentioned before.

Additional Media Types Supported by Google

Google supports additional media types in Sitemap files, such as images and videos. You can create separate sitemaps for these or add them to an existing sitemap.

For example, a video sitemap is a sitemap with additional information about videos on your pages. Using a video sitemap is an efficient way for Google to discover and understand the video content on your site, especially new content or content that’s hard to discover through regular crawling methods.

Here’s an example of a video sitemap. This example includes all the tags that Google uses:

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
        xmlns:video="http://www.google.com/schemas/sitemap-video/1.1">
   <url>
     <loc>http://domain.co.il/videos/some_video_landing_page.html</loc>
     <video:video>
       <video:thumbnail_loc>http://domain.co.il/thumbs/123.jpg</video:thumbnail_loc>
       <video:title>Grilling Steaks in the Summer</video:title>
       <video:description>Alkis shows you how to make perfect steaks
         </video:description>
       <video:content_loc>
           http://cdn.domain.co.il//video123.mp4</video:content_loc>
       <video:player_loc>
         http://domain.co.il/videoplayer.php?video=123</video:player_loc>
       <video:duration>600</video:duration>
       <video:expiration_date>2021-11-05T19:20:30+08:00</video:expiration_date>
       <video:rating>4.2</video:rating>
       <video:view_count>12345</video:view_count>
       <video:publication_date>2007-11-05T19:20:30+08:00</video:publication_date>
       <video:family_friendly>yes</video:family_friendly>
       <video:restriction relationship="allow">IE GB US CA</video:restriction>
       <video:price currency="EUR">1.99</video:price>
       <video:requires_subscription>yes</video:requires_subscription>
       <video:uploader
          info="http://domain.co.il/users/grillymcgrillerson">GrillyMcGrillerson
       </video:uploader>
       <video:live>no</video:live>
     </video:video>
   </url>
</urlset>

You can also use separate sitemaps for images to help Google index them in image search results. This example shows a sitemap with two images:

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
        xmlns:image="http://www.google.com/schemas/sitemap-image/1.1">
  <url>
    <loc>http://domain.co.il/sample.html</loc>
    <image:image>
      <image:loc>http://domain.co.il/image.jpg</image:loc>
    </image:image>
    <image:image>
      <image:loc>http://domain.co.il/photo.jpg</image:loc>
    </image:image>
  </url> 
</urlset> 

Sitemap – General Guidelines

  • Use consistent and complete site addresses. Google will crawl the site addresses exactly as they are written. For example, if the site address is https://www.domain.co.il/, don’t use the address as https://domain.co.il/ (without www) or as a relative address.
  • Inform Google about alternative language versions of the site using the hreflang tag. The Yoast SEO plugin will do this automatically if you use popular plugins for multilingual sites.
  • Sitemap files should be in UTF-8 encoding.
  • Split large sitemap files into smaller files to prevent server load if Google frequently requests your sitemap. Each file should not contain more than 50,000 URLs and should be smaller than 50 MB uncompressed. In these cases, use an index file for the sitemap and list all the sitemap files within it. Send only this index file to Google instead of sending multiple separate files.
  • Do not include paginated pages in the sitemap. For these, use the standard prev and next tags.
  • Use recommended canonicalization methods to indicate to Google if your site is accessible with both www and non-www versions of the domain. Send the sitemap for your preferred domain only.
  • If you have different site addresses for mobile and desktop versions of a page, it’s recommended to direct to only one version. However, if you need to direct to both, you need to annotate the addresses to specify which one is for mobile and which is for desktop.

In Conclusion

The decision to use a sitemap or not is up to you. Personally, I don’t see a reason not to use it as the potential negative impact is very small. Don’t expect that using a sitemap will suddenly boost your rankings, but it will certainly aid Google’s process of crawling your site.

I hope you enjoyed the post, and if you have any questions, feel free to use the comments section below to improve, correct, ask questions, or simply drop a kind word… 🙂

Roee Yossef
Roee Yossef

I develop websites & custom WordPress themes by design. I love typography, colors & everything between, and aim to provide high performance, seo optimized websites with a clean & semantic code.

0 Comments...

Leave a Comment

Quick Navigation

Up!
Blog