The key to success when optimizing your eCommerce store for search engines is ensuring you’re running a tight ship. eCommerce sites are usually much larger than lead gen sites and with this comes an inherently larger SEO consideration.

Search engines look negatively upon sites that have a lot of the same content across multiple URLs. Instead, they favour those that feature unique content. A common SEO issue eCommerce websites face is that of duplicate content; worsened usually by the sheer volume of pages present on these types of sites. If not managed properly, you may end up duplicating sizable sections of your site without even knowing it, resulting in your search visibility being damaged.

It’s therefore really important that eCommerce webmasters get the implementation and functionality of their sites correct to avoid sitewide duplicate content issues.

This means avoiding instances where the same content can be served across different URLs. This can be external or internal to your site and can occur at root domain, category page and product page level.

Here are some of the more common duplicate content issues I come across in eCommerce and how to avoid them:

Not Setting Your Preferred Root Domain

This is an issue I often come across when auditing sites, eCommerce or otherwise, and it only requires a manual check within your browser to diagnose.

SSL padlockThe idea is to have a preferred URL destination, to which all possible variations are 301 redirected. For instance, if your site is on an HTTPS encryption – a consideration all eCommerce stores should now seriously consider – the HTTP version should 301 redirect to the secure protocol (HTTPS version).

Similar principles apply to the ‘www.’ and ‘non-www.’ versions of your site and the trailing slash variants; one should 301 redirect to the other depending on your preference. To check whether your site is being affected by this issue, simply add or remove the elements from the root domain in your browser and try loading it. You’ll be able to see whether the 301 redirect is occurring as the URL you’ve typed in should switch to the preferred version.

You can also use Moz’s Mozbar to determine which status code is being served to further ensure everything has been configured correctly. Your preferred root domain should run a 200 response code while all other variants should run a 301 response code. Be sure to avoid 302 redirects in this instance as these do not allow for link equity to pass.

Not consolidating your preferred root domain can lead to sitewide duplicate content issues. Redirect all possible variations of your root domain to your preferred choice in order to consolidate your content and link equity with search engines. An example of this can be seen on the MCA Leicester website, an eCommerce store that has implemented this correctly; all variants of their root domain 301 redirect to the HTTPS, ‘www.’ and non-trailing slash version of their site.

It’s also good practice to communicate to Google which is your preferred root domain choice. You can do this via your Search Console profile.

Pagination

Pagination is when a site splits up content over a series of pages. Common to eCommerce sites, this type of functionality is typically used within a blog or on category pages where several products are listed.

Although pagination can have some user experience benefits, especially when compared to the disorienting nature of an infinite scroll, it does have crawler limitation implications as well as duplicate content implications.

For instance, presenting a search crawler with a series of paginated content can be an inefficient use of your crawl budget. Search bots will have to crawl a series of predominantly similar content and, depending on the depth of your pagination, they may not be able to reach each page in the series. Search bots have a finite amount of time to spend on your site so it’s important to ensure your site is easy to crawl.

Since the content of paginated pages is usually of a similar nature, duplicate content issues can also occur as there is little to differentiate the pages. Title tags, meta description and body copy are often the same, resulting in thin content and confusion for search engine crawlers.

To overcome issues with pagination, you can consider doing the following:

Increasing the Number of Category Pages or Products Per Category Page

This is perhaps the least technical approach but still a viable option. You can alleviate the need for pagination by introducing further category pages or including more products per single category page. If you were to undertake this approach, ensure there is a genuine need to create further category pages, to avoid keyword cannibalization, and the number of product pages per category page isn’t anything too excessive. Having too many products per category page may come with its own user experience shortfalls, like low click-through and conversion rates for products featured further down the page.

This approach is perhaps better suited to sites about to launch rather than those already established. Great Bean Bags is an example of a newer eCommerce store that uses a greater number of category pages to, therefore, negate the need for pagination.

Great Bean Bags Image

If your site has utilised pagination for some time, you may find it easier to implement the following solutions instead of overhauling your site structure.

Adding “Noindex” Meta Robots Tags

This approach involves adding the following tag into the <head> section of all pages in the paginated series except the first page:

[twig]&amp;amp;amp;amp;lt;meta name=”robots” content=”noindex,follow”&amp;amp;amp;amp;gt;[/twig]

The noindex tag will inform search engines not to include the paginated series within their index while the “follow” tag will allow link equity to flow through the hierarchy.

This is perhaps the easiest solution to implement and ideal for situations where you believe there is no logical reason the paginated pages need to rank. For example, although some of the noindex category pages may contain snippets of product information that you wish to rank, these can still appear in SERPs from their respective product pages. Remember, category pages are intended to be found via shorter-tail search phrases while product pages are usually found from via more focused, longer-tail phrases.

Related posts:  How to Measure Social Media ROI for eCommerce Businesses

Adding rel=”canonical” tags

The canonical tag approach is Google’s preferred method of combating pagination issues. All that is required here is to create a ‘View All’ page that features all your product offerings and implement canonical tags within the pagination to point to this page.

This signals to search engines that you are aware of the duplicate content issue on your site but you wish for all ranking merit to be focused on the ‘View All’ page. This will disclude all the paginated content from search engines and will only allow for your ‘View All’ page to appear.

Adding rel=”prev” and rel=”next” tags

This approach indicates to search engines the sequence of your pagination and component URLs and allows them to serve which URL they deem most appropriate. This is arguably the trickiest approach to implement but made most appealing by its flexibility. It also negates the need for a ‘View All’ page required in the canonical tag solution outlined above. Here’s how you go about implementing rel=”prev” and rel=”next” tags:

https://www.yoursite.com/page-1/
Include a <link rel=”prev” href=”https://www.yoursite.com/page-2/“> in the <head> section of the page to communicate to search engines the next page in the series.

https://www.yoursite.com/page-2/
Include a <link rel=”prev” href=”https://www.yoursite.com/page-1/”> and <link rel=”next” href=”https://www.yoursite.com/page-3/”>, again in the <head> section of the page. This will communicate to search engines the previous and next page in the series.

https://www.yoursite.com/page-3/
For the purposes of this example, let’s assume that “Page 3” is the last page in the sequence. In this instance, you work backwards by only implementing a rel=”prev” tag, i.e. <link rel=”prev” href=”https://www.yoursite.com/page-3/”>.

URL parameters

URL parameters are another cause of duplicate content within eCommerce SEO. Typically found at category page level when filters are used, this functionality serves predominantly the same content (often just reorganised according to the filters selected) under an infinite amount of URL variations.

Self-referencing canonicals

One workaround is to implement canonical tags at category page level, where each category page has a canonical tag pointing to itself. This indicates to search engines that regardless of the query string found at the end of the URL, all ranking merit should be focused to the original, clean URL. We Sell Electrical, an electrical supplies eCommerce store, use this approach to consolidate their category pages and the query strings created as a result of their ‘Shop by’ functionality. View their source code to see how they have implemented the self-referencing canonical in their category pages.

Self-referencing canonicals

Search Console

There’s also functionality within Google’s Search Console that allows you to discount URL parameters from Google’s index.

Search Console URL Parameters

To do this, login into Search Console, access your site’s profile and access ‘URL parameters’ underneath ‘Crawl’ on the left-side navigation menu. From here, you can add a parameter to instruct Google not crawl content under specific query strings. Read this resource to find out more about the URL parameters tool within Search Console.

URL parameters can sometimes affect product pages too, especially if you’re using filters to select different sizes or colours. If this is the case, you can use the solutions outlined above to overcome these issues too.

Duplicate sub-category page and product page URLs

When you were designing your site structure and thinking of the user journey, you may have determined it necessary to nest the same products or sub-categories under several category directories.

And why shouldn’t you? This approach has several user experience and navigational benefits that may contribute towards better conversion rates. However, unsurprisingly, there are duplicate content issues applicable here also. Again, our saving grace is to implement canonical tags where you choose one preferred URL path to direct all ranking merit to.

Below is a typical example of where this is applicable to sub-category pages:

URL #1: https://www.yoursite.com/sub-category
URL #2: https://www.yourdomain.com/category-a/sub-category
URL #3: https://www.yourdomain.com/category-b/sub-category

A similar set-up can also occur with your product pages:

URL #1: https://www.yoursite.com/product
URL #2: https://www.yourdomain.com/category-a/product
URL #3: https://www.yourdomain.com/category-b/product

The above examples show the same sub-categories and product found on three differing URLs; one falls directly off the root directory (URL #1) while the other 2 are found within two separate category folders (URLs #2 and #3). The best practice here it to implement a rel=”canonical” tag on each page that points to URL #1. This allows for flexibility in your site’s future design whilst leaving all ranking potential to remain at the first URL.

Product descriptions

Whereas the majority of this post has discussed technical duplicate content issues, issues with product descriptions occur within the actual page copy.

This issue commonly occurs with sites that sell products which have been supplied by a third party. In a bid to save time and resource, eCommerce managers will lift product descriptions from the original manufacturer’s site and paste them onto their own. This approach is particularly detrimental to your product page’s ranking potential as it puts you up against the original manufacturer’s page, i.e. the originator. Although it’s an incredibly resource-heavy task, it’s more beneficial in the longer term to feature original product descriptions on your eCommerce site. This approach also allows you to exercise your brand voice and implement more rhetoric into your writing.

Summary

Avoiding duplicate content in eCommerce means quickly becoming accustomed to canonical tags, 301 redirects and meta robots tags. Using disallow directives in your robots.txt file is another approach adopted by many to avoid duplicate content issues but it all depends on what the issue is you’re trying to avoid.

It’s easy within eCommerce to implement self-referencing canonical tags everywhere in a bid to quickly negate any duplicate content occurring. However, this approach isn’t always the solution and instead can serve to highlight the problem more. Instead, start by looking at your sitemap and what’s already in place; this holistic approach will let you determine exactly where duplicate content is occurring and what the best method is to counteract it.

What duplicate content issues do you most commonly come across in eCommerce? Do you have any alternative solutions to the issues I’ve outlined above? Would be great to hear your thoughts by leaving a comment below.