An Extensive Guide on Magento Duplicate Content

Nothing is created equal, they say. They are wrong. And if you own an online shop, you are the one who knows that perfectly well as many pages on your site are literary equal, aren’t they? Our extensive guide will help you to solve the problem of Magento duplicate content and never allow it again.

If you’re still not sure whether this guide is for you, just go to Google and search for site:youronlineshop.com (youronlineshop.com should be replaced by the name of your shop). How many pages are returned? Thousands? Ok, then search for site:youronlineshop.com/&. What do you see now? Much less pages are returned, right? So what’s wrong?

Those pages that are omitted during the second search are found in Google supplemental index as Google considers them thin content. In most cases this content means duplicate content (full or partial), and it is one of the greatest enemies of any Magento store. I even heard Google Panda searching for duplicate content once at night…

What are the types and most common examples of duplicate content in Magento?

Partial duplicates are the case when only minor part of the content or its layout is unique.

partial duplicates in Magento

Here are the most common examples of partial duplicates in Magento stores (you can click on any example and and jump to its detailed overview).

Full duplicates are the case when the content on two or more pages is identical.

Full duplicates in Magento

The most common example of full duplicates in Magento is:

Are you ready? Then let’s get down to the details!

Partial Duplicates in Magento

 

1. Product Sorting

That’s great when users can sort the products in your store by bestsellers, by newest, by price, number of reviews, etc. It’s even better if people can decide how many products should be displayed on the page: 20? 50? 100? But all these sorting options create pages with different characters (?, =, |) in the URLs:

http://site.co.uk/category/products.htm?sortby=total_reviews|desc http://site.co.uk/category/products.htm?sortby=total_reviews|asc http://site.co.uk/category/products.htm?sortby=relevance|desc

The problem comes out when sorting pages get indexed and even cached by Google. Imagine how many such pages can exist! Thousands! And Google crawlers spend time indexing them while they could concentrate their resources on indexing more important pages of your site: categories, products, etc.

1.2. How to find product sorting pages

First go to your product pages and sort them by any option. Now you can see the parameters added to the URL after sorting (e.g., dir, sortby). Go to Google and search for site:yourdomain.com inurl:dir

Most likely you’ll see this:

Google Supplemental Index

Just click to include the omitted results and you’ll see the pages in your store containing “dir” in the URLs.

It’s bad when these pages with parameters are indexed. But it’s even worse if they are cached. You can check it by clicking Cached from the search results or by searching cache:url in Google.

1.3. How to remove product sorting duplicates

1.3.1. Google Webmaster Tools

Go to Google Webmaster Tools => Crawl => URL Parameters. Here you will see the parameters Google has found in the URLs of your store and how it crawls them. “Leg Googlebot decide” is the default option there.

URL Parameters in Google Webmaster Tools

But when it comes to crawling your Magento store, it’s you but not Google who should decide which pages should be indexed, right? So if you haven’t decided this before, it’s high time you did it! Click “edit”, choose “Yes” in the dropdown menu and then – “No URls”.

Block indexing of URLs with parameters in GWT

You can also add parameters that are not listed in GWT and set crawling options for Google. But be careful and check twice (or even three times) before blocking the URLs with these parameters.

De-indexation of URL parameters in Google

1.3.2. Rel=canonical

You can also use canonicalization for the sorting pages in your Magento store. This way they will be accessible for users but redirect the crawlers to pages without parameters.

Canonicals? Great! But how do I implement them?

You should add this code to the the sorting pages:

<link href="CategoryURL " rel="canonical" />

where CategoryURL is the address of the same category page without parameters. For example, the following pages:

http://site.co.uk/category/products.htm?sortby=total_reviews|desc http://site.co.uk/category/products.htm?sortby=total_reviews|asc http://site.co.uk/category/products.htm?sortby=relevance|desc

should canonicalize this page

http://site.co.uk/category/products.htm

After adding the code, give Google some time to re-index the pages and follow the new instructions, usually it takes a few days. Canonicalization set up correctly, you’ll see the cache of the canonicalized page (http://site.co.uk/category/products.htm) even if you check the cache of the sorting pages (cache:http://site.co.uk/category/products.htm?sortby=relevance|desc)

using rel=canonical and blocking parameters in GWT

2. Pagination

You Magento store is big as you have lots of great products there, right? But even you have only a few products (that are also great!), they are still placed on the pages with pagination options.

For example:

http://www.site.com/category1.htm?page=2 http://www.site.com/category1.htm?page=3

2.1. How to find paginated duplicates

To find paginated pages in your Magento store, go to Google and search for site:yoursite.com inurl:page

This search returns all the pages containing “page” in the address within your site.

2.2. How to remove duplicates

2.2.1. Canonicalization

You already know a lot about canonicalizing pages in your store, right? Just make rel=canonical tags on those paginated pages:

http://www.site.com/category1.htm?page=2 http://www.site.com/category1.htm?page=3 http://www.site.com/category1.htm?page=4

point to the category

http://www.site.com/category1.htm

Single rel=canonical for each page

2.2.2. Pagination with rel=”next” and rel=”prev”

This option was specifically created by Google to fight duplicate paginated results. The idea is that all paginated pages are connected like links in a chain:

how pages are connected by rel=prev and rel=next

If we take these pages as an example:

http://www.site.com/category1.htm http://www.site.com/category1.htm?page=2

Then we should put the prev/next instructions in the following way:

<link href=" http://www.site.com/category1.htm?page=2" rel="next" />

in the <head> of http://www.site.com/category1.htm.

<link rel="prev" href=" http://www.site.com/category1.htm " />
<link rel="next" href=" http://www.site.com/category1.htm?page=3" />

in the <head> of http://www.site.com/category1.htm?page=2.

<link rel="prev" href=" http://www.site.com/category1.htm?page=2" />

in the <head> of http://www.site.com/category1.htm?page=3 (let’s imagine this is the last page).

All this is not too complicated, so you can make it yourself or use a special module (our Improved Layered Navigation has such functionality) to implement rel=”next” and rel=”prev”.

rel prev next implementation

There are a few things to remember:

  1. The first paginated page should contain only rel=”next”
  2. The last paginated page should contain only rel=”prev”
  3. Google allows using both canonicals and rel=”next” and rel=”prev” on one page

3. Variations of the same product

Imagine you sell mugs (or you really sell them?) and have landing pages for each color:

variations of one product

The characteristics are the same, description is the same, layout is the same… So what’s new? Just picture! Unfortunately, it’s too little for Google to treat such pages as unique. This means that all product variations found on different pages are partial duplicates that act like a magnet for Google Panda.

3.1. How to find product variations

As a Magento shop owner, you probably know all your products and can make a list of their variations. Alternatively, you can search in Google for:

site:yoursite.com “here comes a short excerpt from the product description”

This way you will find all the pages of your site containing this very excerpt.

3.2. How to remove duplicates.

3.2.1. One page for all variations

You can create a single page for a particular product and list all its variations there.

product variations on one page

This way you have one unique page instead of several duplicate ones.

3.2.2. Rel=canonical

You can also use rel=canonical. Just choose one variation page and put canonical to it from other variations. This way the content will be seen by users but Google will have a copy of only one page that you’ve chosen.

3.2.3. Make each variation page unique

The hardest way to solve duplication issues with product variations is to make each variation page unique. You will have to add different product descriptions and meta info. Yes, this is extremely time consuming. So be sure to grab a cup of coffee before starting this long journey!

Full Duplicates in Magento

Now when you’ve found all the partial duplicates, it’s time to look for full ones. Ready, steady, go!

The same product in different categories

You may have one product that can be found in two or more categories. For example:

http://www.site.com/jewellery/necklace.html http://www.site.com/for-her/necklace.html http://www.site.com/gifts/necklace.html

There’s only one necklace but 3 different URLs! In this case no matter how great your product is, Google will consider it thin content. It’s so unfair! So let your great products be great to Google by making them unique.

1.1. Rel=canonical

Just like with product variations I’ve talked above, you can choose one URL to show your product (necklace, as seen in the example) and the other pages will canonicalize it.

1.2. Remove category from URL

Alternatively, you can remove the category path from the URL, so that each product will have only one address no matter in how many categories it can be found:

http://www.site.com/necklace.html

Don’t know how to do that in Magento? There was a good tip in one of our post on SEO in Magento. Just go to System => Configuration => Catalog => Search Engine Optimization and switch “Use Categories Path for Product URLs” field to “No” and both “Canonical Link Meta Tag” fields to “Yes”.

remove category from URL in Magento

1.3. Leave only one category path in a product URL

If you have a red T-Shirt in 2 categories at once: T-Shirts and New, you can choose which category to use in the URL: either the longest one (T-Shirt) or the shortest one (new). This is possible with Unique Product URL extension.

Summary

Now you know how Google sees duplicate content on your store. Don’t let it decide how to treat your pages, suggest the best way possible instead. Take control over your site crawling!