The duplicate content is something that very few sites, most of the time are fought not even intentional, created by our content manager. There are other types of duplicate content that is intentional and can negatively affect the web.
What you need to know about duplicate content, in a very summary way, is that it occurs when the same content appears in multiple URLs and that in principle it is not a reason for penalization, unless a high percentage of your website has duplicate content. Having a few duplicate pages won’t make Google mad at us, but avoiding it will give you clues that we’re on the right track.
Although it does not imply a penalty, it can generate a loss of positioning potential because search engines do not know which are the most relevant pages for a given search. Below I will try some examples of duplicate content and how to fix them:
Canonicalization of the page
This is the most common reason for duplicate content, it occurs when your home page has more than one URL :
Each of the previous ones lead to the same page with the same content, having them without any redirection makes the search engine not know which one you want to direct people to.
You have two options:
- Do a redirect on the server to make sure there is only one page displayed to users.
- Define which subdomain you want to be the main (“www” or “non-www”) in Google Webmaster Tools .
Tags and categories
This occurs in blogs when many tag or category pages have the same content as other pages (something very, very common in the world of blogging). For example, we have a blog with 3 posts that has the following tags and categories:
- Title : How to fix duplicate content
- Tags : Duplicate Content, SEO, Tips
- Categories : SEO, tips, content
- Title : How to detect duplicate content
- Tags : duplicate content, SEO, content
- Categories : SEO, content
- Title : Tips for creating quality content
- Tags : tips, content, quality content
- Categories : SEO, content, tips
This is how the publications on each page of labels and content would look.
We can see what the following pages have the same post:
- SEO Tags and Duplicate Content .
- Categories SEO and Content .
- The tips category and the tips tag .
The solution depends on how you use the categories and the labels and how many there are in each publication . If you use few categories and many tags (like most people) add noindex , follow meta-tags to your tag pages, in this case your categories are the ones that will rank in the search results. If you use a lot of categories and a few tags, the thing would be reversed by adding the meta tags noindex , follow to your category pages.
This is happening more and more due to the huge increase in mobile traffic in the last year . What happens here is that two URLs for the page are presented on each and every page of a website.
There are a few possible solutions, the first is to make the mobile website different from the normal one, with all the pages with different URLs and different designs that present the information according to the device with which the web is accessed.
The latter requires a lot of time and effort, the recommended thing in the case of not having time is to create a responsive design that dynamically adapts the web design based on the resolution of the visitor’s screen.
Finally, the quickest solution is to add rel = canonical tags to all pages from the mobile website to the normal website.
Parameters in the URL
There are many types of parameters, especially in e-commerce : product filters (color, size, score, etc.), ordering (lowest price, relevance, highest price, grid, etc.) and user sessions. The problem is that many of these parameters do not change the content of the page , that causes that there are many URLs for the same content.
In this example we can see three parameters: color, low price and high price.
The solution for any problem with the parameters is to add a rel = canonical tag to the original page, simply with this you can avoid any type of confusion by Google with the original page.
Another possible solution is to indicate to Google through Google Webmaster Tools > Configuration> URL parameters which parameters should be ignored when indexing pages on your website.
Pagination refers to when an article, product list, or tag and category pages have more than one page. Although the pages have different content, they are all focused on the same topic. This is a huge problem on e-commerce pages where they have hundreds of articles in the same category.
Currently there are rel = next and rel = prev tags that allow search engines to know that all pages belong to the same category / publication, not indexing all pages and focusing all the positioning potential on the first page.
Another solution is to find the pagination parameter in the URL and enter it in Google Webmaster Tools so that it is not indexed.
In addition to these, there are many more causes of unintended duplicate content, but these are perhaps the most common and those with a more or less simple solution. Duplicate content is something to be constantly on the lookout for , as having a few duplicate pages is an easy solution, but when this number becomes gigantic it can be a very tedious task.