Duplicate content falls into two main categories:
- Intentional
- Unintentional
Yup, either you meant to duplicate something or you didn’t. Shocker. There is a time and place for both, it’s not always a bad thing, but should be handled correctly. Duplicate content is content that appears in more than one place on the internet. At the surface, this seems pretty straightforward, but where it gets a little more nuanced is when you remember that “one place” means a URL (www.example.com) and that duplicate URL could unintentionally be on your own site at say example.com (we’re you have both the www and non-www versions).
Intentional Duplicate Content
Duplicate content in general is looked down on by search engines because it makes their job of finding the most relevant content more difficult. If you have a piece of content on your site that’s duplicate of something from another site, well, it’s still plagiarism if you copied it trying to pass it off as yours. And, beyond just being lazy it doesn’t help you in search results- why? Because it’s not unique, it doesn’t add any value (outside of the original one), and someone else did it first.
As an internet user, if you search for a topic would you want the original writer or someone who copied the original? Most of the time you want the original because they are the original thought-leader and may have more valuable content for you to read.
So, if you’re thinking about copying from another site… just don’t do it. Whew, done. On to the more nuanced stuff.
Unintentional Duplicate Content
If you have duplicate content within your own site, this can be a problem for a couple reasons. One, search engines don’t know which version to show searchers. Two, users don’t know which version to link to so you end up with links pointing to both versions, diluting them both instead of having one strong version.
The most common “duplicate content” issues we see are:
- HTTP/HTTPS and www/non-www versions of your website
- This is controlled at the host level and is an easy fix, but can create full duplicate websites if not done correctly
- URL duplicate content
- An example here might be: www.example.com/yoga-for-beginners and www.example.com/yoga-for-beginners?episode1
- In this case both pages have the same content but different URLs due to what we call “parameters”
Don’t worry, easy fix!
Do you need both versions live?
Sometimes you actually need both versions to exist, maybe you have a print version for example.
If the answer is no, the easiest solution is to use a 301 redirect, aka a permanent redirect, on the duplicate and point it to the original. This means when someone reaches that duplicate URL they will seamlessly be redirected to the intended one (this includes search engines too!).
If the answer is yes, you need both versions to exist, you’d want to use a canonical tag, aka a canonical link element. This is a command put into the page code of the duplicate page that tells search engines, “hey, we know this is a duplicate, here is the original and one you should show in search results.” This tag looks something like this:
<link rel=”canonical” href=”http://example.com/yoga-for-beginners”/> and you’d put this code on the duplicate page (the one you do NOT want to show in search results) www.example.com/yoga-for-beginners?episode1
That’s it. Either you don’t need the duplicate and you can use a 301 redirect, or you do and you’ll use a canonical tag. But, you won’t ever duplicate content from another site… you’re more original than that.
You may be wondering if you have duplicate content at this point. Take a look, go to google.com and type in “site:yoursite.com”. This will return all pages of your site that search engines have indexed. Take a look through those to see if you’ve got any duplicates and get to work solving for them using the questions above.