Author: Sebastian Scheplitz
What is Duplicate Content?
Duplicate content, in general, is the same/similar copy of content that appears more than once on the Internet. This can happen on more than one page on your website and also on multiple sites.
And here is how Google defines “duplicate content”:
“Duplicate content generally refers to substantive blocks of content within or across domains that either completely match other content or are appreciably similar.”
If you are looking for a quick overview of this topic, we’ve got you covered.
Let’s get a bit more specific now. We all know that copying or scraping an entire article or, worse, an entire website is an unforgivable offense in the content marketing world.
But how about similar sentences on different pages of the same website or different websites? Let’s break it down.
If the content is word-to-word identical to that of another page, then it’s, in theory, duplicate content.
Page A: Affiliate Marketing is the key to success for iGaming operators.
Page B: Affiliate Marketing is the key to success for iGaming operators.
And even if the content is just slightly rephrased, it can still be considered duplicate content as well.
Page A: Affiliate Marketing is the key to success for iGaming operators.
Page B: Affiliate Marketing can increase traffic for iGaming operators.
There’s a huge BUT, though. Because just a few similar words DON’T make it duplicate content – and we’ll go into detail below.
How Does Google Know About Duplicate Content?
Having duplicate content on your site, under certain circumstances, can hurt your organic search performance badly.
Because Google’s algorithm doesn’t actually read your website or pages – it scans through it. And duplicate content can cause it to think that two or more pages are similar, therefore only show information for one of the pages.
And Google (let’s be honest, we’re all mostly trying to rank there and not on Bing) doesn’t like ranking pages that are just the same. One always has to be better than the rest in its eyes. Plus, content clones just don’t add value to the users.
But here’s the kicker: only duplicate or similar content in the exact same place with the exact same page structure is affected!
Here is another example. Let’s say there are two iGaming affiliate sites, each with its own bonus page: “Bet365 sportsbook bonus” page and “Tipico sportsbook bonus” page. Assuming that they both have the exact structure and content, Google will only favor one of the sites.
Which one, you may ask.
We’ll get to that – pinky promise.
Moreover, if you’re scraping other websites entirely, this can result in a bigger penalty from Google. In the worst-case scenario, it can completely unrank your page. This is super rare, though.
With billions of websites on the Internet, it’s a given that someone else has already talked about a topic on their site. And Google knows that not all duplicate content is bad, and it won’t penalize your website if you have no ulterior motive to fool the search engine.
Here are some certain circumstances where duplicate content is acceptable (Google’s words, not ours):
1. Discussion forums that can generate both regular and stripped-down pages targeted at mobile devices
2. Items in an online store that are shown or linked to by multiple distinct URLs
3. Printer-only versions of web pages
The same goes for talking about sports events. Do you think that Google would penalize your site for mentioning this sentence “2021 Australian Open Men’s Singles Qualifiers in Doha, Qatar”?
Of course not!
Millions of sports websites would be unranked if that were true. BUT copying full articles around this topic would land you in trouble, for sure.
Or, using our above examples, it’s not an issue to use the sentence “Affiliate Marketing is the key to success for iGaming operators.” per se. But if that’s one of the main long-tail keywords that you want to rank for, then you need to be careful about the context and structure you’re using on your site.
Now that we’ve explained what is considered acceptable as duplicate content and what is not, let’s go into detail on how to remove duplicate iGaming content and why it matters in terms of content marketing for iGaming companies.
Why Is It Important to Avoid Creating Duplicate iGaming Content?
Having duplicate content, whether intentional or not, adds more problems to your life than value for website users and search engines.
Here are a few reasons why:
- Website owners will feel the impact of duplicate content by losing rankings and site traffic. That’s because search engines would be forced to choose one version among several similar ones to show on their search results, which would simultaneously lower the visibility of all the other versions.
- Search engines won’t know which version to add to/remove from indexing; therefore, they won’t be able to decide which version is the most suitable to rank for.
What Causes Duplicate Content Issues?
Apart from the obvious case of someone copying someone else’s hard work, this problem could also be stemmed from certain technical issues – or unintentional mistakes to put it nicely.
Regardless, we will be looking into both aspects that contribute to this duplicate content issue.
Duplicate Content Caused by Technical Issues
1. www vs. non-www and HTTPS vs. HTTP
Let’s say you have two live website versions due to faulty configuration: www.translationroyale.com and another live site translationroyale.com (without www), this would automatically result in duplicate content.
The same concept applies to websites that have both https:// and http:// versions running at the same time.
2. URL Structuring
The portion of the URL after the domain name is case-sensitive for Google. Therefore it sees
as two different pages. However, this isn’t an issue for Bing. This is because the operating system Linux, which many servers on the internet and all of Google use, is case-sensitive for file names.
For example, there’s a difference between the files translationroyale-logo.jpg and TranslationRoyale-Logo.jpg.
And the same rule applies to pages and post URLs since they are considered as files on your server.
This isn’t a huge problem per se, but if you don’t have any rules in place for file names and URLs, this could cause both of the mentioned examples to be indexed.
The result is either your website visitor sees a 404 error when they type in the wrong spelling of the URL or that you might have two different pages with full duplicate content on your server.
The same concept applies to the trailing slash, which is a forward slash at the end of a URL. This can cause duplicate content as well.
Now imagine you mix both – different cases and trailing slashes – all the time. This could lead to two, three, four, five duplicate pages if you’re not careful.
Our suggestion is to make it a rule for everyone in your company to only use lower case for everything and to never ever use a trailing slash.
3. Query Strings and URL Parameters
Let’s start off with explaining what a URL parameter is:
A URL parameter is a method to track information about a click-through to a URL. Also known as “query string,” it is one of the main ways that content can be filtered, organized, and presented to the user.
You can always find the URL parameters after a question mark in a URL.
Let’s look at this example for reference:
Let’s say there is an online shop that sells poker cards in different colors. This URL would show the website visitors all the red deck cards sorted from the highest to the lowest price.
Filter options are tricky as they will create various combinations in cases where more than one filter is available.
To make it more complex, the duplicate content issue might also be caused by the order of parameters in a URL since these can be rearranged.
For search engines, this will create confusion when they crawl the page; however, it is something that won’t affect the user experience.
But fret not! A parameter handling functionality tool such as Google Search Console and Bing Webmaster tool will help you fix this problem, especially if you have a search function or a large website.
4. Website Taxonomy
This is also known as URL taxonomy and simply refers to how the structure of your website pages is divided into content silos. If you’ve never paid attention to this, now is the right time to start.
Mapping pages from a crawling tool’s perspective, assigning adequate headers (H1, H2, H3), and setting focus keywords are where you should start off from. This helps Google to understand that your pages are grouped and related.
For example, if your website is about online sports betting affiliates, your content silo would look like this:
How to Remove Duplicate Content Caused by Technical Problems
Solution 1: 301 Redirects
If there are duplicate pages on your website with the same content, this can be easily fixed by setting up a 301 redirect from the less preferred pages to the original page/URL.
If traffic signals from numerous pages with high ranking potential are merged into one page, it will create a better chance for the “original” content to rank higher.
Solution 2: Canonical URL
Another good solution is adding a canonical URL to all the duplicate pages.
A canonical URL is an element (rel=canonical) that identifies the content owner even if the content is found on another page. This tag will pass on the information to Google about which version is the original one.
There are two types of canonical tags:
Canonical Tags Pointing to a Page
|Canonical Tags Pointing Away From a Page|
These tags inform Google that this is not
the main version of the content.
|Self-referential canonical tags identify |
the original version of the content.
You can add a canonical attribute, a portion of which includes the link to the original content, into the HTML head of each duplicate page version.
If you do not identify a canonical URL manually, Google might choose it for you, and that’s not always the best idea since Google might choose the page you don’t want to rank as much as the other one!
As an example, it could rank the page that brings in the lowest revenue compared to the original page. In terms of a casino affiliate website, this could be a page where the deal with the operator or affiliate program is not as great, or maybe only very time-, region- or game-specific.
Solution 3: Meta Robots Noindex Attribute
Meta tagging is a good technicality to look at to avoid duplicate iGaming content. Meta tags send informative signals to the search engine describing what your web page is about.
If you decide to remove several pages from being indexed by Google, the meta robots tag is your best bet. The meta robots attribute (noindex, follow) will allow the search engine to do the usual link crawling; however, it will exclude those pages from being indexed.
It’s vital to allow Google to keep crawling the page even though it won’t be indexed. Because one thing is for sure: Google hates crawling exclusions.
Adding the meta robots noindex tag is straightforward. Simply add the meta robots tag to the HTML head of the pages you wish to exclude from the search engine result pages.
A non-indexed page will still be used to determine your site’s overall value, according to Google’s John Mueller, but it won’t be counted towards any duplicate content.
Duplicate Content Caused by Intentionally Copying
Other Parties Copying Your Content
If the other website that copied your content has a higher domain authority than yours, it could be taken as the original publisher of the content in Google’s eyes.
Ideally, the other website has to give credit to your site by linking to your page and also add a canonical URL that leads to your page.
However, if your content was copied without your permission, you need to contact the owner of that website to request for removal. If an agreement hasn’t been reached and the said content is still live on their platform, you can message Google to remove that page from the search results.
Using Content From Another Site
The same rule applies if you are the one who “borrowed” content from another site.
It’s essential to use a canonical URL or a meta robots noindex tag and add a link to the original source after you get permission from the website owner to use their content.
We hope we’ve answered the question of how to avoid creating duplicate content and removing it. Please make sure to set aside a good amount of time to analyze ways to improve some of your website technicalities using the tips mentioned above.
It takes a significant amount of effort and patience to generate highly relevant traffic, but it is worth every ounce of effort in the long run.
Another important takeaway in this article is to create valuable content for your visitors while adding extra precautions to avoid having your content copied elsewhere. Of course, the same also applies when you are using someone’s content on your website.
In such a competitive industry, a good content marketing strategy for iGaming companies requires churning out unique and authentic content for readers.
If you’re on the lookout for original and high-quality iGaming content writing and translation services, we can help you with it straight away.
Remember, if there is a problem, there is always a solution.
Header Image Source: Photo by meen_na on freepik
How did you like Sebastian Scheplitz’s blog post “Avoid Duplicate Content to Gain Organic Traffic”? Let us know in the comments if you have anything to add, have another content idea for iGaming blog posts, or just want to say “hello.” 🙂