Marketing Case Study: Duplicate Content Analysis

It goes without saying that original content is the key to good ranking in search engines, authentic texts go in line with content usefulness, proper media optimization and interlinking. “Content is the King” is the truth known for decades already, nevertheless, there are still websites that have major plagiarism issues done not intentionally. Our marketing team decided to run internal experiment and check clients’ websites in different niches for content duplicates, broken links, slow page load speed and investigate the most common causes of such issues.

Duplicate Content Case Study Infographic

Revealing Text Issues with Website Duplicate Content Checker

Our team used a free content checker Siteliner to scan 555 websites in total. Few words about the tool: Siteliner is a product of Indigo Stream Technologies, the company which created well-known plagiarism checker for content writers CopyScape (we also use this tool to recheck content uniqueness for our clients). Siteliner scans a website within several minutes, and this time depends on the number of pages the tool should go through. Once website is fully analyzed, user receives a report on:

  • number of pages scanned vs total found (free version scans only 250 pages on one website)
  • number of broken links on a website
  • percentage of duplicate, common and original content
  • average page load speed
  • average page size
  • number of words per page
  • text to html ratio
  • internal and external links

and compares this data to other websites on the Web. All these parameters are crucially important for proper internal website optimization, but in this case study we will focus at the duplicate content in particular. It should be noted here that the free version of the tool has a limit for the number of checks per day, so the whole data gathering process took us 1 month to complete. Here is a sample of Siteliner report:

Siteliner Website Content Checker


Results of Duplicate Content Check

In total our team scanned 54823 pages on 555 websites and revealed that:

  • 0.36% websites had over 75% content duplicate issues
  • 5.76% websites had 50%-74% of duplicated content
  • 8.82% websites had 20%-49% of duplicated texts
  • 36.94% websites had 10%-19% of content duplicates
  • only 5.04% websites didn’t have any issues

Per page analysis of websites with high number of plagiarized pages revealed the issue with tags: content on tag pages included identical to the posts with these tags content. This is a common issue for blogs and it doesn’t hurt SEO as long as tag pages are not indexed and not included into sitemap. To fix the bug we changed settings in All-In-One SEO Pack plugin (General Settings) to deindex tags. It’s a good practice to add noindex meta tag to category, tags, date archives, author and paginated pages/posts, since those usually include a selection of posts and often cause duplicate content problems.

Fixing Tags to Avoid Content Duplicates


Broken Links Issues

During our check we also revealed 2576 broken links in total on 250 websites and noticed that

  • 0.9% websites had over 100 broken links
  • 1.8% websites had 50-100 broken links
  • 2.16% websites had 10-49 broken links
  • 40.18% websites had up to 9 broken links

The most popular reasons of these broken links presence were links to the pages restricted to registered users only, incorrect links in menus and sidebars, which were spread through all pages on the website, temporary 508 response codes, which pointed at major issues with the hosting. Such issues were fixed manually by editing menus, sidebars and of course solving issues with exceeding resource limit on hosting.

Get a Quote for Fixing Content Duplicates

Analyzing Average Site Load Speed

Another Siteliner report that we paid attention to was average page load time, since it’s a well-known fact that slow websites don’t rank well. Furthermore, one second delay in page response leads to 7% conversion decrease. Our analysis pointed that

  • 1,982 ms was average page load speed
  • 16,024 ms was maximum page load speed
  • 366 ms was minimum page load speed

Here is the distribution of average page load time per website:

Average Page Load Time Histogram


Since the industry standard is considered to be 671 ms, only 3.6% websites were loading fast enough. To fix this issue we had to apply image compression, enable caching on website, optimize JavaScript and CSS files, where possible according to our website load speed issues guide. After all fixes done we started monitoring websites ranking in ahrefs to see how this internal content optimization influenced positions in search engines.

Want to fix duplicate content issues on your website? Contact our team for a free estimate!

Request a quote

We will contact you as soon as posible.

Attach File (max file size 5MB; allowed extensions: doc, txt, pdf, docx)

Your email address will not be published. Required fields are marked *

Contact us Request a Quote

Your email address will not be published.

Required fields are marked *

Attach File

(max file size 5MB; allowed extensions: doc, txt, pdf, docx)

subscribe to newsletter

Your email address will not be published.

Required fields are marked *

Ask a Question

Your email address will not be published.

Required fields are marked *

Sorry, this page isn't quite ready yet

redirecting to the old site



cannot account for customer alterations, as the site may reflect changes made after the project was completed.

Mobilunity - Dedicated Developers