10-Minute SEO Content Cleanup: How to Find & Remove Low-Value Pages Step-by-Step

by | Dec 14, 2021

As SEO pros, we are always thinking about new content ideas to rank for more keywords, get more traffic, and drive more conversions.

But what about eliminating the content clutter on your website that’s accumulated over the years? Removing content from Google and other search engines (specifically low-value content) is one of the best ways to improve your website’s SEO.

If your site should have about 500 pages but Google search results show 7,960 pages indexed then you, my friend, have what is called “index bloat”…

So why is it a good idea to cut the low-value content, how do you quickly identify low-value pages, and what do you do to remove it like a bad tribal tattoo from search engine indexing?

What is Considered Low-Value Content

Simply put, this is any page that provides little value to a visitor entering your website through organic search. These pages can be totally useless (and can be deleted) or could be useful to some users but not organic search visitors (and can be NoIndex‘ed).

Example of low-value pages to delete

  • Ancient blog posts with outdated content – that not worth updating
  • Discontinued product pages
  • Time-sensitive content like job postings or events
  • Former employee or author pages

Example of low-value pages to NoIndex

  • Company announcements and news posts that were timely but no longer needed
  • Privacy policy and terms & conditions pages
  • Docs pages for ancient product versions
  • Thank you pages

Example of low-value pages to Optimize

  • Ancient blog posts with outdated content – that could be valuable if updated
  • Quality content that isn’t performing due to lacking SEO or depth

Oh and don’t forget canonical pages!

Benefits of Removing Content from Google’s Search Results

Now that you know WHAT low-value content is, let’s review the benefits of getting rid of it instead of just letting it pile up.

If you’re not familiar with the term “index bloat”, it’s when search engine bots are inundated with low-quality pages. Index bloat slows down your site and wastes crawl budget. Maintaining a clean website means search engines will only index the URLs you want visitors to find — delighting search engines AND USERS alike!

Let’s summarize the benefits:

  • Eliminate pages that aren’t valuable – users don’t like stumbling into poor content any more than search engines.
  • Improve link equity of the remaining pages – eg. if your website has 100 backlinks and 1000 pages, each page has 1/10 link per page, but with the same 100 backlinks and just 500 valuable pages then each has 1/5 link per page.
  • Faster, fuller page crawls for remaining pages – Inaccurate and meaningless content impact’s Google ability to crawl your website – according to Google’s Search Quality Evaluator Guidelines.
  • More manageable website – maintaining the design, content, and SEO of a website is an ongoing task. Things break and information gets outdated so the fewer pages to handle, the easier your life will be.

The Content Removal Process

Now that you know WHY to remove content from Google search results is a good idea, let’s cover how to find these pages and how to properly remove web pages from indexing.

 

 

Here is a summary of the steps:

  • Open Screaming Frog (Download here if you don’t have it)
  • Connect GA and GSC (Configuration > API Access > Google Analytics / Google Search Console)
  • Connect your sitemap (Configuration > Spider > Crawl these sitemaps: **paste your sitemap URL**)
  • Filter to HTML
  • Export to Excel file and open the file
  • Remove unnecessary Excel Columns
  • Create Excel sheet Tabs – Delete, NoIndex, Canonicalize, Optimize
  • Delete: junk pages with no value for SEO or for users (eg blogs from 10 years ago with no traffic, template pages, empty pages, duplicate pages)
  • NoIndex: pages not valuable for SEO but potentially useful for users to still have access to the content (eg. Wordpress category or tag pages, privacy policy pages). A NoIndex tag will block these from search results.
  • Canonicalize: pages that are tabbed through (eg blog list pages 2/3/4/5…)
  • Optimize: the remaining valuable pages that you want to keep and optimize
  • Review the pages in your export and organize them in the appropriate tabs.
  • Then go back and optimize the remaining pages Page Title, META Description, H1, and any other elements.
  • Have the client or go into the client’s website and implement your Delete, NoIndex, Canonicalize changes AFTER client review/approval.

A few Tips:

  • Filter to pages with 0 or very low GA sessions over the last year (look out for brand new pages), these pages can probably be deleted, NoIndex, or canonicalized)
  • Filter to pages with very low word count, under 100 words (these pages can likely be removed or need to have content added.

Remove Outdated Content from Google Using the New Removal Tool

If you want to remove content from Google’s index fast, you can expedite the process using the new Removals tool in Google Search Console.

Google now makes it easier and faster with their tool to request removal of online content. Waiting on a crawl of your website can take a few days, weeks, or even months to update in Google’s search results – but their new removal tool handles removal requests more immediately.

This tool lets site owners:

  1. Temporarily hide URLs in Google.
  2. See content reported as “outdated” by users and isn’t showing in Google search results.
  3. See which URLs are being filtered by Google’s SafeSearch filter — for things like personal information, inappropriate content, inaccurate information, copyrighted material, or pornographic content.

 

Watch Google’s removal request tool training video here for details.

Conclusion

So if you’ve followed our guide you should have a lean, mean, link-equity optimized website for search engines and users to delight in.

To recap we covered, what low-value content is, why you should get it out of search results, how to find it, and how to nuke those useless web pages.

It’s good practice to complete a content cleanup with some frequency. Depending on the amount of content and frequency of posting, it could be between monthly or annually.

Happy content cleaning!

Ryder Meehan

Ryder has been on a 16-year journey to master digital marketing from every aspect. His resume includes Razorfish, Slighshot, Fossil, Samsung Mobile and Tatcha before launching Upgrow. Ryder is the acting CEO, heading business development and account services. He has been featured as a digital marketing leader on Forbes, PRNews, Business.com, Workamajig, Databox, Fit Small Biz and other outlets.