SOP 027: How to perform a URL audit on your site
Goal:
Knowing which of your site’s pages need to be indexed, deindexed, redirected or canonicalized, as well as knowing how to perform each of these actions.
Ideal Outcome:
You will clean up your website’s URL profile, by only indexing the pages that need to be indexed and solving some duplicate content issues.
Prerequisites or requirements:
- The following procedure applies solely to WordPress.org websites that are entirely set up (including with the Yoast SEO plugin installed). If your WordPress.org site is not fully set up yet, you can see how to do it in SOP 008 (web version).
- Furthermore, keep in mind that this procedure only applies to small-to-medium sized websites (usually, up to a few hundred URLs). The URL audit process for large sites is out of the scope of this SOP.
Why this is important:
This URL audit is very important for crawl-path optimization. In other words, indexing the right URLs on your website will help Google and the other search engines crawl the pages you want them to display in SERPs, as opposed to pages that only make your website look outdated, spammy, or plain and simply low-quality.
Where this is done:
In your browser, in your WordPress.org website, using Google Sheets, as well as the Export All URLs and Redirection Wordpress plugins.
When is this done:
When you want to optimize your website for the search engines. Ideally, this should be done before a site goes live, but it can be done afterward as well.
Who does this:
You, your SEO specialist, a VA you hired, or a digital marketing agency you employed.
Find all URLs on your website
The first step you have to make in the URL audit process is finding all the URLs on your website. Follow this procedure to extract all the URLs on your website:
Download, install and activate the Export All URLs plugin.
To do this, you need to download the ZIP folder at the aforementioned link.
1. Go to your Wordpress Admin Panel → Plugins → Add New → Upload Plugin.
2. Upload the plugin from your computer.
3. Click on “Install Now”.
4. Click on “Activate”.
5. Go to Settings → Export All URLs.
6. Tick “All Types”, “URLs”, and “CSV”.
7. Click on “Export”.
8. Click on “Download Now”.
9. Select where to download the .csv file.
10. Rename it (for the purpose of this example, we used “Exported_Data_All_URLs”, but you can rename it any other way that makes it easy to recognize).
Classify the URLs
Depending on how many URLs your website has, the URL classification process may get a bit time-consuming. However, it is a step you simply cannot skip if you want your website to be properly optimized for the search engines.
Here’s what you need to do to classify your site’s URLs:
1. Open a new sheet in Google Sheets.
2. Go to File → Open → Open your .csv file with all the URLs.
3. Copy the URLs.
4. Go to the “URL Audit Worksheet” we have created for you.
5. Make a copy of it in your own Drive folders.
6. Add a new sheet. Rename it “Unclassified URLs”.
7. Paste the URLs from the .csv files in the “Unclassified URLs” list.
8. Take each URL in the “Unclassified URLs” sheet, then copy and paste it into the appropriate column in the “Classified URLs” sheet, according to the following questions and instructions.
Is the content of this page still relevant to users? Is it ok if visitors still find this content?
For example, Black Friday content from a previous year is not relevant for users who find your page this year.
If the answer to this question is NO, cut and paste to move the link into the “Redirect” column and move to the next link on your list. If the answer to this question is YES, then move on to the next question.
9. Should the content of this page be found in search engines (Google, Bing, Yahoo, etc)?
For example, if your site has a “Thank You” page, a “Checkout” page, or a page that’s only visible to members, then those pages probably shouldn’t be displayed in the search engines.
If the answer to this question is NO, cut and paste to move the link into the “Deindex” column and move to the next link on your list.
If the answer to this question is YES, then move on to the next question.
10. Is the content of the page an exact copy or very similar (50% of the text is the same) to another page in your site?
For example, if you have two landing page variations for the same page acquisition campaign, you should consider them to be very similar.
If the answer to this question is NO, cut and paste to move the link into the “Index” column and move to the next link on your list.
If the answer to this question is YES, then move on to the next question.
11. Is the content of this page the ORIGINAL or CANONICAL content from which other copies were created?
For example, if you have a copy made after your homepage for a paid campaign, the copy’s URL will fall into this category.
If the answer to this question is NO, cut and paste to move the link into the “Canonicalize” column.
If the answer to this question is YES, move the link into the “Index” column.
Once you are done with the classification process, Press CTRL+A, then click on the “Wrap Text” option - this will allow you to see everything in the spreadsheet clearer, so that you can proceed to the next step.
Take action on each of the categories created
At this point, you should have filled the columns in the “Classified URL” list according to the nature of each URL. Next, you should apply a different set of steps for each of the categories, as follows:
The URLs in the “Redirect” column
1. Install and activate the Redirection wordpress plugin (the plugin installation process is similar to that described for the “Export All URLs” plugin).
2. Go to Tools > Redirection.
3. Input the source URL (the one in your “Redirect column) and target URL (the one where you want users to be redirected to if they stumble upon the outdated URL.)
4. Add redirect.
5. Mark the URL as completed in the worksheet.
The URLs in the “Deindex” column
6. Go to the page or post editor for that URL.
7. Scroll down to the Yoast SEO section.
8. Click on “Advanced.”
9. Under “Meta robots index”, select “noindex.”
10. Publish the changes.
11. Mark the URL as completed in the worksheet.
The URLs in the “Index” column
12. Go to the page or post editor for that URL.
13. Scroll down to the Yoast SEO section.
14. Click on “Advanced.”
15. Under “Meta robots index”, select “index”
16. Publish the changes.
17. Mark the URL as completed in the worksheet.
The URLs in the “Canonicalize” column
18. Go to the page or post editor.
19. Scroll down to the Yoast SEO section.
20. Click on “Advanced.”
21. Under “Canonical URL”, input the URL of the page that has the original or canonical content.
22. Publish the changes.
23.Mark the URL as completed in the worksheet.
That’s it! It wasn’t so hard, was it? Now your website’s all cleaned up and ready to have Google index just the right pages, the right way! :)