Search engines, URLs, and a trailing slash “/”

I wanted to write a general post for the Webmaster Blog about how search engines handle URLs with/without trailing slashes, but turns out that the major engines differ quite a bit in their display.

 

Conducting quick research using the URL for Webmaster Central, which 301s the non-trailing-slash URL, www.google.com/webmasters, to the trailing slash URL, www.google.com/webmasters/,

 

 

here’s what I found today:

  • Good news for PageRank and linking properties! Search engines appear to cluster 301’d trailing-slash URLs to/from no-trailing-slash URLs (evidenced by the same cached version for various URL formats)
  • Google generally adheres to the target of your 301, displaying the target URL (with our without the trailing slash) in SERPs. Some examples from SERPs that show the 301 target with/without the trailing slash:
    www.google.com/productswww.google.com/webmasters/
  • Yahoo! and Bing may remove the trailing slash from results, even if it’s the target of your 301. For the query [google webmaster central], both Yahoo! and Bing show this URL without the trailing slash:
    www.google.com/webmasters
  • Bing can remove more than just the trailing slash of a URL to fit query terms/keywords — they can remove characters, too. For the query [google webmaster tools], Bing shows:
    www.google.com/webmaster

 

Details and screenshots

 

Google generally follows 301s and displays URLs with or without a trailing slash accordingly (i.e. we stay pretty true to the URL that was successfully crawled). For example, Google search results reflect www.google.com/webmasters/, the target of the 301, as the canonical version.

 

But Yahoo! seems to remove the trailing slash from the results display (i.e. even though www.google.com/webmasters/ is the 301 target, the slash is removed):

 

Interestingly, clicking this result’s cached version shows the trailing-slash URL, www.google.com/webmasters/. So while the display may differ, it’s likely both URLs, slash and no-trailing-slash, are clustered as expected:

Bing removes the trailing slash, too.

 

But I think Bing also removes the ‘s’ in ‘webmasters’ when it doesn’t match the query term. Here they display www.google.com/webmaster. I found this feature most interesting.

In Bing, both the URLs
www.google.com/webmasters
and
www.google.com/webmaster
show the source URL in the cached version as
www.google.com/webmasters/.
Evidence again that while the display formats may differ, the duplicate content URLs are likely clustered.

Bing’s swapping/adjusting of display URLs to match queries is a pretty neat idea with potentially large implications. And I’m sure Bing prevents keywords in URLs from becoming spammy in these 301 cases. For example, perhaps their results display only allows stemming of the canonical URL from plural to singular nouns (webmasters -> webmaster), not complete variations of keywords.

 

I don’t research search engine behavior outside of Google as much as I should, sorry about that. If you have more findings on trailing slashes and URLs, please share. Would be cool to learn more.

 

Update on May 18, 2010: A few weeks after this post, I published an official Webmaster Central article about how Google handles URLs and the trailing slash.

11 Replies to “Search engines, URLs, and a trailing slash “/””

  1. Given the number of very old urls that Yahoo insists on leaving in its index I’m not convinced they even follow 301s properly.

    I also noticed recently that they were displaying urls with inserted spaces after the domain part – very odd.

    [Reply]

    Maile Ohye Reply:

    Thanks, Bill, that’s interesting about the inserted spaces… Do you have a screenshot or know of a URL that triggers the result? I’d love to see it.

    [Reply]

    Bill Marshall Reply:

    Hi Maile
    They’ve stopped doing it now. I first saw it around the 12th March and it was still happening on the 24th but had stopped by the 29th.

    The thought crossed my mind that it might be intended to disrupt automated ranking reports but they do tend to change their coding quite a bit so it may have simply been a mistake.

    Something I noticed on Google today that you might be interested in was that one of my sites was having its robots.txt file listed in a site: command result. Not normal and not happening on my other sites.

    [Reply]

  2. Maile you are beautiful. Please come to London and marry me.

    [Reply]

    Maile Ohye Reply:

    My very first marriage proposal! And over the interwebs! I have to tell Amy and Gerry. (Germain, please start calling them “Mum” and “Dad.”)

    Lol, thanks for the teasing. Your comment made my night. 🙂

    [Reply]

  3. Maile I think i’m in love, can I continue your courtship on twitter? @germainokoro

    (Meant as a question not a comment, but I don’t mind if you publish… in my eyes you can do no wrong)

    [Reply]

  4. Yahoo! and Bing both break the web with their trimming of trailing forward slashes when they should be present. They give no consideration to extensionless environments where /products is a document and /products/ is the root of a sub-directory.

    There are many articles written on this behavior. Sebastian has a rant about it too…

    Why storing URLs with truncated trailing slashes is an utter idiocy
    http://sebastians-pamphlets.com/thou-must-not-steal-the-trailing-slash-from-my-urls/

    Sebastian refers to it as URL Circumcision. Ouch! 🙂

    [Reply]

  5. Hover over the Bing results to see the status bar text, and it will be the correct URL. E.g., “www.google.com/webmaster” links to “http://www.google.com/webmasters/” without redirecting.

    They only display the wrong URL. It has been going on since at least 2007.

    [Reply]

    Josh Reply:

    (Sorry, formatting mistake above.)

    I was going to say that “/” has no meaning for Microsoft… where in Linux (most Web servers) a slash is a directory. Maybe the Microsoft programmers don’t know, or just don’t care.

    [Reply]

Leave a Reply

Your email address will not be published. Required fields are marked *