Search engines, URLs, and a trailing slash “/”

I wanted to write a general post for the Webmaster Blog about how search engines handle URLs with/without trailing slashes, but turns out that the major engines differ quite a bit in their display.

 

Conducting quick research using the URL for Webmaster Central, which 301s the non-trailing-slash URL, www.google.com/webmasters, to the trailing slash URL, www.google.com/webmasters/,

 

 

here’s what I found today:

  • Good news for PageRank and linking properties! Search engines appear to cluster 301’d trailing-slash URLs to/from no-trailing-slash URLs (evidenced by the same cached version for various URL formats)
  • Google generally adheres to the target of your 301, displaying the target URL (with our without the trailing slash) in SERPs. Some examples from SERPs that show the 301 target with/without the trailing slash:
    www.google.com/productswww.google.com/webmasters/
  • Yahoo! and Bing may remove the trailing slash from results, even if it’s the target of your 301. For the query [google webmaster central], both Yahoo! and Bing show this URL without the trailing slash:
    www.google.com/webmasters
  • Bing can remove more than just the trailing slash of a URL to fit query terms/keywords — they can remove characters, too. For the query [google webmaster tools], Bing shows:
    www.google.com/webmaster

 

Details and screenshots

 

Google generally follows 301s and displays URLs with or without a trailing slash accordingly (i.e. we stay pretty true to the URL that was successfully crawled). For example, Google search results reflect www.google.com/webmasters/, the target of the 301, as the canonical version.

 

But Yahoo! seems to remove the trailing slash from the results display (i.e. even though www.google.com/webmasters/ is the 301 target, the slash is removed):

 

Interestingly, clicking this result’s cached version shows the trailing-slash URL, www.google.com/webmasters/. So while the display may differ, it’s likely both URLs, slash and no-trailing-slash, are clustered as expected:

Bing removes the trailing slash, too.

 

But I think Bing also removes the ‘s’ in ‘webmasters’ when it doesn’t match the query term. Here they display www.google.com/webmaster. I found this feature most interesting.

In Bing, both the URLs
www.google.com/webmasters
and
www.google.com/webmaster
show the source URL in the cached version as
www.google.com/webmasters/.
Evidence again that while the display formats may differ, the duplicate content URLs are likely clustered.

Bing’s swapping/adjusting of display URLs to match queries is a pretty neat idea with potentially large implications. And I’m sure Bing prevents keywords in URLs from becoming spammy in these 301 cases. For example, perhaps their results display only allows stemming of the canonical URL from plural to singular nouns (webmasters -> webmaster), not complete variations of keywords.

 

I don’t research search engine behavior outside of Google as much as I should, sorry about that. If you have more findings on trailing slashes and URLs, please share. Would be cool to learn more.

 

Update on May 18, 2010: A few weeks after this post, I published an official Webmaster Central article about how Google handles URLs and the trailing slash.

What’s the optimal server response time?

Fairly valid answers:

 

  1. Faster than your competitors
  2. Under 2 seconds

 

I’d go with #2 as I’m a believer in having metrics for myself independent of others’ performance. It just seems conducive to higher overall happiness.

 

My coworker, Sreeram Ramachandran, who developed Site Performance in Webmaster Tools forwarded me an article by Akamai about response times for eCommerce sites.

 

At Google, we definitely aim for sub-two.

 

You can check your site’s response time from locations throughout the world at WebPagetest.org. For example, a user in Virginia with DSL needs less than a second to run the query [page speed] on Google.com.

 

response time for google query
click image for this result on WebPagetest.org

Title and name attributes in HTML anchors

How does Google currently process title and name attributes in HTML anchors?

 

<a title=”sweet link!” name=”nice name!” href=”page.html”>foo</a>

 

title = not processed by Google (please keep in mind that it could be useful for other engines or applications)

 

name = not processed for ranking/content relevance, but can be utilized for understanding page structure (such as with JavaScript functions)

 

Thanks to Joachim Kupke (super nice guy) for checking the code to provide clarification.