Google Webmaster Central Blog - Official news on crawling and indexing sites for the Google index

The Impact of User Feedback, Part 2 (and more Popular Picks!)

Tuesday, August 26, 2008 at 4:41 PM

As a follow-up to my recent post about how user reports of webspam and paid links help improve Google's search results for millions of users, I wanted to highlight one of the most essential parts of Google Webmaster Central: our Webmaster Help Group. With over 37,000 members in our English group and support in 15 other languages, the group is the place to get your questions answered regarding crawling and indexing or Webmaster Tools. We're thankful for a fabulous group of Bionic Posters who have dedicated their time and energy to making the Webmaster Help Group a great place to be. When appropriate, Googlers, including myself, jump in to clarify issues or participate in the dialogue. One thing to note: we try hard to read most posts in the group, and although we may not respond to each one, your feedback and concerns help drive the features we work on. Here are a few examples:

Sitemap details
Submitting a Sitemap through Webmaster Tools is one way to let know Google know about what pages exist on your site. Users were quick to note that even though they submitted a Sitemap of all the pages on their site, they only found a sampling of URLs indexed through a site: search. In response, the Webmaster Tools team created a Sitemaps details page to better tell you how your Sitemap was processed. You can read a refresher about the Sitemaps details page in Jonathan's blog post.

Contextual help
One request we received early on with Webmaster Tools was for better documentation on the data displayed. We saw several questions about meta description and title tag issues using our Content Analysis tool, which led us to beef up our documentation on that page and link to that Help Center article directly from that page. Similarly, we discovered that users needed clarification on the distinction between "top search queries" and "top clicked queries" and how the data can be used. We added an expandable section entitled "How do I use this data?" and placed contextual help information across Webmaster Tools to explain what each feature is and where to get more information about it.

Blog posts
The Webmaster Help Group is also a way for us to keep a pulse on what overarching questions are on the minds of webmasters so we can address some of those concerns through this blog. Whether it's how to submit a reconsideration request using Webmaster Tools, deal with duplicate content, move a site, or design for accessibility, we're always open to hearing more about your concerns in the Group. Which reminds me...

It's time for more Popular Picks!
Last year, we devoted two weeks to soliciting and answering five of your most pressing webmaster-related questions. These Popular Picks covered the following topics:
Seeing as this was a well-received initiative, I'm happy to announce that we're going to do it again. Head on over to this thread to ask your webmaster-related questions. See you there!

silver_medal_count++

Friday, August 22, 2008 at 4:23 PM

Since both tennis and table tennis are in the Olympics, perhaps you're wondering: if there's soccer, why not "table soccer?" Of course, we know table soccer by another name; and while foosball may not be an Olympic sport, we still cheered Nathan Johns and Jan Backes—two members of our Search Quality team—as they brought home the foosball silver medal at the search engine foosball smackdown at SES San Jose.

"Smackdown" doesn't quite equate to "Olympics," but check out the intensity—you could hear a pin drop!

silver medalists at foosball

The gold medal (cup) went to the search engine down the road. :)

gold medalists at foosball
Yahoo's first place winners Daniel Wong and Jake Rosenberg.

Just to be sure they weren't ringers, I quizzed Daniel and Jake, "How can you prevent a file from being crawled?" They correctly answered, "robots.txt."

Gold cup well deserved.

Hey Google, I no longer have badware

Thursday, August 21, 2008 at 5:10 PM

This post is for anyone who has been emailed or notified by Google about badware, received a badware warning when browsing their own site using Firefox, or has come across malware-labeled search results for their own site(s).  As you know, these warnings are produced by our automated scanning systems, which we've put in place to ensure the quality of our results by protecting our users.  Whatever the case, if you are dealing with badware, here are a few recommendations that can help you out. 





1.  If you have badware, it usually means that your web server, your website, or a database used by your website has been compromised. We have a nifty post on how to handle being hacked.  Be very careful when inspecting for malware on your site so as to avoid exposing your computer to infection.

2. Once everything is clear and dandy, you can follow the steps in our post about malware reviews via Webmaster Tools. Please note the screen shot on the previous post is outdated, and the new malware review form is on the Overview page and looks like this:



  • Other programs, such as Firefox, also use our badware data and may not recognize the change immediately due to their caching of the data.  So even if the badware label in search is removed, it may take some time for that to be visible in such programs.

3. Lastly, if you believe that your rankings were somehow affected by the malware, such as compromised content that violated our Webmaster Guidelines [i.e. hacked pages with hidden pharmacy text links], you should fill out a reconsideration request. To clarify, reconsideration requests are usually used for when you notice issues stemming from violations of our Webmaster Guidelines and are separate from malware requests.

If you have additional questions, please review our documentation or post to the discussion group with the URL of your site. We hope you find this updated feature in Webmaster Tools useful in discovering and fixing any malware-related problems. 

Written by Evan Tang, Search Quality Team

Make your 404 pages more useful

Tuesday, August 19, 2008 at 10:13 AM

Your visitors may stumble into a 404 "Not found" page on your website for a variety of reasons:
  • A mistyped URL, or a copy-and-paste mistake
  • Broken or truncated links on web pages or in an email message
  • Moved or deleted content
Confronted by a 404 page, they may then attempt to manually correct the URL, click the back button, or even navigate away from your site. As hinted in an earlier post for "404 week at Webmaster Central", there are various ways to help your visitors get out of the dead-end situation. In our quest to make 404 pages more useful, we've just added a section in Webmaster Tools called "Enhance 404 pages". If you've created a custom 404 page this allows you to embed a widget in your 404 page that helps your visitors find what they're looking for by providing suggestions based on the incorrect URL.


Example: Jamie receives the link www.example.com/activities/adventurecruise.html in an email message. Because of formatting due to a bad email client, the URL is truncated to www.example.com/activities/adventur. As a result it returns a 404 page. With the 404 widget added, however, she could instead see the following:



In addition to attempting to correct the URL, the 404 widget also suggests the following, if available:
  • a link to the parent subdirectory
  • a sitemap webpage
  • site search query suggestions and search box

How do you add the widget? Visit the "Enhance 404 pages" section in Webmaster Tools, which allows you to generate a JavaScript snippet. You can then copy and paste this into your custom 404 page's code. As always, don't forget to return a proper 404 code.

Can you change the way it looks? Sure. We leave the HTML unstyled initially, but you can edit the CSS block that we've included. For more information, check out our guide on how to customize the look of your 404 widget.

This feature is currently experimental -- we might not provide corrections and suggestions for your site but we'll be working to improve the coverage. In the meantime, let us know what you think in the comments below or in our group discussion. Thanks for helping us make the Internet a more friendly place!

Written by Sahala Swenson, Webmaster Tools team

More on 404

Friday, August 15, 2008 at 2:52 PM

Now that we've bid farewell to soft 404s, in this post for 404 week we'll answer your burning 404 questions.

How do you treat the response code 410 "Gone"?
Just like a 404.

Do you index content or follow links from a page with a 404 response code?
We aim to understand as much as possible about your site and its content. So while we wouldn't want to show a hard 404 to users in search results, we may utilize a 404's content or links if it's detected as a signal to help us better understand your site.

Keep in mind that if you want links crawled or content indexed, it's far more beneficial to include them in a non-404 page.

What about 404s with a 10-second meta refresh?
Yahoo! currently utilizes this method on their 404s. They respond with a 404, but the 404 content also shows:

<meta http-equiv="refresh" content="10;url=http://www.yahoo.com/?xxx">

We feel this technique is fine because it reduces confusion by giving users 10 seconds to make a new selection, only offering the homepage after 10 seconds without the user's input.

Should I 301-redirect misspelled 404s to the correct URL?
Redirecting/301-ing 404s is a good idea when it's helpful to users (i.e. not confusing like soft 404s). For instance, if you notice that the Crawl Errors of Webmaster Tools shows a 404 for a misspelled version of your URL, feel free to 301 the misspelled version of the URL to the correct version.

For example, if we saw this 404 in Crawl Errors:
http://www.google.com/webmsters  <-- typo for "webmasters"

we may first correct the typo if it exists on our own site, then 301 the URL to the correct version (as the broken link may occur elsewhere on the web):
http://www.google.com/webmasters

Have you guys seen any good 404s?
Yes, we have! (Confession: no one asked us this question, but few things are as fun to discuss as response codes. :) We've put together a list of some of our favorite 404 pages. If you have more 404-related questions, let us know, and thanks for joining us for 404 week!
http://www.metrokitchen.com/nice-404-page
"If you're looking for an item that's no longer stocked (as I was), this makes it really easy to find an alternative."
-Riona, domestigeek

http://www.comedycentral.com/another-404
"Blame the robot monkeys"
-Reid, tells really bad jokes

http://www.splicemusic.com/and-another
"Boost your 'Time on site' metrics with a 404 page like this."
-Susan, dabbler in music and Analytics

http://www.treachery.net/wow-more-404s
"It's not reassuring, but it's definitive."
-Jonathan, has trained actual spiders to build websites, ants handle the 404s

http://www.apple.com/iPhone4g
"Good with respect to usability."
http://thcnet.net/lost-in-a-forest
"At least there's a mailbox."
-JohnMu, adventurous

http://lookitsme.co.uk/404
"It's pretty cute. :)"
-Jessica, likes cute things

http://www.orangecoat.com/a-404-page.html
"Flow charts rule."
-Sahala, internet traveller

http://icanhascheezburger.com/iz-404-page
"I can has useful links and even e-mail address for questions! But they could have added 'OH NOES! IZ MISSING PAGE! MAYBE TIPO OR BROKN LINKZ?' so folks'd know what's up."
-Adam, lindy hop geek

Farewell to soft 404s

Tuesday, August 12, 2008 at 2:54 PM

We see two kinds of 404 ("File not found") responses on the web: "hard 404s" and "soft 404s." We discourage the use of so-called "soft 404s" because they can be a confusing experience for users and search engines. Instead of returning a 404 response code for a non-existent URL, websites that serve "soft 404s" return a 200 response code. The content of the 200 response is often the homepage of the site, or an error page.

How does a soft 404 look to the user? Here's a mockup of a soft 404: This site returns a 200 response code and the site's homepage for URLs that don't exist.



As exemplified above, soft 404s are confusing for users, and furthermore search engines may spend much of their time crawling and indexing non-existent, often duplicative URLs on your site. This can negatively impact your site's crawl coverage—because of the time Googlebot spends on non-existent pages, your unique URLs may not be discovered as quickly or visited as frequently.

What should you do instead of returning a soft 404?
It's much better to return a 404 response code and clearly explain to users that the file wasn't found. This makes search engines and many users happy.

Return 404 response code



Return clear message to users



Can your webserver return 404, but send a helpful "Not found" message to the user?
Of course! More info as "404 week" continues!

It's 404 week at Webmaster Central

Monday, August 11, 2008 at 1:40 PM

This week we're publishing several blog posts dedicated to helping you with one response code: 404.

Response codes are a numeric status (like 200 for "OK", 301 for "Moved Permanently") that a webserver returns in response to a request for a URL. The 404 response code should be returned for a file "Not Found".

When a user sends a request for your webpage, your webserver looks for the corresponding file for the URL. If a file exists, your webserver likely responds with a 200 response code along with a message (often the content of the page, such as the HTML).

200 response code flow chart


So what's a 404? Let's say that in the link to "Visit Google Apps" above, the link is broken because of a typing error when coding the page. Now when a user clicks "Visit Google Apps", the particular webpage/file isn't located by the webserver. The webserver should return a 404 response code, meaning "Not Found".

404 response code flow chart


Now that we're all on board with the basics of 404s, stay tuned 4 even more information on making 404s good 4 users and 4 search engines.