Is yahoo slurp misbehaving?
Complaints
Webmasters complain that yahoo slurp is not tripping the filter given in robots.txt for the following
User-agent: msnbot
Disallow: /bloop/
Disallow: /blop/
User-agent: googlebot
Disallow: /bloop/
Disallow: /blop/
User-agent: Slurp
Disallow: /bloop/
Disallow: /blop/
User-agent: *
Disallow: /shop/
Disallow: /forum/
Disallow: /cgi-bin/
Yahoo slurp obeys the agent specific rule and hence does not crawl the directories /bloop/ and /blop/directories where as it crawled the directories in the generic rule.
Discussions and Suggestions
Webmasters figured out that not only yahoo but even other search engines behaved in the same manner.
All the SE BOTS and slurps do not obey the Generic rule.
Suggestion 1: Put the wildcard ones first that has generic rule to less specific bots and then place the agent specific code.
Suggestion 2: main theme behind this is that the major bots get to their corresponding user agents and the rest carry on with wild card user agents.
Conclusion
All bots do trip and filter the specifications. They are tripped by the server configuration.
The fact is that major bots and slurps get to their respective user agents and the rest continue with the generic rule.
The exact rule for the above case would be
User-agent: Slurp
Disallow: /bloop/
Disallow: /blop/
Disallow: /shop/
Disallow: /forum/
Disallow: /cgi-bin/
Disallow: /badbottrap/
User-agent: msnbot
Disallow: /bloop/
Disallow: /blop/
Disallow: /shop/
Disallow: /forum/
Disallow: /cgi-bin/
Disallow: /badbottrap/
User-agent: googlebot
Disallow: /bloop/
Disallow: /blop/
Disallow: /shop/
Disallow: /forum/
Disallow: /cgi-bin/
Disallow: /badbottrap/
User-agent: *
Disallow: /shop/
Disallow: /forum/
Disallow: /cgi-bin/
Disallow: /badbottrap/
This can be shortened to
User-agent: Slurp
User-agent: msnbot
User-agent: googlebot
Disallow: /bloop/
Disallow: /blop/
Disallow: /shop/
Disallow: /forum/
Disallow: /cgi-bin/
Disallow: /badbottrap/
User-agent: *
Disallow: /shop/
Disallow: /forum/
Disallow: /cgi-bin/
Disallow: /badbottrap/
No comments yet.
Leave a comment
Blogroll
Categories
- 2013 seo trends
- author rank
- Bing search engine
- blogger
- Fake popularity
- google ads
- Google Adsense
- google fault
- google impact
- google Investigation
- google knowledge
- Google panda
- Google penguin
- Google Plus
- Google webmaster tools
- Hummingbird algorithm
- infographics
- link building
- Mattcutts Video Transcript
- Microsoft
- MSN Live Search
- Negative SEO
- pagerank
- Paid links
- Panda and penguin timeline
- Panda Update
- Panda Update #22
- Panda Update 25
- Panda update releases 2012
- Penguin Update
- Sandbox Tool
- search engines
- SEO
- SEO cartoons comics
- seo predictions
- seo techniques
- SEO tools
- seo updates
- social bookmarking
- Social Media
- SOPA Act
- Spam
- Uncategorized
- Webmaster News
- website
- Yahoo