While search engines are relatively good at finding their own way around a site, occasionally they need further guidance. If there are any pages or areas of your site that you would prefer weren't added to a search index, you should place a robots.txt file in the root of your site specifying them. This mechanism should be honored by all search engines.
Another recent method available is the Sitemap Protocol (see www.sitemaps.org, an XML format that is now supported and encouraged by most major search engines. Using this protocol, you can give search engines guidance on which pages of your site you would like to have indexed, as well as how important those pages are in relation to each other, and how often you expect them to change. If your site is not currently capable of producing these XML sitemaps, it may be worth including them in a future redevelopment where practical.
Of course, classic page-based sitemaps and index pages on your site are also of benefit to search engines, but be aware that most search engines only read the first 100 links they encounter in an HTML page, so a large sitemap or index should be broken up into multiple pages to get maximum benefit. It may seem counter intuitive to restrict search engine access to certain pages on the site if you are trying to increase your rankings, but remember that quality is more important than quantity to search engines; by only allowing the search engines to index pages that would be useful to a searcher, you are helping keep the search engine indexes clean, and therefore people will find what they want faster.
Common pages you should prevent search engines from indexing include:
posted by Alenjoe @ 2:59 PM permanent link | |
Post a Comment