Skip to content

But wait! Isn't that what robots.txt is?

PureKrome edited this page Nov 24, 2014 · 1 revision

Nope - not quite.

A robots.txt file contains urls of independent resources (like about me or contact us), sitemaps (like what this library creates) and also of what resources a robot/crawler can/cannot crawl and index.

Example:

User-agent: *
Allow: / #everything from the root and below are allowed
Sitemap: http://www.myWebSite.com/sitemap/products
Sitemap: http://www.myWebSite.com/sitemap/users
...

or

User-agent: *
Disallow: #nothing is blocked

For more info, read about the standard on the official wiki page.

Clone this wiki locally