-
-
Notifications
You must be signed in to change notification settings - Fork 9
But wait! Isn't that what robots.txt is?
PureKrome edited this page Nov 24, 2014
·
1 revision
Nope - not quite.
A robots.txt file contains urls of independent resources (like about me or contact us), sitemaps (like what this library creates) and also of what resources a robot/crawler can/cannot crawl and index.
Example:
User-agent: *
Allow: / #everything from the root and below are allowed
Sitemap: http://www.myWebSite.com/sitemap/products
Sitemap: http://www.myWebSite.com/sitemap/users
...or
User-agent: *
Disallow: #nothing is blockedFor more info, read about the standard on the official wiki page.