# Define access-restrictions for robots/spiders # http://www.robotstxt.org/wc/norobots.html # This is patterned loosely off of weblion's file: # http://weblion.psu.edu/robots.txt User-agent: * Disallow: /search Disallow: /files # Add Googlebot-specific syntax extension to exclude forms # that are repeated for each piece of content in the site # the wildcard is only supported by Googlebot # http://www.google.com/support/webmasters/bin/answer.py?answer=40367&ctx;=sibling User-Agent: Googlebot Disallow: /*sendto_form$ Disallow: /*folder_factories$ Disallow: /*?searchterm=* # Don't spider all the old revisions of stuff. 4 hits a second from Googlebot # eats two of our logical cores for not much added value: Disallow: /*?rev=* Disallow: /*&rev;=*