Sponsorised links
This year
Wikio : le bon (redirection 301) ? La brute (redirection 302) ou le truant (fichier robots.txt frelaté) ?
La multiplication des digg-like, des agrégateurs et autres aspirateurs de contenus nécessite une vigilance accrue quant aux redirections qui affectent les liens pointant vers nos blogs et nos sites web. A cause de Wikio, il faudra non seulement vérifier les redirections, mais aussi les fichiers robots.txt. Magneto !
Sponsorised links
2007
ACAP Launches, Robots.txt 2.0 For Blocking Search Engines?
After a year of discussions, ACAP -- Automated Content Access Protocol -- was released today as a sort of robots.txt 2.0 system for telling search engines what they can or can't include in their listings. However, none of the major search engines support ACAP, and its future remains firmly one of "watch and see."
Robots.txt Generator - McAnerin International Inc.
Good Robot.txt generator
robotstxt.org
This is the main source for information on the robots.txt Robots Exclusion Standard and other articles about writing well-behaved Web robots.
辛辣インターフェース評議会 - ポケットはてなは著作権侵害かどうか
"ふつう変換系のサービスってrobots.txtいれるよね。(中略)はてなは検索エンジンSPAMで収益を上げる会社ですか?"
ニコニコブックマーク(仮)
ニコニコブックマークのuser-agentは nicobot0.1 (+http://www.nicob.jp/?m=default&a=info&p=help) です。
ニコニコブックマークのみに登録させたくないときはrobots.txtに以下のように書いてください。
User-Agent: nicobot
Disallow: /
タグによる登録拒否
HTMLに以下のmetaタグを埋め込むことでも登録拒否が可能です。
Mes meilleurs adresses pour créer un site
Mes références entant que webmestre. Cette page sert aussi à montrer ce que l'on peut faire avec des commandes CSS.
2006
New Robots.txt Syntax Checker: a validator for robots.txt files
This robots.txt checker is a "validator" that analyzes the syntax of a robots.txt file to see if its format is valid as established by Robot Exclusion Standard (please read the documentation and the tutorial to learn the basics) or if it contains errors.
The Web Robots Pages
Web Robots are programs that traverse the Web automatically. Some people call them Web Wanderers, Crawlers, or Spiders. These pages have further information about these Web Robots.
somesound.org
robots.txt is a PHP script that acts like a normal robots.txt file, but with a few differences. When a Spider attempts to access robots.txt, the script will "disallow" access to a list of pre-defined directories. When a normal user attempts to access robo
