1 Feb 2008 14:17
Prevent bots (Google) to index WebSVN
jiho <jo.irisson <at> gmail.com>
2008-02-01 13:17:28 GMT
2008-02-01 13:17:28 GMT
Hello everyone, I've been happily using WebSVN for a while to make my repositories public in a nice and friendly manner. The repositories are hosted on a Linux box (Fedora Core 8), with Apache and they use multiview. RSS is disabled but tarballs are enabled. They were recently indexed by Google and msn as Apache logs shows: 65.55.208.171 - - [27/Jan/2008:04:17:18 +0100] "GET /##whatever##/ models_larvae/?rev=52&sc=0 HTTP/1.0" 200 6567 "-" "msnbot/1.0 (+http://search.msn.com/msnbot.htm) " 65.55.208.171 - - [27/Jan/2008:04:17:19 +0100] "GET /##whatever##/ models_larvae/?rev=35&sc=0 HTTP/1.0" 200 6411 "-" "msnbot/1.0 (+http://search.msn.com/msnbot.htm) " [...] 66.249.66.50 - - [01/Feb/2008:14:34:17 +0100] "GET /##whatever##/ ownfor/bbscript/trunk/doc/figures/deco/bluefish.png?op=diff&rev=&sc=1 HTTP/1.1" 200 4039 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html) " 66.249.66.50 - - [01/Feb/2008:14:34:23 +0100] "GET /##whatever##/ ownfor/bbscript/trunk/src/GNU_GPL.txt?op=log&rev=6&sc=1&isdir=0 HTTP/ 1.1" 200 6326 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html) " [...] which generated a huge number of /tmp files which completely filled the hard drive of the machine. In addition, even if these repositories are public, I would rather point people to the address and prevent(Continue reading)
RSS Feed