David Nicol | 5 Jan 19:02 2011
Picon

new supercomputing metric: gigaflops per watt


The air force has wired together almost two thousand playstations at wright-pat. The article doesn't say what OS they're using, or if they're running code on the graphics chips.

http://www.airforcetimes.com/news/2010/12/air-force-playstation-3-supercomputer-122410w/

--
“The aeroplane is fatally defective. It is merely a toy—a sporting play-thing.  It can never become commercially practical." -- Nikola Tesla
_______________________________________________
KCLUG mailing list
KCLUG@...
http://kclug.org/mailman/listinfo/kclug
Michienne Dixon | 11 Jan 19:49 2011

Looking for a Systems builder

We are looking to replace our mail appliance.  I am currently taking bids on a new mail server and LDAP system that will be a Xen guest in a clustered environment.
Please contact me off list for more information.
 
-
Michienne Dixon
Network Administrator
liNKCity
312 Armour Rd.
North Kansas City, MO  64116
www.linkcity.org
(816) 412-7990
 
_______________________________________________
KCLUG mailing list
KCLUG@...
http://kclug.org/mailman/listinfo/kclug
Jonathan Hutchins | 16 Jan 21:53 2011

robots.txt question

I'm wondering about the syntax.  The example file from drupal uses the format

Disallow: /aggregator

However, it says in the comments that only the root /robots.txt file is valid.  

From my understanding of the syntax, /aggregator does not 
block /foo/aggregator, so I need to either prepend "/foo" to everything, or 
use wildcards per the new google/webcrawler extensions to the protocol.

If anybody can cite an on-line example that explains I'd be grateful.
_______________________________________________
KCLUG mailing list
KCLUG@...
http://kclug.org/mailman/listinfo/kclug

Jack | 17 Jan 13:26 2011
Picon

Re: robots.txt question

Let me explain a bit.

To exclude all robots, that respect the robots.txt file:

User-agent: *
Disallow: /

To exclude just one directory and its subdirectories, say, the /aggregator/ directory:

User-agent: *
Disallow: /aggregator/

To  disallow specific robots you need to know what it calls itself, ia_archiver is the wayback machine


To allow the Internet Archive bot you'd make a line like this:

User-agent: ia_archiver
Disallow:

To block ia_archiver from visiting:

User-agent: ia_archiver
Disallow: /

You can have as many lines like this as you want. So you can disallow all robots from everywhere, and then allow only those you want. You can block certain robots from certain parts. You can block directories and sub directories or individual files.. If you have numerous "aggregator" files in various subdirectories you want to block you need to list them all.

Like this:

User-agent: *
Disallow:/aggregator/
Disallow:/foo/aggretator/
...
Disallow:/hidden/aggregator/

Your syntax looks wonky, missing the final "/".
User-agent tells who to block and Disallow what to block. This all assumes well behaved robots. This file is useless for those that ignore this file. It is not a security device, just a polite sticky note.

You might go here for more detailed info. I'm no expert for sure.

http://www.robotstxt.org/orig.html

Jack

--- On Sun, 1/16/11, Jonathan Hutchins <hutchins-ZRzwtDF0/5SXDw4h08c5KA@public.gmane.org> wrote:

From: Jonathan Hutchins <hutchins <at> tarcanfel.org>
Subject: robots.txt question
To: "KCLUG (E-mail)" <kclug <at> kclug.org>
Date: Sunday, January 16, 2011, 12:53 PM

I'm wondering about the syntax.  The example file from drupal uses the format

Disallow: /aggregator

However, it says in the comments that only the root /robots.txt file is valid. 

From my understanding of the syntax, /aggregator does not
block /foo/aggregator, so I need to either prepend "/foo" to everything, or
use wildcards per the new google/webcrawler extensions to the protocol.

If anybody can cite an on-line example that explains I'd be grateful.

_______________________________________________
KCLUG mailing list
KCLUG@...
http://kclug.org/mailman/listinfo/kclug
Jonathan Hutchins | 17 Jan 17:21 2011

Re: robots.txt question

Thanks for at least trying.  Yes, I'm aware of robotstxt.org.  My question was 
not regarding user-agents, but about whether a string such as /foo blocks 
only http://website.org/foo, or if it would block 
http://website.org/blah/foo.  I think it would not, but I'm looking for 
documentation, references, or examples that specifically address that.

From reading Google's reference on robots.txt as extended by Google, 
Microsoft, and Yahoo, the most obvious interpretation would be that I need to 
block /*foo, but again, no specific confirmation.

As for the trailing slash, again, no specific reference to that format so far.
_______________________________________________
KCLUG mailing list
KCLUG@...
http://kclug.org/mailman/listinfo/kclug

Jack | 17 Jan 18:14 2011
Picon

Re: robots.txt question

Actually, I was wrong in my previous email.

"Disallow: /aggregator/" should block any directory at any level that is "aggregator".

The syntax, "Disallow: /aggregator" should block access to any aggregator directory and any aggregator.html files.

Without the slash you block directories and html pages by the name. With the final slash you block just directories.
There's no reason to add more than one robots.txt file. You should only have one, and put all your rules in there. 

/aggregator/ should block :
/aggragator/,
/foo/aggregator/,
/long/deep/path/to/obscure/folder/aggregator/,
etc.

You can test all of this for yourself. you can use wget to download from your site as a robot. You can also make wget ignore the robots.txt file. You can have wget pretend to be any robot you like, or even make it your own robot, that you allow to mirror your page.

Caveat, if you make wget ignore the robots.txt file you should also add a pause to it so you don't hammer the site you are downloading/mirroring. Some sites specifically disallow wget in recursive mode, to keep the site from getting hammered by downloads.

Jack

--- On Sun, 1/16/11, Jonathan Hutchins <hutchins-ZRzwtDF0/5SXDw4h08c5KA@public.gmane.org> wrote:

From: Jonathan Hutchins <hutchins-ZRzwtDF0/5SXDw4h08c5KA@public.gmane.org>
Subject: robots.txt question
To: "KCLUG (E-mail)" <kclug-3DadQFcgQnvYtjvyW6yDsg@public.gmane.org>
Date: Sunday, January 16, 2011, 12:53 PM

I'm wondering about the syntax.  The example file from drupal uses the format

Disallow: /aggregator

However, it says in the comments that only the root /robots.txt file is valid. 

From my understanding of the syntax, /aggregator does not
block /foo/aggregator, so I need to either prepend "/foo" to everything, or
use wildcards per the new google/webcrawler extensions to the protocol.

If anybody can cite an on-line example that explains I'd be grateful.
_______________________________________________
KCLUG mailing list
KCLUG-3DadQFcgQnvYtjvyW6yDsg@public.gmane.org
http://kclug.org/mailman/listinfo/kclug
_______________________________________________
KCLUG mailing list
KCLUG@...
http://kclug.org/mailman/listinfo/kclug
Jim Herrmann | 17 Jan 22:27 2011

4G Cards

What Spring 4G cards work with Linux?  I am making the bold assumption that there must be someone on here that knows this off the top of their head.  :-)

It appears that 3G is well supported, albeit with some USB tweaks.  But I'm not hopeful from what I've found for drivers getting 4G speeds being supported on Linux.

Help me choose.

Thanks,
Jim

_______________________________________________
KCLUG mailing list
KCLUG@...
http://kclug.org/mailman/listinfo/kclug
Jim Herrmann | 17 Jan 23:32 2011

Re: 4G Cards

OK, I may have answered my own question, but I would still like to hear other's opinions.

The Overdrive Hot Spot gives me a 4G connection, and allows up to 5 WiFi connections to it.  That allows any OS to use it.  Sound good?  Yes, I know

Let me know what you think.

Jim

On Mon, Jan 17, 2011 at 4:27 PM, Jim Herrmann <kclug-APhNlRNkRNqakBO8gow8eQ@public.gmane.org> wrote:
What Spring 4G cards work with Linux?  I am making the bold assumption that there must be someone on here that knows this off the top of their head.  :-)

It appears that 3G is well supported, albeit with some USB tweaks.  But I'm not hopeful from what I've found for drivers getting 4G speeds being supported on Linux.

Help me choose.

Thanks,
Jim

_______________________________________________
KCLUG mailing list
KCLUG@...
http://kclug.org/mailman/listinfo/kclug
Jonathan Hutchins | 17 Jan 23:50 2011

Re: robots.txt question

BTW for anybody interested in the legal relevance of a /robots.txt file 
(basically zilch) please see http://www.robotstxt.org/faq/legal.html
_______________________________________________
KCLUG mailing list
KCLUG@...
http://kclug.org/mailman/listinfo/kclug

Jack | 18 Jan 00:40 2011
Picon

Re: robots.txt question

Before, anyone gets too cocky on the legal ramifications of robots.txt files. Let me just warn you, the person who wrote that piece is not a lawyer. It would still be irrelevant if he were. There is nothing preventing legal authorities charging you with the CFAA Act because you violated a robots.txt file. What a judge will decide is yet another unknown.

Now you may say or experts may say that they'd have no case. That may all be true, but if you get arrested, you will: have an arrest record, lose time fighting a open and shut case, spend a bunch of money on an attorney to defend yourself, most likely not have the same kind of lawyers as Google, and possibly lose the first case and go to j ail. Not to mention a whole lot of stress and other intangibles.

Do not get you legal advice from friends and strangers on the Internet. Talk to a lawyer if you have questions.

Jack

--- On Mon, 1/17/11, Jonathan Hutchins <hutchins-ZRzwtDF0/5SXDw4h08c5KA@public.gmane.org> wrote:

From: Jonathan Hutchins <hutchins-ZRzwtDF0/5SXDw4h08c5KA@public.gmane.org>
Subject: Re: robots.txt question
To: "Kclug" <kclug-3DadQFcgQnvYtjvyW6yDsg@public.gmane.org>
Date: Monday, January 17, 2011, 2:50 PM

BTW for anybody interested in the legal relevance of a /robots.txt file
(basically zilch) please see http://www.robotstxt.org/faq/legal.html

_______________________________________________
KCLUG mailing list
KCLUG@...
http://kclug.org/mailman/listinfo/kclug

Gmane