Douglas Eadline | 3 May 14:40
Favicon

Forward: RE:


I know Penguin runs the list, but I'm not sure who
to contact, I'll forward it to the list. Hopefully
someone will be able to provide an answer.

--
Doug

> Doug
>
> Quick Beowulf question - I've got a problem with the list - the listinfo
URL doesn't work (for me) >
> http://www.beowulf.org/mailman/listinfo/beowulf, obvious email addresses
either bounce or vanish down a /dev/null hole.
>
> Do you know who's in charge and how it's managed?
>
> Thanks in advance for your help
> Regards
> Graham
>
> Graham Mullier
> Head of Information Connection and Design, R&D IS
> Syngenta, Bracknell, RG42 6EY, UK.
> direct line: +44 (0) 1344 414163
> mailto:Graham.Mullier <at> syngenta.com
>
>
>
>
(Continue reading)

Hearns, John | 1 May 17:10

Intel NUC

http://www.theregister.co.uk/2012/05/01/intel_pi_rival_nuc/

 

Ohhh....

Thinking of how to cool a rack full of these things with 2x16Gbyte DIMMS in each.

Looks like you could seal that case and use immersive cooling -  partially dip the case in the coolant but leave the top dry???

 

2x mini PCIe slots for that fast interconnect

Though – does anyone know much about networking over thunderbolt?

 

 

 

John Hearns | CFD Hardware Specialist | McLaren Racing Limited
McLaren Technology Centre, Chertsey Road, Woking, Surrey GU21 4YH, UK


T:  +44 (0) 1483 262000

D:  +44 (0) 1483 262352

F:  +44 (0) 1483 261928 
E:  john.hearns <at> mclaren.com

W: www.mclaren.com

 

The contents of this email are confidential and for the exclusive use of the intended recipient. If you receive this email in error you should not copy it, retransmit it, use it or disclose its contents but should return it to the sender immediately and delete your copy.

Mark Hahn | 25 Apr 04:58
Picon
Picon
Favicon

yikes: intel buys cray's spine

http://www.eetimes.com/electronics-news/4371639/Cray-sells-interconnect-hardware-unit-to-Intel

that's one market where AMD no longer plays eh?
Prentice Bisbal | 20 Apr 15:37

New industry for Iceland?

Combine this article:

"A Cool Place for Cheap Flops"
http://www.hpcwire.com/hpcwire/2012-04-11/a_cool_place_for_cheap_flops.html

With this paper:

"Relativistic Statistical Arbitrage"
dspace.mit.edu/openaccess-disseminate/1721.1/62859

And it's looks like Iceland has a new industry: Datacenters for the
high-frequency trading (HFT) gang.

Just remember - you heard it here first, folks! ;)

--

-- 
Prentice 

Rayson Ho | 19 Apr 20:34
Picon

Next release of Open Grid Scheduler & the Gompute User Group Meeting

The next release of Open Grid Scheduler/Grid Engine will be released
at the Gompute User Group Meeting. The Gompute User Group Meeting is a
free, 2-day, HPC event in Gothenburg, Sweden.

Register for the event at: http://www.simdi.se/

** Please let me know if you are interested in a Grid Engine track.

Gridcore/Gompute contributed booth space at SC11 for the Grid Engine
2011.11 release (the first major release of open-source Grid Engine
after separation from Oracle), and joined the Open Grid Scheduler
project in April 2012.

Rayson

=================================
Open Grid Scheduler / Grid Engine
http://gridscheduler.sourceforge.net/

Scalable Grid Engine Support Program
http://www.scalablelogic.com/
Rayson Ho | 19 Apr 16:26
Picon

Re: 2 Security bugs fixed in Grid Engine

Right, the GE2011.11p1.patch diff is against GE2011.11. GE2011.11p1
(ie. trunk) is compatible with GE2011.11, and GE2011.11 is also
compatible with SGE 6.2u5.

I can quickly create a diff for GE2011.11 during lunch time today -
will let you know when it is done.

Rayson

=================================
Open Grid Scheduler / Grid Engine
http://gridscheduler.sourceforge.net/

Scalable Grid Engine Support Program
http://www.scalablelogic.com/

On Thu, Apr 19, 2012 at 8:31 AM, Taras Shapovalov
<taras.shapovalov <at> brightcomputing.com> wrote:
> Hi,
>
> I am trying to apply GE2011.11p1.patch for GE2011.11 and it fails. It seems,
> the developers of GE have created this patch for the trunk version of GE
> (which is not the same as the stable version). Is it correct?
>
> --
> Best regards,
> Taras
>

--

-- 
==================================================
Open Grid Scheduler - The Official Open Source Grid Engine
http://gridscheduler.sourceforge.net/
Christopher Samuel | 19 Apr 03:53
Picon
Picon
Favicon

Migrating from IB datagram mode to connected mode live ?


Hi folks,

For hysterical raisins we have an IBM iDataPlex system which is
running QDR IB in datagram mode.  To that IB network we'll be adding
another QDR system which can only run in connected mode.

The kicker is that our IB network is used for GPFS over IPoIB and so
our NSD's will need to move to connected mode for the new system.

I've been Googling without success to find out if you can do such a
migration live (i.e. change the servers to connected mode, increase
their MTUs and then migrate clients to connected mode (we have enough
redundancy in servers to do this) or whether we'll need to schedule an
outage and take the whole system down and bring it back up in
connected mode.

Any thoughts?

cheers,
Chris
--

-- 
    Christopher Samuel - Senior Systems Administrator
 VLSCI - Victorian Life Sciences Computation Initiative
 Email: samuel <at> unimelb.edu.au Phone: +61 (0)3 903 55545
         http://www.vlsci.unimelb.edu.au/

Prentice Bisbal | 18 Apr 21:02

Re: Questions about upgrading InfiniBand

Aggregation spine? Can you tell me more about that? Can you give me a
part/model number?

Prentice 

On 04/18/2012 11:22 AM, Andrew Howard wrote:
> I would talk to Mellanox about your options for switch topology. We
> opted not to go with the single 648-port FDR director switch, but
> instead use top-of-rack leaf switches (the 36-port guys) and then an
> aggregation spine to connect those. It performs beautifully. It also
> means we don't have to worry about buying longer (more expensive)
> cables to run to the director switch, we can buy the shorter cables to
> run to the rack switch and then only have to buy a few 10M cables to
> run to the spine.
>
> --
> Andrew Howard
> HPC Systems Engineer
> Purdue University
> (765) 889-2523
>
>
>
> On Wed, Apr 18, 2012 at 11:05 AM, Prentice Bisbal <prentice <at> ias.edu
> <mailto:prentice <at> ias.edu>> wrote:
>
>     Beowulfers,
>
>     I'm planning on adding some upgrades to my existing cluster, which has
>     66 compute nodes pluss the head node. Networking consists of a Cisco
>     7012 IB switch with 6 out of 12 line cards installed, giving me a
>     capacity of 72 DDR ports, expandable to 144, and two 40-port ethernet
>     switches that have only six extra ports between them.
>
>     I'd like to add a Lustre filesystem (over InfiniBand)  to my cluster,
>     and then begin adding/replacing nodes in the cluster. Obviously, I'll
>     need to increase capacity of both my IB and ethernet networks. The
>     questions I have are about upgrading my InifiniBand.
>
>     1. It looks like QLogic is out of the InfiniBand business. Is Mellanox
>     the only game in town these days?
>
>     2. Due to the size of my cluster, it looks like buying a just a
>     core/enterprise IB switch with capacity for ~100 ports is the best
>     option (I don't expect my cluster to go much bigger than this in the
>     next 4-5 years).  Based on that criteria, it looks like the Mellanox
>     IS5100 is my only option. Am I over looking other options?
>
>     http://www.mellanox.com/content/pages.php?pg=products_dyn&product_family=71&menu_section=49
>     <http://www.mellanox.com/content/pages.php?pg=products_dyn&product_family=71&menu_section=49>
>
>     3. In my searching yesterday, I didn't find any FDR core/enterprise
>     switches with > 36 ports, other than the Mellanox SX6536. At 648
>     ports,
>     the SX6536is too big for my needs. I've got to be over looking other
>     products, right?
>
>     http://www.mellanox.com/content/pages.php?pg=products_dyn&product_family=122&menu_section=49
>     <http://www.mellanox.com/content/pages.php?pg=products_dyn&product_family=122&menu_section=49>
>
>     4. Adding an additional line card to my existing switch looks like it
>     will cost me only ~$5,000, and give me the additional capacity
>     I'll need
>     for the next 1-2 years. I'm thinking it makes sense to do that,
>     and wait
>     for affordable FDR switches to come out with the port count I'm
>     looking
>     for instead of upgrading to QDR right now, and start buying hardware
>     with FDR HCAs in preparation for that.  Please feel free to
>     agree/disagree. This brings me to my next question...
>
>     5. FDR and QDR should be backwards compatible with my existing DDR
>     hardware, but how exactly does work? If I have, say an FDR switch
>     with a
>     mixture of FDR, QDR, and DDR HCAs, will the whole fabric slow down to
>     the lowest-common denominator, or will the slow-down be based on
>     the two
>     nodes involved in the communication only? When I googled for an
>     answer,
>     all I found were marketing documents that guaranteed backwards
>     compatibility, but didn't go to this level of detail, I searched the
>     standard spec (v1.2.1), and didn't find an obvious answer to this
>     question.
>
>     6. I see some Mellanox docs saying their FDR switches are
>     compliant with
>     v1.3 of the standard, but the latest version available for download is
>     1.2.1. I take it the final version of 1.3 hasn't been ratified yet. Is
>     that correct?
>
>     --
>     Prentice
>
>     _______________________________________________
>     Beowulf mailing list, Beowulf <at> beowulf.org
>     <mailto:Beowulf <at> beowulf.org> sponsored by Penguin Computing
>     To change your subscription (digest mode or unsubscribe) visit
>     http://www.beowulf.org/mailman/listinfo/beowulf
>
>
Prentice Bisbal | 18 Apr 17:05

Questions about upgrading InfiniBand

Beowulfers,

I'm planning on adding some upgrades to my existing cluster, which has
66 compute nodes pluss the head node. Networking consists of a Cisco
7012 IB switch with 6 out of 12 line cards installed, giving me a
capacity of 72 DDR ports, expandable to 144, and two 40-port ethernet
switches that have only six extra ports between them.

I'd like to add a Lustre filesystem (over InfiniBand)  to my cluster,
and then begin adding/replacing nodes in the cluster. Obviously, I'll
need to increase capacity of both my IB and ethernet networks. The
questions I have are about upgrading my InifiniBand.

1. It looks like QLogic is out of the InfiniBand business. Is Mellanox
the only game in town these days?

2. Due to the size of my cluster, it looks like buying a just a
core/enterprise IB switch with capacity for ~100 ports is the best
option (I don't expect my cluster to go much bigger than this in the
next 4-5 years).  Based on that criteria, it looks like the Mellanox
IS5100 is my only option. Am I over looking other options?

http://www.mellanox.com/content/pages.php?pg=products_dyn&product_family=71&menu_section=49

3. In my searching yesterday, I didn't find any FDR core/enterprise
switches with > 36 ports, other than the Mellanox SX6536. At 648 ports,
the SX6536is too big for my needs. I've got to be over looking other
products, right?

http://www.mellanox.com/content/pages.php?pg=products_dyn&product_family=122&menu_section=49

4. Adding an additional line card to my existing switch looks like it
will cost me only ~$5,000, and give me the additional capacity I'll need
for the next 1-2 years. I'm thinking it makes sense to do that, and wait
for affordable FDR switches to come out with the port count I'm looking
for instead of upgrading to QDR right now, and start buying hardware
with FDR HCAs in preparation for that.  Please feel free to
agree/disagree. This brings me to my next question...

5. FDR and QDR should be backwards compatible with my existing DDR
hardware, but how exactly does work? If I have, say an FDR switch with a
mixture of FDR, QDR, and DDR HCAs, will the whole fabric slow down to
the lowest-common denominator, or will the slow-down be based on the two
nodes involved in the communication only? When I googled for an answer,
all I found were marketing documents that guaranteed backwards
compatibility, but didn't go to this level of detail, I searched the
standard spec (v1.2.1), and didn't find an obvious answer to this question.

6. I see some Mellanox docs saying their FDR switches are compliant with
v1.3 of the standard, but the latest version available for download is
1.2.1. I take it the final version of 1.3 hasn't been ratified yet. Is
that correct?

--

-- 
Prentice 

Rayson Ho | 18 Apr 02:06
Picon

2 Security bugs fixed in Grid Engine

There were 2 security related bugs fixed and released in Grid Engine today:

- Code injection via LD_* environment variables
- sgepasswd buffer overflow

Oracle fixed both of them in their CPU (Critical Patch Update) release
for Oracle Grid Engine this afternoon.

For Sun Grid Engine (6.2u5) and Open Grid Scheduler/Grid Engine, visit:

http://gridscheduler.sourceforge.net/security.html

The first one was found by William Hay back in Nov 2011. And the
second one was reported by an outside security researcher to Oracle.
The details of the bug were passed onto me, and we (all the Grid
Engine forks) decided that we should share any security related
information instead of putting it in marketing slides.

Download patches and pre-compiled binaries for:

- SGE 6.2u5, 6.2u5p1, 6.2u5p2
- Open Grid Scheduler/Grid Engine 2011.11

from the URL above.

To apply the patches, just replace the older version of the binaries
with the newer version.

Rayson

=================================
Open Grid Scheduler / Grid Engine
http://gridscheduler.sourceforge.net/

Scalable Grid Engine Support Program
http://www.scalablelogic.com/
Hearns, John | 17 Apr 17:26

Ubuntu MAAS

I read a ZDnet article on Ubuntu LTS pitching to be your cloud and data centre distribution on choice.

It mentions Ubunti Metal-As-A-Service

 

http://www.markshuttleworth.com/archives/1103

 

https://wiki.ubuntu.com/ServerTeam/MAAS/

 

I guess this is what clustering types have been doing for a long time with various cluster deployment and management suites.

 

Also note Mark Shuttleworths comment about the cost of the OS per node :

“As we enter an era in which ATOM is as important in the data centre as XEON, an operating system like Ubuntu makes even more sense”

I guess this chimes with the initial Beowulfery spirit – when you have low-cost nodes, why use an OS (whether it is Windows, Solaris etc)

Which is a significant fraction of the nodes cost.

 

 

John Hearns | CFD Hardware Specialist | McLaren Racing Limited
McLaren Technology Centre, Chertsey Road, Woking, Surrey GU21 4YH, UK


T:  +44 (0) 1483 262000

D:  +44 (0) 1483 262352

F:  +44 (0) 1483 261928 
E:  john.hearns <at> mclaren.com

W: www.mclaren.com

 

The contents of this email are confidential and for the exclusive use of the intended recipient. If you receive this email in error you should not copy it, retransmit it, use it or disclose its contents but should return it to the sender immediately and delete your copy.


Gmane