frankgd | 2 Feb 2009 19:02
Picon
Favicon

Newbie, need help!

Hello everybody. I'm a Computer Science student from University of Havana.
I'm taking part in a new project here at the UH related with simulations
and mathematical modeling which requires great power of calculus, so we
are mounting a small cluster to run those calculations. I'm the one
supposed to install and later administer that cluster, but I've just
started learning about this topic, I've never program in parallel before,
so I need somebody to recommend any good books, tutorials, URLs, etc. and
give me some advises about the topic. I would appreciate any help.
Thank you
Frank

John Hearns | 3 Feb 2009 07:45

Re: Newbie, need help!

Frank,
  welcome to the list. Here are some places to get you started:

http://www.clustermonkey.net/
http://www.phy.duke.edu/~rgb/brahma/Resources/resources.php

Please continue to ask questions - I'm sure we can help here.


<div><p>Frank,<br>&nbsp; welcome to the list. Here are some places to get you started:<br><br><a href="http://www.clustermonkey.net/" target="_blank">http://www.clustermonkey.net/</a><br><a href="http://www.phy.duke.edu/%7Ergb/brahma/Resources/resources.php" target="_blank">http://www.phy.duke.edu/~rgb/brahma/Resources/resources.php</a><br><br>Please continue to ask questions - I'm sure we can help here.<br><br><br></p></div>
John Hearns | 3 Feb 2009 18:25

IBM Sequoia

 
I make this 400 cores per 1U rack unit. How is the counting being done here?
 
This article says 4096 cores per rack:
That seems more rational - 93 racks adds up to 380928 cores
<div>
<div><a href="http://www.theregister.co.uk/2009/02/03/llnl_buys_ibm_supers/">http://www.theregister.co.uk/2009/02/03/llnl_buys_ibm_supers/</a></div>
<div>&nbsp;</div>
<div>I make this 400 cores per 1U rack unit. How is the counting being done here?</div>
<div>&nbsp;</div>
<div>This article says 4096 cores per rack:</div>
<div><a href="http://www.engadget.com/2009/02/03/ibms-sequoia-20x-faster-than-the-worlds-fastest-supercomputer/">http://www.engadget.com/2009/02/03/ibms-sequoia-20x-faster-than-the-worlds-fastest-supercomputer/</a></div>

<div>That seems more rational - 93 racks adds up to 380928 cores</div>
</div>
Kilian CAVALOTTI | 4 Feb 2009 14:19
Picon

Re: IBM Sequoia

Hi John,

On Tuesday 03 February 2009 18:25:04 John Hearns wrote:
> http://www.theregister.co.uk/2009/02/03/llnl_buys_ibm_supers/
>
> I make this 400 cores per 1U rack unit. How is the counting being done
> here?

Even for the intermediate "development" system, Dawn, the article quotes 
150,000 cores in 36 racks. Which makes about 100 cores in a 1U space. That's a 
hell of a density...

> This article says 4096 cores per rack:
> http://www.engadget.com/2009/02/03/ibms-sequoia-20x-faster-than-the-worlds-
>fastest-supercomputer/ That seems more rational - 93 racks adds up to 380928
> cores

An even weirdest number around this system is the amount of memory listed. 
1.6TB of main memory for 1.6 million cores, that's a mere 1 MB per core... I 
hope it's a P, not a T.

Cheers,
--

-- 
Kilian
John Hearns | 4 Feb 2009 14:23

Re: IBM Sequoia

2009/2/4 Kilian CAVALOTTI <kilian.cavalotti.work <at> gmail.com>:
> An even weirdest number around this system is the amount of memory listed.
> 1.6TB of main memory for 1.6 million cores, that's a mere 1 MB per core... I
> hope it's a P, not a T.
>

Yes, the Hpcwire article confirms 1.6Pbyte.
http://www.hpcwire.com/features/Lawrence-Livermore-Prepares-for-20-Petaflop-Blue-GeneQ-38948594.html
It also examines the core count:
"However, since Sequoia will be composed of 98,304 compute nodes and
contain a total of 1.6 million cores, one can surmise that a Blue
Gene/Q node will contain 16 cores. Whether this is implemented as one
16-core chip or two 8-core chips (or even four quad-core chips)
remains to be seen."
Joe Landman | 4 Feb 2009 14:45
Favicon

Re: IBM Sequoia

Kilian CAVALOTTI wrote:

> An even weirdest number around this system is the amount of memory listed. 
> 1.6TB of main memory for 1.6 million cores, that's a mere 1 MB per core... I 
> hope it's a P, not a T.

Could be processors-in-memory ...

--

-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: landman <at> scalableinformatics.com
web  : http://www.scalableinformatics.com
        http://jackrabbit.scalableinformatics.com
phone: +1 734 786 8423 x121
fax  : +1 866 888 3112
cell : +1 734 612 4615
Eugen Leitl | 4 Feb 2009 14:52

Re: IBM Sequoia

On Wed, Feb 04, 2009 at 08:45:08AM -0500, Joe Landman wrote:
> Kilian CAVALOTTI wrote:
> 
> >An even weirdest number around this system is the amount of memory listed. 
> >1.6TB of main memory for 1.6 million cores, that's a mere 1 MB per core... 
> >I hope it's a P, not a T.
> 
> Could be processors-in-memory ...

Does anyone know how they interface memory to the CPU?
This isn't a sandwich package, is it?

--

-- 
Eugen* Leitl <a href="http://leitl.org">leitl</a> http://leitl.org
______________________________________________________________
ICBM: 48.07100, 11.36820 http://www.ativel.com http://postbiota.org
8B29F6BE: 099D 78BA 2FD3 B014 B08A  7779 75B0 2443 8B29 F6BE
Robert G. Brown | 4 Feb 2009 15:08
Gravatar

Re: IBM Sequoia

On Wed, 4 Feb 2009, Kilian CAVALOTTI wrote:

> Hi John,
>
> On Tuesday 03 February 2009 18:25:04 John Hearns wrote:
>> http://www.theregister.co.uk/2009/02/03/llnl_buys_ibm_supers/
>>
>> I make this 400 cores per 1U rack unit. How is the counting being done
>> here?
>
> Even for the intermediate "development" system, Dawn, the article quotes
> 150,000 cores in 36 racks. Which makes about 100 cores in a 1U space. That's a
> hell of a density...

Wow in spades.  If the cores drew only 2W each, that would be a "normal"
amount of heat for 1U and adds up to being ballpark 10KW per rack.  If
they draw 5W each, that's between 550W and 750W (the latter if there is
a power supply in there with them, although I'm guessing that they'll
provide power through rails and not via onboard power supplies at that
density).  750W makes each rack burn something like 30 KW.  And 5W is
not a lot to drive a high-clock core and memory.

One wonders how they are acheiving/planning to achieve it.  Clock
throttling?  six to eight 16 core CPUs?  An array of 800 MHz PDA CPUs?
Some combination of the above?  Something else entirely?  I haven't
looked into the power requirements of cores beyond dual and quad, but
just running the bus and attached memory seems like it would make such a
system hot, hot, hot.  Or use slow, slow, slow processors (but a lot of
them).

The latter isn't a crazy idea, depending on the kind of task this
faster-that-fastest system is supposed to be faster on.  Some sort of
massively SIMD decomposable problem with minimal nonlocal IPCs where the
per-processor tasks are modest and nearly independent would get the
near-linear scaling required to use up 1.6 million cores, and it would
explain the 1 MB of memory per core.  Consider each node as representing
(say) 10,000 neurons and you've got a 16 billion neuron neural net with
some sort of semilocal topology. Not bad, actually.

     rgb

>
>> This article says 4096 cores per rack:
>> http://www.engadget.com/2009/02/03/ibms-sequoia-20x-faster-than-the-worlds-
>> fastest-supercomputer/ That seems more rational - 93 racks adds up to 380928
>> cores
>
> An even weirdest number around this system is the amount of memory listed.
> 1.6TB of main memory for 1.6 million cores, that's a mere 1 MB per core... I
> hope it's a P, not a T.
>
> Cheers,
> -- 
> Kilian
> _______________________________________________
> Beowulf mailing list, Beowulf <at> beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
>

Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb <at> phy.duke.edu

Bruno Coutinho | 4 Feb 2009 16:50
Picon

Re: IBM Sequoia

Probably it will be similar to Blue gene/P
http://en.wikipedia.org/wiki/Blue_Gene#Blue_Gene.2FP

Each compute card could have a octa core processor and some memory chips.
Like Blue Gene/P, each node card could have 16 compute cards and each card will be 2U high (but they will put two sideways like older Blue Genes).
So they can pack 100 cores in a 1U space without magic! :)

But cool this thing will now be easy.
Probably they will use something like refrigerated doors (or walls).


2009/2/4 Robert G. Brown <rgb <at> phy.duke.edu>
On Wed, 4 Feb 2009, Kilian CAVALOTTI wrote:

Hi John,

On Tuesday 03 February 2009 18:25:04 John Hearns wrote:
http://www.theregister.co.uk/2009/02/03/llnl_buys_ibm_supers/

I make this 400 cores per 1U rack unit. How is the counting being done
here?

Even for the intermediate "development" system, Dawn, the article quotes
150,000 cores in 36 racks. Which makes about 100 cores in a 1U space. That's a
hell of a density...

Wow in spades.  If the cores drew only 2W each, that would be a "normal"
amount of heat for 1U and adds up to being ballpark 10KW per rack.  If
they draw 5W each, that's between 550W and 750W (the latter if there is
a power supply in there with them, although I'm guessing that they'll
provide power through rails and not via onboard power supplies at that
density).  750W makes each rack burn something like 30 KW.  And 5W is
not a lot to drive a high-clock core and memory.

One wonders how they are acheiving/planning to achieve it.  Clock
throttling?  six to eight 16 core CPUs?  An array of 800 MHz PDA CPUs?
Some combination of the above?  Something else entirely?  I haven't
looked into the power requirements of cores beyond dual and quad, but
just running the bus and attached memory seems like it would make such a
system hot, hot, hot.  Or use slow, slow, slow processors (but a lot of
them).

The latter isn't a crazy idea, depending on the kind of task this
faster-that-fastest system is supposed to be faster on.  Some sort of
massively SIMD decomposable problem with minimal nonlocal IPCs where the
per-processor tasks are modest and nearly independent would get the
near-linear scaling required to use up 1.6 million cores, and it would
explain the 1 MB of memory per core.  Consider each node as representing
(say) 10,000 neurons and you've got a 16 billion neuron neural net with
some sort of semilocal topology. Not bad, actually.

   rgb



This article says 4096 cores per rack:
http://www.engadget.com/2009/02/03/ibms-sequoia-20x-faster-than-the-worlds-
fastest-supercomputer/ That seems more rational - 93 racks adds up to 380928
cores

An even weirdest number around this system is the amount of memory listed.
1.6TB of main memory for 1.6 million cores, that's a mere 1 MB per core... I
hope it's a P, not a T.

Cheers,
--
Kilian
_______________________________________________
Beowulf mailing list, Beowulf <at> beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


Robert G. Brown                        http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb <at> phy.duke.edu



_______________________________________________
Beowulf mailing list, Beowulf <at> beowulf.org
To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

<div>
<p>Probably it will be similar to Blue gene/P <br><a href="http://en.wikipedia.org/wiki/Blue_Gene#Blue_Gene.2FP" target="_blank">http://en.wikipedia.org/wiki/Blue_Gene#Blue_Gene.2FP</a><br><br>Each compute card could have a octa core processor and some memory chips.<br>

Like Blue Gene/P, each node card could have 16 compute cards and each card will be 2U high (but they will put two sideways like older Blue Genes).<br>So they can pack 100 cores in a 1U space without magic! :) <br><br>But cool this thing will now be easy.<br>
Probably they will use something like refrigerated doors (or walls).<br><br><br></p>
<div class="gmail_quote">2009/2/4 Robert G. Brown <span dir="ltr">&lt;<a href="mailto:rgb <at> phy.duke.edu" target="_blank">rgb <at> phy.duke.edu</a>&gt;</span><br><blockquote class="gmail_quote">

<div>On Wed, 4 Feb 2009, Kilian CAVALOTTI wrote:<br><br><blockquote class="gmail_quote">
Hi John,<br><br>
On Tuesday 03 February 2009 18:25:04 John Hearns wrote:<br><blockquote class="gmail_quote">
<a href="http://www.theregister.co.uk/2009/02/03/llnl_buys_ibm_supers/" target="_blank">http://www.theregister.co.uk/2009/02/03/llnl_buys_ibm_supers/</a><br><br>
I make this 400 cores per 1U rack unit. How is the counting being done<br>
here?<br>
</blockquote>
<br>
Even for the intermediate "development" system, Dawn, the article quotes<br>
150,000 cores in 36 racks. Which makes about 100 cores in a 1U space. That's a<br>
hell of a density...<br>
</blockquote>
<br>
</div>
Wow in spades. &nbsp;If the cores drew only 2W each, that would be a "normal"<br>
amount of heat for 1U and adds up to being ballpark 10KW per rack. &nbsp;If<br>
they draw 5W each, that's between 550W and 750W (the latter if there is<br>
a power supply in there with them, although I'm guessing that they'll<br>
provide power through rails and not via onboard power supplies at that<br>
density). &nbsp;750W makes each rack burn something like 30 KW. &nbsp;And 5W is<br>
not a lot to drive a high-clock core and memory.<br><br>
One wonders how they are acheiving/planning to achieve it. &nbsp;Clock<br>
throttling? &nbsp;six to eight 16 core CPUs? &nbsp;An array of 800 MHz PDA CPUs?<br>
Some combination of the above? &nbsp;Something else entirely? &nbsp;I haven't<br>
looked into the power requirements of cores beyond dual and quad, but<br>
just running the bus and attached memory seems like it would make such a<br>
system hot, hot, hot. &nbsp;Or use slow, slow, slow processors (but a lot of<br>
them).<br><br>
The latter isn't a crazy idea, depending on the kind of task this<br>
faster-that-fastest system is supposed to be faster on. &nbsp;Some sort of<br>
massively SIMD decomposable problem with minimal nonlocal IPCs where the<br>
per-processor tasks are modest and nearly independent would get the<br>
near-linear scaling required to use up 1.6 million cores, and it would<br>
explain the 1 MB of memory per core. &nbsp;Consider each node as representing<br>
(say) 10,000 neurons and you've got a 16 billion neuron neural net with<br>
some sort of semilocal topology. Not bad, actually.<br><br>
 &nbsp; &nbsp;rgb<div>
<br><br><blockquote class="gmail_quote">
<br><blockquote class="gmail_quote">
This article says 4096 cores per rack:<br><a href="http://www.engadget.com/2009/02/03/ibms-sequoia-20x-faster-than-the-worlds-" target="_blank">http://www.engadget.com/2009/02/03/ibms-sequoia-20x-faster-than-the-worlds-</a><br>
fastest-supercomputer/ That seems more rational - 93 racks adds up to 380928<br>
cores<br>
</blockquote>
<br>
An even weirdest number around this system is the amount of memory listed.<br>
1.6TB of main memory for 1.6 million cores, that's a mere 1 MB per core... I<br>
hope it's a P, not a T.<br><br>
Cheers,<br>
-- <br>
Kilian<br>
_______________________________________________<br>
Beowulf mailing list, <a href="mailto:Beowulf <at> beowulf.org" target="_blank">Beowulf <at> beowulf.org</a><br>
To change your subscription (digest mode or unsubscribe) visit <a href="http://www.beowulf.org/mailman/listinfo/beowulf" target="_blank">http://www.beowulf.org/mailman/listinfo/beowulf</a><br><br>
</blockquote>
<br>
</div>
Robert G. Brown &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<a href="http://www.phy.duke.edu/%7Ergb/" target="_blank">http://www.phy.duke.edu/~rgb/</a><br>
Duke University Dept. of Physics, Box 90305<br>
Durham, N.C. 27708-0305<br>
Phone: 1-919-660-2567 &nbsp;Fax: 919-660-2525 &nbsp; &nbsp; <a href="mailto:email%3Argb <at> phy.duke.edu" target="_blank">email:rgb <at> phy.duke.edu</a><div>
<div></div>
<div>
<br><br><br>
_______________________________________________<br>
Beowulf mailing list, <a href="mailto:Beowulf <at> beowulf.org" target="_blank">Beowulf <at> beowulf.org</a><br>
To change your subscription (digest mode or unsubscribe) visit <a href="http://www.beowulf.org/mailman/listinfo/beowulf" target="_blank">http://www.beowulf.org/mailman/listinfo/beowulf</a><br>
</div>
</div>
</blockquote>
</div>
<br>
</div>
John Hearns | 4 Feb 2009 17:09

Re: IBM Sequoia

2009/2/4 Bruno Coutinho <coutinho <at> dcc.ufmg.br>:
>>
> Each compute card could have a octa core processor and some memory chips.
> Like Blue Gene/P, each node card could have 16 compute cards and each card
> will be 2U high (but they will put two sideways like older Blue Genes).
> So they can pack 100 cores in a 1U space without magic! :)
>
> But cool this thing will now be easy.
> Probably they will use something like refrigerated doors (or walls).

Special packaging? Engineering effort and expertise given to cooling a
system which would otherwise melt?
Systems cabinets arranged in rings or squares?
Seymour Cray is laughing out loud on his personal cloud of freon
vapour right now.

Gmane