Re: IBM Sequoia
Bruno Coutinho <coutinho <at> dcc.ufmg.br>
2009-02-04 15:50:26 GMT
Probably it will be similar to Blue gene/P
http://en.wikipedia.org/wiki/Blue_Gene#Blue_Gene.2FP
Each compute card could have a octa core processor and some memory chips.
Like Blue Gene/P, each node card could have 16 compute cards and each card will be 2U high (but they will put two sideways like older Blue Genes).
So they can pack 100 cores in a 1U space without magic! :)
But cool this thing will now be easy.
Probably they will use something like refrigerated doors (or walls).
2009/2/4 Robert G. Brown
<rgb <at> phy.duke.edu>
On Wed, 4 Feb 2009, Kilian CAVALOTTI wrote:
Hi John,
On Tuesday 03 February 2009 18:25:04 John Hearns wrote:
http://www.theregister.co.uk/2009/02/03/llnl_buys_ibm_supers/
I make this 400 cores per 1U rack unit. How is the counting being done
here?
Even for the intermediate "development" system, Dawn, the article quotes
150,000 cores in 36 racks. Which makes about 100 cores in a 1U space. That's a
hell of a density...
Wow in spades. If the cores drew only 2W each, that would be a "normal"
amount of heat for 1U and adds up to being ballpark 10KW per rack. If
they draw 5W each, that's between 550W and 750W (the latter if there is
a power supply in there with them, although I'm guessing that they'll
provide power through rails and not via onboard power supplies at that
density). 750W makes each rack burn something like 30 KW. And 5W is
not a lot to drive a high-clock core and memory.
One wonders how they are acheiving/planning to achieve it. Clock
throttling? six to eight 16 core CPUs? An array of 800 MHz PDA CPUs?
Some combination of the above? Something else entirely? I haven't
looked into the power requirements of cores beyond dual and quad, but
just running the bus and attached memory seems like it would make such a
system hot, hot, hot. Or use slow, slow, slow processors (but a lot of
them).
The latter isn't a crazy idea, depending on the kind of task this
faster-that-fastest system is supposed to be faster on. Some sort of
massively SIMD decomposable problem with minimal nonlocal IPCs where the
per-processor tasks are modest and nearly independent would get the
near-linear scaling required to use up 1.6 million cores, and it would
explain the 1 MB of memory per core. Consider each node as representing
(say) 10,000 neurons and you've got a 16 billion neuron neural net with
some sort of semilocal topology. Not bad, actually.
rgb
Robert G. Brown http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb <at> phy.duke.edu
<div>
<p>Probably it will be similar to Blue gene/P <br><a href="http://en.wikipedia.org/wiki/Blue_Gene#Blue_Gene.2FP" target="_blank">http://en.wikipedia.org/wiki/Blue_Gene#Blue_Gene.2FP</a><br><br>Each compute card could have a octa core processor and some memory chips.<br>
Like Blue Gene/P, each node card could have 16 compute cards and each card will be 2U high (but they will put two sideways like older Blue Genes).<br>So they can pack 100 cores in a 1U space without magic! :) <br><br>But cool this thing will now be easy.<br>
Probably they will use something like refrigerated doors (or walls).<br><br><br></p>
<div class="gmail_quote">2009/2/4 Robert G. Brown <span dir="ltr"><<a href="mailto:rgb <at> phy.duke.edu" target="_blank">rgb <at> phy.duke.edu</a>></span><br><blockquote class="gmail_quote">
<div>On Wed, 4 Feb 2009, Kilian CAVALOTTI wrote:<br><br><blockquote class="gmail_quote">
Hi John,<br><br>
On Tuesday 03 February 2009 18:25:04 John Hearns wrote:<br><blockquote class="gmail_quote">
<a href="http://www.theregister.co.uk/2009/02/03/llnl_buys_ibm_supers/" target="_blank">http://www.theregister.co.uk/2009/02/03/llnl_buys_ibm_supers/</a><br><br>
I make this 400 cores per 1U rack unit. How is the counting being done<br>
here?<br>
</blockquote>
<br>
Even for the intermediate "development" system, Dawn, the article quotes<br>
150,000 cores in 36 racks. Which makes about 100 cores in a 1U space. That's a<br>
hell of a density...<br>
</blockquote>
<br>
</div>
Wow in spades. If the cores drew only 2W each, that would be a "normal"<br>
amount of heat for 1U and adds up to being ballpark 10KW per rack. If<br>
they draw 5W each, that's between 550W and 750W (the latter if there is<br>
a power supply in there with them, although I'm guessing that they'll<br>
provide power through rails and not via onboard power supplies at that<br>
density). 750W makes each rack burn something like 30 KW. And 5W is<br>
not a lot to drive a high-clock core and memory.<br><br>
One wonders how they are acheiving/planning to achieve it. Clock<br>
throttling? six to eight 16 core CPUs? An array of 800 MHz PDA CPUs?<br>
Some combination of the above? Something else entirely? I haven't<br>
looked into the power requirements of cores beyond dual and quad, but<br>
just running the bus and attached memory seems like it would make such a<br>
system hot, hot, hot. Or use slow, slow, slow processors (but a lot of<br>
them).<br><br>
The latter isn't a crazy idea, depending on the kind of task this<br>
faster-that-fastest system is supposed to be faster on. Some sort of<br>
massively SIMD decomposable problem with minimal nonlocal IPCs where the<br>
per-processor tasks are modest and nearly independent would get the<br>
near-linear scaling required to use up 1.6 million cores, and it would<br>
explain the 1 MB of memory per core. Consider each node as representing<br>
(say) 10,000 neurons and you've got a 16 billion neuron neural net with<br>
some sort of semilocal topology. Not bad, actually.<br><br>
rgb<div>
<br><br><blockquote class="gmail_quote">
<br><blockquote class="gmail_quote">
This article says 4096 cores per rack:<br><a href="http://www.engadget.com/2009/02/03/ibms-sequoia-20x-faster-than-the-worlds-" target="_blank">http://www.engadget.com/2009/02/03/ibms-sequoia-20x-faster-than-the-worlds-</a><br>
fastest-supercomputer/ That seems more rational - 93 racks adds up to 380928<br>
cores<br>
</blockquote>
<br>
An even weirdest number around this system is the amount of memory listed.<br>
1.6TB of main memory for 1.6 million cores, that's a mere 1 MB per core... I<br>
hope it's a P, not a T.<br><br>
Cheers,<br>
-- <br>
Kilian<br>
_______________________________________________<br>
Beowulf mailing list, <a href="mailto:Beowulf <at> beowulf.org" target="_blank">Beowulf <at> beowulf.org</a><br>
To change your subscription (digest mode or unsubscribe) visit <a href="http://www.beowulf.org/mailman/listinfo/beowulf" target="_blank">http://www.beowulf.org/mailman/listinfo/beowulf</a><br><br>
</blockquote>
<br>
</div>
Robert G. Brown <a href="http://www.phy.duke.edu/%7Ergb/" target="_blank">http://www.phy.duke.edu/~rgb/</a><br>
Duke University Dept. of Physics, Box 90305<br>
Durham, N.C. 27708-0305<br>
Phone: 1-919-660-2567 Fax: 919-660-2525 <a href="mailto:email%3Argb <at> phy.duke.edu" target="_blank">email:rgb <at> phy.duke.edu</a><div>
<div></div>
<div>
<br><br><br>
_______________________________________________<br>
Beowulf mailing list, <a href="mailto:Beowulf <at> beowulf.org" target="_blank">Beowulf <at> beowulf.org</a><br>
To change your subscription (digest mode or unsubscribe) visit <a href="http://www.beowulf.org/mailman/listinfo/beowulf" target="_blank">http://www.beowulf.org/mailman/listinfo/beowulf</a><br>
</div>
</div>
</blockquote>
</div>
<br>
</div>