Fugu | 2 Mar 2006 00:20
Picon

Runtime optimization??

Why don't the cruncher cores compiled using fftw-like optimizations?
Assembly code may be slightly faster, but in needs to be remade for every platform...
 


--
FIN|ACK
_______________________________________________
rc5 mailing list
rc5@...
http://lists.distributed.net/mailman/listinfo/rc5

Re: Runtime optimization??


On Mar 1, 2006, at 8:20 PM, Fugu wrote:

> Why don't the cruncher cores compiled using fftw-like optimizations?
> Assembly code may be slightly faster, but in needs to be remade for  
> every platform...

This technique is worthwhile in FFTW (and even then, specialized  
packages handily beat FFTW) because of issues like memory/cache  
organization, size, speed, etc. RC5 is immune to those variations --  
it is only affected by register allocation and instruction scheduling.

An effective FFTW-like optimizer for RC5 would be pretty similar to a  
general C compiler's code generator/optimizer. Not only is this an  
overly complicated piece of software, but it can't schedule  
instructions all that well anyway -- just look at the performance of  
C cores. Plus I dare any compiler's code generator to produce  
something like kakace's software pipelined Altivec+integer RC5 core.

By the way, `assembly code may be slightly faster' qualifies for  
understatement of the year. Even generic assembly code, without  
targeting a specific processor, would easily beat the C cores.

Décio_______________________________________________
rc5 mailing list
rc5@...
http://lists.distributed.net/mailman/listinfo/rc5

Jim C. Nasby | 2 Mar 2006 01:39

Re: Runtime optimization??

On Wed, Mar 01, 2006 at 08:42:10PM -0300, D?cio Luiz Gazzoni Filho wrote:
> 
> On Mar 1, 2006, at 8:20 PM, Fugu wrote:
> 
> >Why don't the cruncher cores compiled using fftw-like optimizations?
> >Assembly code may be slightly faster, but in needs to be remade for  
> >every platform...
> 
> This technique is worthwhile in FFTW (and even then, specialized  
> packages handily beat FFTW) because of issues like memory/cache  
> organization, size, speed, etc. RC5 is immune to those variations --  
> it is only affected by register allocation and instruction scheduling.
> 
> An effective FFTW-like optimizer for RC5 would be pretty similar to a  
> general C compiler's code generator/optimizer. Not only is this an  
> overly complicated piece of software, but it can't schedule  
> instructions all that well anyway -- just look at the performance of  
> C cores. Plus I dare any compiler's code generator to produce  
> something like kakace's software pipelined Altivec+integer RC5 core.
> 
> By the way, `assembly code may be slightly faster' qualifies for  
> understatement of the year. Even generic assembly code, without  
> targeting a specific processor, would easily beat the C cores.

BTW, should you wish to prove Decio wrong or try your hand at this, the
source code for the clients is available for download. :)
--

-- 
Jim C. Nasby, Database Architect            decibel@...
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"
_______________________________________________
rc5 mailing list
rc5@...
http://lists.distributed.net/mailman/listinfo/rc5

Elektron | 2 Mar 2006 15:58
Picon
Favicon

Re: Runtime optimization??


On 2006-03-01, at 23:20:29, Fugu wrote:

> Why don't the cruncher cores compiled using fftw-like optimizations?
> Assembly code may be slightly faster, but in needs to be remade for  
> every platform...

Because there's someone willing to rewrite it for every platform. A  
lot of work is spent in squeezing out an extra 6%.

I don't even understand the AltiVec core.

- Me
_______________________________________________
rc5 mailing list
rc5@...
http://lists.distributed.net/mailman/listinfo/rc5

Benjamin Kaufman | 3 Mar 2006 02:55
Picon

Re: Runtime optimization??

Just 1% in the scope of distributed.net speed is huge.

1% more on a few thousand computers, is a lot.
6% more on a few thousand computers is years shaved off the project.

Benjamin 'thumper^' Kaufman

On 02/03/06, Elektron <elektron_rc5@...> wrote:
>
> On 2006-03-01, at 23:20:29, Fugu wrote:
>
> > Why don't the cruncher cores compiled using fftw-like optimizations?
> > Assembly code may be slightly faster, but in needs to be remade for
> > every platform...
>
> Because there's someone willing to rewrite it for every platform. A
> lot of work is spent in squeezing out an extra 6%.
>
> I don't even understand the AltiVec core.
>
> - Me
> _______________________________________________
> rc5 mailing list
> rc5@...
> http://lists.distributed.net/mailman/listinfo/rc5
>
_______________________________________________
rc5 mailing list
rc5@...
http://lists.distributed.net/mailman/listinfo/rc5

bert | 3 Mar 2006 03:13
Favicon

Re: Runtime optimization??


I wish I was a coder,but if someone is actually working right now to make the current cores 6% more faster than they are now,maybe we could eventually actually break the 1% done barrier in my lifetime or cut the time to finish to 10 years or less :P

Bert O'Dell

Benjamin Kaufman wrote:
Just 1% in the scope of distributed.net speed is huge. 1% more on a few thousand computers, is a lot. 6% more on a few thousand computers is years shaved off the project. Benjamin 'thumper^' Kaufman On 02/03/06, Elektron <elektron_rc5-FFYn/CNdgSA@public.gmane.org> wrote:
On 2006-03-01, at 23:20:29, Fugu wrote:
Why don't the cruncher cores compiled using fftw-like optimizations? Assembly code may be slightly faster, but in needs to be remade for every platform...
Because there's someone willing to rewrite it for every platform. A lot of work is spent in squeezing out an extra 6%. I don't even understand the AltiVec core. - Me _______________________________________________ rc5 mailing list rc5-Ra3b/QYEcJ3d140v2zMXi0fjHoOT/h/0@public.gmane.org http://lists.distributed.net/mailman/listinfo/rc5
_______________________________________________ rc5 mailing list rc5-Ra3b/QYEcJ3d140v2zMXi0fjHoOT/h/0@public.gmane.org http://lists.distributed.net/mailman/listinfo/rc5
_______________________________________________
rc5 mailing list
rc5@...
http://lists.distributed.net/mailman/listinfo/rc5
Thorsten Wolf | 3 Mar 2006 08:19
Picon
Picon

Re: Runtime optimization??

Actually, I don't think that 6% improvement would really cut a big piece of
the whole thing.

I remember the transition from Pentium III to Pentium 4 based Systems, where
one Instruction (was it the "Rotate") was changed from Hardware to Emulation
within the CPU which slowed dnetc down by roughly 25% on P4 at the same
clockspeed (given an 1.4 GHz example). Check the RC5 Calculator here
(http://www.distributed.net/~nerf/rc5calc.html) ... (ok, I know it doesn't
work atm, but I'm sure, Nerf will fix it).

So with Dual Core now available, Quad Core to be introduced by INTC 2007, I
think that a coding improvement of 6% won't do much. Maybe Intel Corp
decides to put that instruction back into the cpu or extra instruction come
with next generation CPU's... Dnetc speed can vastly improve over today's
performance. Which is great if you ask me, but that's just my 2 cents here.

Cheers,

Thorsten
t_wolf <at> d.net

> 
> 
> I wish I was a coder,but if someone is actually working right now to 
> make the current cores 6% more faster than they are now,maybe we could 
> eventually actually break the 1% done barrier in my lifetime or cut the 
> time to finish to 10 years or less :P
> 
> Bert O'Dell
> 
> Benjamin Kaufman wrote:
> 
> >Just 1% in the scope of distributed.net speed is huge.
> >
> >1% more on a few thousand computers, is a lot.
> >6% more on a few thousand computers is years shaved off the project.
> >
> >Benjamin 'thumper^' Kaufman
> >
> >
> >
> >On 02/03/06, Elektron <elektron_rc5@...> wrote:
> >  
> >
> >>On 2006-03-01, at 23:20:29, Fugu wrote:
> >>
> >>    
> >>
> >>>Why don't the cruncher cores compiled using fftw-like optimizations?
> >>>Assembly code may be slightly faster, but in needs to be remade for
> >>>every platform...
> >>>      
> >>>
> >>Because there's someone willing to rewrite it for every platform. A
> >>lot of work is spent in squeezing out an extra 6%.
> >>
> >>I don't even understand the AltiVec core.
> >>
> >>- Me
> >>_______________________________________________
> >>rc5 mailing list
> >>rc5@...
> >>http://lists.distributed.net/mailman/listinfo/rc5
> >>
> >>    
> >>
> >_______________________________________________
> >rc5 mailing list
> >rc5@...
> >http://lists.distributed.net/mailman/listinfo/rc5
> >
> >
> >  
> >
> 

--

-- 
Contact me on ICQ: 7656468

Echte DSL-Flatrate dauerhaft für 0,- Euro*!
"Feel free" mit GMX DSL! http://www.gmx.net/de/go/dsl
_______________________________________________
rc5 mailing list
rc5@...
http://lists.distributed.net/mailman/listinfo/rc5

Re: Runtime optimization??


On Mar 3, 2006, at 4:19 AM, Thorsten Wolf wrote:

> So with Dual Core now available, Quad Core to be introduced by INTC  
> 2007, I
> think that a coding improvement of 6% won't do much. Maybe Intel Corp
> decides to put that instruction back into the cpu or extra  
> instruction come
> with next generation CPU's... Dnetc speed can vastly improve over  
> today's
> performance. Which is great if you ask me, but that's just my 2  
> cents here.

The latest P4 cores, codenamed `Prescott', already have barrel  
shifters and hence execute rotates in 1 cycle. However other  
instructions were slowed down and the gain wasn't as huge as it could  
have been. There is though a clear difference from the newest P4s to  
the previous models.

But this isn't very relevant to the future performance of Intel  
processors, since Intel is dropping the Pentium 4 microarchitecture  
in favor of a Pentium M derivative.

Also, dual or quad core processors will be considered minor  
improvements once the Playstation 3 hits the scene.

Décio_______________________________________________
rc5 mailing list
rc5@...
http://lists.distributed.net/mailman/listinfo/rc5

Earl Stenlund | 17 Mar 2006 14:30

Help with justification of running Dnet at work

I'm the network admin at a small community college. I have permission from
the higher up to run D.net (6+ years now), but we have one instructor that
is raising hell about it. The administration wants me to put a pack together
so they can reevaluate it. We are currently running just the OGR project. I
push the client to all computer on the network from a server. No files are
stored on the local machines, everything comes from the server. Below is an
email response I received from the instructor after I replied to him
question what Dnet.exe is.

	I only know what my students and I experience and that is a delay in
opening 	applications such as browser windows, powerpoint
presentations, etc.  I read all the 	mumbo jumbo on the web site and I
have actually had to power cycle a machine to get it 	back under my
control.  I 'm not buying the science contribution either.   Why would you
want to support something that is not really under your control?  We need
all of the 	resources we can get and should not be squandering them on a
research project that even 	the federal government would not support.
I don't need to see your benchmarks since 	they are done under a
laboratory environment and do not reflect what will happen under
actual situations.

	The statement below from the web site is unbelievable...do they
really think we are that 	stupid....the only reason I go to look at
the statistics is because I cannot launch a 	powerpoint presentation or a
browser, etc!  It is NOT a question of a task running in the 	background;
it is a question of STARTING a new task while dnet.exe clings to the
processor.

	'While you're watching cpu time statistics, you aren't doing
anything else that might be 	consuming cpu time, and consequently you see
the client using 100%. If however you were 	to have another task running
in the background that needed/was using cpu time, you'd see 	the client
use that much less.'

Can anyone provide me with some help. Again, I am putting info together on
the project and how it runs. What would be great is an explanation of the
source code showing that is only runs when idle. I have benchmarks run on my
laptop but as you can see the instructor doesn't believe them.

Also, does anyone know what other Schools are running the project? A list
would be great. I know DePaul, Polytechnic of Namibia, University of
Toronto, University of Illinois, Education Network of Ontario

Thanks,
Earl Stenlund
Attachment (smime.p7s): application/x-pkcs7-signature, 3109 bytes
_______________________________________________
rc5 mailing list
rc5@...
http://lists.distributed.net/mailman/listinfo/rc5
Ray Booysen | 17 Mar 2006 14:47

Re: Help with justification of running Dnet at work

Earl Stenlund wrote:
> I'm the network admin at a small community college. I have permission from
> the higher up to run D.net (6+ years now), but we have one instructor that
> is raising hell about it. The administration wants me to put a pack together
> so they can reevaluate it. We are currently running just the OGR project. I
> push the client to all computer on the network from a server. No files are
> stored on the local machines, everything comes from the server. Below is an
> email response I received from the instructor after I replied to him
> question what Dnet.exe is.
>
> 	I only know what my students and I experience and that is a delay in
> opening 	applications such as browser windows, powerpoint
> presentations, etc.  I read all the 	mumbo jumbo on the web site and I
> have actually had to power cycle a machine to get it 	back under my
> control.  I 'm not buying the science contribution either.   Why would you
> want to support something that is not really under your control?  We need
> all of the 	resources we can get and should not be squandering them on a
> research project that even 	the federal government would not support.
> I don't need to see your benchmarks since 	they are done under a
> laboratory environment and do not reflect what will happen under
> actual situations.
>
> 	The statement below from the web site is unbelievable...do they
> really think we are that 	stupid....the only reason I go to look at
> the statistics is because I cannot launch a 	powerpoint presentation or a
> browser, etc!  It is NOT a question of a task running in the 	background;
> it is a question of STARTING a new task while dnet.exe clings to the
> processor.
>
>
> 	'While you're watching cpu time statistics, you aren't doing
> anything else that might be 	consuming cpu time, and consequently you see
> the client using 100%. If however you were 	to have another task running
> in the background that needed/was using cpu time, you'd see 	the client
> use that much less.'
>
> Can anyone provide me with some help. Again, I am putting info together on
> the project and how it runs. What would be great is an explanation of the
> source code showing that is only runs when idle. I have benchmarks run on my
> laptop but as you can see the instructor doesn't believe them.
>
> Also, does anyone know what other Schools are running the project? A list
> would be great. I know DePaul, Polytechnic of Namibia, University of
> Toronto, University of Illinois, Education Network of Ontario
>
> Thanks,
> Earl Stenlund
>   
> ------------------------------------------------------------------------
>
> _______________________________________________
> rc5 mailing list
> rc5@...
> http://lists.distributed.net/mailman/listinfo/rc5
>   
Although d.net state it only uses idle time, I have noticed on machines 
at there is extra latency with tasks if dnetc is running.  Especially 
under high load such as image or video editing.  This was never an 
isolated event and there will always be overhead with the machine having 
to switch the processes and handling the extra interrupts that dnetc 
will generate.

Regards
Ray

--

-- 
Ray Booysen
rj_booysen@...

_______________________________________________
rc5 mailing list
rc5@...
http://lists.distributed.net/mailman/listinfo/rc5


Gmane