Re: Why Git is so fast
James Pickens <jepicken <at> gmail.com>
2009-05-01 01:25:21 GMT
On Thu, Apr 30, 2009, Steven Noonan <steven <at> uplinklabs.net> wrote:
> A bit off topic, but the results are rather interesting to me, and I
> think I see a weakness in how GCC is doing this on Intel. Someone
> please correct me if I'm wrong, but the PowerPC code seems much better
> because it can yield very high instruction-level parallelism. It does
> 5 loads and then 5 stores, using 4 registers for temporary storage and
> 2 registers for pointers.
>
> I realize the Intel x86 architecture is quite constrained in that it
> has so few general purpose registers, but there has to be better code
> than what GCC emitted above. It seems like the processor would stall
> because of the quantity of sequential inter-dependent instructions
> that can't be done in parallel (mov to memory that depends on a mov to
> eax, etc).
There aren't any unnecessary dependencies. Take this sequence:
1: movl (%edx), %eax
2: movl %eax, (%ecx)
3: movl 4(%edx), %eax
4: movl %eax, 4(%ecx)
There are two unavoidable dependencies - #2 depends on #1, and #4
depends on #3. #3 does not depend on #2, even though they both
use %eax, because #3 is a write to %eax. So whatever was in %eax
before #3 is irrelevant. The processor knows this and will use
register renaming to execute #1 and #3 in parallel, and #2 and #4
in parallel.
James
(Continue reading)