1 Jun 2006 01:25
Re: [patch] Adding optimized kernel copying support - Part III
Bruce Evans <bde <at> zeta.org.au>
2006-05-31 23:25:17 GMT
2006-05-31 23:25:17 GMT
On Wed, 31 May 2006, Attilio Rao wrote: > 2006/5/31, Suleiman Souhlal <ssouhlal <at> freebsd.org>: >> Nice work. Any chance you could also port it to amd64?(Continue reading)> > Not in the near future, I think. :P It is not useful for amd64. An amd64 has enough instruction bandwidth to saturate the L1 cache using 64-bit accesses although not using 32-bit accesses. An amd64 has 64-bit integer registers which can be accesses without the huge setup overheads and code complications for MMX/XMM registers. It already uses 64-bit registers or 64-bit movs for copying and zeroing of course. Perhaps it should use prefetches and nontemporal writes more than it already does, but these don't require using SSE2 instructions like nontemporal writes do for 32-bit CPUs. >> Does that mean it won't work with SMP and PREEMPTION? > > Yes it will work (even if I think it needs more testing) but maybe > would give lesser performances on SMP|PREEMPTION due to too much > traffic on memory/cache. For this I was planing to use non-temporal > instructions > (obviously benchmarks would be very appreciate). Er, isn't its main point to fix some !SMP assumptions made in the old copying-through-the-FPU code? (The old code is messy due to its avoidance of global changes. It wants to preserve the FPU state on the stack, but this doesn't quite work so it does extra things (still mostly locally) that only work in the !SMP && (!SMPng even with UP) case. Patching this
>
> Not in the near future, I think. :P
It is not useful for amd64. An amd64 has enough instruction bandwidth
to saturate the L1 cache using 64-bit accesses although not using
32-bit accesses. An amd64 has 64-bit integer registers which can be
accesses without the huge setup overheads and code complications for
MMX/XMM registers. It already uses 64-bit registers or 64-bit movs
for copying and zeroing of course. Perhaps it should use prefetches
and nontemporal writes more than it already does, but these don't
require using SSE2 instructions like nontemporal writes do for 32-bit
CPUs.
>> Does that mean it won't work with SMP and PREEMPTION?
>
> Yes it will work (even if I think it needs more testing) but maybe
> would give lesser performances on SMP|PREEMPTION due to too much
> traffic on memory/cache. For this I was planing to use non-temporal
> instructions
> (obviously benchmarks would be very appreciate).
Er, isn't its main point to fix some !SMP assumptions made in the old
copying-through-the-FPU code? (The old code is messy due to its avoidance
of global changes. It wants to preserve the FPU state on the stack, but
this doesn't quite work so it does extra things (still mostly locally)
that only work in the !SMP && (!SMPng even with UP) case. Patching this
.
I realize how hard getting write support for one of those is, for
certain. You'd still have to go through the labor with ZFS though,
unless you are talking about read-only support for it. I don't know
much about licensing stuff...
> NetBSD has a Journalling Google SoC, definitely interesting if they get far.
>
> Pedro.
We did too last year, but it didn't complete. I think Scott Long is
back looking at it again (I've seen some hints of life in the p4 repo).
Eric
RSS Feed