1 Jan 2010 01:14
Re: Correct way to make a 16-byte aligned double* for SSE vectorization?
Brian Budge <brian.budge <at> gmail.com>
2010-01-01 00:14:38 GMT
2010-01-01 00:14:38 GMT
I see, so you want the function to be compiled as though the pointers are guaranteed to point to 16-byte-aligned addresses. This is an interesting question. I'll be following this too :) Brian On Thu, Dec 31, 2009 at 12:27 PM, Benjamin Redelings I <benjamin_redelings <at> ncsu.edu> wrote: > On 12/31/2009 08:41 AM, Brian Budge wrote: >> >> The reason it won't work is that you're saying the pointer itself >> needs to be 16 (or 8) byte aligned. You need the address that the >> pointer points to to be aligned. >> >> On the stack: >> >> __attribute__ ((aligned(16)) real myArray[32]; >> >> On the heap (*nix): >> real *myArray; >> posix_memalign((void**)&myArray, 16, 32 * sizeof(real)); >> >> or for more portability you could use the SSE intrinsic mm_malloc. >> >> To know why the one version you posted works, we'd need to see the >> calling code of f. In general, it shouldn't work if malloc or new >> are used to allocate the memory passed in, but it might be that the >> memory is allocated on the stack? >> >> Brian(Continue reading)
RSS Feed