Re: unaligned accesses in IA64
Håkan Hjort <hakan.hjort <at> gmail.com>
2008-03-01 20:43:44 GMT
2008/2/29 Loren Merritt <lorenm <at> u.washington.edu>:
> On Fri, 29 Feb 2008, Håkan Hjort wrote:
> Tough. Write a compiler that can optimize two consecutive stores as a
> single 64bit store, and I'll stop using workarounds.
Pragmatic solution, just wanted to point out that it might cause more than
At other places in this file you use structure assignment. That generates
a memcpy with some (all?) versions of GCC but could potentially be what
would be needed to get the compiler to optimize the accesses, e.i. create
a compound type for the MV, a union of uint64_t and int or perhaps just
a struct. However the changes would touch a lot of code...
So what about the DECLARE_ALIGNED() changes?
Attached is another revision where I made use of the cast to uint64_t to
copy MVs in x264_mb_analyse_inter_b16x8 and b8x16 too. Other places
are copying between int and int16_t so can't directly be paired like this.
Looking for other places that have alignment constraints for similar
reasons I found the following;
x264_mb_load_mv_direct8x8 accesses h->mb.cache.mv and
h->mb.cache.direct_mv as uint64_t
x264_macroblock_cache_ref calls x264_macroblock_cache_rect1
with h->mb.cache.ref which is then accessed as uint32_t
x264_macroblock_cache_skip calls x264_macroblock_cache_rect1
with h->mb.cache.skip which is then accessed as uint32_t