Re: unaligned accesses in IA64
Håkan Hjort <hakan.hjort <at> gmail.com>
2008-03-01 20:43:44 GMT
2008/2/29 Loren Merritt <lorenm <at> u.washington.edu>:
> On Fri, 29 Feb 2008, Håkan Hjort wrote:
>
> Tough. Write a compiler that can optimize two consecutive stores as a
> single 64bit store, and I'll stop using workarounds.
>
Pragmatic solution, just wanted to point out that it might cause more than
alignment issues.
At other places in this file you use structure assignment. That generates
a memcpy with some (all?) versions of GCC but could potentially be what
would be needed to get the compiler to optimize the accesses, e.i. create
a compound type for the MV, a union of uint64_t and int[2] or perhaps just
a struct. However the changes would touch a lot of code...
So what about the DECLARE_ALIGNED() changes?
Attached is another revision where I made use of the cast to uint64_t to
copy MVs in x264_mb_analyse_inter_b16x8 and b8x16 too. Other places
are copying between int and int16_t so can't directly be paired like this.
Looking for other places that have alignment constraints for similar
reasons I found the following;
x264_mb_load_mv_direct8x8 accesses h->mb.cache.mv and
h->mb.cache.direct_mv as uint64_t
x264_macroblock_cache_ref calls x264_macroblock_cache_rect1
with h->mb.cache.ref which is then accessed as uint32_t
x264_macroblock_cache_skip calls x264_macroblock_cache_rect1
with h->mb.cache.skip which is then accessed as uint32_t
(Continue reading)