Ronald S. Bultje | 1 Sep 2010 01:06
Picon

[PATCH] fix h264_deblock_sse2.asm segfaults on clang/x86-32

Hi,

on fate/clang/freebsd/x86-32 [1], all h264 tests fail with a sigbus.
There's some alignment issues in the sse2 asm. The asm appears to
assume that the caller function guarantees a certain alignment, which
is (IMO) not necessarily true. If the function is called directly from
C, stack can have any alignment, depending on the compiler used, so
the functions shouldn't assume any specific alignment imo.

Attached patch tries to make it work with any alignment. x86-64
h_luma_intra wasn't fixed because it doesn't show any obvious issues
and any attempt of mine to fix it bombed out, probably somewhere in
the macros which do way much. This code is scary witchcraft, almost as
crazy as inline asm. With this patch, most sigbus issues are gone and
"make fate-h264" goes well until about halfway
(h264-conformance-caba2_sony_e), where it crashes somewhere in
decode_cabac_mb_mvd() which I haven't quite traced down yet.

(Related, how do I run make fate-xyz without it quitting as soon as a
single test fails? I'd like output as on fate.ffmpeg.org so I can see
how many failures I fixed with this patch.)

Thanks,
Ronald

[1] http://fate.ffmpeg.org/x86_32-freebsd-clang/20100831203328
Attachment (fix_clang_x86_32_fbsd.patch): application/octet-stream, 6944 bytes
_______________________________________________
(Continue reading)

Ronald S. Bultje | 1 Sep 2010 03:56
Picon

Re: [PATCH] fix h264_deblock_sse2.asm segfaults on clang/x86-32

Hi,

On Tue, Aug 31, 2010 at 7:06 PM, Ronald S. Bultje <rsbultje <at> gmail.com> wrote:
> Attached patch tries to make it work with any alignment. x86-64
> h_luma_intra wasn't fixed because it doesn't show any obvious issues
> and any attempt of mine to fix it bombed out, probably somewhere in
> the macros which do way much. This code is scary witchcraft, almost as
> crazy as inline asm. With this patch, most sigbus issues are gone and
> "make fate-h264" goes well until about halfway

Attached is a new patch with all x86-64 parts removed (requested by
Mans, since they don't crash on fate), so this only touches x86-32
specific code. Effect on runtime should be minimal, but b/c of the
stack saving this will be 1-2 instructions more for each function.

> (h264-conformance-caba2_sony_e), where it crashes somewhere in
> decode_cabac_mb_mvd() which I haven't quite traced down yet.

Turns out to be a compiler bug, see http://llvm.org/bugs/show_bug.cgi?id=8044

Ronald
Attachment (fix-clang-h264-alignment.patch): application/octet-stream, 5285 bytes
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel <at> mplayerhq.hu
https://lists.mplayerhq.hu/mailman/listinfo/ffmpeg-devel
Reimar Döffinger | 1 Sep 2010 07:53
Picon
Picon

Re: [PATCH] fix h264_deblock_sse2.asm segfaults on clang/x86-32

On Tue, Aug 31, 2010 at 07:06:09PM -0400, Ronald S. Bultje wrote:
> on fate/clang/freebsd/x86-32 [1], all h264 tests fail with a sigbus.
> There's some alignment issues in the sse2 asm. The asm appears to
> assume that the caller function guarantees a certain alignment, which
> is (IMO) not necessarily true. If the function is called directly from
> C, stack can have any alignment, depending on the compiler used, so
> the functions shouldn't assume any specific alignment imo.

We already use
attribute_align_arg
on the functions where it should be necessary since they are called
from outside code and use optimizations and the compiler
is supposed to keep sufficient alignment for all functions below.
If that's not the case we might at least have performance issues
with several other functions using aligned stack arguments,
so it depends on that if there's any point in trying to fix that.
Ronald S. Bultje | 1 Sep 2010 08:01
Picon

Re: [PATCH] fix h264_deblock_sse2.asm segfaults on clang/x86-32

Hi,

On Wed, Sep 1, 2010 at 1:53 AM, Reimar Döffinger
<Reimar.Doeffinger <at> gmx.de> wrote:
> On Tue, Aug 31, 2010 at 07:06:09PM -0400, Ronald S. Bultje wrote:
>> on fate/clang/freebsd/x86-32 [1], all h264 tests fail with a sigbus.
>> There's some alignment issues in the sse2 asm. The asm appears to
>> assume that the caller function guarantees a certain alignment, which
>> is (IMO) not necessarily true. If the function is called directly from
>> C, stack can have any alignment, depending on the compiler used, so
>> the functions shouldn't assume any specific alignment imo.
>
> We already use
> attribute_align_arg

Alex brought that up also. That sets alignment of stack in functions,
not functions called by functions (afaics). Since these are compiled
by yasm, they're unaffected by that.

Or clang is buggy, who knows... :-).

Ronald
Alexander Strange | 1 Sep 2010 08:08
Gravatar

Re: [PATCH] fix h264_deblock_sse2.asm segfaults on clang/x86-32


On Sep 1, 2010, at 1:53 AM, Reimar Döffinger wrote:

> On Tue, Aug 31, 2010 at 07:06:09PM -0400, Ronald S. Bultje wrote:
>> on fate/clang/freebsd/x86-32 [1], all h264 tests fail with a sigbus.
>> There's some alignment issues in the sse2 asm. The asm appears to
>> assume that the caller function guarantees a certain alignment, which
>> is (IMO) not necessarily true. If the function is called directly from
>> C, stack can have any alignment, depending on the compiler used, so
>> the functions shouldn't assume any specific alignment imo.
> 
> We already use
> attribute_align_arg
> on the functions where it should be necessary since they are called
> from outside code and use optimizations and the compiler
> is supposed to keep sufficient alignment for all functions below.
> If that's not the case we might at least have performance issues
> with several other functions using aligned stack arguments,
> so it depends on that if there's any point in trying to fix that.

There are two issues; the assumed alignment on entry to a function and the alignment it retains when calling
something else.

gcc on linux assumes 16-byte alignment for both, and realigns on entry with attribute_align_arg.

clang on linux/freebsd assumes 4-byte alignment, realigns to 16-byte on entry with
attribute_align_arg, but only attempts to retain 4-byte alignment. So any callee code needing an
aligned stack and using clang on freebsd/linux needs to realign it themselves.

Although they will probably accept gcc compatibility as reason to retain 16-byte stack alignment, the
(Continue reading)

Reimar Döffinger | 1 Sep 2010 08:21
Picon
Picon

Re: [PATCH] fix h264_deblock_sse2.asm segfaults on clang/x86-32

On Wed, Sep 01, 2010 at 02:08:35AM -0400, Alexander Strange wrote:
> clang on linux/freebsd assumes 4-byte alignment, realigns to 16-byte on entry with
attribute_align_arg, but only attempts to retain 4-byte alignment. So any callee code needing an
aligned stack and using clang on freebsd/linux needs to realign it themselves.

My point is about 16-byte aligned stack variables.
We use those, and with that approach clang can only work either not at all or
inefficiently with those which is why I am raising the question whether we should
add possibly slower and more complex code when we maybe won't be able to make all of
FFmpeg work anyway with that compiler.
Reimar Döffinger | 1 Sep 2010 08:23
Picon
Picon

Re: [PATCH] fix h264_deblock_sse2.asm segfaults on clang/x86-32

On Wed, Sep 01, 2010 at 02:01:36AM -0400, Ronald S. Bultje wrote:
> Hi,
> 
> On Wed, Sep 1, 2010 at 1:53 AM, Reimar Döffinger
> <Reimar.Doeffinger <at> gmx.de> wrote:
> > On Tue, Aug 31, 2010 at 07:06:09PM -0400, Ronald S. Bultje wrote:
> >> on fate/clang/freebsd/x86-32 [1], all h264 tests fail with a sigbus.
> >> There's some alignment issues in the sse2 asm. The asm appears to
> >> assume that the caller function guarantees a certain alignment, which
> >> is (IMO) not necessarily true. If the function is called directly from
> >> C, stack can have any alignment, depending on the compiler used, so
> >> the functions shouldn't assume any specific alignment imo.
> >
> > We already use
> > attribute_align_arg
> 
> Alex brought that up also. That sets alignment of stack in functions,
> not functions called by functions (afaics). Since these are compiled
> by yasm, they're unaffected by that.
> 
> Or clang is buggy, who knows... :-).

No it's not, I just considered it reasonable to assume that any
non-ancient compiler wouldn't needlessly clobber stack alignment.
Eli Friedman | 1 Sep 2010 08:28
Picon

Re: [PATCH] fix h264_deblock_sse2.asm segfaults on clang/x86-32

On Tue, Aug 31, 2010 at 11:21 PM, Reimar Döffinger
<Reimar.Doeffinger <at> gmx.de> wrote:
> On Wed, Sep 01, 2010 at 02:08:35AM -0400, Alexander Strange wrote:
>> clang on linux/freebsd assumes 4-byte alignment, realigns to 16-byte on entry with
attribute_align_arg, but only attempts to retain 4-byte alignment. So any callee code needing an
aligned stack and using clang on freebsd/linux needs to realign it themselves.
>
> My point is about 16-byte aligned stack variables.
> We use those, and with that approach clang can only work either not at all or
> inefficiently with those which is why I am raising the question whether we should
> add possibly slower and more complex code when we maybe won't be able to make all of
> FFmpeg work anyway with that compiler.

By default, clang will automatically realign the stack for functions
which use variables requiring 16 or more bytes of alignment.

-Eli
Alexander Strange | 1 Sep 2010 08:35
Gravatar

Re: [PATCH] fix h264_deblock_sse2.asm segfaults on clang/x86-32


On Sep 1, 2010, at 2:23 AM, Reimar Döffinger wrote:

> On Wed, Sep 01, 2010 at 02:01:36AM -0400, Ronald S. Bultje wrote:
>> Hi,
>> 
>> On Wed, Sep 1, 2010 at 1:53 AM, Reimar Döffinger
>> <Reimar.Doeffinger <at> gmx.de> wrote:
>>> On Tue, Aug 31, 2010 at 07:06:09PM -0400, Ronald S. Bultje wrote:
>>>> on fate/clang/freebsd/x86-32 [1], all h264 tests fail with a sigbus.
>>>> There's some alignment issues in the sse2 asm. The asm appears to
>>>> assume that the caller function guarantees a certain alignment, which
>>>> is (IMO) not necessarily true. If the function is called directly from
>>>> C, stack can have any alignment, depending on the compiler used, so
>>>> the functions shouldn't assume any specific alignment imo.
>>> 
>>> We already use
>>> attribute_align_arg
>> 
>> Alex brought that up also. That sets alignment of stack in functions,
>> not functions called by functions (afaics). Since these are compiled
>> by yasm, they're unaffected by that.
>> 
>> Or clang is buggy, who knows... :-).
> 
> No it's not, I just considered it reasonable to assume that any
> non-ancient compiler wouldn't needlessly clobber stack alignment.

Adding "-mllvm -stack-alignment=16" to cflags might fix it.
(Continue reading)

Reimar Döffinger | 1 Sep 2010 08:41
Picon
Picon

Re: [PATCH] fix h264_deblock_sse2.asm segfaults on clang/x86-32

On Tue, Aug 31, 2010 at 11:28:07PM -0700, Eli Friedman wrote:
> On Tue, Aug 31, 2010 at 11:21 PM, Reimar Döffinger
> <Reimar.Doeffinger <at> gmx.de> wrote:
> > On Wed, Sep 01, 2010 at 02:08:35AM -0400, Alexander Strange wrote:
> >> clang on linux/freebsd assumes 4-byte alignment, realigns to 16-byte on entry with
attribute_align_arg, but only attempts to retain 4-byte alignment. So any callee code needing an
aligned stack and using clang on freebsd/linux needs to realign it themselves.
> >
> > My point is about 16-byte aligned stack variables.
> > We use those, and with that approach clang can only work either not at all or
> > inefficiently with those which is why I am raising the question whether we should
> > add possibly slower and more complex code when we maybe won't be able to make all of
> > FFmpeg work anyway with that compiler.
> 
> By default, clang will automatically realign the stack for functions
> which use variables requiring 16 or more bytes of alignment.

Will it do it at least a bit more sanely if calling lots of static functions
with aligned variables for a single function?
Or will it realign the stack over and over? In that case I am still
unsure if we shouldn't wait until they decide on a reasonable way
of handling it.
(though unless compiling with -Os I really don't think it is sane
to break stack alignment anyway)

Gmane