Behaviour of Annex B encoding: bug or not?
Philip Spencer <pspencer <at> fields.utoronto.ca>
2010-01-05 18:41:51 GMT
First a disclaimer: I know very little about the H.264 spec, and am just
trying to resolve an interoperability issue we are having, so please bear
with me if I misstate or misunderstand something.
In the routine x264_nal_encode (in common/common.c), the flag b_annexb
controls whether or not to add a NAL start code (00 00 00 01) to the
beginning of the packet, but does not control whether or not to do the
escaping of sequences of the form 00 00 00/1/2/3 by inserting a 03 byte
into the third position -- that escaping is ALWAYS done, even if b_annexb
is not set.
Is this a bug, or is it meant to be that way? (I don't have access to the
text of Annex B).
It certainly breaks interoperability with several devices. In particular,
x264 cannot be used with the Ekiga softphone application to communicate
with the "LifeSize Room" brand of videoconferencing equipment: that device
expects non-Annex-B packets over RTP, and cannot handle the extra 03 byte
that is inserted. The result is that the session parameters (like stream
resolution and other such settings, which often contain multiple zero
bytes in a row) get completely garbled because of the Annex-B bytestream
Also, if one attempts to sniff the network traffic with software such as
WireShark, it too chokes on the unexpected 03 bytes in RTP packets.
It would seem to me that this is a bug: if Annex B bytestream encoding is
not desired, such as for an RTP packet, then the extra escape bytes should
not be inserted.
On our system, I have applied the patch below to common.c and then
H.264 connectivity to the LifeSize Room videoconferencing equipment works
On the other hand, from a quick glance at the source code of the reference
encoder/decoder, it seems that it behaves the same way as x264: always
inserts the escape byte. Is this a bug in the reference encoder/decoder
too, or does the text of Annex B specify that ALL H.264 streams should
have the extra bytes inserted, even when bytestream encoding is not being
In the latter case, then obviously the LifeSize brand videoconference
units are buggy, but since I know they interperate will over H.264 with a
wide range of units from other manufacturers there must be a lot of buggy
devices out there -- would it be worth adding an extra flag to x264 that
says "do bytestream encoding only in Annex B mode, for compatibility with
devices that cannot handle it in RTP mode"?
Here is the patch we are using to make x264 work with our LifeSize Room
--- common/common.c.orig 2009-10-20 19:06:53.000000000 -0400
+++ common/common.c 2009-10-20 19:06:44.000000000 -0400
<at> <at> -730,7 +730,7 <at> <at>
*dst++ = 0x03;
i_count = 0;
- if( *src == 0 )
+ if( *src == 0 && b_annexb )
i_count = 0;
Philip Spencer pspencer <at> fields.utoronto.ca | Director of Computing Services
Room 336 (416)-348-9710 ext3036 | The Fields Institute for
222 College St, Toronto ON M5T 3J1 Canada | Research in Mathematical Sciences
x264-devel mailing list
x264-devel <at> videolan.org