Victor Stinner | 4 Mar 2007 03:04

Release of hachoir-core 0.8, hachoir-parser 0.9, hachoir-metadata 0.9 and hachoir-urwid 0.8

What's new in hachoir-core 0.8?
===============================

New features:

 * Field value and display attributes are fault tolerant
 * New type:
   * Int24 and UInt24: signed/unsigned 24-bit integer ;
   * Float80: 80-bit flotting point number ;
   * TimestampMSDOS32: 32-bit MS-DOS, since 1st january 1980 ;
   * TimestampUnix32: 32-bit UNIX, seconds since 1st january 1970 ;
   * TimestampMac32: 32-bit Mac, seconds since 1st january 1904 ;
   * TimestampWin64: 64-bit Windows, nanoseconds since 1st january 1600 ;
 * Function createOrphanField(): allow to create a field at any address
 * String: add "MacRoman" charset, and rename "UTF-16LE" to "UTF-16-LE"
   (and UTF-16BE to UTF-16-BE) for IronPython compatibility
 * Write functions timestampUNIX(), timestampMac32(), timestampWin64(),
   and humanDatetime() for IronPython compatibility. Functions use UTC and
   not local timezone
 * Add methods getSubIStream() and setSubIStream() to Field class

Other changes:

 * Split GenericFieldSet into BasicFieldSet and GenericFieldSet, and create
   SeekableFieldSet (not working yet) class
 * Remove EncodedField (replaced by SubFile).
 * Move hachoir_core.editor to new subproject hachoir_editor
 * Use ASCII and not ISO-8859-1 charset for raw display
 * Field class inherits from Logger to have info(), warning() and error() 
methods
(Continue reading)

Christophe Fergeau | 13 Mar 2007 19:55
Picon

itunesdb parser playlist support

Hi,

The attached patch extends the current itunesdb parser to make it able
to parse playlist content as well as a few more mhods. 
The itunesdb parser is still not complete, there are some mhods that
aren't parsed, and this patch can't parse podcasts or smart playlists.

I don't know much python and only discovered hachoir a few days ago, so
I may be doing really weird things, don't be surprised :p

I'd really appreciate some hints at what I can to speed up
parsing/display of the mhod 52 (in the if (self["type"].value == 52 in
DataObject::createField block). There is a block of N integer at the end
of such blocks (where N == number of songs on the ipod), and using a for
loop to display those ints one by one is really inefficient. Is there
any better way?

Thanks for the good work on hachoir, it's really nice to graphically
browse an itunesdb,

Christophe

Index: hachoir_parser/audio/itunesdb.py
===================================================================
--- hachoir_parser/audio/itunesdb.py	(révision 2144)
+++ hachoir_parser/audio/itunesdb.py	(copie de travail)
 <at>  <at>  -73,8 +73,17  <at>  <at> 
         50:"Smart Playlist Data",
(Continue reading)

Victor Stinner | 14 Mar 2007 09:33

Re: itunesdb parser playlist support

Hi,

Le Mardi 13 Mars 2007 19:55, Christophe Fergeau a écrit :
> The attached patch extends the current itunesdb parser to make it able
> to parse playlist content as well as a few more mhods.

Patch applied, thanks. Hachoir is still missing a iTunesDB file for our 
testcase. Do you have such file?

> There is a block of N integer at the end
> of such blocks (where N == number of songs on the ipod), and using a for
> loop to display those ints one by one is really inefficient. Is there
> any better way?

You may use GenericVector for such operation. It's faster because it computes 
size of all items (eg. 10 items of 32 bits => 320 bits) and so it doesn't 
need to create each item.

> Thanks for the good work on hachoir, it's really nice to graphically
> browse an itunesdb,

Do you know that it's possible to edit your file using Hachoir? :-)

Victor
--

-- 
Victor Stinner
http://hachoir.org/

Christophe Fergeau | 14 Mar 2007 10:25
Picon

Re: itunesdb parser playlist support

Hi,

> Patch applied, thanks. Hachoir is still missing a iTunesDB file for our 
> testcase. Do you have such file?

You can find a bunch of such files in ipod-sharp sources, see
http://banshee-project.org/Subprojects/Ipod-sharp and
http://svn.myrealbox.com/viewcvs/trunk/ipod-sharp/tests/
(I'm not sure the parser will run flawlessly though ;)

> 
> You may use GenericVector for such operation. It's faster because it computes 
> size of all items (eg. 10 items of 32 bits => 320 bits) and so it doesn't 
> need to create each item.

Is there any sample code showing how it can be used?

> 
> > Thanks for the good work on hachoir, it's really nice to graphically
> > browse an itunesdb,
> 
> Do you know that it's possible to edit your file using Hachoir? :-)

Yep, I saw that, but I'll probably prefer to use rhythmbox for that ;)
Is there any GUI to edit files using Hachoir ?

Victor Stinner | 15 Mar 2007 02:11

Re: itunesdb parser playlist support

Le Mercredi 14 Mars 2007 10:25, Christophe Fergeau a écrit :
> > You may use GenericVector for such operation. It's faster because it
> > computes size of all items (eg. 10 items of 32 bits => 320 bits) and so
> > it doesn't need to create each item.
>
> Is there any sample code showing how it can be used?

Search "Vector" in hachoir_parser/*/*.py.

Examples:

yield GenericVector(self, "defined[]", nb_digest, UInt8, "bool")
yield GenericVector(self, "badpages", count, UInt32, "badpage")
yield GenericVector(self, "pages", self["last_page"].value, Page, "page")
etc.

Constructor:
--------------------------- 8< --------------------------------------------
class GenericVector(FieldSet):
    def __init__(self, parent, name, nb_items, item_class, item_name="item",
    description=None):
        ...
--------------------------- 8< --------------------------------------------

Victor
--

-- 
Victor Stinner aka haypo
http://hachoir.org/

(Continue reading)

Conrad Steenberg | 15 Mar 2007 18:57
Picon
Favicon

New parser question

Hi all

I'm trying to write a parser for the Hessian binary format, and have a
few questions :-)

I followed the instructions in filling out the template for a new parser
and have most of the information needed for handling the format in there
- it's a very simple byte-oriented format.

When starting up hachoir-wx, how do I enable tracebacks of my code? I
get e.g.:
[warn] [<HessianFile>] Error when getting size of 'root': delete it
[warn] [<HessianFile>] generator already executing
[warn] [<HessianFile>] [Autofix] Fix parser error: stop parser, add
padding

Thanks,
Conrad

--

-- 
Conrad Steenberg <conrad@...> 
California Institute of Technology | http://conradsteenberg.info
Attachment (smime.p7s): application/x-pkcs7-signature, 2619 bytes
Victor Stinner | 15 Mar 2007 19:09

Re: New parser question

Le Jeudi 15 Mars 2007 18:57, Conrad Steenberg a écrit :
> Hi all
>
> I'm trying to write a parser for the Hessian binary format, and have a
> few questions :-)
>
> I followed the instructions in filling out the template for a new parser
> and have most of the information needed for handling the format in there
> - it's a very simple byte-oriented format.
>
> When starting up hachoir-wx, how do I enable tracebacks of my code? I
> get e.g.:
> [warn] [<HessianFile>] Error when getting size of 'root': delete it
> [warn] [<HessianFile>] generator already executing
> [warn] [<HessianFile>] [Autofix] Fix parser error: stop parser, add
> padding

Please, use last version of Hachoir (from Subversion). Use directly Subversion 
to download trunk (and then use PYTHONPATH to load correct modules), or 
use "daily snapshot".

Last version gives more information about such problem.

Victor
--

-- 
Victor Stinner aka haypo
http://hachoir.org/

Conrad Steenberg | 15 Mar 2007 21:19
Picon
Favicon

Re: New parser question

Hi Victor

Next question:

In a class derived from a FieldSet I do
   addr = self.absolute_address
   value = stream.readBytes(addr, 1)
   if value=='i':
     yield UInt16(self, "Int", "Unsigned Int")

If I feed the parser a string 'i\x00\x0f' it should read the 'i' and
then yield a UInt16 with value 15. Instead it reads 'i\x00' as the value
of the int. 

How can I advance the address that data is read from by one, as it seems
like readBytes() doesn't do that?

Thanks again :-)

Conrad

On Thu, 2007-03-15 at 19:09 +0100, Victor Stinner wrote:
> Le Jeudi 15 Mars 2007 18:57, Conrad Steenberg a écrit :
> > Hi all
> >
> > I'm trying to write a parser for the Hessian binary format, and have a
> > few questions :-)
> >
> > I followed the instructions in filling out the template for a new parser
> > and have most of the information needed for handling the format in there
(Continue reading)

Cyril Zorin | 15 Mar 2007 22:19
Picon
Gravatar

Re: New parser question

Hi Conrad,

Instead of using stream.readBytes, just yield a Byte and then check  
its value to yield your UInt16, etc. It's better to yield a Byte in  
this case anyway because it will give the viewer a better description  
of the format.

On 15-Mar-07, at 4:19 PM, Conrad Steenberg wrote:

> Hi Victor
>
> Next question:
>
> In a class derived from a FieldSet I do
>    addr = self.absolute_address
>    value = stream.readBytes(addr, 1)
>    if value=='i':
>      yield UInt16(self, "Int", "Unsigned Int")
>
> If I feed the parser a string 'i\x00\x0f' it should read the 'i' and
> then yield a UInt16 with value 15. Instead it reads 'i\x00' as the  
> value
> of the int.
>
> How can I advance the address that data is read from by one, as it  
> seems
> like readBytes() doesn't do that?
>
> Thanks again :-)
>
(Continue reading)

Conrad Steenberg | 16 Mar 2007 00:55
Picon
Favicon

Allowing Empty strings

Hi

It seems like the GenericString class doesn't allow zero length strings
even for "fixed" strings, from /hachoir_core/field/string_field.py:113
            if not (1 <= nbytes <= 0xffff):
                raise FieldError("Invalid string size for %s: %s" %...

Since the constructor for the GenericString class is a quite extensive,
I'm loathe to reimplement it just to change this one test.

Would it be possible to either change the above test to
            if not (0 <= nbytes <= 0xffff):
                raise FieldError("Invalid string size for %s: %s" %...

or to add a keyword to the constructor to allow zero-length fixed
strings?

Btw, this comes from the Hessian format that is perfectly happy with
zero-length strings: "S\0\0" is a Pascal16 string with an 'S' prefix.

Thanks,
Conrad

--

-- 
Conrad Steenberg <conrad@...> 
California Institute of Technology | http://conradsteenberg.info
Attachment (smime.p7s): application/x-pkcs7-signature, 2619 bytes

Gmane