Martijn Faassen | 2 Jun 18:42 2005

Re: 0.6 on Tiger

Florent Guillaume wrote:
> Just a success report:
[snip]

Thanks for the report!

> Note that libxslt is 1.1.11, not the recommended 1.1.12 in  INSTALL.txt. 
> Should I expect problems ?

I don't expect so, as the amount of XSLT APIs used is still small and I 
don't expect them to change a lot over versions. Still, I can't test 
against older versions myself.

Regards,

Martijn
Mikhail Sobolev | 11 Jun 10:04 2005
Picon

compilation using gcc 4.0.1

Hi

I'm trying to build lxml 0.6 using gcc 4.0:

  $ gcc --version
  gcc (GCC) 4.0.1 20050604 (prerelease) (Debian 4.0.0-8ubuntu3)
  Copyright (C) 2005 Free Software Foundation, Inc.

The compilation fails with the following result:

  python2.4 setup.py  build_ext -i
  running build_ext
  building 'lxml.etree' extension
  creating build
  creating build/temp.linux-i686-2.4
  creating build/temp.linux-i686-2.4/src
  creating build/temp.linux-i686-2.4/src/lxml
  gcc -pthread -fno-strict-aliasing -DNDEBUG -g -O3 -Wall -Wstrict-prototypes -fPIC
-I/usr/include/libxml2 -I/usr/include/python2.4 -c src/lxml/etree.c -o
build/temp.linux-i686-2.4/src/lxml/etree.o -w
  src/lxml/etree.c: In function '__pyx_f_5etree__elementFactory':
  src/lxml/etree.c:3008: error: invalid lvalue in assignment
  src/lxml/etree.c: In function '__pyx_f_5etree__commentFactory':
  src/lxml/etree.c:3385: error: invalid lvalue in assignment
  src/lxml/etree.c: In function '__pyx_f_5etree__attribFactory':
  src/lxml/etree.c:4228: error: invalid lvalue in assignment
  src/lxml/etree.c: In function '__pyx_f_5etree__attribIteratorFactory':
  src/lxml/etree.c:4365: error: invalid lvalue in assignment
  src/lxml/etree.c: In function '__pyx_f_5etree__elementIteratorFactory':
  src/lxml/etree.c:4501: error: invalid lvalue in assignment
(Continue reading)

Olivier Grisel | 11 Jun 15:44 2005

cElementTree compatibility issue

Hello,

I wanted to  try to replace cElementTree by lxml.etree in the promising 
bazaar-ng SCM tool [1], but it seems to me there is a  API compatibility 
issue. I read the compatbility doc [2] to as to replace import lines in 
the BZR source code, but then whene I run the tests, I get::

   AttributeError: 'etree._ElementTree' object has no attribute 'parse'

Here are the steps to reproduce the problem::

   # download BZR:
   % rsync -av --delete bazaar-ng.org::bazaar-ng/bzr/bzr.dev .
   % cd bzr.dev/

   # replace the import lines:
   % perl -p -i -e 's/from cElementTree/from lxml.etree/g' bzrlib/*.py

   # run the tests:
   % python bzr selftest

The orignal import lines all have the form::

   >>> from cElementTree import Element, ElementTree, SubElement

What am I doing wrong?

[1] http://bazaar-ng.org
[2] http://codespeak.net/lxml/compatibility.html

(Continue reading)

Martijn Faassen | 13 Jun 12:47 2005

Re: cElementTree compatibility issue

Olivier Grisel wrote:

> I wanted to  try to replace cElementTree by lxml.etree in the promising 
> bazaar-ng SCM tool [1],

Cool! I didn't know it was using ElementTree. Thanks for trying this! 
What motivated you to try this?

> but it seems to me there is a API compatibility issue. 

This is quite possible.

> I read the compatbility doc [2] to as to replace import lines in 
> the BZR source code, but then whene I run the tests, I get::
> 
>   AttributeError: 'etree._ElementTree' object has no attribute 'parse'

Yup, this is a known (by me :) weakness in the lxml implementation -- 
not all of the API of ElementTree is supported (yet). What 'parse' on 
the ElementTree class is *replace* the contents of an XML document with 
a newly parsed tree. There's in fact a commented out completely bogus 
code for it in etree.pyx:

##     def parse(self, source, parser=None):
##         # XXX ignore parser for now
##         cdef xmlDoc* c_doc
##         c_doc = theParser.parseDoc(source)
##         result._c_doc = c_doc

##         return self.getroot()
(Continue reading)

Philipp von Weitershausen | 13 Jun 14:18 2005
Picon

Re: cElementTree compatibility issue

Martijn Faassen wrote:
> I hope I can resolve some of these issues with Fredrik Lundh 
> eventually so we can nail down more firmly what the ElementTree API is.

A PEP defining that API (similar to what PEP333 is to WSGI) would be 
awesome. Just something to think about...

Philipp
Olivier Grisel | 13 Jun 14:20 2005

Re: cElementTree compatibility issue

Martijn Faassen a écrit :

> Cool! I didn't know it was using ElementTree. Thanks for trying this! 
> What motivated you to try this?

I found both bzr and lxml great technologies and I just wanted to play 
with them together. I really like the pythonish API of elementtree and I 
wanted to benchmark lxml.etree by running the test suite of bzr and 
comparing the results with those of cElementTree on some 'real world' 
use cases.

> not all of the API of ElementTree is supported (yet). What 'parse' on 
> the ElementTree class is *replace* the contents of an XML document with 
> a newly parsed tree. [snip] 
> I recall I didn't finish this implementation as I thought about the 
> scary consequences of doing this. Perhaps I will give it another stab..

Thanks. What are the 'scary consequences' of replacing the document with 
a newly parsed tree?

> Nothing, I'm afraid. I should extend the compatibility text with this 
> information. It's unfortunately very hard to support from ElementTree 
> features on top of libxml2, so switching over the imports will not work 
> for all applications and will likely require some understanding of the 
> code... I hope I can resolve some of these issues with Fredrik Lundh 
> eventually so we can nail down more firmly what the ElementTree API is.

Great. Fredrik Lundh regurlaly posts on the bzr ML so he might be
aware if the XML needs and architecture of BZR.

(Continue reading)

dharana | 14 Jun 00:44 2005
Picon

Dump output correct, tostring output incorrect?

Hello list. I've found something really weird with lxml, I hope you can help me. 
I want dump() output but I can't get it and tostring returns the full xml 
document, not the part I need.

 >>> import lxml.etree
 >>> mydoc = lxml.etree.fromstring('<?xml 
version="1.0"?><req><oml><section><list><items><item>quick</item></items></list></section></oml></req>')
 >>> lxml.etree.tostring(oml)
'<req><oml><section><list><items><item>quick</item></items></list></section></oml></req>'
 >>> lxml.etree.dump(oml)
<oml><section><list><items><item>quick</item></items></list></section></oml>
 >>> oml
<Element oml at -481b6608>

lxml version 0.6

--

-- 
dharana
Martijn Faassen | 14 Jun 11:49 2005

Re: Re: cElementTree compatibility issue

Olivier Grisel wrote:
> Martijn Faassen a écrit :
> 
>> Cool! I didn't know it was using ElementTree. Thanks for trying this! 
>> What motivated you to try this?
> 
> I found both bzr and lxml great technologies and I just wanted to play 
> with them together. I really like the pythonish API of elementtree and I 
> wanted to benchmark lxml.etree by running the test suite of bzr and 
> comparing the results with those of cElementTree on some 'real world' 
> use cases.

Yes, I'm curious to see what the results would be like. I wouldn't be 
surprised if cElementTree was faster -- lxml code *can* be faster but 
typically only when special features of lxml are used, such as XPath. 
Still, I wouldn't expect it to be that much slower either.

>> not all of the API of ElementTree is supported (yet). What 'parse' on 
>> the ElementTree class is *replace* the contents of an XML document 
>> with a newly parsed tree. [snip] I recall I didn't finish this 
>> implementation as I thought about the scary consequences of doing 
>> this. Perhaps I will give it another stab..

> Thanks. What are the 'scary consequences' of replacing the document with 
> a newly parsed tree?

I think what I was worried about was stray nodes floating about still 
connected to a tree (which is also a stray). There's probably a way 
around this; we could replace the root element with a new one, 
disconnecting the previous root, and it may just all work (including 
(Continue reading)

Martijn Faassen | 14 Jun 11:50 2005

Re: compilation using gcc 4.0.1

[I accidentally only sent this to Mikhail directly before; sending to 
the list as well]

Mikhail Sobolev wrote:
> I'm trying to build lxml 0.6 using gcc 4.0:
> 
>   $ gcc --version
>   gcc (GCC) 4.0.1 20050604 (prerelease) (Debian 4.0.0-8ubuntu3)
>   Copyright (C) 2005 Free Software Foundation, Inc.
[snip]
> Any suggestions? :)  It might be a question to pyrex developers though.

Thanks for the report!

I don't dare trying to install gcc 4.0 on this box yet right now, so
debugging this is going to be hard. It may indeed be better to ask pyrex
developers, though it'd be nice to get this down to a small test sample
in that case. I'd be happy to help you figure this out; if you need
information about what a particular piece of lxml code does please ask!

I'll also look at getting a gcc-4.0 safely installed on my system, so I
can do some experimenting myself.

Regards,

Martijn
Martijn Faassen | 14 Jun 11:52 2005

Re: Dump output correct, tostring output incorrect?

dharana wrote:
> Hello list. I've found something really weird with lxml, I hope you can 
> help me. I want dump() output but I can't get it and tostring returns 
> the full xml document, not the part I need.
> 
>  >>> import lxml.etree
>  >>> mydoc = lxml.etree.fromstring('<?xml 
>
version="1.0"?><req><oml><section><list><items><item>quick</item></items></list></section></oml></req>') 
> 
>  >>> lxml.etree.tostring(oml)
> '<req><oml><section><list><items><item>quick</item></items></list></section></oml></req>' 
> 
>  >>> lxml.etree.dump(oml)
> <oml><section><list><items><item>quick</item></items></list></section></oml> 
> 
>  >>> oml
> <Element oml at -481b6608>
> 
> lxml version 0.6

There are some issues with 'tostring()' in lxml 0.6. I've solved some of 
them in the svn version that will turn into 0.7, but I still need to 
resolve one more issue. Anyway, am aware of these issues, will resolve 
them. Thanks for the report and sorry you had to run into this!

Regards,

Martijn
(Continue reading)


Gmane