Martijn Faassen | 3 Jun 20:30

segfault when using etree.CustomElementClassLookup

Hi there,

I just ran into a segfault with lxml (2.0.6). The problem is as follows:

from lxml import etree

class Lookup(etree.CustomElementClassLookup):
     def __init__(self):
         pass

     def lookup(self, node_type, document, namespace, name):
         return Foo

class Foo(etree.ElementBase):
     def custom(self):
         return "test"

lookup = Lookup()
parser = etree.XMLParser()
parser.setElementClassLookup(lookup)

root = etree.XML('<foo/>', parser) # crash!

If I leave out the custom __init__ in Lookup, things won't crash.

Regards,

Martijn
Kevin JR | 4 Jun 11:23
Picon

svn version failed to compiled against hg version of cython

libxslt-1.1.24
libxml2-2.6.32
python-2.5.2
cython-hg(482)

the error message:

$ python setup.py build
Building lxml version 2.1.beta3-55506.
Building with Cython 0.9.6.14.
Using build configuration of libxslt 1.1.24
Building against libxml2/libxslt in the following directory: /usr/lib
running build
running build_py
running build_ext
cythoning src/lxml/lxml.etree.pyx to src/lxml/lxml.etree.c

Error converting Pyrex file to C:
------------------------------------------------------------
...
        c_attr = c_attr.next
    return attributes

cdef object __RE_XML_ENCODING
__RE_XML_ENCODING = re.compile(
    ur'^(\s*<\?\s*xml[^>]+)\s+encoding\s*=\s*"[^"]*"\s*', re.U)
     ^
------------------------------------------------------------

/dev/shm/python-lxml/src/lxml-build/src/lxml/apihelpers.pxi:487:6: Expected ')'



_______________________________________________
lxml-dev mailing list
lxml-dev <at> codespeak.net
http://codespeak.net/mailman/listinfo/lxml-dev
Stefan Behnel | 4 Jun 12:36
Picon
Favicon
Gravatar

Re: svn version failed to compiled against hg version of cython

Kevin JR wrote:
> libxslt-1.1.24
> libxml2-2.6.32
> python-2.5.2
> cython-hg(482)
>
> the error message:
>
> $ python setup.py build
> Building lxml version 2.1.beta3-55506.
> Building with Cython 0.9.6.14.
> Using build configuration of libxslt 1.1.24
> Building against libxml2/libxslt in the following directory: /usr/lib
> running build
> running build_py
> running build_ext
> cythoning src/lxml/lxml.etree.pyx to src/lxml/lxml.etree.c
>
> Error converting Pyrex file to C:
> ------------------------------------------------------------
> ...
>         c_attr = c_attr.next
>     return attributes
>
> cdef object __RE_XML_ENCODING
> __RE_XML_ENCODING = re.compile(
>     ur'^(\s*<\?\s*xml[^>]+)\s+encoding\s*=\s*"[^"]*"\s*', re.U)
>      ^
> ------------------------------------------------------------
>
> /dev/shm/python-lxml/src/lxml-build/src/lxml/apihelpers.pxi:487:6:
> Expected
> ')'

I guess you are actually using an older Cython version, likely installed
with easy_install. The version number in current hg wasn't increased yet.

Stefan
Kevin JR | 4 Jun 12:44
Picon

Re: svn version failed to compiled against hg version of cython

On Wed, Jun 4, 2008 at 6:36 PM, Stefan Behnel <stefan_ml <at> behnel.de> wrote:

Kevin JR wrote:
> libxslt-1.1.24
> libxml2-2.6.32
> python-2.5.2
> cython-hg(482)
>
> the error message:
>
> $ python setup.py build
> Building lxml version 2.1.beta3-55506.
> Building with Cython 0.9.6.14.
> Using build configuration of libxslt 1.1.24
> Building against libxml2/libxslt in the following directory: /usr/lib
> running build
> running build_py
> running build_ext
> cythoning src/lxml/lxml.etree.pyx to src/lxml/lxml.etree.c
>
> Error converting Pyrex file to C:
> ------------------------------------------------------------
> ...
>         c_attr = c_attr.next
>     return attributes
>
> cdef object __RE_XML_ENCODING
> __RE_XML_ENCODING = re.compile(
>     ur'^(\s*<\?\s*xml[^>]+)\s+encoding\s*=\s*"[^"]*"\s*', re.U)
>      ^
> ------------------------------------------------------------
>
> /dev/shm/python-lxml/src/lxml-build/src/lxml/apihelpers.pxi:487:6:
> Expected
> ')'

I guess you are actually using an older Cython version, likely installed
with easy_install. The version number in current hg wasn't increased yet.


no, it's compiled from hg pool.
_______________________________________________
lxml-dev mailing list
lxml-dev <at> codespeak.net
http://codespeak.net/mailman/listinfo/lxml-dev
Stefan Behnel | 4 Jun 14:38
Picon
Favicon
Gravatar

Re: segfault when using etree.CustomElementClassLookup

Hi Martijn,

Martijn Faassen wrote:
> I just ran into a segfault with lxml (2.0.6). The problem is as follows:
> 
> from lxml import etree
> 
> class Lookup(etree.CustomElementClassLookup):
>      def __init__(self):
>          pass

Yep, you didn't call the __init__() method of the super class here, so the
internal lookup function call isn't set up. I replaced that with a __cinit__()
now that always sets it to the default lookup scheme, so that it won't
segfault anymore even if people forget the obvious. ;)

A patch is attached and it's generally easy to work around this by writing
correct code, so there won't be a 2.0.7 right away.

BTW, this:

> parser.setElementClassLookup(lookup)

is correctly spelled

> parser.set_element_class_lookup(lookup)

since lxml 2.0, following PEP 8 naming conventions. However, I didn't dare to
remove the original method, since I figured that it would break tons of code
for no major reason. At least the examples should reflect the new name
everywhere now, so maybe I can remove it in lxml 3.0. ;)

Stefan
Index: src/lxml/classlookup.pxi
===================================================================
--- src/lxml/classlookup.pxi	(revision 55464)
+++ src/lxml/classlookup.pxi	(working copy)
@@ -86,6 +86,10 @@
     """
     cdef readonly ElementClassLookup fallback
     cdef _element_class_lookup_function _fallback_function
+    def __cinit__(self):
+        # fall back to default lookup
+        self._fallback_function = _lookupDefaultElementClass
+
     def __init__(self, ElementClassLookup fallback=None):
         if fallback is not None:
             self._setFallback(fallback)
@@ -133,8 +137,10 @@
     cdef readonly object comment_class
     cdef readonly object pi_class
     cdef readonly object entity_class
+    def __cinit__(self):
+        self._lookup_function = _lookupDefaultElementClass
+
     def __init__(self, element=None, comment=None, pi=None, entity=None):
-        self._lookup_function = _lookupDefaultElementClass
         if element is None:
             self.element_class = _Element
         elif issubclass(element, ElementBase):
@@ -213,6 +219,9 @@
     cdef object _pytag
     cdef char* _c_ns
     cdef char* _c_name
+    def __cinit__(self):
+        self._lookup_function = _attribute_class_lookup
+
     def __init__(self, attribute_name, class_mapping,
                  ElementClassLookup fallback=None):
         self._pytag = _getNsTag(attribute_name)
@@ -225,7 +234,6 @@
         self._class_mapping = dict(class_mapping)

         FallbackElementClassLookup.__init__(self, fallback)
-        self._lookup_function = _attribute_class_lookup

 cdef object _attribute_class_lookup(state, _Document doc, xmlNode* c_node):
     cdef AttributeBasedElementClassLookup lookup
@@ -245,8 +253,7 @@
     """ParserBasedElementClassLookup(self, fallback=None)
     Element class lookup based on the XML parser.
     """
-    def __init__(self, ElementClassLookup fallback=None):
-        FallbackElementClassLookup.__init__(self, fallback)
+    def __cinit__(self):
         self._lookup_function = _parser_class_lookup

 cdef object _parser_class_lookup(state, _Document doc, xmlNode* c_node):
@@ -272,8 +279,7 @@

     If you return None from this method, the fallback will be called.
     """
-    def __init__(self, ElementClassLookup fallback=None):
-        FallbackElementClassLookup.__init__(self, fallback)
+    def __cinit__(self):
         self._lookup_function = _custom_class_lookup

     def lookup(self, type, doc, namespace, name):
Index: src/lxml/lxml.pyclasslookup.pyx
===================================================================
--- src/lxml/lxml.pyclasslookup.pyx	(revision 55464)
+++ src/lxml/lxml.pyclasslookup.pyx	(working copy)
@@ -311,8 +311,7 @@

     If you return None from this method, the fallback will be called.
     """
-    def __init__(self, ElementClassLookup fallback=None):
-        FallbackElementClassLookup.__init__(self, fallback)
+    def __cinit__(self):
         self._lookup_function = _lookup_class

     def lookup(self, doc, element):
_______________________________________________
lxml-dev mailing list
lxml-dev <at> codespeak.net
http://codespeak.net/mailman/listinfo/lxml-dev
Stefan Behnel | 4 Jun 19:50
Picon
Favicon
Gravatar

Re: svn version failed to compiled against hg version of cython

Hi,

Kevin JR wrote:
> On Wed, Jun 4, 2008 at 6:36 PM, Stefan Behnel <stefan_ml <at> behnel.de> wrote:
>> I guess you are actually using an older Cython version, likely installed
>> with easy_install. The version number in current hg wasn't increased yet.
>>
>>
> no, it's compiled from hg pool.

Believe me, you are not using a recent developer version of Cython. Maybe you
have hg pulled from cython-release instead of cython-devel or whatever. Try
moving your hg Cython directory out of the way and check if it still compiles.

Stefan
Martijn Faassen | 4 Jun 22:01

Re: segfault when using etree.CustomElementClassLookup

Hey Stefan,

On Wed, Jun 4, 2008 at 2:38 PM, Stefan Behnel <stefan_ml <at> behnel.de> wrote:
> Martijn Faassen wrote:
>> I just ran into a segfault with lxml (2.0.6). The problem is as follows:
>>
>> from lxml import etree
>>
>> class Lookup(etree.CustomElementClassLookup):
>>      def __init__(self):
>>          pass
>
> Yep, you didn't call the __init__() method of the super class here, so the
> internal lookup function call isn't set up.

Hm, I thought you were wrong, but you are right. I actually did have a
super call before I whittled it away to a minimal (too minimal!) test
case:

class Lookup(etree.CustomElementClassLookup):
  def __init__(self):
      super(etree.CustomElementClassLookup, self).__init__()

But I just realized that call was wrong, and should've been:

class Lookup(etree.CustomElementClassLookup):
    def __init__(self):
        super(Lookup, self).__init__()

that *does* work. :)

> I replaced that with a __cinit__()
> now that always sets it to the default lookup scheme, so that it won't
> segfault anymore even if people forget the obvious. ;)

> A patch is attached and it's generally easy to work around this by writing
> correct code, so there won't be a 2.0.7 right away.

Yes, that's fine, I could work around it anyway, and you're right it's
also a mistake for me. You don't expect a segfault even if you do it
wrong of course, but it's a corner case. My apologies for the mistaken
bug report!

> BTW, this:
>
>> parser.setElementClassLookup(lookup)
>
> is correctly spelled
>
>> parser.set_element_class_lookup(lookup)
>
> since lxml 2.0, following PEP 8 naming conventions. However, I didn't dare to
> remove the original method, since I figured that it would break tons of code
> for no major reason. At least the examples should reflect the new name
> everywhere now, so maybe I can remove it in lxml 3.0. ;)

The documentation on the website still has the camelCases when I read
it yesterday.

Regards,

Martijn
Stefan Behnel | 4 Jun 22:23
Picon
Favicon
Gravatar

Re: segfault when using etree.CustomElementClassLookup

Hi,

Martijn Faassen wrote:
> On Wed, Jun 4, 2008 at 2:38 PM, Stefan Behnel <stefan_ml <at> behnel.de> wrote:
>>> parser.setElementClassLookup(lookup)
>> is correctly spelled
>>
>>> parser.set_element_class_lookup(lookup)
>> since lxml 2.0, following PEP 8 naming conventions. However, I didn't dare to
>> remove the original method, since I figured that it would break tons of code
>> for no major reason. At least the examples should reflect the new name
>> everywhere now, so maybe I can remove it in lxml 3.0. ;)
> 
> The documentation on the website still has the camelCases when I read
> it yesterday.

Hrmpf, thanks. I had actually postponed my initial decision to follow PEP 8
everywhere, and then forgotten to fix that function name for 2.0. Then I
figured out later that it was still used everywhere in the docs, so I couldn't
remove it without a longer warning phase. I had fixed it on the trunk back
then, but apparently forgot to merge the doc changes over to the 2.0 branch...

It's fixed now ... finally ...

Stefan
Martijn Faassen | 4 Jun 22:27

Re: segfault when using etree.CustomElementClassLookup

Hi there,

On Wed, Jun 4, 2008 at 10:23 PM, Stefan Behnel <stefan_ml <at> behnel.de> wrote:
[snip]
> It's fixed now ... finally ...

I'm glad this thread came to some good after all then, even though it
was all based on a mistake by me. :)

Regards,

Martijn
Roger Patterson | 5 Jun 01:59
Picon

a different segfault

Hi Stefan et al.

I am getting a mysterious segfault using the XSLT lib.
Basically, if I have:
    <xsl:strip-space elements="*"/>

in my transform, I get the segfault, if I remove that line, it works fine.

Now, I haven't distilled it down to a succinct example yet, and my 
transform and code are pretty large, but I was wondering if anyone else 
has experienced this?

cheers
-Roger

The dump looks like this:

*** glibc detected *** python: double free or corruption (!prev): 
0x000000000f7dab00 ***
======= Backtrace: =========
/lib64/libc.so.6[0x352d46e890]
/lib64/libc.so.6(cfree+0x8c)[0x352d471fac]
/usr/lib64/libxml2.so.2(xmlFreeNodeList+0x177)[0x3536e4ff27]
/usr/lib64/libxml2.so.2(xmlFreeNodeList+0x89)[0x3536e4fe39]
/usr/lib64/libxml2.so.2(xmlFreeNodeList+0x89)[0x3536e4fe39]
/usr/lib64/libxml2.so.2(xmlFreeDoc+0xb6)[0x3536e4fc96]
/usr/lib/python2.4/site-packages/lxml-2.0.4-py2.4-linux-x86_64.egg/lxml/etree.so[0x2aaaaf1b79c8]
/usr/lib/python2.4/site-packages/lxml-2.0.4-py2.4-linux-x86_64.egg/lxml/etree.so[0x2aaaaf1b900d]
/usr/lib64/libpython2.4.so.1.0[0x353fa74f98]
/usr/lib64/libpython2.4.so.1.0[0x353fa4abd2]
/usr/lib64/libpython2.4.so.1.0(PyEval_EvalCodeEx+0x383)[0x353fa95363]
/usr/lib64/libpython2.4.so.1.0(PyEval_EvalFrame+0x407f)[0x353fa9405f]
/usr/lib64/libpython2.4.so.1.0(PyEval_EvalCodeEx+0x925)[0x353fa95905]
/usr/lib64/libpython2.4.so.1.0(PyEval_EvalFrame+0x407f)[0x353fa9405f]
/usr/lib64/libpython2.4.so.1.0(PyEval_EvalFrame+0x44a6)[0x353fa94486]
/usr/lib64/libpython2.4.so.1.0(PyEval_EvalFrame+0x44a6)[0x353fa94486]
/usr/lib64/libpython2.4.so.1.0(PyEval_EvalFrame+0x44a6)[0x353fa94486]
/usr/lib64/libpython2.4.so.1.0(PyEval_EvalFrame+0x44a6)[0x353fa94486]
/usr/lib64/libpython2.4.so.1.0(PyEval_EvalCodeEx+0x925)[0x353fa95905]
/usr/lib64/libpython2.4.so.1.0(PyEval_EvalFrame+0x407f)[0x353fa9405f]
/usr/lib64/libpython2.4.so.1.0(PyEval_EvalCodeEx+0x925)[0x353fa95905]
/usr/lib64/libpython2.4.so.1.0(PyEval_EvalFrame+0x407f)[0x353fa9405f]
/usr/lib64/libpython2.4.so.1.0(PyEval_EvalCodeEx+0x925)[0x353fa95905]
/usr/lib64/libpython2.4.so.1.0[0x353fa4c263]
/usr/lib64/libpython2.4.so.1.0(PyObject_Call+0x10)[0x353fa35f90]
/usr/lib64/libpython2.4.so.1.0[0x353fa3c01f]
/usr/lib64/libpython2.4.so.1.0(PyObject_Call+0x10)[0x353fa35f90]
/usr/lib64/libpython2.4.so.1.0(PyEval_EvalFrame+0x220d)[0x353fa921ed]
/usr/lib64/libpython2.4.so.1.0(PyEval_EvalFrame+0x44a6)[0x353fa94486]
/usr/lib64/libpython2.4.so.1.0(PyEval_EvalCodeEx+0x925)[0x353fa95905]
/usr/lib64/libpython2.4.so.1.0(PyEval_EvalFrame+0x407f)[0x353fa9405f]
/usr/lib64/libpython2.4.so.1.0(PyEval_EvalCodeEx+0x925)[0x353fa95905]
/usr/lib64/libpython2.4.so.1.0(PyEval_EvalCode+0x32)[0x353fa95952]
/usr/lib64/libpython2.4.so.1.0[0x353fab1ea9]
/usr/lib64/libpython2.4.so.1.0(PyRun_SimpleFileExFlags+0x1a8)[0x353fab3358]
/usr/lib64/libpython2.4.so.1.0(Py_Main+0xa5d)[0x353fab979d]
/lib64/libc.so.6(__libc_start_main+0xf4)[0x352d41d8a4]
python[0x400629]

Gmane