Chris Wj | 2 Mar 03:24
Picon
Gravatar

Segmentation Fault loading graphml.xsd

Before posting a new bug I want to confirm this. I am reading in graphml.xsd as a Schema to validate against, which has other xsd files that it references located in same folder.

Linux x86_64, Python 2.5, lxml 2.2beta4

lxml.etree:        (2, 2, -96, 0)
libxml used:       (2, 6, 32)
libxml compiled:   (2, 6, 32)
libxslt used:      (1, 1, 24)
libxslt compiled:  (1, 1, 24)

Code to reproduce error:

In [1]: from lxml import etree

In [2]: etree.XMLSchema(file="grap
graphml+svg.xsd graphml-attributes.xsd graphml-parseinfo.xsd graphml-structure.xsd graphml.dtd graphml.xsd

In [2]: s = etree.XMLSchema(file="graphml.xsd")
Segmentation fault

Schemas can be obtained here: http://graphml.graphdrawing.org/specification.html
Loading the others seg faults too.

-Chris

_______________________________________________
lxml-dev mailing list
lxml-dev <at> codespeak.net
http://codespeak.net/mailman/listinfo/lxml-dev
Stefan Behnel | 2 Mar 14:07
Picon
Favicon
Gravatar

Re: Segmentation Fault loading graphml.xsd

Hi,

Chris Wj wrote:
> Before posting a new bug I want to confirm this. I am reading in graphml.xsd
> as a Schema to validate against, which has other xsd files that it
> references located in same folder.
> 
> Linux x86_64, Python 2.5, lxml 2.2beta4
> 
> lxml.etree:        (2, 2, -96, 0)
> libxml used:       (2, 6, 32)
> libxml compiled:   (2, 6, 32)
> libxslt used:      (1, 1, 24)
> libxslt compiled:  (1, 1, 24)
> 
> Code to reproduce error:
> 
> In [1]: from lxml import etree
> 
> In [2]: etree.XMLSchema(file="grap
> graphml+svg.xsd graphml-attributes.xsd graphml-parseinfo.xsd
> graphml-structure.xsd graphml.dtd graphml.xsd
> 
> In [2]: s = etree.XMLSchema(file="graphml.xsd")
> Segmentation fault
> 
> Schemas can be obtained here:
> http://graphml.graphdrawing.org/specification.html
> Loading the others seg faults too.

Thanks for the report. I can confirm that this was a bug in lxml. It only
happens when you parse the schema directly from a filename. This will be
fixed in the final 2.2 release.

Stefan
jholg | 3 Mar 11:31
Picon
Picon

(re-raising) exceptions problem in lxml 2.2beta4

Hi,

I just ran into a problem with some code that re-raises exceptions,
where accessing sys.exc_info() returns (None, None, None) instead
of the expected most recent exception information.
This seems to happen if "some operation" is performed on the tree after the exception has been caught and
before sys.exc_info() gets invoked.
Looks like lxml clears the exception information somewhere on the way.

Here's a minimal example where the invocation of iterchildren() triggers
the behaviour:

$ cat lxml_reraise.py
import sys

from lxml import etree
print "using lxml version", etree.__version__

root = etree.Element('root')
try:
    access = bool(sys.argv[1])
except IndexError:
    access = False

try:
    raise RuntimeError('Too much foo for bar')
except Exception, e:
    if access:
        print "children:", list(root.iterchildren())
    print sys.exc_info()

Run with lxml 2.1.5:

$ python2.4 lxml_reraise.py
using lxml version 2.1.5
(<class exceptions.RuntimeError at 0x132780>, <exceptions.RuntimeError instance at 0x25f0a8>,
<traceback object at 0x25f080>)

$ python2.4 lxml_reraise.py 1
using lxml version 2.1.5
children: []
(<class exceptions.RuntimeError at 0x132780>, <exceptions.RuntimeError instance at 0x1d5828>,
<traceback object at 0x1d57d8>)

Run with lxml 2.2beta4:

$ ln -s ~/pydev/tmp/lxml-2.2beta4/build/lib.solaris-2.8-sun4u-2.4/lxml
$ python2.4 lxml_reraise.py
using lxml version 2.2.beta4
(<class exceptions.RuntimeError at 0x132780>, <exceptions.RuntimeError instance at 0x252260>,
<traceback object at 0x252170>)

$ python2.4 lxml_reraise.py 1
using lxml version 2.2.beta4
children: []
(None, None, None)

Now, I seem to remember some discussion of changes wrt to exceptions for
lxml 2.2. Might this be an (unwanted) side effect to these changes?

Holger
--

-- 
Psssst! Schon vom neuen GMX MultiMessenger gehört? Der kann`s mit allen: http://www.gmx.net/de/go/multimessenger01
Stefan Behnel | 3 Mar 14:29
Picon
Favicon
Gravatar

Re: (re-raising) exceptions problem in lxml 2.2beta4

Hi,

jholg <at> gmx.de wrote:
> I just ran into a problem with some code that re-raises exceptions,
> where accessing sys.exc_info() returns (None, None, None) instead
> of the expected most recent exception information.
> This seems to happen if "some operation" is performed on the tree

"some operation" being something that raises an exception internally, such
as StopIteration. So it's not really the "most recent exception" that you
get, but only the last exception that you caught in your frame.

> Now, I seem to remember some discussion of changes wrt to exceptions for
> lxml 2.2. Might this be an (unwanted) side effect to these changes?

Yes, it's related, and it's definitely a side-effect in Python 2. I wonder
what your example does in Py3...

Looks like Cython needs some version specific code here. Pretty hard to
get these things right in a portable way...

Stefan
Alex Klizhentas | 4 Mar 12:24
Picon
Gravatar

Lxml Crash

Hi all,
sometimes i get exception killing apache process. It happens occasionally (acually it happened once on my production site), so I have no more logs up to the moment, I can only suspect that crash  happens when I am trying to replace the node:

    def replace(self,child,new_child):
        root = self.getroottree().getroot()
        index = self.index(child)
        if root._should_notify():
            old_child = deepcopy(child)
            self.insert(index,new_child)
            etree.ElementBase.remove(self,child)
            root._notify(NodeReplaced(old_child,new_child))
            return self[index]
        else:
            self.insert(index,new_child)
            etree.ElementBase.remove(self,child)
            return self[index]

crash log is below:

*** glibc detected *** /usr/sbin/apache2: free(): invalid pointer: 0x08cd6eca ***
======= Backtrace: =========
/lib/tls/i686/cmov/libc.so.6[0xb7e26a85]
/lib/tls/i686/cmov/libc.so.6(cfree+0x90)[0xb7e2a4f0]
/usr/lib/libxml2.so.2(xmlFreeNodeList+0x126)[0xa984d1e6]
/usr/lib/libxml2.so.2(xmlFreeNode+0x76)[0xa984d656]
/usr/lib/python2.5/site-packages/lxml-2.2alpha1-py2.5-linux-i686.egg/lxml/etree.so[0xa9992bf2]
/usr/lib/python2.5/site-packages/lxml-2.2alpha1-py2.5-linux-i686.egg/lxml/etree.so[0xa99b529f]

I will bring in more logs if crash repeats, but I will appreciate any ideas/thoughts/comments so I can quickly eliminate/workaround/prevent the issue from happening again.

--
Regards,
Alex

_______________________________________________
lxml-dev mailing list
lxml-dev <at> codespeak.net
http://codespeak.net/mailman/listinfo/lxml-dev
Stefan Behnel | 4 Mar 13:19
Picon
Favicon
Gravatar

Re: Lxml Crash

Alex Klizhentas wrote:
> sometimes i get exception killing apache process. It happens occasionally
> (acually it happened once on my production site), so I have no more logs
> up to the moment,
> [...]
> I will bring in more logs if crash repeats, but I will appreciate any
> ideas/thoughts/comments so I can quickly eliminate/workaround/prevent the
> issue from happening again.

One thing to note is that you are using lxml 2.2alpha1. There were plenty
of bugs that were fixed in 2.2 since then, including a couple of crash
bugs. I'd try to switch to 2.2beta4 ASAP.

http://codespeak.net/lxml/dev/changes-2.2beta4.html

> I can only suspect that crash  happens when I am trying to
> replace the node:
>
>     def replace(self,child,new_child):
>         root = self.getroottree().getroot()
>         index = self.index(child)
>         if root._should_notify():
>             old_child = deepcopy(child)
>             self.insert(index,new_child)
>             etree.ElementBase.remove(self,child)
>             root._notify(NodeReplaced(old_child,new_child))
>             return self[index]
>         else:
>             self.insert(index,new_child)
>             etree.ElementBase.remove(self,child)
>             return self[index]

Regarding this code, I assume that "self" is an ElementBase subtype. I
wonder why you didn't write it like this:

    def replace(self,child,new_child):
        etree.ElementBase.replace(self, child, new_child)
        root = self.getroottree().getroot()
        if root._should_notify():
            root._notify(NodeReplaced(child, new_child))
        return new_child

BTW, is your tree protected against concurrent modification in any way? If
your environment (mod_python?) is configured to run requests in parallel,
concurrently replacing a child of the same parent may lead to crashes.

> crash log is below:
>
> *** glibc detected *** /usr/sbin/apache2: free(): invalid pointer:
> 0x08cd6eca ***
> ======= Backtrace: =========
> /lib/tls/i686/cmov/libc.so.6[0xb7e26a85]
> /lib/tls/i686/cmov/libc.so.6(cfree+0x90)[0xb7e2a4f0]
> /usr/lib/libxml2.so.2(xmlFreeNodeList+0x126)[0xa984d1e6]
> /usr/lib/libxml2.so.2(xmlFreeNode+0x76)[0xa984d656]
> /usr/lib/python2.5/site-packages/lxml-2.2alpha1-py2.5-linux-i686.egg/lxml/etree.so[0xa9992bf2]
> /usr/lib/python2.5/site-packages/lxml-2.2alpha1-py2.5-linux-i686.egg/lxml/etree.so[0xa99b529f]

All I can see here is that this happens when freeing a node or subtree.
Not much I can extract from that.

Stefan
Alex Klizhentas | 4 Mar 13:30
Picon
Gravatar

Re: Lxml Crash

OK, thanks for your suggestions - I'll apply changes immediately,
What about concurrency - XML trees are not shared between threads, so it's unlikely a root cause.

On Wed, Mar 4, 2009 at 3:19 PM, Stefan Behnel <stefan_ml <at> behnel.de> wrote:
Alex Klizhentas wrote:
> sometimes i get exception killing apache process. It happens occasionally
> (acually it happened once on my production site), so I have no more logs
> up to the moment,
> [...]
> I will bring in more logs if crash repeats, but I will appreciate any
> ideas/thoughts/comments so I can quickly eliminate/workaround/prevent the
> issue from happening again.

One thing to note is that you are using lxml 2.2alpha1. There were plenty
of bugs that were fixed in 2.2 since then, including a couple of crash
bugs. I'd try to switch to 2.2beta4 ASAP.

http://codespeak.net/lxml/dev/changes-2.2beta4.html


> I can only suspect that crash  happens when I am trying to
> replace the node:
>
>     def replace(self,child,new_child):
>         root = self.getroottree().getroot()
>         index = self.index(child)
>         if root._should_notify():
>             old_child = deepcopy(child)
>             self.insert(index,new_child)
>             etree.ElementBase.remove(self,child)
>             root._notify(NodeReplaced(old_child,new_child))
>             return self[index]
>         else:
>             self.insert(index,new_child)
>             etree.ElementBase.remove(self,child)
>             return self[index]

Regarding this code, I assume that "self" is an ElementBase subtype. I
wonder why you didn't write it like this:

   def replace(self,child,new_child):
       etree.ElementBase.replace(self, child, new_child)
       root = self.getroottree().getroot()
       if root._should_notify():
           root._notify(NodeReplaced(child, new_child))
       return new_child

BTW, is your tree protected against concurrent modification in any way? If
your environment (mod_python?) is configured to run requests in parallel,
concurrently replacing a child of the same parent may lead to crashes.


> crash log is below:
>
> *** glibc detected *** /usr/sbin/apache2: free(): invalid pointer:
> 0x08cd6eca ***
> ======= Backtrace: =========
> /lib/tls/i686/cmov/libc.so.6[0xb7e26a85]
> /lib/tls/i686/cmov/libc.so.6(cfree+0x90)[0xb7e2a4f0]
> /usr/lib/libxml2.so.2(xmlFreeNodeList+0x126)[0xa984d1e6]
> /usr/lib/libxml2.so.2(xmlFreeNode+0x76)[0xa984d656]
> /usr/lib/python2.5/site-packages/lxml-2.2alpha1-py2.5-linux-i686.egg/lxml/etree.so[0xa9992bf2]
> /usr/lib/python2.5/site-packages/lxml-2.2alpha1-py2.5-linux-i686.egg/lxml/etree.so[0xa99b529f]

All I can see here is that this happens when freeing a node or subtree.
Not much I can extract from that.

Stefan




--
Regards,
Alex
_______________________________________________
lxml-dev mailing list
lxml-dev <at> codespeak.net
http://codespeak.net/mailman/listinfo/lxml-dev
TP | 10 Mar 18:25
Picon
Favicon

the subelements of my tree are moving alone

Hi everybody,

I have derived custom classes from ET._ElementTree and ET.ElementBase to
obtain a custom tree suited to my needs.

It works perfectly, but it seems that the nodes under the root node (the
subelements) move sometimes "alone". The tree structure is kept, but the
address of the elements in memory is changing. As the structure is kept, it
is not a problem for lxml use only: I can walk in the tree, doing what I
need.

But the problem is that I use this custom tree as the underlying data
structure for a PyQt custom QTreeWidget. In this widget, I use the
method "internalPointer()" of QModelIndex instances (as proposed in the
chapter 16 of book "Rapid GUI Programming with Python
and Qt" by Mark Summerfield (around p.500)).

The problem is that if the nodes move, the "internalPointer()" of Qt are not
up to date: I obtain segmentation faults.

Is this normal that nodes of the tree move in memory *alone*? Is this due to
the garbage collector? If yes, how to keep my pointers up to date?

Thanks in advance

--

-- 
python -c "print ''.join([chr(154 - ord(c)) for c in '*9(9&(18%.\
9&1+,\'Z4(55l4('])"

"When a distinguished but elderly scientist states that something is
possible, he is almost certainly right. When he states that something is
impossible, he is very probably wrong." (first law of AC Clarke)
jholg | 11 Mar 09:22
Picon
Picon

Re: the subelements of my tree are moving alone

Hi,

> It works perfectly, but it seems that the nodes under the root node (the
> subelements) move sometimes "alone". The tree structure is kept, but the
> address of the elements in memory is changing. As the structure is kept,
> it
> is not a problem for lxml use only: I can walk in the tree, doing what I
> need.

That's true. lxml creates its elements on-the-fly on access, you can think of them as access proxies to the
underlying libxml2 tree.
This means they go away when no Python reference to them is kept.

> But the problem is that I use this custom tree as the underlying data
> structure for a PyQt custom QTreeWidget. In this widget, I use the
> method "internalPointer()" of QModelIndex instances (as proposed in the
> chapter 16 of book "Rapid GUI Programming with Python
> and Qt" by Mark Summerfield (around p.500)).
> 
> The problem is that if the nodes move, the "internalPointer()" of Qt are
> not
> up to date: I obtain segmentation faults.
> 
> Is this normal that nodes of the tree move in memory *alone*? Is this due
> to
> the garbage collector? If yes, how to keep my pointers up to date?

You could keep elements around by caching them, which is usually done for performance tuning (trading
memory for speed), like:

cache[root] = list(root.iter())

This caches the whole tree, see "Caching elements" in the objectify performance section:
http://codespeak.net/lxml/performance.html#lxml-objectify

So essentially you'd need to keep a Python reference to each instantiated element that you want to hand to PyQt.
I wondered why PyQt doesn't keep the Python reference itself, but alas it's just a weak reference:
http://www.mail-archive.com/pyqt <at> riverbankcomputing.com/msg16046.html

Holger
--

-- 
Nur bis 16.03.! DSL-Komplettanschluss inkl. WLAN-Modem für nur 
17,95 ¿/mtl. + 1 Monat gratis!* http://dsl.gmx.de/?ac=OM.AD.PD003K11308T4569a
TP | 11 Mar 11:41
Picon
Favicon

Re: the subelements of my tree are moving alone

jholg <at> gmx.de wrote:

> So essentially you'd need to keep a Python reference to each instantiated
> element that you want to hand to PyQt. I wondered why PyQt doesn't keep
> the Python reference itself, but alas it's just a weak reference:
> http://www.mail-archive.com/pyqt <at> riverbankcomputing.com/msg16046.html

Thanks Holger and Stefan for your help.
By keeping a reference to all elements in the tree, it works perfectly.

Julien

--

-- 
python -c "print ''.join([chr(154 - ord(c)) for c in '*9(9&(18%.\
9&1+,\'Z4(55l4('])"

"When a distinguished but elderly scientist states that something is
possible, he is almost certainly right. When he states that something is
impossible, he is very probably wrong." (first law of AC Clarke)

Gmane