Andreas Pakulat | 1 Feb 2006 20:55
Picon
Picon
Gravatar

Request for help on testing new libxml2 feature

Hi,

regarding the remove-redundant-namespaces issue there are news:

kbuchcik implemented the xmlDOMWrapReconcileNamespaces in tree.c of
libxml2 so it should remove redundant NS decl. However neither do I have
any experience with libxml2 nor do I have the time to dig into it so
that I can build a test program for this.

Thus I ask you guys here, who surely are libxml2 experts, if you could
help me out here. Either some "hack" for lxml that allows me to test
this or a small programm that takes an xml file and applies this
function to it's DOM tree (and outputs the result) would be really
great.

Thanks,

Andreas

--

-- 
You now have Asian Flu.
_______________________________________________
lxml-dev mailing list
lxml-dev <at> codespeak.net
http://codespeak.net/mailman/listinfo/lxml-dev
Kasimier Buchcik | 2 Feb 2006 13:23
Picon
Favicon

Re: Request for help on testing new libxml2 feature

Hi,

On Wed, 2006-02-01 at 20:55 +0100, Andreas Pakulat wrote:
> Hi,
> 
> regarding the remove-redundant-namespaces issue there are news:
> 
> kbuchcik implemented the xmlDOMWrapReconcileNamespaces in tree.c of
> libxml2 so it should remove redundant NS decl. However neither do I have
> any experience with libxml2 nor do I have the time to dig into it so
> that I can build a test program for this.
> 
> Thus I ask you guys here, who surely are libxml2 experts, if you could
> help me out here. Either some "hack" for lxml that allows me to test
> this or a small programm that takes an xml file and applies this
> function to it's DOM tree (and outputs the result) would be really
> great.

Note that removal of redundant ns-decls in 
xmlDOMWrapReconcileNamespaces() was committed to CVS just
yesterday and I fixed some bugs today; I peformed only rudimental
tests, so more testing would be appreciated.

Andreas' bug-entry:
http://bugzilla.gnome.org/show_bug.cgi?id=329347

Regards,

Kasimier
(Continue reading)

Petri Savolainen | 2 Feb 2006 14:15
Picon
Picon
Favicon
Gravatar

Version for windows?

Hello, I saw some talk about a windows version on this list/group, but 
did not see one on the website yet...? Any pointers? The one below was 
not available when I tried...

Thanks!

  Petri

Steve Howe wrote:
> Hello Werner,
> 
> Wednesday, December 14, 2005, 1:10:09 PM, you wrote:
> 
>> I am running into problems trying to:
> 
>> \python24\python setup.py install
> 
> [...]
> 
>> Any chance that there is a binary build for Windows XP is available?
> You can get it from me:
> 
> http://redguy.dhs.org:81/lxml-0.8.win32-py2.4.exe
> 
> This is the 0.8 version.
> 
> Best Regards,
> Howe
Stefan Behnel | 2 Feb 2006 16:09
Picon

XMLParser + XMLFormatter ?

Hi,

Andreas requested a new feature that mainly relates to output formatting. The
obvious API for it might be a new keyword argument.

However, since we seem to be getting more and more keyword arguments in the
I/O functions, maybe we should rethink the way we set options on output
methods. Fredrik mentioned the possibility to have an XMLParser that only
wrapps options like this:

    class XMLParser:
        def __init__(self, **options):
            self.options = options

    doc = ET.parse(source, parser=XMLParser(configuration))

What about doing the same with output options? Imagine this:

    class XMLFormatter:
        def __init__(self, xhtml=False, pretty_print=False, indent=4, ...):
            self.options = {} ...

   xml_text = ET.write_str(XMLFormatter(pretty_print=True))

That would give us a nice, symmetric API for input and output options.

If you prefer, you could easily use sublasses to provide different default
arguments:

    class XMLPrettyPrinter(XMLFormatter):
(Continue reading)

Stefan Behnel | 2 Feb 2006 16:30
Picon

Re: Request for help on testing new libxml2 feature


Kasimier Buchcik wrote:
> On Wed, 2006-02-01 at 20:55 +0100, Andreas Pakulat wrote:
>> regarding the remove-redundant-namespaces issue there are news:
>>
>> kbuchcik implemented the xmlDOMWrapReconcileNamespaces in tree.c of
>> libxml2 so it should remove redundant NS decl. However neither do I have
>> any experience with libxml2 nor do I have the time to dig into it so
>> that I can build a test program for this.
>>
>> Thus I ask you guys here, who surely are libxml2 experts, if you could
>> help me out here. Either some "hack" for lxml that allows me to test
>> this or a small programm that takes an xml file and applies this
>> function to it's DOM tree (and outputs the result) would be really
>> great.
> 
> Note that removal of redundant ns-decls in 
> xmlDOMWrapReconcileNamespaces() was committed to CVS just
> yesterday and I fixed some bugs today; I peformed only rudimental
> tests, so more testing would be appreciated.
> 
> Andreas' bug-entry:
> http://bugzilla.gnome.org/show_bug.cgi?id=329347

Ok, I think we cannot easily depend on CVS versions of libxml2 in lxml, so
this rather experimental feature will not be supported in lxml for a while.

As a work around this kind of problems with libxml2 versions, we *could*
support something like conditional *compilation* in lxml, depending on the
libxml2 version. It's not beautiful, but I could imagine something like this,
(Continue reading)

Scott Haeger | 4 Feb 2006 17:09
Picon

memory leak in el.get() ?

I am getting a memory leak using the get attribute method.  The following will illustrate the problem:

el = etree.Element("element")
el.set('a', 'first')

while True:
    x = el.get('a')

Memory usage as displayed in top continues will continue to grow.

I am using the following:
lxml 0.8
libxml2 2.6.23
Fedora Core 3

Scott
  


_______________________________________________
lxml-dev mailing list
lxml-dev <at> codespeak.net
http://codespeak.net/mailman/listinfo/lxml-dev
Stefan Behnel | 5 Feb 2006 10:26
Picon

Re: memory leak in el.get() ?


Scott Haeger wrote:
> I am getting a memory leak using the get attribute method.  The
> following will illustrate the problem:
> 
> el = etree.Element("element")
> el.set('a', 'first')
> 
> while True:
>     x = el.get('a')
> 
> Memory usage as displayed in top continues will continue to grow.
> 
> I am using the following:
> lxml 0.8
> libxml2 2.6.23
> Fedora Core 3
> 
> Scott

Hi Scott,

thanks for reporting this. I think you're right, there is a problem with
freeing the string buffer of attribute values after converting them to Python
strings. I'll see if I can fix that and commit a patch to SVN.

Thanks,
Stefan
Stefan Behnel | 5 Feb 2006 11:27
Picon

Re: memory leak in el.get() ?


Scott Haeger wrote:
> I am getting a memory leak using the get attribute method.  The
> following will illustrate the problem:
> 
> el = etree.Element("element")
> el.set('a', 'first')
> 
> while True:
>     x = el.get('a')
> 
> Memory usage as displayed in top continues will continue to grow.
> 
> I am using the following:
> lxml 0.8
> libxml2 2.6.23
> Fedora Core 3
> 
> Scott

Fixed. You can either use the current SVN version (trunk or scoder2 branch) or
apply the attached patch by hand.

Stefan
Attachment (leak.patch): text/x-patch, 1216 bytes
_______________________________________________
lxml-dev mailing list
lxml-dev <at> codespeak.net
http://codespeak.net/mailman/listinfo/lxml-dev
Werner F. Bruhin | 7 Feb 2006 15:45
Picon
Favicon

OT - Schema conversion

I am trying to convert XSD schema to Relax NG to get the best of both 
worlds - maybe.

Looked at Trang but it does not convert from XSD to Relax, just the 
other way round but I found the Sun Relax NG converter which should be 
able to do it.

But I can't get it to work and my Java knowledge is probably less then none.

Has anyone used this conversion tool successfully?

I am getting the following exception:

Sun Relax NG Converter version 20030225

C:\Dev\TheWineCellarBook\xml\sunconverter>java -jar rngconv.jar 
wineXML.xsd > wineXML.rng
Exception in thread "main" java.lang.ClassCastException
       at 
com.sun.msv.datatype.xsd.TypeIncubator.derive(TypeIncubator.java:216)
       at 
com.sun.msv.reader.datatype.xsd.XSDatatypeExp$1.derive(XSDatatypeExp.java:92) 

       at 
com.sun.msv.reader.datatype.xsd.RestrictionState.annealType(RestrictionState.java:41) 

       at 
com.sun.msv.reader.datatype.xsd.TypeWithOneChildState.makeType(TypeWithOneChildState.java 

:42)
       at 
com.sun.msv.reader.datatype.xsd.TypeState._makeType(TypeState.java:76)
       at 
com.sun.msv.reader.datatype.xsd.TypeState.endSelf(TypeState.java:52)
       at com.sun.msv.reader.SimpleState.endElement(SimpleState.java:100)
       at org.xml.sax.helpers.XMLFilterImpl.endElement(Unknown Source)
       at org.apache.xerces.parsers.AbstractSAXParser.endElement(Unknown 
Source)
       at 
org.apache.xerces.impl.XMLNSDocumentScannerImpl.scanEndElement(Unknown 
Source)
       at 
org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch( 

Unknown Source)
       at 
org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown 
Source)
       at org.apache.xerces.parsers.DTDConfiguration.parse(Unknown Source)
       at org.apache.xerces.parsers.DTDConfiguration.parse(Unknown Source)
       at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
       at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
       at 
com.sun.msv.reader.util.GrammarLoader._loadSchema(GrammarLoader.java:511)
       at 
com.sun.msv.reader.util.GrammarLoader.parse(GrammarLoader.java:331)
       at 
com.sun.msv.reader.util.GrammarLoader.loadSchema(GrammarLoader.java:178)
       at com.sun.msv.writer.relaxng.Driver.main(Unknown Source)

I am on Windows XP Pro and I am not familar with Java (version installed 
is j2re1.4.2_01)

Tried to find info on the Sun site and posted on their list but no 
answer till now.

Maybe someone can point me in the right direction to get this working.

See you
Werner
Narayan Desai | 10 Feb 2006 16:52
Favicon

fix for another memory leak

In addition to the memory leak in the Element.get method, I have
recently been seeing memory leaks in Element.Attrib.has_key. This
results in the following trace from valgrind:
==12133== 8 bytes in 4 blocks are definitely lost in loss record 2 of 48
==12133==    at 0x401B422: malloc (vg_replace_malloc.c:149)
==12133==    by 0x453D8B5: xmlStrndup (in /usr/lib/libxml2.so.2.6.23)
==12133==    by 0x453D943: xmlStrdup (in /usr/lib/libxml2.so.2.6.23)
==12133==    by 0x453E0D1: xmlStrcat (in /usr/lib/libxml2.so.2.6.23)
==12133==    by 0x44E98A1: xmlNodeListGetString (in /usr/lib/libxml2.so.2.6.23)
==12133==    by 0x44EF6D2: xmlGetNoNsProp (in /usr/lib/libxml2.so.2.6.23)
==12133==    by 0x4462ED0: __pyx_f_5etree_7_Attrib_has_key (etree.c:5389)
==12133==    by 0x80B6BE3: (within /usr/bin/python2.3)
==12133==    by 0x80B8356: PyEval_EvalCodeEx (in /usr/bin/python2.3)
==12133==    by 0x80B85D4: PyEval_EvalCode (in /usr/bin/python2.3)
==12133==    by 0x80D8EFF: PyRun_InteractiveOneFlags (in /usr/bin/python2.3)
==12133==    by 0x80D9018: PyRun_InteractiveLoopFlags (in /usr/bin/python2.3)

The following patch appears to fix things for me. It is basically
copied from the Element.get fix.
 -nld

Index: src/lxml/etree.pyx
===================================================================
--- src/lxml/etree.pyx  (revision 23162)
+++ src/lxml/etree.pyx  (working copy)
 <at>  <at>  -855,6 +855,7  <at>  <at> 
             result = tree.xmlGetNoNsProp(self._c_node, tag)
         else:
             result = tree.xmlGetNsProp(self._c_node, tag, ns)
+        tree.xmlFree(result)
         return result is not NULL

     def __contains__(self, key):

Gmane