varun bhatnagar | 22 Jul 11:07 2014
Picon

Removing an element and strip space - XSLT

Hi,

I am trying to play around with python and xslt. I have an xml and I want to transform it to another xml by deleting its one element. The xml is pasted below:

<?xml version="1.0" encoding="UTF-8"?>
<testNode>
<nodeInfo>
      <nodePeriod nodeTime="600000000"/>
      <nodeBase base="0" />
    </nodeInfo>
</testNode>


I want to remove the <nodeBase> tag and this is how my xsl file looks like:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes"/>

  <xsl:template match=" <at> *|node()">
    <xsl:copy>
      <xsl:apply-templates select=" <at> *|node()"/>
    </xsl:copy>
  </xsl:template>
  
  
<xsl:template match="/testNode/nodeInfo/nodeBase">
</xsl:template>

</xsl:stylesheet>

When I execute it my output looks like this:

<?xml version="1.0" encoding="UTF-8"?>
<testNode>
<nodeInfo>
      <nodePeriod nodeTime="600000000"/>
      
    </nodeInfo>
</testNode>

I want to strip the space between <nodePeriod> and </nodeInfo>
Can anyone suggest a way out to do that?

Thanks,
BR,
Varun
_________________________________________________________________
Mailing list for the lxml Python XML toolkit - http://lxml.de/
lxml <at> lxml.de
https://mailman-mail5.webfaction.com/listinfo/lxml
Martin Mueller | 19 Jul 19:20 2014

finding siblings that are not next or previous

Is there a simple way in lxml to say things like "the third house on the
block," "the next house but one," or "the last house on the block."?  I
understand getnext() and getprevious(), and it's possible to concatenate
those, but it's not very elegant, and I'm not sure how it scales.

I work with TEI documents where <w> elements alternate with <c> element,
and very often what you do with a given <w> element depends on the
attributes of the next <w> element, which is the "next but one" element.

Martin Mueller
Professor emeritus of English and Classics
Northwestern University

_________________________________________________________________
Mailing list for the lxml Python XML toolkit - http://lxml.de/
lxml <at> lxml.de
https://mailman-mail5.webfaction.com/listinfo/lxml
Arne Neumann | 16 Jul 14:17 2014
Picon

How to set xml:base attribute with ElementMaker?

Dear all,

I have to generate XML files that contain snippets like this one,
but I can't find a way to produce a "xml:base" attribute:

<markList xmlns:xlink="http://www.w3.org/1999/xlink" type="tok" 
xml:base="maz-1423.text.xml">
	<mark id="sTok1" xlink:href="#xpointer(string-range(//body,'',1,3))" />
	<mark id="sTok2" xlink:href="#xpointer(string-range(//body,'',5,10))" />
</markList>

I tried to setup the namespaces, but instead of "xml:base", I'll only 
get "ns0:base".

NSMAP={None: 'xml',
        'xlink': 'http://www.w3.org/1999/xlink',
        'xml': 'xml'}

E = ElementMaker(nsmap=NSMAP)
etree.tostring(E("markList", {'type': 'tok', '{%s}base' % NSMAP['xml']: 
'maz-1423.text.xml'}))

'<markList xmlns:xlink="http://www.w3.org/1999/xlink" xmlns="xml" 
xmlns:ns0="xml" ns0:base="maz-1423.text.xml" type="tok"/>'

I'd appreciate any help with this.

Best regards,
Arne Neumann
_________________________________________________________________
Mailing list for the lxml Python XML toolkit - http://lxml.de/
lxml <at> lxml.de
https://mailman-mail5.webfaction.com/listinfo/lxml
Elhadi Falah | 16 Jul 11:23 2014
Picon

Apache2 crashes with segmentation fault

Hello,

We are using lxml in several of our applications with Python 2.6 and from time to time, the application stops responding after a segmentation fault error ( [notice] child pid 10544 exit signal Segmentation fault (11)), and this kind of backtrace:

Jul 1 15:24:48 server1 httpd: *** glibc detected *** /usr/sbin/apache2: munmap_chunk(): invalid pointer: 0x00007f6468bf2c00 ***

Jul 1 15:24:48 server1 httpd: ======= Backtrace: =========

Jul 1 15:24:48 server1 httpd: /lib/libc.so.6(+0x78bf6)[0x7f64767ecbf6]

Jul 1 15:24:48 server1 httpd: /usr/lib/libxml2.so.2(xmlCopyError+0xd1)[0x7f6473311801]

Jul 1 15:24:48 server1 httpd: /usr/lib/libxml2.so.2(__xmlRaiseError+0x30b)[0x7f6473312ecb]

Jul 1 15:24:48 server1 httpd: /usr/lib/libxml2.so.2(+0x393e5)[0x7f64733173e5]

Jul 1 15:24:48 server1 httpd: /usr/lib/libxml2.so.2(xmlParseDocument+0x2dc)[0x7f647332e5cc]

Jul 1 15:24:48 server1 httpd: /usr/lib/libxml2.so.2(+0x50895)[0x7f647332e895]

Jul 1 15:24:48 server1 httpd: /usr/lib/python2.6/dist-packages/lxml/etree.so(+0x8cbc2)[0x7f645691cbc2]

Jul 1 15:24:48 server1 httpd: /usr/lib/python2.6/dist-packages/lxml/etree.so(+0x2c7cf)[0x7f64568bc7cf]


After trying several versions of lxml we are still facing the issue.I've checked for the system memory consumption but everything looks fine to me, plenty of memory available, I don't see any process consuming abnormally.

The issue is reproducible everytime when we execute the commande apache (apache2 reload or apache2 graceful). As workaround for this issue we execute apache2 restart.

We've followed recommendations defined on these 2 links but we're still facing the issue.

http://code.google.com/p/modwsgi/wiki/ApplicationIssues#Python_Simplified_GIL_State_API

http://code.google.com/p/modwsgi/wiki/ApplicationIssues#Multiple_Python_Sub_Interpreters


Library version:

   print("%-20s: %s" % ('Python',           sys.version_info))

Python              : (2, 6, 5, 'final', 0)

   print("%-20s: %s" % ('lxml.etree',       etree.LXML_VERSION))

lxml.etree          : (2, 3, 5, 0)

   print("%-20s: %s" % ('libxml used',      etree.LIBXML_VERSION))

libxml used         : (2, 7, 6)

   print("%-20s: %s" % ('libxml compiled',  etree.LIBXML_COMPILED_VERSION))

libxml compiled     : (2, 7, 6)

   print("%-20s: %s" % ('libxslt used',     etree.LIBXSLT_VERSION))

libxslt used        : (1, 1, 26)

   print("%-20s: %s" % ('libxslt compiled', etree.LIBXSLT_COMPILED_VERSION))

libxslt compiled    : (1, 1, 26)

Apache 2.2.14


Here is the source code that generate the issue:

ID_TRANSFORM = os.environ['APPLICATION_WORKING_PATH']+'/statics/xsl/list.xsl'

styledoc = lxml.etree.parse(ID_TRANSFORM)

transform = lxml.etree.XSLT(styledoc)

doc_root = lxml.etree.XML(str(atom))


Could you help us on this case?


Regards


_________________________________________________________________
Mailing list for the lxml Python XML toolkit - http://lxml.de/
lxml <at> lxml.de
https://mailman-mail5.webfaction.com/listinfo/lxml
Sam Bull | 12 Jul 17:15 2014

Can I change maxvars?

I'm trying to process some XML files, and a few of them are several
thousand lines long, and with the moderately complicated XSL I'm using,
I seem to be hitting recursion limits.

I'm currently getting this message:
        lxml.etree.XSLTApplyError: xsltApplyXSLTTemplate: A potential
        infinite template recursion was detected.
        You can adjust maxTemplateVars (--maxvars) in order to raise the
        maximum number of variables/params (currently set to 15000).

It says I can adjust the value, but doesn't explain how, nor is this
value mentioned anywhere in the documentation.

I've just had to change the maxdepth, which can be done with
XSLT.set_global_max_depth(), but there doesn't appear to be an
equivalent for maxvars. How can I change this value?
_________________________________________________________________
Mailing list for the lxml Python XML toolkit - http://lxml.de/
lxml <at> lxml.de
https://mailman-mail5.webfaction.com/listinfo/lxml
Scott Young | 11 Jul 21:30 2014

ImportError when importing etree from lxml

Hello,

I am trying to run lxml (in order to run python-docx).  I can import lxml okay, but whenever I try to import etree from lxml, I get an ImportError:

ImportError: /usr/lib/x86_64-linux-gnu/libxslt.so.1: symbol xmlBufUse, version   LIBXML2_2.9.0 not defined in file libxml2.so.2 with link time reference

Here is my configuration information:
- Python 2.7.6 running with Enthought Canopy. 
- Ubuntu 14.04 on VMWare Worstation 10.
- lxml 3.3.5
- libxml2 2.9.1+dfsg1-3ubuntu4
- libxslt1 1.1.28-2build1

It seems that it is calling for version 2.9.0 of libxml2, but I'm not sure why, because it appears that 2.9.1 has been out for a while.  I also cannot tell whether this is a problem with lxml, libxslt, or libxml2, so I will post this to each of the lists. 

Any guidance or suggestions would be appreciated!  Thank you!
 
Thanks,
Scott

_________________________________________________________________
Mailing list for the lxml Python XML toolkit - http://lxml.de/
lxml <at> lxml.de
https://mailman-mail5.webfaction.com/listinfo/lxml
Kun JIN | 11 Jul 11:00 2014
Picon

Problem of charging TEI.rng

Hello Everyone,

I'm trying to use the method from lxml website to charge a relaxNG file into lxml : tei_cmr.rng (https://www.dropbox.com/s/9y4aicw5ytg2v4x/tei_cmr.rng.zip).
but i was blocked at
          rng = etree.RelaxNG(rng_doc)
i don't know why, i got nothing,  even not error.

My code:

  >>> from lxml import etree
  >>> f = open("tei_cmr.rng","r")
  >>> n = etree.parse(f)
  >>> rng = etree.RelaxNG(n)

I can't use any XML editor to validate my XML file, cause too big. so i
have to use this method, could you give me some solution?

Thank you in advance,

Best regards
Kun

_________________________________________________________________
Mailing list for the lxml Python XML toolkit - http://lxml.de/
lxml <at> lxml.de
https://mailman-mail5.webfaction.com/listinfo/lxml
junkmail | 4 Jul 23:00 2014

lxml.etree.XMLSyntaxError in process 1Gig file

I am using iterparse.  I am working with a very large file, over one Gig.

I am getting 'lxml.etree.XMLSyntaxError'

The error occurs when I am about 300M into the file.

I keep a log of what is successfully processed and I can look at the output.  As such, I can ascertain where in the input file the error is occurring.

I have created a smaller test file from the mater input file, copying elements prior to and after the point where the error occurs in the master file.

This test file process without error.

Thanks for your guidance.

Robert


_________________________________________________________________
Mailing list for the lxml Python XML toolkit - http://lxml.de/
lxml <at> lxml.de
https://mailman-mail5.webfaction.com/listinfo/lxml
Stefan Behnel | 4 Jul 05:40 2014
Picon

Re: Extra content at the end of the document

junkmail <at> visumpoint.com, 04.07.2014 04:42:
> I am not familiar with using iterparse.  Are there examples/documentation you 
> can direct me to?

https://duckduckgo.com/?q=lxml+iterparse

Stefan

_________________________________________________________________
Mailing list for the lxml Python XML toolkit - http://lxml.de/
lxml <at> lxml.de
https://mailman-mail5.webfaction.com/listinfo/lxml
Picon

Windows exes/wheels for 3.3 and 3.4?

Hello,
I tried to install lxml on Windows from PyPI, but installer exes and
wheels are available for 3.2 only, and there are no 3.3/3.4 builds
there.  Why is that?  Will they ever be available?

--

-- 
Chris “Kwpolska” Warrick <http://kwpolska.tk>
PGP: 5EAAEA16
stop html mail | always bottom-post | only UTF-8 makes sense
_________________________________________________________________
Mailing list for the lxml Python XML toolkit - http://lxml.de/
lxml <at> lxml.de
https://mailman-mail5.webfaction.com/listinfo/lxml
Anton Patrushev | 1 Jul 21:46 2014
Picon

C-extension and nogil

Hi Everyone,

We are trying to create c-extension to offload XML parsing from our main thread.
We need lxml.etree.Element as result of parsing.
Parsing will be done in separate threads (pool of) from in-memory string as input source.
How we can easily avoid acquiring GIL in our parsing threads (from pool).

Is there any code samples or may You can suggest some directions in lxml code base to look for?
AFAICS, lxml release GIL only internally after some preparation steps, so if we want to use fromstring we must acquire GIL before.

Thanks in advance,
Anton
_________________________________________________________________
Mailing list for the lxml Python XML toolkit - http://lxml.de/
lxml <at> lxml.de
https://mailman-mail5.webfaction.com/listinfo/lxml

Gmane