krey | 21 Aug 14:45 2015
Picon

where do I report a bug?

Hi there,

I think I've found a bug in lxml, but I wasn't able to find a bug tracker (the stuff on launchpad is ancient, github doesn't have an 'issues' page)

parser.pxi in lxml.etree.ParseError.__init__ (src/lxml/lxml.etree.c:89005)()

TypeError: __init__() takes exactly 5 positional arguments (2 given)

I don't really have the code/input to reproduce it, but it should be easy enough to fix.

Thanks,

Kris

Sent from an iToaster.
_________________________________________________________________
Mailing list for the lxml Python XML toolkit - http://lxml.de/
lxml <at> lxml.de
https://mailman-mail5.webfaction.com/listinfo/lxml
Eero Helenius | 17 Aug 08:24 2015
Picon

Querying XML catalogs for URIs

Hi all,

I’m developing an lxml-based Python 3 app that uses the xml-model processing instruction to validate
XML files against the schema associated with that file.

Here’s an example of an xml-model PI:

 <?xml-model href="http://www.docbook.org/xml/5.0/rng/docbook.rng"
             type="application/xml"
             schematypens="http://relaxng.org/ns/structure/1.0”?>

My app would then validate the file that has this PI against the DocBook 5.0 RelaxNG schema defined in the
 <at> href attribute. For more information on the xml-model PI, see http://www.w3.org/TR/xml-model/.

I can get the  <at> href attribute of the PI just fine and use the schema it points to to validate the file, but I
would really like to be able to query my XML catalog for the URI in the  <at> href attribute and that way use a local
copy of the schema instead of having to download it from an external source.

The “original” libxml2 python bindings have a function called catalogResolveURI which seems to do
precisely what I need. Here’s an example lifted from Stack Overflow (http://stackoverflow.com/a/7229470/825783):

  import libxml2
  libxml2.loadCatalog('catalog.xml’)
  print libxml2.catalogResolveURI('file:///common/logo.xml')
  file:///home/kst/svn/TOOLS/Docbook/common/logo.xml

I would use that function, but there doesn’t seem to be a version of libxml2 bindings for Python 3, unfortunately.

So my question is, does lxml offer a way to do this? I tried browsing through the API docs and some parts of the
code but I couldn’t find any way to do it.

If not, would it be possible to add this feature? Or is it maybe possible to use the public C API to add this
feature? If so, could you give me any pointers on how to achieve that?

I’m guessing the libxml2 xmlACatalogResolveURI function
(http://xmlsoft.org/html/libxml-catalog.html#xmlACatalogResolveURI) would do the job, so I guess
I’d just need a way to call it via lxml.

Many thanks in advance!

Best,

Eero Helenius
_________________________________________________________________
Mailing list for the lxml Python XML toolkit - http://lxml.de/
lxml <at> lxml.de
https://mailman-mail5.webfaction.com/listinfo/lxml
Corey Boyle | 11 Aug 20:16 2015
Picon

http://lxml.de/html5parser.html

When I run the example code at http://lxml.de/html5parser.html I get...

<html:table xmlns:html="http://www.w3.org/1999/xhtml"><html:tbody><html:tr><html:td>foo</html:td></html:tr></html:tbody></html:table>

but the page says I should get...

'<table><tbody><tr><td>foo</td></tr></tbody></table>'

How can I get rid of the namespace stuff?

This is the exact code I am running...

from lxml.html import tostring, html5parser
print tostring(html5parser.fromstring("<table><td>foo"))
_________________________________________________________________
Mailing list for the lxml Python XML toolkit - http://lxml.de/
lxml <at> lxml.de
https://mailman-mail5.webfaction.com/listinfo/lxml
Tim Browski | 29 Jul 23:44 2015
Picon

issue with findtext and find

Hi,

For the following scenario find and find text cannot return the element. Is it intended functionality or a possible bug?


<a> <b> </b> <b> <c></c> </b> </a>

when element called with elm.find("b/c") or elm.findtext("b/c"), it returns None.

However, when I call find with exact XPATH location it returns the expected result.

Best,
Tim
    


_________________________________________________________________
Mailing list for the lxml Python XML toolkit - http://lxml.de/
lxml <at> lxml.de
https://mailman-mail5.webfaction.com/listinfo/lxml
Tim Browski | 29 Jul 23:43 2015
Picon

Fwd: issue with findtext and find


Hi,

For the following scenario find and find text cannot return the element. Is it intended functionality or a possible bug?


<a> <b> </b> <b> <c></c> </b> </a>

when element called with elm.find("b/c") or elm.findtext("b/c"), it returns None.

However, when I call find with exact XPATH location it returns the expected result.

Best,
Tim
    

_________________________________________________________________
Mailing list for the lxml Python XML toolkit - http://lxml.de/
lxml <at> lxml.de
https://mailman-mail5.webfaction.com/listinfo/lxml
Gilles Lenfant | 20 Jul 14:37 2015
Picon

Weird (?) tree.getpath behaviour

Hi,

I use the "tree.getpath(element)" to solve a problem, and I need to have a meaningful path.

In the example at http://pastebin.com/1Xjprfui the getpath method behaves exactly as expected when I use it in an XML doc with no namespace or if all elements have a namespace prefix (examples 1 and 3).

But if I use tree.getpath(element) on a document with a default namespace (second example), I got weird getpath results like "/*", "/*/*/*[2]" and so on when I expected meaningful path like "/root", "/root/parent/child[2]".

Is this a bug or a feature ? If a feature, is there some workaround to get meaningful path expressions ?

Thanks in advance for any help.
--
Gilles Lenfant

_________________________________________________________________
Mailing list for the lxml Python XML toolkit - http://lxml.de/
lxml <at> lxml.de
https://mailman-mail5.webfaction.com/listinfo/lxml
Rice, Nathan Alexander | 13 Jul 18:29 2015
Picon

debugging issues with application of XSLT

Hello,

I'm currently attempting to apply an XSL transform to an XML document using LXML.  Unfortunately, when I do
this I get the following traceback:

    /usr/lib/python2.7/dist-packages/lxml/etree.so in lxml.etree.XSLT.__call__ (src/lxml/lxml.etree.c:160146)()

    XSLTApplyError: Failed to evaluate the 'select' expression.

Given there are quite a few select expressions in the XSL document (and more than one may be an issue), is
there any way to get LXML to spit out the specific expression that triggered this error?

Thank you,

Nathan
_________________________________________________________________
Mailing list for the lxml Python XML toolkit - http://lxml.de/
lxml <at> lxml.de
https://mailman-mail5.webfaction.com/listinfo/lxml
Bjorn Holmgren | 13 Jul 14:07 2015
Picon

Windows 7, x64

Hi,


I need to install lxml, but it fails. I install it with: pip install lxml. I have Python 3.4 installed on a Windows 7, x64 computer. I have Visual Studio 2010 installed as required by Python 3.4.


I get this error message:

 C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\BIN\cl.exe /c /nologo
 /Ox /MD /W3 /GS- /DNDEBUG -Ic:\users\ashhab\appdata\local\temp\pip-build-0r0v7a
\lxml\src\lxml\includes -IC:\Python33\include -IC:\Python33\include /Tcsrc\lxml\
lxml.etree.c /Fobuild\temp.win32-3.3\Release\src\lxml\lxml.etree.obj -w
    cl : Command line warning D9025 : overriding '/W3' with '/w'
    lxml.etree.c
    c:\users\ashhab\appdata\local\temp\pip-build-0r0v7a\lxml\src\lxml\includes\e
tree_defs.h(14) : fatal error C1083: Cannot open include file: 'libxml/xmlversio
n.h': No such file or directory
    C:\Python33\lib\distutils\dist.py:258: UserWarning: Unknown distribution opt
ion: 'bugtrack_url'
      warnings.warn(msg)
    error: command '"C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\BIN\
cl.exe"' failed with exit status 2


Can someone help me to install lxml in Windows 7?


Regards,
Björn






_________________________________________________________________
Mailing list for the lxml Python XML toolkit - http://lxml.de/
lxml <at> lxml.de
https://mailman-mail5.webfaction.com/listinfo/lxml
John Munroe | 26 Jun 08:14 2015
Picon

Error from .itertext() “ValueError: Input object has no element: HtmlComment”

Hi

I'm trying to iterate through the text content of a subtree using elt.itertext() (v3.5.0b1 git master
branch) as follows:

import lxml.html.soupparser as soupparser
import requests

doc = requests.get("http://f10.5post.com/forums/showthread.php?t=1142017").content
tree = soupparser.fromstring(doc)

nodes = tree.getchildren()

for elt in nodes:
    for t in elt.itertext():
         print t

But I keep getting an error saying

 File "src/lxml/iterparse.pxi", line 248, in lxml.etree.iterwalk.__init__ (src/lxml/lxml.etree.c:134032)
 File "src/lxml/apihelpers.pxi", line 67, in lxml.etree._rootNodeOrRaise (src/lxml/lxml.etree.c:15220)
ValueError: Input object has no element: HtmlComment

Is there a way to skip all HTML comments? Also, what does this error actually mean?

Any help will be appreciated.

Thanks

John

_________________________________________________________________
Mailing list for the lxml Python XML toolkit - http://lxml.de/
lxml <at> lxml.de
https://mailman-mail5.webfaction.com/listinfo/lxml
John Munroe | 24 Jun 10:10 2015
Picon

Build error: No such file or directory: 'src/lxml/lxml.etree.c'

Hi,

I've grabbed 3.5.0beta1 from github and tried building it. I'm on OS X and have lxml2.9.2 rather than
lxml2.9.1. So, I’m using the following command to build:

python setup.py build --static-deps --libxml2-version=2.9.2 --without-cython

but I keep getting an error saying

clang: error: no such file or directory: 'src/lxml/lxml.etree.c'
clang: error: no input files

Indeed, the C file doesn't exist and isn't part of the distribution though.

Am I missing something? I'd like to have it installed in a virtualenv (eventually).

Any help will be appreciated.

Thanks

John

_________________________________________________________________
Mailing list for the lxml Python XML toolkit - http://lxml.de/
lxml <at> lxml.de
https://mailman-mail5.webfaction.com/listinfo/lxml
Stefan Behnel | 21 Jun 13:41 2015
Picon

Re: Can I change maxvars?

Sam Bull schrieb am 21.06.2015 um 12:12:
> On Mon, 2014-07-14 at 20:52 +0200, Stefan Behnel wrote:
>> Sam Bull, 12.07.2014 17:15:
>>> I'm trying to process some XML files, and a few of them are several
>>> thousand lines long, and with the moderately complicated XSL I'm using,
>>> I seem to be hitting recursion limits.
>>>
>>> I'm currently getting this message:
>>>         lxml.etree.XSLTApplyError: xsltApplyXSLTTemplate: A potential
>>>         infinite template recursion was detected.
>>>         You can adjust maxTemplateVars (--maxvars) in order to raise the
>>>         maximum number of variables/params (currently set to 15000).
>>>
>>> It says I can adjust the value, but doesn't explain how, nor is this
>>> value mentioned anywhere in the documentation.
>>>
>>> I've just had to change the maxdepth, which can be done with
>>> XSLT.set_global_max_depth(), but there doesn't appear to be an
>>> equivalent for maxvars. How can I change this value?
>>
>> You can't currently. The problem is, it was new in libxslt 1.1.27, and even
>> the next lxml release will still support everything back to 1.1.23, so this
>> needs a little C level hacking to support depending on the libxslt version
>> it compiles against.
>>
>> The upside is that libxslt 1.1.27 also introduced a per-context setting
>> (maxTemplateVars), i.e. you can define the value for each stylesheet run
>> rather than setting a global value. A new keyword argument for XSLT()
>> should work nicely here, e.g. "max_recursion_vars". The same applies to
>> "maxTemplateDepth" in 1.1.27, which could be set as "max_recursion_depth"
>> in XSLT().
> 
> Don't suppose there's been any progress on this?

No. Pull requests still welcome.

Stefan

_________________________________________________________________
Mailing list for the lxml Python XML toolkit - http://lxml.de/
lxml <at> lxml.de
https://mailman-mail5.webfaction.com/listinfo/lxml

Gmane