Adarsh LR | 16 Dec 17:56 2014
Picon

Help me on installation please ...

Namaskar ...

I am from india, Anyone who can help me with a quick installer of pip.. I am not even able to install this python script

Anyways ... even I am working on this

Lets go

Adarsh L Raju
_________________________________________________________________
Mailing list for the lxml Python XML toolkit - http://lxml.de/
lxml <at> lxml.de
https://mailman-mail5.webfaction.com/listinfo/lxml
Ken Norris | 16 Dec 08:39 2014
Picon

Trouble running code on Windows Server 2012

Hi, 

I'm having trouble running a code I wrote that uses lxml in python on a Windows 2012 Server. I originally wrote the script on MacOSx and it works well. I tried to run the same on the server. The code runs without error and it produces PDFs, however when I try to open them up they are corrupted or show up as blank. I followed some steps (i.e. following the instructions in the documentation to install the dependencies of lxml and by reseting the PATH env to make sure the dependencies were routing to the proper place) to no avail. I'll attach the script. 

I'm doing research in the economics department at UBC in Vancouver, Canada. My coding skills are coming along but they're far from perfect. Any help you could provide would be great. 

Thanks, 

Ken
Attachment (Test.py): text/x-python-script, 1612 bytes
_________________________________________________________________
Mailing list for the lxml Python XML toolkit - http://lxml.de/
lxml <at> lxml.de
https://mailman-mail5.webfaction.com/listinfo/lxml
Stefan Behnel | 13 Dec 19:36 2014
Picon

Re: Support default_namespace parameter in write method?

Charlie Clark schrieb am 07.12.2014 um 12:00:
> Am .12.2014, 10:05 Uhr, schrieb Stefan Behnel:
>> Not sure. It's due to an inherent difference in the tree models of libxml2
>> and ElementTree: the latter does not know about prefixes at all and handles
>> them only in the serialiser, whereas libxml2 takes them from the in-memory
>> tree. There is no override in libxml2's serialiser, so it would require
>> modifying the tree's namespaces during serialisation (i.e. modification
>> during a supposed read-only operation). I guess it's worth trying out,
>> though.
> 
> I thought there might be something like that. But would it be possible to
> allow the registration of an empty namespace (as ElementTree allows)?
> Currently this is only possible with nsmap. BTW. what are the constraints
> on nsmap? The attribute is not writable but __setitem__ does not raise an
> exception.

Yes, it's actually documented (see docstring) that changing an Element's
nsmap has no effect. The property returns a real dict that contains all
prefix-namespace mappings known in the context of this Element in the tree.
But it has no actual connection to that Element that could pass back
modifications.

I think the problem is that it was never obvious to me what the semantics
should be here. It feels wrong if modifying a mapping that includes all
ancestor namespaces changes the namespaces defined on a specific node. And
would it then update the subtree of that Element? Then reading the property
would give you access to the ancestor state and modifying it would change
the children? Strange. If you change the namespace of a prefix, would it
then have to go back to the ancestor that defined the modified prefix and
changing its entire subtree? That sounds even worse. And what would happen
if you deleted a prefix that's in use?

I agree that the feature of modifying the namespace mapping is missing,
though. If someone has a good idea how this should work, I'm all eyes.

> register_namespace("", "http://example.com/whatever") # works in
> ElementTree but not in lxml

Ah, yes. Not sure the error you get was intended. It would be easy to make
this work, but then, register_namespace() changes global library state.
Globally setting up a prefix-namespace mapping is not a great idea already
(it's ok as long as the prefix is a generally accepted de-facto standard),
but setting a global default namespace seems like asking for trouble and
interference with other code. The mapping is bidirectional and if multiple
default namespace mappings are registered, one will overwrite the other and
the result will depend on who happens to be last. Rearrange some imports
and loose your pretty output? If you're lucky, you'll at least see some of
your tests fail, but I guess prefixes are rarely tested for.

Stefan

_________________________________________________________________
Mailing list for the lxml Python XML toolkit - http://lxml.de/
lxml <at> lxml.de
https://mailman-mail5.webfaction.com/listinfo/lxml
Martin Mueller | 11 Dec 05:25 2014

using Xpath to look for one or another element


I know how to use finditer to select one type of element, as in

speech.iterfind('.//tei:w', namespaces =
{'tei':'http://www.tei-c.org/ns/1.0'})

which selects all the <w> descendants of the parent element. But what if I
want to find <w> OR <c> elements. As I understand from a posting in
stackoverflow,

iterfind('.//tei:w |.//tei:c',namespaces =
{'tei':'http://www.tei-c.org/ns/1.0'})

should do the trick in regular expression mode. But it doesn't seem to
work. Am I doing something wrong or have I hit a limit of the program?

_________________________________________________________________
Mailing list for the lxml Python XML toolkit - http://lxml.de/
lxml <at> lxml.de
https://mailman-mail5.webfaction.com/listinfo/lxml
D.H.J. Takken | 5 Dec 15:31 2014
Picon
Picon

etree.xmlfile shadows exceptions thrown by output

Hello,

I use etree.xmlfile and a coroutine to generate XML into a file-like
object that does post-processing on the XML data. Now, if I use the
send() method to produce invalid XML data, the post-processor throws an
exception.

Here, I see a little problem: No matter what exception is thrown by the
post-processor, the send() method always generates a generic IO
exception. Apparently, lxml catches any exception thrown by the
file-like object and produces a generic IO exception in stead. This
means that there is no way to know what exactly went wrong in the
post-processor. That information is lost.

Inside the coroutine, I *do* get the original exception from the
post-processor, but doing the error handling there is kind of ugly.

Is there a better way to solve this problem?

Thanks a lot!

_________________________________________________________________
Mailing list for the lxml Python XML toolkit - http://lxml.de/
lxml <at> lxml.de
https://mailman-mail5.webfaction.com/listinfo/lxml
Charlie Clark | 4 Dec 08:25 2014
Picon

Support default_namespace parameter in write method?

Hiya,

since Python 2.7 the xml.etree's ElementTree has supported a default  
namespace when serialising XML:

write(file, encoding="us-ascii", xml_declaration=None,  
default_namespace=None, method="xml")

Would it be possible to support the signature in lxml? This would make  
coding for both libraries easier where a default namespace is desired, as  
in certain parts of openpyxl, due to overly fussy or downright buggy  
clients?

Charlie
--

-- 
Charlie Clark
Managing Director
Clark Consulting & Research
German Office
Kronenstr. 27a
Düsseldorf
D- 40217
Tel: +49-211-600-3657
Mobile: +49-178-782-6226
_________________________________________________________________
Mailing list for the lxml Python XML toolkit - http://lxml.de/
lxml <at> lxml.de
https://mailman-mail5.webfaction.com/listinfo/lxml
Holger Joukl | 3 Dec 11:54 2014
Picon

tests failing due to encoding errors on non-utf-8 system


Hi,

I just encountered failing lxml tests on a system with non-UTF-8 locale:

(don't get irritated by the overall test case count, I've disabled some
tests)

...
1626/1626 (100.0%): txt (xpathxslt)
Doctest: xpathxslt.txt
======================================================================
ERROR: test_etree_parse_io_error (lxml.tests.test_io.ETreeIOTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/apps/prod/releases/2.0/lib/python2.7/unittest/case.py", line 331,
in run
    testMethod()
  File
"/var/tmp/hjoukl/BUILD/NEW/64bit/gcc/2014-QX/lxml-3.4.1/src/lxml/tests/test_io.py",
 line 276, in test_etree_parse_io_error
    dn = tempfile.mkdtemp(prefix=dirnameRU)
  File "/apps/prod/releases/2.0/lib/python2.7/tempfile.py", line 329, in
mkdtemp
    _os.mkdir(file, 0700)
UnicodeEncodeError: 'latin-1' codec can't encode character u'\u0161' in
position 6: ordinal not in range(256)

======================================================================
ERROR: test_etree_parse_io_error (lxml.tests.test_io.ElementTreeIOTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/apps/prod/releases/2.0/lib/python2.7/unittest/case.py", line 331,
in run
    testMethod()
  File
"/var/tmp/hjoukl/BUILD/NEW/64bit/gcc/2014-QX/lxml-3.4.1/src/lxml/tests/test_io.py",
 line 276, in test_etree_parse_io_error
    dn = tempfile.mkdtemp(prefix=dirnameRU)
  File "/apps/prod/releases/2.0/lib/python2.7/tempfile.py", line 329, in
mkdtemp
    _os.mkdir(file, 0700)
UnicodeEncodeError: 'latin-1' codec can't encode character u'\u0161' in
position 6: ordinal not in range(256)

----------------------------------------------------------------------
Ran 1626 tests in 56.208s

FAILED (errors=2)
Skipping tests in lxml.cssselect - external cssselect package is not
installed
Comparing with ElementTree 1.3.0

TESTED VERSION: 3.4.1
    Python:           sys.version_info(major=2, minor=7, micro=5,
releaselevel='final', serial=0)
    lxml.etree:       (3, 4, 1, 0)
    libxml used:      (2, 9, 1)
    libxml compiled:  (2, 9, 1)
    libxslt used:     (1, 1, 28)
    libxslt compiled: (1, 1, 28)

make: *** [test_inplace] Error 1

This is caused by the directory path characters of a temp directory that
are not representable in the
system's encoding:

    def test_etree_parse_io_error(self):
        # this is a directory name that contains characters beyond latin-1
        dirnameEN = _str('Directory')
        dirnameRU = _str('Ã~PÅ¡Ã~P°Ã~Qâ~ <at> ~ZÃ~P°Ã~P»Ã~PŸÃ~P³')
        filename = _str('nosuchfile.xml')
        dn = tempfile.mkdtemp(prefix=dirnameEN)
        try:
            self.assertRaises(IOError, self.etree.parse, os.path.join(dn,
filename))
        finally:
            os.rmdir(dn)
        dn = tempfile.mkdtemp(prefix=dirnameRU)
        try:
            self.assertRaises(IOError, self.etree.parse, os.path.join(dn,
filename))
        finally:
            os.rmdir(dn)

I'm unsure of what a proper fix for this should look like. Refactor to 2
separate tests
and disable the non-ascii dir path test based on sys.getfilesystemencoding
()?

Or rather do
dn = tempfile.mkdtemp(prefix=dirnameRU.encode('utf-8'))
?

Holger

Landesbank Baden-Wuerttemberg
Anstalt des oeffentlichen Rechts
Hauptsitze: Stuttgart, Karlsruhe, Mannheim, Mainz
HRA 12704
Amtsgericht Stuttgart

_________________________________________________________________
Mailing list for the lxml Python XML toolkit - http://lxml.de/
lxml <at> lxml.de
https://mailman-mail5.webfaction.com/listinfo/lxml
Ovnicraft | 22 Nov 20:42 2014
Picon

XML Digital Signature (Xades)

Hi to all !

Maybe this topic is off, but i want to comment here and get your feedback.

I am working in digital signatures for my country and i want to work just in python, so my work start reading espec about xades[1], nothing complicated, now i see lxml can help me but i want to know if anyone here has experience around this.

I have an example about xml file[2] and see basic x509 cert use and RSA key.

Any comment are really appreciated !

Regards !

[1] http://www.w3.org/TR/XAdES/
[2] https://gist.github.com/ovnicraft/c8645904e01e0a071199

--
 
 
Cristian Salamea
about.me/ovnicraft
 
_________________________________________________________________
Mailing list for the lxml Python XML toolkit - http://lxml.de/
lxml <at> lxml.de
https://mailman-mail5.webfaction.com/listinfo/lxml
D.H.J. Takken | 21 Nov 17:36 2014
Picon
Picon

Re: Efficient incremental parsing using etree.iterparse

Ah wow, one of the hidden treasures of lxml.. :)

Thanks!

On 11/21/2014 12:58 PM, Steven Vereecken wrote:
> Hello,
> 
> It doesn't seem to be mentioned in the docs, but you *can* specify
> multiple tag names (just use a list of names instead of one string).
> I'm not really sure where I picked that up myself, maybe from this
> mention in the changelog (features added in 3.0alpha1) :
> "Tree iteration and iterparse() with a selective tag argument supports
> passing a set of tags. Tree nodes will be returned by the iterators if
> they match any of the tags."
> 
> greetings,
> Steven
> 
> 2014-11-21 11:47 GMT+01:00 D.H.J. Takken <d.h.j.takken <at> xs4all.nl
> <mailto:d.h.j.takken <at> xs4all.nl>>:
> 
>     Hello,
> 
>     I need to process very large XML files as quickly as possible. The XML
>     processing does not require processing of every single tag, so I was
>     looking at the iterparse method.
> 
>     Unfortunately, the iterparse method only allows one tag name to be
>     specified for triggering events, while I need to do processing on two or
>     three different tags. This would still be much more efficient than using
>     the target parser method, because the XML data contains many more tags
>     that do not require immediate processing.
> 
>     So, it looks like I need something in between processing *all* tags and
>     processing a single tag. Is there any way to do that?
> 
>     Thanks for any hints!
>     _________________________________________________________________
>     Mailing list for the lxml Python XML toolkit - http://lxml.de/
>     lxml <at> lxml.de <mailto:lxml <at> lxml.de>
>     https://mailman-mail5.webfaction.com/listinfo/lxml
> 
> 

_________________________________________________________________
Mailing list for the lxml Python XML toolkit - http://lxml.de/
lxml <at> lxml.de
https://mailman-mail5.webfaction.com/listinfo/lxml
D.H.J. Takken | 21 Nov 11:47 2014
Picon
Picon

Efficient incremental parsing using etree.iterparse

Hello,

I need to process very large XML files as quickly as possible. The XML
processing does not require processing of every single tag, so I was
looking at the iterparse method.

Unfortunately, the iterparse method only allows one tag name to be
specified for triggering events, while I need to do processing on two or
three different tags. This would still be much more efficient than using
the target parser method, because the XML data contains many more tags
that do not require immediate processing.

So, it looks like I need something in between processing *all* tags and
processing a single tag. Is there any way to do that?

Thanks for any hints!
_________________________________________________________________
Mailing list for the lxml Python XML toolkit - http://lxml.de/
lxml <at> lxml.de
https://mailman-mail5.webfaction.com/listinfo/lxml
Stefan Behnel | 20 Nov 19:57 2014
Picon

lxml 3.4.1 released

Hi all,

I just released an "incremental improvements" version of lxml, 3.4.1, with
only one real addition: an "htmlfile" class that accompanies the existing
"xmlfile" incremental serialisation class.

Note that the 3.4 release series no longer supports Python versions before
Py2.6 and Py3.2. Support for very old versions of libxml2 and libxslt
(<=2008) was also removed.

The documentation is here: http://lxml.de/

Download:  http://lxml.de/files/lxml-3.4.1.tgz

Signature: http://lxml.de/files/lxml-3.4.1.tgz.asc

Changelog: http://lxml.de/3.4/changes-3.4.1.html

Github:
https://github.com/lxml/lxml/commit/55b7af55eeb50abab9ca6d251d6137607849d6a1

This release was built using Cython 0.21.1, but should also build fine with
0.20.x.

If you are interested in commercial support or customisations for the lxml
package, please contact me directly.

Have fun,

Stefan

3.4.1 (2014-11-20)
==================

Features added
--------------

* New ``htmlfile`` HTML generator to accompany the incremental ``xmlfile``
  serialisation API.  Patch by Burak Arslan.

Bugs fixed
----------

* ``lxml.sax.ElementTreeContentHandler`` did not initialise its superclass.
_________________________________________________________________
Mailing list for the lxml Python XML toolkit - http://lxml.de/
lxml <at> lxml.de
https://mailman-mail5.webfaction.com/listinfo/lxml

Gmane