Ovnicraft | 22 Nov 20:42 2014
Picon

XML Digital Signature (Xades)

Hi to all !

Maybe this topic is off, but i want to comment here and get your feedback.

I am working in digital signatures for my country and i want to work just in python, so my work start reading espec about xades[1], nothing complicated, now i see lxml can help me but i want to know if anyone here has experience around this.

I have an example about xml file[2] and see basic x509 cert use and RSA key.

Any comment are really appreciated !

Regards !

[1] http://www.w3.org/TR/XAdES/
[2] https://gist.github.com/ovnicraft/c8645904e01e0a071199

--
 
 
Cristian Salamea
about.me/ovnicraft
 
_________________________________________________________________
Mailing list for the lxml Python XML toolkit - http://lxml.de/
lxml <at> lxml.de
https://mailman-mail5.webfaction.com/listinfo/lxml
D.H.J. Takken | 21 Nov 17:36 2014
Picon
Picon

Re: Efficient incremental parsing using etree.iterparse

Ah wow, one of the hidden treasures of lxml.. :)

Thanks!

On 11/21/2014 12:58 PM, Steven Vereecken wrote:
> Hello,
> 
> It doesn't seem to be mentioned in the docs, but you *can* specify
> multiple tag names (just use a list of names instead of one string).
> I'm not really sure where I picked that up myself, maybe from this
> mention in the changelog (features added in 3.0alpha1) :
> "Tree iteration and iterparse() with a selective tag argument supports
> passing a set of tags. Tree nodes will be returned by the iterators if
> they match any of the tags."
> 
> greetings,
> Steven
> 
> 2014-11-21 11:47 GMT+01:00 D.H.J. Takken <d.h.j.takken <at> xs4all.nl
> <mailto:d.h.j.takken <at> xs4all.nl>>:
> 
>     Hello,
> 
>     I need to process very large XML files as quickly as possible. The XML
>     processing does not require processing of every single tag, so I was
>     looking at the iterparse method.
> 
>     Unfortunately, the iterparse method only allows one tag name to be
>     specified for triggering events, while I need to do processing on two or
>     three different tags. This would still be much more efficient than using
>     the target parser method, because the XML data contains many more tags
>     that do not require immediate processing.
> 
>     So, it looks like I need something in between processing *all* tags and
>     processing a single tag. Is there any way to do that?
> 
>     Thanks for any hints!
>     _________________________________________________________________
>     Mailing list for the lxml Python XML toolkit - http://lxml.de/
>     lxml <at> lxml.de <mailto:lxml <at> lxml.de>
>     https://mailman-mail5.webfaction.com/listinfo/lxml
> 
> 

_________________________________________________________________
Mailing list for the lxml Python XML toolkit - http://lxml.de/
lxml <at> lxml.de
https://mailman-mail5.webfaction.com/listinfo/lxml
D.H.J. Takken | 21 Nov 11:47 2014
Picon
Picon

Efficient incremental parsing using etree.iterparse

Hello,

I need to process very large XML files as quickly as possible. The XML
processing does not require processing of every single tag, so I was
looking at the iterparse method.

Unfortunately, the iterparse method only allows one tag name to be
specified for triggering events, while I need to do processing on two or
three different tags. This would still be much more efficient than using
the target parser method, because the XML data contains many more tags
that do not require immediate processing.

So, it looks like I need something in between processing *all* tags and
processing a single tag. Is there any way to do that?

Thanks for any hints!
_________________________________________________________________
Mailing list for the lxml Python XML toolkit - http://lxml.de/
lxml <at> lxml.de
https://mailman-mail5.webfaction.com/listinfo/lxml
Stefan Behnel | 20 Nov 19:57 2014
Picon

lxml 3.4.1 released

Hi all,

I just released an "incremental improvements" version of lxml, 3.4.1, with
only one real addition: an "htmlfile" class that accompanies the existing
"xmlfile" incremental serialisation class.

Note that the 3.4 release series no longer supports Python versions before
Py2.6 and Py3.2. Support for very old versions of libxml2 and libxslt
(<=2008) was also removed.

The documentation is here: http://lxml.de/

Download:  http://lxml.de/files/lxml-3.4.1.tgz

Signature: http://lxml.de/files/lxml-3.4.1.tgz.asc

Changelog: http://lxml.de/3.4/changes-3.4.1.html

Github:
https://github.com/lxml/lxml/commit/55b7af55eeb50abab9ca6d251d6137607849d6a1

This release was built using Cython 0.21.1, but should also build fine with
0.20.x.

If you are interested in commercial support or customisations for the lxml
package, please contact me directly.

Have fun,

Stefan

3.4.1 (2014-11-20)
==================

Features added
--------------

* New ``htmlfile`` HTML generator to accompany the incremental ``xmlfile``
  serialisation API.  Patch by Burak Arslan.

Bugs fixed
----------

* ``lxml.sax.ElementTreeContentHandler`` did not initialise its superclass.
_________________________________________________________________
Mailing list for the lxml Python XML toolkit - http://lxml.de/
lxml <at> lxml.de
https://mailman-mail5.webfaction.com/listinfo/lxml
Stefan Behnel | 20 Nov 07:58 2014
Picon

christmas funding

Hi all,

my bicycle was recently stolen and since I now have to get a new one,
here's a proposal.

From today on until December 24th, I will divert all donations that I
receive for my work on lxml to help in restoring my local mobility.

If you do not like this 'misuse', do not donate in this time frame. I do
hope, however, that some of you like the idea that the money they give for
something they value is used for something that is of value to the receiver.

All the best,

Stefan
_________________________________________________________________
Mailing list for the lxml Python XML toolkit - http://lxml.de/
lxml <at> lxml.de
https://mailman-mail5.webfaction.com/listinfo/lxml
Isaac Councill | 17 Nov 19:23 2014

static compilation of lxml 3.4.0

Hi,

FYI I ran into a problem over the weekend with creating a static lxml wheel on centos 6.5. There were several missing CFLAGS (generally known according to a google search), and one step I didn't see involving a missing -liconv flag during the linking of etree.so

In case it's useful, here are the steps I followed to create a working static lxml wheel:

CFLAGS="-lgcrypt -fPIC -lrt -ldl -lgpg-error -Wl,--no-as-needed" \
  /usr/python2.7/bin/python2.7 setup.py bdist_wheel --static-deps

That got me most of the way except I got a runtime error that symbol libiconv was undefined. So I manually re-linked etree.so as follows:

gcc -pthread -shared -lgcrypt -fPIC -lrt -ldl -lgpg-error -Wl,--no-as-needed \
  build/temp.linux-x86_64-2.7/src/lxml/lxml.etree.o \
  /home/isaac/lxml-3.4.0/build/tmp/libxml2/lib/libxslt.a \
  /home/isaac/lxml-3.4.0/build/tmp/libxml2/lib/libiconv.a \
  /home/isaac/lxml-3.4.0/build/tmp/libxml2/lib/libexslt.a \
  /home/isaac/lxml-3.4.0/build/tmp/libxml2/lib/libxml2.a \
  -L/home/isaac/lxml-3.4.0/build/tmp/libxml2/lib -L/usr/python2.7/lib64 \
  -lz -lm -liconv -lpython2.7 -o build/lib.linux-x86_64-2.7/lxml/etree.so

The only difference from the automated build step is the inclusion of "-liconv".

Then I re-reran the first command, which picked up the re-linked etree.so and created a working static lxml wheel.

I didn't have time to do it right and send a diff, so just passing this on in case it helps someone.

-Isaac
_________________________________________________________________
Mailing list for the lxml Python XML toolkit - http://lxml.de/
lxml <at> lxml.de
https://mailman-mail5.webfaction.com/listinfo/lxml
Walrus theCat | 1 Nov 18:18 2014
Picon

why html fragments in htmldiff?

Hi,

I was wondering why the choice was made to only diff html fragments.  Why not entire documents?

Thanks
_________________________________________________________________
Mailing list for the lxml Python XML toolkit - http://lxml.de/
lxml <at> lxml.de
https://mailman-mail5.webfaction.com/listinfo/lxml
Stefan Behnel | 31 Oct 11:21 2014
Picon

Crash on MacOS-X when importing tkinter

Hi,

I got a crash report when importing tkinter and lxml together under MacOS-X.

https://bugs.launchpad.net/lxml/+bug/1384102

Could other Apple users please try the following couple of commands and
report if they trigger a crash for them or not?

"""
import tkinter
from lxml import etree

print("%-20s: %s" % ('Python', sys.version_info))
print("%-20s: %s" % ('lxml.etree', etree.LXML_VERSION))
print("%-20s: %s" % ('libxml used', etree.LIBXML_VERSION))
print("%-20s: %s" % ('libxml compiled', etree.LIBXML_COMPILED_VERSION))
print("%-20s: %s" % ('libxslt used', etree.LIBXSLT_VERSION))
print("%-20s: %s" % ('libxslt compiled', etree.LIBXSLT_COMPILED_VERSION))

root = etree.Element('test')
xml = etree.tostring(root)   # <- crashing here
"""

Please report the library versions that you are using (printed above) and
whether it's a static lxml build or links dynamically against libxml2. If
someone has gdb available and can provide further information about the
spot in libxml2 where this crashes, I'd be happy to hear them.

Thanks!

Stefan
_________________________________________________________________
Mailing list for the lxml Python XML toolkit - http://lxml.de/
lxml <at> lxml.de
https://mailman-mail5.webfaction.com/listinfo/lxml
Felix Schwarz | 27 Oct 13:28 2014
Picon

building lxml with pypy: "pyconfig.h" not found

Hi,

I just tried to build lxml 3.4.0 using pypy 2.2.1 (on Fedora 20) build failed
due to a missing pyconfig.h.

Is that expected behavior? I see that there are passing travis builds for the
github version so it should be possible somehow.

I'm not sure where to start debugging the issue as it could be a lxml problem,
pypy error or Fedora packaging bug so I decided to start at the source :-)

I started out with the lxml 3.4.0 tar.gz from pypi because I tried to avoid
rebuilding all the cython stuff. Maybe that's the problem?

$ pypy setup.py build
Building lxml version 3.4.0.
Building without Cython.
Using build configuration of libxslt 1.1.28
Building against libxml2/libxslt in the following directory: /usr/lib64
/usr/lib64/pypy-2.2.1/lib-python/2.7/distutils/dist.py:267: UserWarning:
Unknown distribution option: 'bugtrack_url'
  warnings.warn(msg)
running build
running build_py
copying src/lxml/includes/lxml-version.h ->
build/lib.linux-x86_64-2.7/lxml/includes
running build_ext
building 'lxml.etree' extension
cc -O2 -fPIC -Wimplicit -I/usr/include/libxml2
-I/home/fs/code/szoska/fiverx/lxml-3.4.0/src/lxml/includes
-I/home/fs/code/szoska/fiverx/venv.pypy/include -c src/lxml/lxml.etree.c -o
build/temp.linux-x86_64-2.7/src/lxml/lxml.etree.o -w
src/lxml/lxml.etree.c:8:22: fatal error: pyconfig.h: No such file or directory
 #include "pyconfig.h"
                      ^
compilation terminated.
error: command 'cc' failed with exit status 1

BUT:
$ rpm -q --list pypy-devel | grep pyconfig.h
/usr/lib64/pypy-2.2.1/include/pyconfig.h

Thanks,
fs
_________________________________________________________________
Mailing list for the lxml Python XML toolkit - http://lxml.de/
lxml <at> lxml.de
https://mailman-mail5.webfaction.com/listinfo/lxml
Charlie Clark | 23 Oct 11:48 2014
Picon

Generating code from schema

Hi,

just a quick question about what you can and cannot do with lxml's schema  
support. In openpyxl we're moving towards a 1:1 implementation of the  
underlying schema. lxml.objectify isn't directly an option for two  
reasons: lxml is an optional dependency and there are cases where we'd  
definitely run out of memory. Instead we're using descriptors to enforce  
type definitions. This means a little more code but now that it seems to  
be working well I was thinking whether we could automate some of the  
process. I've looked at some of the existing XSD to Python generators but  
the generated code is far from what I'd like to have.

Can we use the lxml schema support for anything other than validation? ie.  
can I query a schema object for a particular definition? Or is the best  
approach to parse the XSD files directly and work through the definitions  
with a mapping?

Charlie
--

-- 
Charlie Clark
Managing Director
Clark Consulting & Research
German Office
Kronenstr. 27a
Düsseldorf
D- 40217
Tel: +49-211-600-3657
Mobile: +49-178-782-6226
_________________________________________________________________
Mailing list for the lxml Python XML toolkit - http://lxml.de/
lxml <at> lxml.de
https://mailman-mail5.webfaction.com/listinfo/lxml
jens quade | 16 Oct 13:40 2014
Picon

XMLParser mode resolve_entities=False and entities in attributes

Hi,

this has been discussed before in 11/2009, but the bug seems to persist, so I will try to document it again:

If an XML parser is generated with XMLParser(resolve_entities=False), and the document used declares an
external DTD, then entities in attributes are inserted into the parent element (if a parent element
exists) directly before the element containing that attribute.

Expected behaviour:

- an error, because entities are undeclared; or, more useful in some cases:
- Entities stay in their attributes

Workarounds:

- Declare an internal DTD that defines all entities

- Use an actual external DTD *and* use dtd_validation=True with XMLParser

Sample code: (see also: http://pastebin.com/24bM98La -- some more examples there)

from lxml import etree
parser = etree.XMLParser(resolve_entities=False)

try:
   tree = etree.XML("""<test>1<a href="&uuml;bel">&ouml;</a></test>""", parser=parser)
except etree.XMLSyntaxError as e:
   print e

Output:
> Entity 'uuml' not defined, line 1, column 23

from lxml import etree
parser = etree.XMLParser(resolve_entities=False)

try:
   tree = etree.XML("""<!DOCTYPE test SYSTEM ""><test>1<a href="&uuml;bel">&ouml;</a></test>""", parser=parser)
   print tree[:]
   print tree.find('.//a').attrib['href']
   print etree.tostring(tree)
except etree.XMLSyntaxError as e:
   print e

Output:
> [&uuml;, <Element a at 0x1073be3f8>]
> bel
> <test>1&uuml;<a href="bel">&ouml;</a></test>

Tested with various lxml versions, e.g.:

Python              : sys.version_info(major=2, minor=7, micro=5, releaselevel='final', serial=0)
lxml.etree          : (3, 4, 0, 0)
libxml used         : (2, 9, 2)
libxml compiled     : (2, 9, 2)
libxslt used        : (1, 1, 28)
libxslt compiled    : (1, 1, 28)

jens

_________________________________________________________________
Mailing list for the lxml Python XML toolkit - http://lxml.de/
lxml <at> lxml.de
https://mailman-mail5.webfaction.com/listinfo/lxml

Gmane