Stefan Behnel | 1 Jun 17:38 2012
Picon

Re: xml install problems

Am 25.05.2012 04:40, schrieb William Herry:
> Hi,
> I am try to install xml on centos6.2 with command:
> pip-python install xml==2.3
> 
> it gives me this error:
> 
> Downloading/unpacking lxml==2.3
>   Running setup.py egg_info for package lxml
>     Building lxml version 2.3.
>     Building with Cython 0.14.1.

It's best to build without having Cython installed. See the docs.

>     Using build configuration of libxslt 1.1.26
>     Building against libxml2/libxslt in the following directory: /usr/lib64
> Installing collected packages: lxml
>   Running setup.py install for lxml
>     Building lxml version 2.3.
>     Building with Cython 0.14.1.
>     Using build configuration of libxslt 1.1.26
>     Building against libxml2/libxslt in the following directory: /usr/lib64
>     building 'lxml.etree' extension
>     gcc -pthread -fno-strict-aliasing -O2 -g -pipe -Wall
> -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector
> --param=ssp-buffer-size=4 -m64 -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv
> -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions
> -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic
> -D_GNU_SOURCE -fPIC -fwrapv -fPIC -I/usr/include/libxml2
> -I/usr/include/python2.6 -c src/lxml/lxml.etree.c -o
(Continue reading)

Stefan Behnel | 1 Jun 17:48 2012
Picon

Re: objectify.parse() returns lxml.etree._ElementTree instead of lxml.objectify.ObjectifiedElement

Am 30.05.2012 03:40, schrieb Roger Hoover:
> Any idea why objectify.parse() is returning an lxml.etree._ElementTree
> object while objectify.parsestring()

I assume you meant "fromstring()"?

> returns lxml.objectify.ObjectifiedElement?
> 
> I've installed lxml 2.3.4 using easy_install on Ubuntu 10.04.
> 
> from lxml import objectify
> import StringIO
> obj = objectify.parse(StringIO.StringIO("<foo><bar>1</bar></foo>"))
> print type(obj) #prints <type 'lxml.etree._ElementTree'>

Can't check right now, but I'm pretty sure there's a FAQ entry about it
(and if not, then it's certainly spelled out in the docs somewhere).

Short answer: use cases.

Stefan
_________________________________________________________________
Mailing list for the lxml Python XML toolkit - http://lxml.de/
lxml <at> lxml.de
https://mailman-mail5.webfaction.com/listinfo/lxml
Stefan Behnel | 1 Jun 17:46 2012
Picon

Re: lxml.get_include() does not return system libxml paths

Am 17.05.2012 15:59, schrieb Giovanni Bajo:
> Hi Stefan,
> 
> we submitted a patch (that is currently sitting in git trunk) for distributing pxd interface files with
lxml (including the static Windows package), to simplify writing cython code that class the lxml Cython
API interface.
> 
> Unfortunately, we're not quite there yet, because lxml.get_include() still does not return the system
include path for libxml (/usr/include/libxml2 or the like), in the cases in which the system ones were
used (I think it does work in the case where libxml header files are being shipped within the lxml static
package, like on Windows; I can't verify since there is no static Windows package for 2.4 yet).
> 
> This means that each and every users of the lxml Cython API must include a build system that duplicates
lxml's one build system and finds the system libxml include path.
> 
> I was trying to get to a point where a lxml Cython API user should only call lxml.get_include() and get a list
of pxd/h directories to be used for both cythoning (pxd) and compiling (h) the extension. Do you think it
makes sense to get there?

Absolutely. get_include() should return whatever is required as include
paths on the current system.

Stefan
_________________________________________________________________
Mailing list for the lxml Python XML toolkit - http://lxml.de/
lxml <at> lxml.de
https://mailman-mail5.webfaction.com/listinfo/lxml
Stefan Behnel | 1 Jun 17:41 2012
Picon

Re: gpx tostring for xml newbe

Am 20.05.2012 15:35, schrieb janwillem:
> I wanted to do some analyzing of Garmin gpx track and waypoint files. So
> I thought of diving into lxml and start with pretty printing a garmin
> file. Garmin writes all info on a single line which is not pretty. When
> split into lines a waypoint looks like:
> <wpt lat="52.151006" lon="15.373755">
> <ele>14.742697</ele>
> <time>2012-05-19T20:23:15Z</time>
> <name>Gazon</name>
> <sym>Park</sym>
> <extensions>
> <wptx1:WaypointExtension>
> <wptx1:Samples>3</wptx1:Samples>
> </wptx1:WaypointExtension>
> </extensions>
> </wpt>
> 
> when I do:
> 
> wpt = '{%s}wpt' % xmlns
> 
>     waypoints = root.findall(wpt)
> 
> for point in waypoints:
> 
> print point.get('lon'), point.get('lat')
> 
>         print etree.tostring(point, pretty_print=True, with_tail=False)
> 
> I get a lot of info from the beginning of the file between "<wpt" and
(Continue reading)

William Herry | 3 Jun 04:22 2012
Picon

Re: xml install problems

I messed up my system, maybe some lib problem(I don't know but it is messed up), this problem doesn't exist on other CentOS, so, I just ignore it

Thanks

On Fri, Jun 1, 2012 at 11:38 PM, Stefan Behnel <stefan_ml <at> behnel.de> wrote:
Am 25.05.2012 04:40, schrieb William Herry:
> Hi,
> I am try to install xml on centos6.2 with command:
> pip-python install xml==2.3
>
> it gives me this error:
>
> Downloading/unpacking lxml==2.3
>   Running setup.py egg_info for package lxml
>     Building lxml version 2.3.
>     Building with Cython 0.14.1.

It's best to build without having Cython installed. See the docs.


>     Using build configuration of libxslt 1.1.26
>     Building against libxml2/libxslt in the following directory: /usr/lib64
> Installing collected packages: lxml
>   Running setup.py install for lxml
>     Building lxml version 2.3.
>     Building with Cython 0.14.1.
>     Using build configuration of libxslt 1.1.26
>     Building against libxml2/libxslt in the following directory: /usr/lib64
>     building 'lxml.etree' extension
>     gcc -pthread -fno-strict-aliasing -O2 -g -pipe -Wall
> -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector
> --param=ssp-buffer-size=4 -m64 -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv
> -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions
> -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic
> -D_GNU_SOURCE -fPIC -fwrapv -fPIC -I/usr/include/libxml2
> -I/usr/include/python2.6 -c src/lxml/lxml.etree.c -o
> build/temp.linux-x86_64-2.6/src/lxml/lxml.etree.o -w
>     {standard input}: Assembler messages:
>     {standard input}:150508: Warning: end of file not at end of a line;
> newline inserted
>     gcc: Internal error: Killed (program cc1)
>     Please submit a full bug report.
>     See <http://bugzilla.redhat.com/bugzilla> for instructions.

No idea what's wrong here, but this hints at a serious problem on your
system. Try to build other binary Python packages to see if this works
at all.

Stefan
_________________________________________________________________
Mailing list for the lxml Python XML toolkit - http://lxml.de/
lxml <at> lxml.de
https://mailman-mail5.webfaction.com/listinfo/lxml



--



William Herry
====================
WilliamHerryChina <at> Gmail.com

_________________________________________________________________
Mailing list for the lxml Python XML toolkit - http://lxml.de/
lxml <at> lxml.de
https://mailman-mail5.webfaction.com/listinfo/lxml
Roger Hoover | 4 Jun 03:10 2012
Picon

Re: objectify.parse() returns lxml.etree._ElementTree instead of lxml.objectify.ObjectifiedElement

Oh, I see.  I need to call obj.get_root() to the ObjectifiedElement.  Thanks.


Roger

On Fri, Jun 1, 2012 at 8:48 AM, Stefan Behnel <stefan_ml <at> behnel.de> wrote:
Am 30.05.2012 03:40, schrieb Roger Hoover:
> Any idea why objectify.parse() is returning an lxml.etree._ElementTree
> object while objectify.parsestring()

I assume you meant "fromstring()"?


> returns lxml.objectify.ObjectifiedElement?
>
> I've installed lxml 2.3.4 using easy_install on Ubuntu 10.04.
>
> from lxml import objectify
> import StringIO
> obj = objectify.parse(StringIO.StringIO("<foo><bar>1</bar></foo>"))
> print type(obj) #prints <type 'lxml.etree._ElementTree'>

Can't check right now, but I'm pretty sure there's a FAQ entry about it
(and if not, then it's certainly spelled out in the docs somewhere).

Short answer: use cases.

Stefan
_________________________________________________________________
Mailing list for the lxml Python XML toolkit - http://lxml.de/
lxml <at> lxml.de
https://mailman-mail5.webfaction.com/listinfo/lxml

_________________________________________________________________
Mailing list for the lxml Python XML toolkit - http://lxml.de/
lxml <at> lxml.de
https://mailman-mail5.webfaction.com/listinfo/lxml
Stefan Behnel | 6 Jun 16:50 2012
Picon

Suggestions for lxml topics at PyCon-DE ? Anyone for a code sprint?

Hi,

I gave a somewhat generic lxml talk at last year's PyCon-DE in Leipzig. In
case some people on this list are going there, is there anything more
special you'd want me to present this year? The deadline has been extended
to Mai 15th, so there's some time left to come up with good ideas.

I'm also considering to run an lxml sprint at the end of the conference.
Anyone interested in learning how to work on lxml's code base? There's
quite a number of open bugs that would lend themselves to getting your
hands dirty, be it in Cython code or Python code (e.g. lxml.html). Working
on those would be a very good way to give something back to the tool you
are using.

Stefan
_________________________________________________________________
Mailing list for the lxml Python XML toolkit - http://lxml.de/
lxml <at> lxml.de
https://mailman-mail5.webfaction.com/listinfo/lxml
Andreas Maier | 7 Jun 12:08 2012
Picon
Picon

How to produce explicitly closed form of HTML tags

Hi,
I am using lxml's XSLT support to transform XML to HTML. The XSLT specifies the following xsl:output directive:

  <xsl:output method="html" version="4.0" encoding="UTF-8" media-type="text/html"
   doctype-public="-//W3C//DTD HTML 4.01//EN" standalone="yes"
   omit-xml-declaration="yes" indent="no"/>

I serialize the output element tree to a string using the etree.tostring() function:

  xslt_tree = etree.parse(xslt_url)
  xslt_transform = etree.XSLT(xslt_tree.getroot())
  out_tree = xslt_transform(in_root_elem)
  fp.write(etree.tostring(out_tree))

This creates self-closing HTML tags, for example:

  <script type="text/javascript" src="tocgen.js"/>

This hurts, because some browsers (including IE8 and FF10) do not support self-closing tags for some tags when the doctype is HTML. The script tag is a particularly nasty one because not recognizing that it is actually closed causes the whole rest of the document to be interpreted as (invalid) Javascript. But doing that with other tags may create problems as well.

What works on all browsers I tested, is the explicitly closed form:

  <script type="text/javascript" src="tocgen.js"></script>

See this thread for discussion:
http://stackoverflow.com/questions/69913/why-dont-self-closing-script-tags-work

BTW, Xalan produces the explicitly closed form for the same scenario. (Just to mention it as one other data point, not that I would consider going back to Xalan - lxml is much much faster and by comparing lxml and Xalan output beyond differences such as self-closing tags I found already one difference that seems to be a Xalan bug).

-> Is there a way to get lxml's HTML serialization support to produce the explicitly closed form ?

The lxml version I am using is lxml 2.3 with libxml2 statically linked, for Python 2.6, on Windows (lxml-2.3.win32-py2.6.exe).

-> Is there a later version that has a Windows installer and libxml2 statically linked ? (Other people need to be able to set that up for my stuff to work, and this way it is real simple).

Andy

_________________________________________________________________
Mailing list for the lxml Python XML toolkit - http://lxml.de/
lxml <at> lxml.de
https://mailman-mail5.webfaction.com/listinfo/lxml
Andreas Maier | 7 Jun 12:47 2012
Picon
Picon

Re: How to produce explicitly closed form of HTML tags

Hi,

-> Is there a way to get lxml's HTML serialization support to produce the explicitly closed form ?

I actually found a solution to this problem, by specifying more parameters to the etree.tostring() function:

  fp.write(etree.tostring(out_tree,
      encoding="utf-8", method="html", xml_declaration=None,
      pretty_print=False, with_tail=True, standalone=None,
      doctype='<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN">'))

However, this means that the output parameters specified in the XSLT become meaningless, because they are overwritten in the Python program.

-> Is there a way to let lxml honor the output parameters specified in the XSLT ?
It seems to me that would be a good default behavior.

-> Also, is there a way to get any HTML comments in the element tree to be serialized ?
The tostring() parameter with_comments is documented to apply to the C14N output method only, and does not cause the HTML comments to be created when set to True.

Andy

_________________________________________________________________
Mailing list for the lxml Python XML toolkit - http://lxml.de/
lxml <at> lxml.de
https://mailman-mail5.webfaction.com/listinfo/lxml
Simon Sapin | 7 Jun 13:29 2012
Picon

Re: How to produce explicitly closed form of HTML tags

Le 07/06/2012 12:47, Andreas Maier a écrit :
> I actually found a solution to this problem, by specifying more
> parameters to the etree.tostring() function:
>
>    fp.write(etree.tostring(... method="html"))

Alternatively, html5lib can serialize lxml trees:

http://code.google.com/p/html5lib/wiki/UserDocumentation#Serialization_of_Streams

https://github.com/Kozea/docutils-html5-writer/blob/master/docutils_html5/__init__.py#L73

Regards,
--

-- 
Simon Sapin

_________________________________________________________________
Mailing list for the lxml Python XML toolkit - http://lxml.de/
lxml <at> lxml.de
https://mailman-mail5.webfaction.com/listinfo/lxml

Gmane