Stefan Behnel | 1 Mar 2011 07:20
Picon
Favicon

Re: [lxml-dev] Adding top-level comments to an ElementTree object

Tyler Erickson, 28.02.2011 21:14:
> How do you add top-level comments to an ElementTree object?

http://lxml.de/api/lxml.etree._Element-class.html#addprevious

Stefan
Adam GROSZER | 1 Mar 2011 10:30
Picon
Gravatar

Re: availability of lxml 2.3 windows binaries

On 02/28/2011 01:29 PM, Sidnei da Silva wrote:
> On Mon, Feb 28, 2011 at 6:03 AM, Piotr Dobrogost
> <pd <at> gmane.2011.dobrogost.pl>  wrote:
>> Hi!
>>
>> When could we expect to be able to download lxml 2.3 windows binaries from PyPi
>> (http://pypi.python.org/pypi/lxml/2.3#downloads)?
>
> No estimates yet. However, it should be possible for someone with
> enough motivation to use the same scripts I use for building it:
> http://people.canonical.com/~sidnei/lxml/

Are those files somewhere in the lxml repo?
If not, would be great to have them close to the source...

Seems like 32bit eggs/exes are there now, any chance to have 64bits too?

--

-- 
Best regards,
  Adam GROSZER
--
Quote of the day:
I had a thousand questions to ask God; but when I met Him they all fled 
and didn't seem to matter.
- Christopher Morley
Sidnei da Silva | 1 Mar 2011 12:28
Favicon
Gravatar

Re: availability of lxml 2.3 windows binaries

On Tue, Mar 1, 2011 at 6:30 AM, Adam GROSZER <agroszer <at> gmail.com> wrote:
> Are those files somewhere in the lxml repo?
> If not, would be great to have them close to the source...

Indeed. It would be great if someone could pick that up.

> Seems like 32bit eggs/exes are there now, any chance to have 64bits too?

Eventually. There's only so many hours in a day.

-- Sidnei
Stefan Behnel | 1 Mar 2011 12:48
Picon
Favicon

Re: [lxml-dev] availability of lxml 2.3 windows binaries

Sidnei da Silva, 01.03.2011 12:28:
> On Tue, Mar 1, 2011 at 6:30 AM, Adam GROSZER wrote:
>> Are those files somewhere in the lxml repo?
>> If not, would be great to have them close to the source...
>
> Indeed. It would be great if someone could pick that up.

Well, lxml is an open source project. I take patches.

Could someone else try these scripts to see how hard it is to get them 
working on a different machine?

Stefan
Tyler Erickson | 1 Mar 2011 20:18
Picon
Gravatar

Re: [lxml-dev] Adding top-level comments to an ElementTree object

Stefan,
Thanks, that was what I was looking for...
- Tyler

$ python
Python 2.6.6 (r266:84292, Sep 15 2010, 16:22:56) 
[GCC 4.4.5] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from lxml import etree
>>> from lxml.builder import E
>>> # construct an Element class using the E-factory
... element = (
...   E.root(
...     E.a('eggs'),
...   )
... )
>>> element.addprevious(etree.Comment('top level comment'))
>>> print etree.tostring(etree.ElementTree(element))
<!--top level comment--><root><a>eggs</a></root>



On Mon, Feb 28, 2011 at 11:20 PM, Stefan Behnel <stefan_ml <at> behnel.de> wrote:
Tyler Erickson, 28.02.2011 21:14:
How do you add top-level comments to an ElementTree object?

http://lxml.de/api/lxml.etree._Element-class.html#addprevious

Stefan

_______________________________________________
lxml-dev mailing list
lxml-dev <at> codespeak.net
http://codespeak.net/mailman/listinfo/lxml-dev
Giovanni Torres | 4 Mar 2011 10:35
Picon
Favicon

lxml removes line breaks from XML attributes

Hello All!

Some of you might not be happy with my question. But, I'm dealing with
an XML file that has line breaks in XML attributes. I use lxml to
parse the file, run some XPath queries, make changes to it and write
it back. Unfortunately, lxml removes the line breaks from the
attributes.

Here is what I mean more clearly:

$ cat example.xml
<example attr="This is an attribute with several
break
lines"/>

$ cat test.py

import sys
import lxml.etree

xml = lxml.etree.parse(sys.stdin)
xml.write(sys.stdout)
print()

$ python test.py < example.xml
<example attr="This is an attribute with several  break lines"/>()

$ python -c 'import lxml.etree ; print(lxml.etree.__version__)'
2.3.0

$ python -V
Python 2.6.5

$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 10.04.2 LTS
Release:        10.04
Codename:       lucid

$

Is there any way I can get lxml to write those line breaks back?

I'm actually not sure if they are even legal. But, they seem to be
according to this:

http://stackoverflow.com/questions/449627/are-line-breaks-in-xml-attribute-values-valid

Thanks in advance for your help!

--
Giovanni
Lorenzo Sutton | 4 Mar 2011 15:49
Picon
Favicon
Gravatar

Re: lxml removes line breaks from XML attributes

Hi Giovanni,

Not really an answer to your question but just an idea which maybe can be helpful, also depending on your scenario.

Extracts from what Giovanni Torres wrote:
Some of you might not be happy with my question. But, I'm dealing with an XML file that has line breaks in XML attributes. I use lxml to parse the file, run some XPath queries, make changes to it and write it back. Unfortunately, lxml removes the line breaks from the attributes.

Maybe you could use entities which are preserved in the attributes.

If the input xml is not too big you could replace the \n with e.g. &#10; and then back.

xml='<example attr="This is an attribute with several&#10;break&#10;lines"/>'
...
do lxml processing
...
output = etree.tostring (processed)
output.replace ('&#10;','\n')


Lorenzo.

Here is what I mean more clearly: $ cat example.xml <example attr="This is an attribute with several break lines"/> $ cat test.py import sys import lxml.etree xml = lxml.etree.parse(sys.stdin) xml.write(sys.stdout) print() $ python test.py < example.xml <example attr="This is an attribute with several break lines"/>() $ python -c 'import lxml.etree ; print(lxml.etree.__version__)' 2.3.0 $ python -V Python 2.6.5 $ lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 10.04.2 LTS Release: 10.04 Codename: lucid $ Is there any way I can get lxml to write those line breaks back? I'm actually not sure if they are even legal. But, they seem to be according to this: http://stackoverflow.com/questions/449627/are-line-breaks-in-xml-attribute-values-valid Thanks in advance for your help! -- Giovanni
_______________________________________________
Mailing list for the lxml Python XML toolkit - http://lxml.de/
lxml <at> lxml.de
http://mailman-mail5.webfaction.com/listinfo/lxml
Stefan Behnel | 5 Mar 2011 07:31
Picon
Favicon

Re: lxml removes line breaks from XML attributes

[accidentally sent to the wrong list]

Giovanni Torres, 04.03.2011 10:35:
> Some of you might not be happy with my question. But, I'm dealing with
> an XML file that has line breaks in XML attributes. I use lxml to
> parse the file, run some XPath queries, make changes to it and write
> it back. Unfortunately, lxml removes the line breaks from the
> attributes.
>
> Here is what I mean more clearly:
>
> $ cat example.xml
> <example attr="This is an attribute with several
> break
> lines"/>
>
> $ cat test.py
>
> import sys
> import lxml.etree
>
> xml = lxml.etree.parse(sys.stdin)
> xml.write(sys.stdout)
> print()
>
> $ python test.py<  example.xml
> <example attr="This is an attribute with several  break lines"/>()

This is called "attribute-value normalisation" in the XML spec:

http://www.w3.org/TR/REC-xml/#AVNormalize

> Is there any way I can get lxml to write those line breaks back?

You should escape the newlines in attribute values as presented in the 
spec, i.e. use "#xA;" etc.

> I'm actually not sure if they are even legal. But, they seem to be
> according to this:
>
> http://stackoverflow.com/questions/449627/are-line-breaks-in-xml-attribute-values-valid

Well, technically, the example is "legal", as stated, but it doesn't give 
the requested result.

Stefan
Stefan Behnel | 4 Mar 2011 16:39
Picon
Favicon

Re: [lxml-dev] lxml removes line breaks from XML attributes

Giovanni Torres, 04.03.2011 10:35:
> Some of you might not be happy with my question. But, I'm dealing with
> an XML file that has line breaks in XML attributes. I use lxml to
> parse the file, run some XPath queries, make changes to it and write
> it back. Unfortunately, lxml removes the line breaks from the
> attributes.
>
> Here is what I mean more clearly:
>
> $ cat example.xml
> <example attr="This is an attribute with several
> break
> lines"/>
>
> $ cat test.py
>
> import sys
> import lxml.etree
>
> xml = lxml.etree.parse(sys.stdin)
> xml.write(sys.stdout)
> print()
>
> $ python test.py<  example.xml
> <example attr="This is an attribute with several  break lines"/>()

This is called "attribute-value normalisation" in the XML spec:

http://www.w3.org/TR/REC-xml/#AVNormalize

> Is there any way I can get lxml to write those line breaks back?

You should escape the newlines in attribute values as presented in the 
spec, i.e. use "#xA;" etc.

> I'm actually not sure if they are even legal. But, they seem to be
> according to this:
>
> http://stackoverflow.com/questions/449627/are-line-breaks-in-xml-attribute-values-valid

Well, technically, the example is "legal", as stated, but it doesn't give 
the requested result.

Stefan
Giovanni Torres | 7 Mar 2011 11:16
Picon
Favicon

Re: lxml removes line breaks from XML attributes

On Sat, Mar 5, 2011 at 08:31, Stefan Behnel <stefan_ml <at> behnel.de> wrote:

> You should escape the newlines in attribute values as presented in the
> spec, i.e. use "#xA;" etc.

Thank you and others for their answers. I'll use "#xA;" to be able to
keep the line breaks.

--

-- 
Giovanni

Gmane