Re: XML Documents & I18N (the way Cocoon does it)
Alexis Georges <velvetcrafter.subscriber <at> gmail.com>
2009-06-01 20:36:56 GMT
Hi,
This is a bit late, but thanks for the response.
I am playing around with iterparse() and am following the advice you
gave.
I have a question though: I could not find a way to consume an element
and replace it with just text. For example <i18n:text>hello</
i18n:text> when found in the middle of a paragraph will be replaced by
text. The replace() method requires the replacement to be an element.
Is this possible?
Thanks!
Alexis Georges
On 28-Apr-09, at 1:59 PM, Stefan Behnel wrote:
> Hi,
>
> Alexis Georges wrote:
>> I am maintaining a multilingual website which works with XML, XSLT to
>> generate XHTML.
>>
>> I am working with Apache Cocoon (http://cocoon.apache.org/2.1/) using
>> (among other things) their I18NTransformer. Basically I can use
>> elements
>> in the I18N (http://apache.org/cocoon/i18n/2.1) namespace, and then
>> tell
>> Cocoon to apply the I18NTransfomer to the document; this replaces the
>> I18N elements with a localized value (eg. a formatted date/number, a
>> translated label/attribute, etc...).
>>
>> I have been looking at lxml a little bit to see if I could move to a
>> Python-based framework for the website. I am not quite sure how to go
>> about the I18N part though.
>>
>> Using the Babel library (http://babel.edgewall.org/) along with
>> request
>> headers to generate localized data, I have everything I need. What is
>> missing is the "parser" for the I18N elements. All I can think of
>> right
>> now is to implement a SAX parser, the way Cocoon does (in Java).
>
> There is a SAX-like interface in lxml.etree, called "target parser".
>
> However, if your documents fit into memory, using iterparse() is a lot
> simpler (and likely not even much slower).
>
> Something like this might work:
>
> context = etree.iterparse(
> "somefile.xml",
> tag = "{http://apache.org/cocoon/i18n/2.1}*")
>
> for event, i18n_element in context:
> new_element = get_i18n_replacement_for(i18n_element)
> i18n_element.getparent().replace(i18n_element, new_element)
>
> context.getroottree().write("newfile.xml")
>
> See here for some documentation:
>
> http://codespeak.net/lxml/parsing.html
>
> You can also achieve the same thing in XSLT, or using XPath, or ...
>
> Stefan