Re: bidirectional python <-> xml transformations?
Eric Levy <contact <at> ericlevy.name>
2013-05-08 04:58:48 GMT
On 05/07/2013 12:59 AM, Stefan Behnel wrote:
> Hi,
>
> one more remark.
>
> Stefan Behnel, 07.05.2013 06:45:
>> Eric Levy, 07.05.2013 01:42:
>>> On 05/06/2013 07:04 PM, Simon Sapin wrote:
>>>> Le 07/05/2013 00:52, Eric Levy a écrit :
>>>>> results[0] is type str which, like type list, causes append() to cause
>>>>> an exception further up the stack. The contents of results[0] is the
>>>>> string analogous to the XPath text() function of the query results. In
>>>>> other words, it's a flattened representation of the result tree.
>>>>
>>>> I’m really not familiar with XSLT, but the API docs indicate that
>>>> apply_templates() returns a list of elements and strings. I don’t know
>>>> what determines which.
>>>>
>>>> http://lxml.de/api/
>>>> http://lxml.de/api/lxml.etree.XSLTExtension-class.html#apply_templates
>>>>
>>>> There is also an output_parent parameter that could be useful to you.
>>>>
>>>>
>>>> But more generally, to resolve this kind of issues: look at the precise
>>>> message of exceptions, be mindful of the types of your various values,
>>>> and double-check docs (both tutorials and API) for hints.
>>>>
>>>> I feel like I’ve helped you more than enough with this now. I hope that
>>>> you got the general idea of the debugging process to do it on your own.
>>>
>>> Well, I'm sorry that you feel that you've given too much help
>>
>> I think that what Simon meant is that he's given you all the help he could,
>> without knowing your code and without having enough information to
>> reproduce your problem.
>>
>>
>>> but the
>>> issue is not that I don't know how to debug my own work. The issue is that
>>> either I am misunderstanding the lxml API, or the implementation is working
>>> incorrectly. I had tried variations of the them, such as using the first
>>> element of the XPath results rather than the list, but nothing worked. I
>>> wrote my original message so that someone from the development team could
>>> help me determine which is the case: my misuse of the library, or a bug
>>> inside it. The core of the issue is that XPath generates one or more XML
>>> trees, and XSLT templates operate on an XML tree to create an output tree.
>>> Yet, when try to connect an XPath result to an XSLT input, then send the
>>> XSLT results to the console, all in a very simple way, I don't have
>>> success. Using the output_parent also causes the nodes to be flattened,
>>> and additionally seems to cause the output to be not well-formed.
>>> Hopefully I can understand why all my attempts are failing, and what needs
>>> to happen for me to be successful. Thanks greatly to anyone who can help.
>>
>> From reading though this thread (and from looking at your initial code
>> example), I get the impression that Simon is right. You do not seem to be
>> aware of what types you pass into the API functions. The append() method
>> only works for Elements, and extend() is not more than a little helper that
>> appends a list of Elements, one after the other.
>>
>> My guess is that your input is not an Element (for append()) or a list of
>> Elements (for extend()), but a list of different objects, perhaps
>> containing other lists and/or strings. Please make sure that's not the
>> case. If it is, you'll have to fix up your input by manually building the
>> desired tree(s) that you want to append.
>
> You may want to take a look at the lxml.builder package for this.
>
>> lxml simply cannot know what it
>> should do with a list of arbitrary objects. That's totally ambiguous input,
>> so it refuses to guess and gives you an appropriate error.
>>
>> If you need more help with debugging your problem, please provide more
>> information about what you are doing, specifically about the data you are
>> trying to process and the kind of XPath query you are using to collect the
>> input data. Otherwise, you make it impossible for others to guess what
>> might be happening.
>
> Stefan
It seems my initial message was not explicit enough. I was aware that
the output of the XPath query is basic Python objects, while the input
required for the XSLT transformation is a tree of objects defined by the
lxml API. The point that I wanted to make, though I admit I did not
state it outright, was that one expects, within in the context of the
same software library, that the same type of data is represented by the
same datatypes. So if I have XML data here, and XML data there, they
are hopefully the same data types (unless the library explicitly
supports multiple representations of the data). The reason for this
expectation of consistency, or course, is so that output from one
operation can be input of another.
I am not criticizing or second-guessing the design choices underlying
lxml. The purpose of the original post was to say: "I am trying to
make the input of one operation the output of another, all within the
same library, and it is surprising that this does not work. What, then,
is the intended way to deal with this issue within lxml?"
I should also added that, as it is beginning to appear that there is no
straightforward solution, I have attempted another approach, where I
programmatically select the desired node rather than using an XPath
expression. I am able to do this, but when I try to recursively apply
templates to the selected node, I am seeing now is that the
apply_templates() routine does not work as it should, according to my
understanding of the documentation. Instead of processing descendent
elements as though they were in the original document, I am seeing that
the entire subtree is flattened to be text only (elements removed).
I attached a sample case to demonstrate the issue.
So again I am asking for help from the community, hoping you will have
useful suggestions.
Eric Levy
Eric Levy
_________________________________________________________________
Mailing list for the lxml Python XML toolkit - http://lxml.de/
lxml <at> lxml.de
https://mailman-mail5.webfaction.com/listinfo/lxml