Re: carriage return in attribute
2009-03-02 16:39:53 GMT
I understand about XML rules for processing of carriage returns. I am dealing with an XML document that in being imported into my application. I am not sure if it has been serialized correctly or not, but if I read through this document byte-by-byte I see carriage return (13) and newline (10) as termination characters in an attribute that is a String. I know it's probably wrong to put these characters in an attribute and this should have been a value of the element inside a CDATA, but this is the document that I need to work with.
So once I parse this document all CRLFs are converted to LFs and I am left with a line with newlines which changes how this attribute is displayed - string is displayed in line instead of having newlines visible.
Now, I guess I can read through the document before it is imported (without parser) and replace all CRLFs with
to make it correct. However, this would be ugly and I was wondering if there is an easier way to deal with this.
Hope I am being clear in what I am trying to achieve.
I'm not sure what you're asking for. Attribute value normalization  is part of the parsing process. It occurs before the data is presented to an application through any of the standard APIs.
XML Parser Development
IBM Toronto Lab
E-mail: mrglavas <at> ca.ibm.com
E-mail: mrglavas <at> apache.org
Aleksandr Kravets <akravets.work <at> gmail.com> wrote on 02/27/2009 10:07:08 AM:
> Are there utilities in Xerces that allow carriage returns
> normalization easier than let's say parsing the whole document and
> doing it manually?
> On Thu, Feb 26, 2009 at 6:39 PM, <keshlam <at> us.ibm.com> wrote:
> Carriage return is ASCII 13, so or &xD; will represent that character.
> However, be sure you understand XML's rules for whitespace
> normalization in attribute values. Depending on what you're trying
> to do, you may want to replace that attribute with a child
> element... or replace the offending character with some notation
> that your application, rather than XML, will process appropriately.
> "... Three things see no end: A loop with exit code done wrong,
> A semaphore untested, And the change that comes along. ..."
> -- "Threes" Rev 1.1 - Duane Elms / Leslie Fish (http://www.ovff.