6 May 2007 23:56
CRLF in <string> -- XML vs XML-RPC
<bryanh <at> giraffe-data.com>
2007-05-06 21:56:49 GMT
2007-05-06 21:56:49 GMT
I'd like to bring up again a favorite topic on this list: the line of the XML-RPC spec that attempts to say you can put arbitrary bytes in a <string> element. Those who have followed this discussion know that this is at odds with another important statement in the spec that an XML-RPC message is XML. The problem is that if if it's XML, then the contents of a <string> element are characters and the XML spec says a lot about how they're interpreted and there's no obvious way to put arbitrary byte data in there. One area where it's particularly hard to stretch XML to cover this binary data idea is "control character" data, because while it's easy to assume that you would send the byte 0x41 as the XML character sixty-five, there's no obvious way to send the byte 0x00, because there is no XML character zero. But on the other hand, you can imagine an XML character zero and some implementations do. But there's one other area that causes trouble: CRLF. XML says when an XML processor sees "<string>\r\n<string>" (\r and \n here are two characters, in the C notation), it should present to the XML application "\n". But the binary data concept would suggest that the XML-RPC application should see "\r\n". Previous discussion seems to reach the consensus that it's better to follow XML than some interpretation of that misguided paragraph in the XML-RPC spec. One good reason to do this is so you can use standard XML processing software.(Continue reading)
RSS Feed