bryanh | 6 May 23:56 2007

CRLF in <string> -- XML vs XML-RPC

I'd like to bring up again a favorite topic on this list: the line of the
XML-RPC spec that attempts to say you can put arbitrary bytes in a 
<string> element.

Those who have followed this discussion know that this is at odds with
another important statement in the spec that an XML-RPC message is XML.
The problem is that if if it's XML, then the contents of a <string>
element are characters and the XML spec says a lot about how they're
interpreted and there's no obvious way to put arbitrary byte data in
there.

One area where it's particularly hard to stretch XML to cover this
binary data idea is "control character" data, because while it's easy
to assume that you would send the byte 0x41 as the XML character
sixty-five, there's no obvious way to send the byte 0x00, because
there is no XML character zero.  But on the other hand, you can imagine
an XML character zero and some implementations do.

But there's one other area that causes trouble: CRLF.  XML says when
an XML processor sees "<string>\r\n<string>" (\r and \n here are two
characters, in the C notation), it should present to the XML application
"\n".

But the binary data concept would suggest that the XML-RPC application
should see "\r\n".

Previous discussion seems to reach the consensus that it's better to
follow XML than some interpretation of that misguided paragraph in the
XML-RPC spec.  One good reason to do this is so you can use standard
XML processing software.
(Continue reading)

John Wilson | 7 May 09:36 2007
Picon

Re: CRLF in <string> -- XML vs XML-RPC


On 6 May 2007, at 22:56, bryanh <at> giraffe-data.com wrote:

>
> So, enough of the background.  I'm posting because I have recently  
> been told
> that many or most XML-RPC servers see "\r\n" at the XML-RPC  
> application
> level.  So I want to poll: In existing implementations, if
> <string>\r\n</string> comes over the wire, does the XML-RPC  
> application
> code see "\r\n" or "\n"?  And does the implementation use standard
> XML parsing code or custom XML-RPC code?
>
> On the sending side, you can defeat XML's attempts to normalize line
> endings by sending the five characters &#x0d instead of the character
> \r.  That brings you closer to the spirit of the "binary data"  
> paragraph
> in the XML-RPC spec.  Does anybody do that?
>
> I'll start.  I maintain the XML-RPC For C/C++ libraries.  As far as I
> know, they have always presented \r\n to the application as \n on the
> receiving end, and it's because they use Expat and Libxml2 XML
> parsers.  On the sending side, if the application presents a string
> containing "\r\n" in the value for a <string>, those two characters go
> onto the wire.

Bryan,

I wrote and maintain MinML-RPC ( a server only implementation written  
(Continue reading)

Gaetano Giunta | 7 May 11:32 2007
Picon

RE: CRLF in <string> -- XML vs XML-RPC

As far as I am concerned (ie. maintaining the php-xmlrpc lib):

+ by default I encode using &#xxxx; notation EVERY char outside ASCII range when sending. That means, for 8
bit charset (php defaults to iso-8859-1 basically), chars <= 32 and >= 160. It is the best solution I could
come up to make sure the charset transcoding errors are kept to a minimum. Plus CR LF are always sent and
received as-is.
The receiving end should decode the payload (since it is full ascii nobody , and if the unicode code point is
not available in the charset that the client app desires, it is up to it to decide what to do with it.

Note: the php application can, when using the lib, specify it desires to encode sent data as iso-8859-1 or
UTF-8, in which case I do not convert \r or \n chars to their code point representation - and I do not
normalize upon sending, either.

+ when receiving, the php xml parser is used. It is expat-based. As far as I can tell, i does not normalize
CR\LF while decoding (but, of course, if the sending application has normalized it, I get what they sent)

+ the consensus on the spec is indeed 'that frase is crap - for binary data go base64'

Bye
Gaetano

  -----Original Message-----
  From: xml-rpc <at> yahoogroups.com [mailto:xml-rpc <at> yahoogroups.com]On Behalf Of John Wilson
  Sent: Monday, May 07, 2007 9:36 AM
  To: xml-rpc <at> yahoogroups.com
  Subject: Re: [xml-rpc] CRLF in <string> -- XML vs XML-RPC

  On 6 May 2007, at 22:56, bryanh <at> giraffe-data.com wrote:

  >
(Continue reading)

bryanh | 8 May 17:23 2007

Re: CRLF in <string> -- XML vs XML-RPC

>by default I encode using &#xxxx; notation EVERY char outside ASCII
>range when sending. That means, for 8 bit charset (php defaults to
>iso-8859-1 basically), chars <= 32 and >= 160. It is the best solution
>I could come up to make sure the charset transcoding errors are kept
>to a minimum. Plus CR LF are always sent and received as-is.

>Note: the php application can, when using the lib, specify it desires
>to encode sent data as iso-8859-1 or UTF-8, in which case I do not
>convert \r or \n chars to their code point representation - and I do
>not normalize upon sending, either.

I can't see a difference between the default and the option.  When do
you send \r \n as the raw characters and when don't you?  When you say
"code point representation," which representation is that?  &#xxxx; ?

>when receiving, the php xml parser is used. It is expat-based. As
>far as I can tell, i does not normalize CR\LF while decoding

That's hard to believe.  You mean the conventional PHP XML parser
doesn't normalize CR/LF ever?  Considering 1) that the XML spec makes
it pretty clear that it's required; and 2) it's generally very useful,
I don't think it could get away with that.  Plus, I have an Expat
parser from 2001 that has the normalization integrated pretty tightly
into it.

--

-- 
Bryan Henderson                                   San Jose, California

 
Yahoo! Groups Links
(Continue reading)

Gaetano Giunta | 9 May 10:10 2007
Picon

Re: CRLF in <string> -- XML vs XML-RPC

>> by default I encode using &#xxxx; notation EVERY char outside ASCII
>> range when sending. That means, for 8 bit charset (php defaults to
>> iso-8859-1 basically), chars <= 32 and >= 160. It is the best solution
>> I could come up to make sure the charset transcoding errors are kept
>> to a minimum. Plus CR LF are always sent and received as-is.

>> Note: the php application can, when using the lib, specify it desires
>> to encode sent data as iso-8859-1 or UTF-8, in which case I do not
>> convert \r or \n chars to their code point representation - and I do
>> not normalize upon sending, either.

> I can't see a difference between the default and the option. When do
> you send \r \n as the raw characters and when don't you? When you say
> "code point representation," which representation is that? &#xxxx; ?

Sorry for my poor wording. I am always very bad at using the correct terms.
By "code point representation" I did mean "&#xxxx;" representation, or, as it is named in the spec,
"character reference"

I do send \r \n as "raw" characters when the user of the library specifies via a specific call that he wants to
use a particular character set for the outgoing payload.
Since the supported charsets (by the lib) are ISO-8859-1 and UTF8 only, raw CR and LF chars are deemed to be valid.
If the user of the lib does not specify anything, I do the "always encode using character represenation
trick", and send character representation for \r and \n.

One other thing I do, which I did not mention is that by default I do not send any charset header in neither the
xml prologue nor the http headers, but when the lib user specifies the charset he wants to use I do set it in
both places. 

Oh, and I also send '&amp;', '&quot;', '&apos;', '&lt;', '&gt;' as entity references instead of cahrcter
(Continue reading)

John Wilson | 9 May 12:02 2007
Picon

Re: CRLF in <string> -- XML vs XML-RPC

If it is of interest this is how I handle encoding the the Groovy  
implementation:

Characters with values less than 0X20 but not equal to  0X09  0X0A or  
0X0D cause an exception to be thrown and no data is sent

Characters with values greater than 0XD800 and not greater than or  
equals to  0XE000 and less than 0XFFFE cause an exception to be  
thrown and no data is sent

The characters '<", ">" and '&' are encoded as &lt;, &gt; and &amp;

Other characters with value less than or equal to 0XFF get sent "as  
is" (including 0X09  0X0A and 0X0D)

All others get represented as &#x....; entities

I do this no matter that the encoding of the document is set to.

So I always send "well formed" XML.

I should probably be more conservative and encode everything > 0XEF  
so that implementations which don't understand UTF-8 but do  
understand numeric character entities will be able to process the  
message.

John Wilson

 
Yahoo! Groups Links
(Continue reading)

bryanh | 10 May 03:04 2007

Re: CRLF in <string> -- XML vs XML-RPC

>> That's hard to believe. You mean the conventional PHP XML parser
>> doesn't normalize CR/LF ever? Considering 1) that the XML spec makes
>> it pretty clear that it's required; and 2) it's generally very useful,
>> I don't think it could get away with that. Plus, I have an Expat
>> parser from 2001 that has the normalization integrated pretty tightly
>> into it.

>Well, I quoted from memory, and I am most likely wrong.  I just
>re-read the spec, and as you correctly pointed out in chapter 2.11 the
>normalization is in fact required upon parsing.

Well, there's still some confusion, because I have a user who, using
your implementation as a standard by which to measure mine, claims
that yours does _not_ do line ending normalization on the receiving
side (specifically, where a server receives a <string> parameter).
Could you have been wrong about the fact that you (PHP XML-RPC) use
the standard PHP XML parser but right about the fact that the PHP
XML-RPC server doesn't do line ending normalization?

Same user says a Python XML-RPC server does _not_ do line ending
normalization of <string> parameter values.

--

-- 
Bryan Henderson                                   San Jose, California

 
Yahoo! Groups Links

<*> To visit your group on the web, go to:
    http://groups.yahoo.com/group/xml-rpc/
(Continue reading)

Gaetano Giunta | 11 May 12:10 2007
Picon

RE: CRLF in <string> -- XML vs XML-RPC

Only way to make sure is trying out, really... I'll try to allocate time for a test in the next couple of days.

I am fairly sure I use the standard xml parser that comes with php, unless I have coded an xml parser from
scratch without realizing it ;)

Afaik, that means expat for php 4. It can be either a version bundled together with php, the expat version in
use by apache or any other version available on the system.

For php5, the thing is muddier: the online man page talks about an "expat compat layer" without giving many details...
ldd and source code perusal seem to confirm that libxml is used as the default xml parser in such
configuration (plus I just remembered some quirks that emerged in my lib when first testing with php 5, and
a bug I opened on php.net but was dimissed as bogus from the developers. This confirms that the xml parser in
use is another one altogether)

Of course, expat and libxml might have different povs regarding CRLF normalization...
I have no idea about Python.

Bye
Gaetano

  -----Original Message-----
  From: xml-rpc <at> yahoogroups.com [mailto:xml-rpc <at> yahoogroups.com]On Behalf Of bryanh <at> giraffe-data.com
  Sent: Thursday, May 10, 2007 3:05 AM
  To: xml-rpc <at> yahoogroups.com
  Subject: Re: [xml-rpc] CRLF in <string> -- XML vs XML-RPC

  >> That's hard to believe. You mean the conventional PHP XML parser
  >> doesn't normalize CR/LF ever? Considering 1) that the XML spec makes
  >> it pretty clear that it's required; and 2) it's generally very useful,
  >> I don't think it could get away with that. Plus, I have an Expat
(Continue reading)

generalcopperfield | 15 May 14:12 2007
Picon

xml-rpc structs

Hello,

I am working with xmlrpc-c v.1.1 because I work on windows and with C 
(this version is checked for windows). But I can get structs to work. 
Using for example
sent_structure = xmlrpc_build_value(&env, 
                                   "({s:d,s:d}{s:d,s:d}{s:d,s:d})", 
                                   "min0", 0.2,
                                   "max0", 20,
                                   "min1", 0.5,
                                   "max1", 31.9,
                                   "min2", 5.75,
                                   "max2", 35.9
                                  );

gives an error. Does someone know how can I make structs to work?

Thank you!!

General Copperfield.

 
Yahoo! Groups Links

<*> To visit your group on the web, go to:
    http://groups.yahoo.com/group/xml-rpc/

<*> Your email settings:
    Individual Email | Traditional

(Continue reading)

xmlrpccarlosfred | 16 May 22:49 2007
Picon

Delphi and C# xmlrpc


Hi,

I'm new in xmlrpc, and i need to build a client in delphi 7 that 
communicates to a server in C# using xml-rpc , does anyone know how do 
i call c# methods on the server from the delphi 7 client?

I've done a client and a server in c#, but i don't know how to use a 
client in delphi 7.

Thanks,

Carlos Frederico

 
Yahoo! Groups Links

<*> To visit your group on the web, go to:
    http://groups.yahoo.com/group/xml-rpc/

<*> Your email settings:
    Individual Email | Traditional

<*> To change settings online go to:
    http://groups.yahoo.com/group/xml-rpc/join
    (Yahoo! ID required)

<*> To change settings via email:
    mailto:xml-rpc-digest <at> yahoogroups.com 
    mailto:xml-rpc-fullfeatured <at> yahoogroups.com
(Continue reading)


Gmane