Julian Reschke | 5 Oct 15:54 2006
Picon
Picon

Feedback on draft-gregorio-uritemplate-00


Hi Joe/Mark/Marc/David,

I think this goes into the right direction. Congratulations.

The main issue I see is that the spec doesn't seem to have a position on 
what to do with values that contain non-URI friendly character 
sequences. For instance, in

<http://bitworking.org/projects/URI-Templates/draft-gregorio-uritemplate-00.html#rfc.section.4.2.p.2> 
you say:

"If the value of a template variable would conflict with a reserved 
character's purpose as a delimiter, then the conflicting data must be 
percent-encoded before substitution."

However, in

<http://bitworking.org/projects/URI-Templates/draft-gregorio-uritemplate-00.html#rfc.section.4.3> 
we see:

+++++++++++
The following are examples of URI Template expansions that are not legal.

     Name                                Value
     ------------------------------------------------------------
     a                                   fred barney
     b                                   %

The following URI Templates are expanded with the given values and do 
(Continue reading)

James M Snell | 5 Oct 16:23 2006
Picon

Re: Feedback on draft-gregorio-uritemplate-00


Hey Julian,

Yeah, this actually came up in our own discussions prior to publishing
the draft.  The point that the spec is trying to make is that invalid
characters MUST be percent encoded before performing the replacement so
rather than replacing {a} with the literal "fred barney" you'd replace
it with the literal "fred%20barney".  This should ensure that no
additional processing of the URI is necessary after performing the
template expansion.

- James

Julian Reschke wrote:
> [snip]
> However, in
> <http://bitworking.org/projects/URI-Templates/draft-gregorio-uritemplate-00.html#rfc.section.4.3>
> we see:
> 
> +++++++++++
> The following are examples of URI Template expansions that are not legal.
> 
>     Name                                Value
>     ------------------------------------------------------------
>     a                                   fred barney
>     b                                   %
> 
> The following URI Templates are expanded with the given values and do
> not produce legal URIs.
> 
(Continue reading)

Julian Reschke | 5 Oct 16:38 2006
Picon
Picon

Re: Feedback on draft-gregorio-uritemplate-00


James M Snell schrieb:
> Hey Julian,
> 
> Yeah, this actually came up in our own discussions prior to publishing
> the draft.  The point that the spec is trying to make is that invalid
> characters MUST be percent encoded before performing the replacement so
> rather than replacing {a} with the literal "fred barney" you'd replace
> it with the literal "fred%20barney".  This should ensure that no
> additional processing of the URI is necessary after performing the
> template expansion.

OK,

in that case the examples should be clarified.

Now, what if I put non-ASCII characters into a variable? Should the spec 
require encoding à la IRI?

Best regards, Julian

Joe Gregorio | 5 Oct 16:52 2006

Re: Feedback on draft-gregorio-uritemplate-00


On 10/5/06, Julian Reschke <julian.reschke <at> gmx.de> wrote:
>
> Hi Joe/Mark/Marc/David,
>
> I think this goes into the right direction. Congratulations.

Thanks! More responses inline.

> The main issue I see is that the spec doesn't seem to have a position on
> what to do with values that contain non-URI friendly character
> sequences. For instance, in
> <http://bitworking.org/projects/URI-Templates/draft-gregorio-uritemplate-00.html#rfc.section.4.2.p.2>
> you say:
>
> "If the value of a template variable would conflict with a reserved
> character's purpose as a delimiter, then the conflicting data must be
> percent-encoded before substitution."
>
> However, in
> <http://bitworking.org/projects/URI-Templates/draft-gregorio-uritemplate-00.html#rfc.section.4.3>
> we see:
>
> +++++++++++
> The following are examples of URI Template expansions that are not legal.
>
>      Name                                Value
>      ------------------------------------------------------------
>      a                                   fred barney
>      b                                   %
(Continue reading)

James M Snell | 5 Oct 16:57 2006
Picon

Re: Feedback on draft-gregorio-uritemplate-00


For a URI template, the only legal values for a template variable name
are in the "reserved" set.  Anything is not allowed.  Applications are
free to treat sequences like %20 as pct-encoded if they wish but that is
not required).  I am working on an IRI templates variation that will
allow non-ascii characters in template variable names and will allow for
 the generation of IRIs.

As for the replacement values, again, they MUST result in a valid URI,
meaning that the application is responsible for proper encoding of the
values so the result is valid.

- James

Julian Reschke wrote:
> James M Snell schrieb:
>> Hey Julian,
>>
>> Yeah, this actually came up in our own discussions prior to publishing
>> the draft.  The point that the spec is trying to make is that invalid
>> characters MUST be percent encoded before performing the replacement so
>> rather than replacing {a} with the literal "fred barney" you'd replace
>> it with the literal "fred%20barney".  This should ensure that no
>> additional processing of the URI is necessary after performing the
>> template expansion.
> 
> OK,
> 
> in that case the examples should be clarified.
> 
(Continue reading)

Julian Reschke | 5 Oct 17:01 2006
Picon
Picon

Re: Feedback on draft-gregorio-uritemplate-00


Joe Gregorio schrieb:
> This has been a long running part of the discussion
> and any ideas would be greatly appreciated. Here
> is a quick synopsis of the problems:
> 
> 1. What about the character encoding of non-ascii characters?
>    Do we force UTF-8?

I think if you *can* enforce it, by all means do it.

> 2. What about double escaping?
>    Given:
> 
>       a                                 none%20of%20the%20above
> 
>   Should the substitution be:
> 
>       http://example.org/{a}
>       http://example.org/none%20of%20the%20above
> 
>   or
> 
>       http://example.org/none%2520of%2520the%2520above

I think it would be the latter. Pick one, but then be consistent :-)

> 3. What about 'reserved' characters?
>    Given:
> 
(Continue reading)

Julian Reschke | 5 Oct 17:04 2006
Picon
Picon

Re: Feedback on draft-gregorio-uritemplate-00


James M Snell schrieb:
> For a URI template, the only legal values for a template variable name
> are in the "reserved" set.  Anything is not allowed.  Applications are
> free to treat sequences like %20 as pct-encoded if they wish but that is
> not required).  I am working on an IRI templates variation that will
> allow non-ascii characters in template variable names and will allow for
>  the generation of IRIs.
> 
> As for the replacement values, again, they MUST result in a valid URI,
> meaning that the application is responsible for proper encoding of the
> values so the result is valid.

...that's certainly another possible point of view. In the end, the spec 
must be consistent with regards to this.

In general, people frequently get URI/IRI encoding wrong, so being 
strict here makes a lot of sense.

Best regards, Julian

James M Snell | 5 Oct 17:48 2006
Picon

Re: Feedback on draft-gregorio-uritemplate-00


The underlying question is whether the the template processor is
responsible for performing the escaping or whether the application
providing the values is responsible.

The difference ends up being very important. In the current draft, a
replacement value can span multiple segments. i.e.,

  http://example.org/{foo}

could expand to:

  http://example.org/jasnell/2006/10?foo#bar

If the template processor performs the escaping, then we either a) rule
out the ability for a replacement value to span multiple segments (e.g.
gen-delims would always end up encoded), b) define explicitly what
characters get encoded and which do not or c) define some sort of
escaping mechanism so the processor knows not to encode certain
characters.

If "none%20of%20the%20above" should expand to
"none%2520of%2520the%2520above", then the replacement value
"jasnell/2006/10?foo#bar" should instead expand to
"jasnell%2F2006%2F10%3Ffoo%23bar"

- James

Julian Reschke wrote:
>>[snip]
(Continue reading)

Bjoern Hoehrmann | 5 Oct 18:15 2006
Picon
Picon

Re: Feedback on draft-gregorio-uritemplate-00


* Julian Reschke wrote:
>[draft-gregorio-uritemplate-00.txt]
>I think this goes into the right direction. Congratulations.

I somewhat miss the point of the document. What is being defined is a
general-purpose template format except that:

>"If the value of a template variable would conflict with a reserved 
>character's purpose as a delimiter, then the conflicting data must be 
>percent-encoded before substitution."

Without this escaping procedure (and I do not understand the require-
ment at all), there is nothing specific to "URIs" in this document,
except for the title and the misplaced constraint that the result of
applying the replacement algorithm must be a "valid" URI in some sense.

Why not leave the validity requirement to protocols using this kind of
template format, and make the escaping configurable, if you have any
escaping at this level at all? You could, for example, provide triples
of

  [ name, value, escaping-method ]

as input to the template processor, or specify it inline like

  http://{punycode:host}/{uri:dir}?q={base64:q}#{frag}

Or use a encoding specifier prefix like

(Continue reading)

Julian Reschke | 5 Oct 19:00 2006
Picon
Picon

Re: Feedback on draft-gregorio-uritemplate-00


James M Snell schrieb:
> The underlying question is whether the the template processor is
> responsible for performing the escaping or whether the application
> providing the values is responsible.

Yes.

> The difference ends up being very important. In the current draft, a
> replacement value can span multiple segments. i.e.,

Well, the current draft IMHO is ambiguous, see 
<http://bitworking.org/projects/URI-Templates/draft-gregorio-uritemplate-00.html#rfc.section.4.2.p.2>:

"If the value of a template variable would conflict with a reserved 
character's purpose as a delimiter, then the conflicting data must be 
percent-encoded before substitution."

So that IMHO needs to be clarified.

Independently of this, I don't think you can get away without stating 
how non-ASCII characters need to be escaped....

 > ...

Best regards, Julian


Gmane