Sebastian Pipping | 1 Sep 2008 19:08

uriparser 0.7.2 released


Hello!

This release fixes bad cleanup logic in two functions
related to URI reference creation and resolution.
Depending on your usage of these functions the result
could have been either a crash or a memleak, so updating
is strongly recommended.
This release is both source- and binary-compatible.

Download:
http://sourceforge.net/project/showfiles.php?group_id=182840

Changelog:
http://sourceforge.net/project/shownotes.php?release_id=623493

Sebastian

Mark Nottingham | 15 Sep 2008 13:57
Favicon
Gravatar

URI Templates: done or dead?


There hasn't been a lot of discussion or activity on URI Templates  
recently, which either means it's very stable, or very nearly dead.

If it's very stable, we should ship it and be done with it. If it's  
nearly dead (and I do get a whiff of that; while I continuously hear  
people clamouring for it to be finished, not many seem to be willing  
to use it in its current state; YMMV), we should at least try to  
revive it.

My continuing concerns with the -03 draft are that it's too complex,  
not human-friendly, and it makes the common, simple use cases hard.  
The first example in the spec ( http://www.example.com/users/ 
{userid} ) holds up well, but it goes quickly downhill from there; ( http://www.example.com/? 
{-join|&|query,number} ) looks like line noise, IMHO.

I believe there are a few things we can do to make URI Template more  
broadly useful and useable, without sacrificing too much functionality  
(at least in the 80% case).

1. Reduce or drop operators.

As mentioned above, they don't read well; they're obviously intended  
for machines, not people. The expansion for a template should be  
blindingly obvious, but the operator syntax seems to want to get in  
the way rather than help. Furthermore, the vast majority of use cases  
for templates are for simple template substitution, not operations  
like 'neg' and 'opt'.

2. Drop list values.
(Continue reading)

Julian Reschke | 15 Sep 2008 16:33
Picon
Picon

Re: URI Templates: done or dead?


Mark Nottingham wrote:
> 
> There hasn't been a lot of discussion or activity on URI Templates 
> recently, which either means it's very stable, or very nearly dead.
> ...

Or parts of them are stable, while the others are nearly dead :-)

Apparently the attempt to standardize the hard parts has failed so far. 
So let's try to stick to the simple cases, and potentially leave in 
extension points.

It probably would also make sense to look at what current frameworks 
implement (JAVA: JSR-311, .NET: I think there's something related to URI 
templates in 3.5?), and document that (if it's interoperable).

BR, Julian

DeWitt Clinton | 15 Sep 2008 16:46
Gravatar

Re: URI Templates: done or dead?

Well, I hope its not dead, as we still need a standard for URI templates.

I personally agree that it is a little too complex, for the same reasons you suggest, Mark.  But if no one expresses interest in doing another iteration on the spec, then I'd still be +1 on shipping it as is rather than dropping it altogether.  Even in its current form it is quite useful, and I'd be comfortable implementing it as it stands.

-DeWitt

On Mon, Sep 15, 2008 at 7:33 AM, Julian Reschke <julian.reschke <at> gmx.de> wrote:

Mark Nottingham wrote:

There hasn't been a lot of discussion or activity on URI Templates recently, which either means it's very stable, or very nearly dead.
...

Or parts of them are stable, while the others are nearly dead :-)

Apparently the attempt to standardize the hard parts has failed so far. So let's try to stick to the simple cases, and potentially leave in extension points.

It probably would also make sense to look at what current frameworks implement (JAVA: JSR-311, .NET: I think there's something related to URI templates in 3.5?), and document that (if it's interoperable).

BR, Julian


Sebastian Pipping | 15 Sep 2008 19:11

Review of URI Template draft #3 / was Re: URI Templates: done or dead?


DeWitt Clinton wrote:
> I personally agree that it is a little too complex, for the same reasons
> you suggest, Mark.  But if no one expresses interest in doing another
> iteration on the spec, then I'd still be +1 on shipping it as is rather
> than dropping it altogether.

Sorry for jumping in so late.  I just had a quick review of
URI Template draft #3.  Here are my comments and questions.
I'm aware there might be reasons for the draft's status quo.

  1.  In 4.2. (Template Expansions) the given grammar reads
        vars        = var [ *("," var) ]       .
      I would have expected either
        vars        = var [ 1*("," var) ]      or
        vars        = var *("," var)
      instead.  Is this an intended choice?

  2.  The draft does not mention case sensitivity of
      variables names.  Does "FOO" match "foo"? Is this
      left to an implementation?

  3.  I find the operators <opt> and <neg> quite unnatural.
      Is there a difference between {k=v} and {-opt|v|k}?
      Does <opt> served more than complementing <neg>?
      Does <neg> have practical applications?
      Also, <opt> and <neg> reminded my or Lua's use of
      the <and> and <or> operators but I'm not sure if that
      would be a good trade for everybody.
      Maybe <first> and <second>?

  4.  The operators <list> and <join> feel quite similar to me.
      I think <pairs> would be a better name for <join>,
      and <list> is the real <join>.  So let me propose this
      translation:

        {-list|,|l}   --> {-join|,|l}
        {-join|&|a,b} --> {-pairs|&|a,b}

  5.  I'm wondering if it would be good idea to hard-code
      the allowed operator names into the grammar.
      Instead of
        op          = 1*ALPHA
      that would make
        op          = "opt"|"neg"|"prefix"|"suffix"|"join"|"list"
      Is adding custom operators allowed?  If not it might
      be asking for it.  Was this defined with a parser in mind?

      Taking this idea a step further it would allow us to
      embed the list-operator-takes-exactly-one-var constraint
      into the grammar, e.g. though rules like
        op          = "opt"|"neg"|"prefix"|"suffix"|"join"
        operator    = "-" ((op "|" arg "|" vars)
                          | ("list" "|" arg "|" var))

  6.  I'm a bit confused of the different use of '/': While
      a slash in the argument makes it through unescaped in

        bar := ["fee", "fi", "fo", "fum"]
        "{-prefix|/|bar}"
        -> "/fee/fi/fo/fum"

      it is escaped in

        garply := a/b/c
        "http://example.org/{bar}{bar}/{garply}"
        -> "http://example.org/fredfred/a%2Fb%2Fc"   .

      So the argument in {-<op>|<arg>|<vars>} is never escaped?
      Which characters must be escaped when substituting a variable?

  7.  Questions related to RFC 3986:
      - Can a brace block span several parts of a URI?
        - If so is it discouraged or does it come with problems?
      - Are brace blocks ("{".."}") allowed in any part of a URI?
        - If so does it come with problems?
      - Maybe hint that '{', '}' and '|' are not in URIs and
        that URI templates are not valid URIs itself before
        substitution.

  8.  References to other RFCs are done inconsistently, for instance
      in 6. (IANA Considerations) it reads
        "In common with RFC3986[..]"
      but later
        "[..]defined in [RFC4395]".

  9.  Why is variable plugh evaluated like this
        plugh := ["\u017F\u0307", "\u0073\u0307"]
        "{-suffix|:|plugh}"
        -> "%E1%B9%A1:%E1%B9%A1:"
      in 4.5. (Examples)?

 10.  I guess people will use their own words for "{".."}" blocks
      unless there is a term ready to use.  Is there any?
      I used brace blocks above.  Curly blocks?  Template field?

> Even in its current form it is quite
> useful, and I'd be comfortable implementing it as it stands.

What language will your implementation be written in?

Sebastian

Mike Schinkel | 15 Sep 2008 20:26
Picon
Gravatar

RE: URI Templates: done or dead?


Mark:

Great email, thanks for stirring things up.

On this issue I had a great conversation with Roy Fielding at the most
recent ApacheCon in Atlanta. He was gracious enough to spend almost an hour
with me. He had the following to say, which I will do my best to paraphrase
based on my memory though chances are I will get part of it wrong. I've cc'd
Roy in hopes he will correct anything that I accidentally misrepresent:

	Basically Roy said that his view of URI Templates 
	was that it needed to be written to support 
	machines, not humans. He said that a human readable 
	*representation* was possible to be generated from 
	a machine-readable URI Template, but from his 
	perspective if it wasn't written to support machines 
	he would not use it nor support it.

Roy gave me his comments in response to my fervent advocacy for URI
Templates to be optimized for humans. Mark it sounds like you have a similar
position to mine which is there is a strong needs for a very human friendly
URI Template specification. I envision many uses for URI Templates in
Content Management Systems and Application Frameworks but I think Roy
envisions using them in Apache, routers, proxies, etc. Unfortunately, I
don't think any URI template specification will come to a resolution as long
as there are those competing and evidently non-compatible objectives.

So I thinkthere really needs to be two different specifications; one for
humans and another for machines. Ideally those two specifications would be
linked in that all the aspects of the human-friendly version would at least
be translatable to the machine-optimized version but there would be two (2)
different specs optimized for two (2) significantly different needs. 

So I'll throw out this straw-man. I'd advocate starting from scratch on a
human-friendly URI Template specification while keeping an eye on our
ability to translate it to the existing machine-optimized URI Template spec
as it exists, and maintaining some compatibility with the most basic use
cases.  One thing I think this buys us is the ability to disregard some of
the more complex encoding concerns that are simply unimportant in CMS or App
Framework use-cases.

And not to give you more work that you may not have asked for, but I think
you Mark would be the perfect person to lead this human-friendly version of
URI Templates, and I'd be honored to help you on that task if you choose to
head in that direction. I'd consider leading it but I simply don't have any
experience writing specs. 

If I had my preference I'd actually call the human-friendly version of URI
Templates "URL Templates" because most people are more familiar with "URL"
than "URI" and this is the human-friendly version, but I know that the w3c
really dislikes "URL" so my guess is that part wouldn't fly.

Well anyway, I've tossed out the piƱata so bat away.

-Mike Schinkel 
http://mikeschinkel.com

-----Original Message-----
From: uri-request <at> w3.org [mailto:uri-request <at> w3.org] On Behalf Of Mark
Nottingham
Sent: Monday, September 15, 2008 7:58 AM
To: URI
Cc: Joe Gregorio; David Orchard; Marc Hadley
Subject: URI Templates: done or dead?

There hasn't been a lot of discussion or activity on URI Templates recently,
which either means it's very stable, or very nearly dead.

If it's very stable, we should ship it and be done with it. If it's nearly
dead (and I do get a whiff of that; while I continuously hear people
clamouring for it to be finished, not many seem to be willing to use it in
its current state; YMMV), we should at least try to revive it.

My continuing concerns with the -03 draft are that it's too complex, not
human-friendly, and it makes the common, simple use cases hard.  
The first example in the spec ( http://www.example.com/users/ {userid} )
holds up well, but it goes quickly downhill from there; (
http://www.example.com/? 
{-join|&|query,number} ) looks like line noise, IMHO.

I believe there are a few things we can do to make URI Template more broadly
useful and useable, without sacrificing too much functionality (at least in
the 80% case).

1. Reduce or drop operators.

As mentioned above, they don't read well; they're obviously intended for
machines, not people. The expansion for a template should be blindingly
obvious, but the operator syntax seems to want to get in the way rather than
help. Furthermore, the vast majority of use cases for templates are for
simple template substitution, not operations like 'neg' and 'opt'.

2. Drop list values.

Again, the majority of use cases out there have no need for list values in
template variables, and including them in the spec significantly complicates
things.

3. Make percent-encoding context-sensitive.

There are just too many cases where the 'escape everything but unreserved'
rule gets in the way; for example, if my template is
"http://example.com/user/ {email}", I'm going to have percent-encoded  <at> 
signs in my URIs whether I like it or not -- even though they're not
required to be percent- encoded there. This is a relatively simple thing to
do, as long as we also...

4. Allow exceptions to percent-encoding.

We need a syntax that allows characters to be excepted from encoding, even
in context. As a straw-man, I suggest preceding the expression with the
characters that are excepted, like:

    http://example.com/{/path}
    http://example.com/thing{?&=query_args}

and so forth.

5. If we keep operators at all, mint special ones for the common cases.

E.g., something to handle encoded form query values "out of the box":
   http://example.com/thing{-?a=foo&b=bar&c=baz}
and likewise with matrix parameters.

Let's see if that shakes things up...

--
Mark Nottingham     http://www.mnot.net/

Mark Nottingham | 15 Sep 2008 22:42
Favicon
Gravatar

Re: URI Templates: done or dead?


Most of my use cases are for doing things like putting them into  
headers and content so that people can build new/dynamic protocols for  
machines with them. Making them more friendly for humans to read is a  
side effect; what I'm interested in is making them easier/more  
intuitive for humans to *mint*, because they'll usually be the ones  
writing them (just as with HTML, although I think the collective crowd  
of authors will be a bit more technical, but only a bit).

My observation is that for those protocol-building cases, it's usually  
a light integration; someone needs to create a URI with a particular  
piece of data in it, and they don't want to constrain its form, so  
they need a template. They don't need list data or complex operators,  
or if they do, they can specify some pre-processing.

I don't disagree that there may be a place for "whiteboardable"  
templates, but it's a secondary use case for me.

Cheers,

On 16/09/2008, at 4:26 AM, Mike Schinkel wrote:

> Roy gave me his comments in response to my fervent advocacy for URI
> Templates to be optimized for humans. Mark it sounds like you have a  
> similar
> position to mine which is there is a strong needs for a very human  
> friendly
> URI Template specification. I envision many uses for URI Templates in
> Content Management Systems and Application Frameworks but I think Roy
> envisions using them in Apache, routers, proxies, etc.  
> Unfortunately, I
> don't think any URI template specification will come to a resolution  
> as long
> as there are those competing and evidently non-compatible objectives.

--
Mark Nottingham     http://www.mnot.net/

Roy T. Fielding | 16 Sep 2008 04:28
Favicon
Gravatar

Re: URI Templates: done or dead?


On Sep 15, 2008, at 4:57 AM, Mark Nottingham wrote:
> There hasn't been a lot of discussion or activity on URI Templates  
> recently, which either means it's very stable, or very nearly dead.

We should just remind the authors that they have several outstanding
comments on the spec and see if they are still interested in editing.

> If it's very stable, we should ship it and be done with it. If it's  
> nearly dead (and I do get a whiff of that; while I continuously  
> hear people clamouring for it to be finished, not many seem to be  
> willing to use it in its current state; YMMV), we should at least  
> try to revive it.

I won't use it in its current state because it isn't finished yet.
The prose is, at best, an outline.  The operators aren't even defined
in words -- the reader has to guess why they exist.  The examples seem
to be obsessed with the most irrelevant corner cases instead of teaching
the common cases first.  And it is far too focused on python language
as a means of definition.  None of these are technical issues.
I am not griping about the lack of completion because I have a similar
list of issues with the HTTPbis spec that I haven't done yet either.

Technically, the mechanism is caught halfway between being concise and
being human friendly, which means it is currently neither.  My opinion
is that we have IETF specs for the purpose of defining interoperable
protocols, not to define user interfaces, and so the argument that these
things should be end-user readable is unfounded and potentially very  
costly.
They only need to be readable by the folks who are defining  
applications.

> My continuing concerns with the -03 draft are that it's too  
> complex, not human-friendly, and it makes the common, simple use  
> cases hard. The first example in the spec ( http://www.example.com/ 
> users/{userid} ) holds up well, but it goes quickly downhill from  
> there; ( http://www.example.com/?{-join|&|query,number} ) looks  
> like line noise, IMHO.

Then please let's drop the idea of using english words as function
names and go back to the use cases that really matter.  I need a
way to describe substitutions for variable values, value prefixes,
URI inserts, ordered value lists, and unordered substitutions
within path segments (path ;param=value) and queries (form &var=value).
Only one of those (URI inserts) needs a raw substitution.

I could use some other tricks as well, but the above is what I know
is needed.  Joe had a lot more use cases that I have probably forgotten.

> I believe there are a few things we can do to make URI Template  
> more broadly useful and useable, without sacrificing too much  
> functionality (at least in the 80% case).
>
> 1. Reduce or drop operators.
>
> As mentioned above, they don't read well; they're obviously  
> intended for machines, not people. The expansion for a template  
> should be blindingly obvious, but the operator syntax seems to want  
> to get in the way rather than help. Furthermore, the vast majority  
> of use cases for templates are for simple template substitution,  
> not operations like 'neg' and 'opt'.

Actually, the vast majority use case is unordered form key=value
substitution.  Complete path segment replacement is second, followed
by URI inserts ("insert this value without further encoding").

> 2. Drop list values.
>
> Again, the majority of use cases out there have no need for list  
> values in template variables, and including them in the spec  
> significantly complicates things.

I think it is complicated because the introduction of list-only
operators (typed functions) is unnecessary.  Complex values can be
addressed in an orthogonal manner when the value is substituted,
mainly by defaulting to the most common form, and more complex
behavior can be defined only when applicable (i.e., a prefix on
the variable name can indicate how to translate a list into
numbered parameters or even associative array key=value sets).
The important thing to note is that compound values are only
interesting when templates are embedded within computer language
processing, so we could easily allow such things to be language
specific by reserving non-alphanumeric prefixes on variable names
for that purpose.

> 3. Make percent-encoding context-sensitive.
>
> There are just too many cases where the 'escape everything but  
> unreserved' rule gets in the way; for example, if my template is  
> "http://example.com/user/{email}", I'm going to have percent- 
> encoded  <at>  signs in my URIs whether I like it or not -- even though  
> they're not required to be percent-encoded there. This is a  
> relatively simple thing to do, as long as we also...

URI inserts could do that.  E.g., use {+email} instead of {email}.

> 4. Allow exceptions to percent-encoding.
>
> We need a syntax that allows characters to be excepted from  
> encoding, even in context. As a straw-man, I suggest preceding the  
> expression with the characters that are excepted, like:
>
>    http://example.com/{/path}
>    http://example.com/thing{?&=query_args}
>
> and so forth.

That is much more complex.  Dynamically changing the transcoding
algorithm is far more expensive than just using a different operator
for non-encoded insertion.

> 5. If we keep operators at all, mint special ones for the common  
> cases.
>
> E.g., something to handle encoded form query values "out of the box":
>   http://example.com/thing{-?a=foo&b=bar&c=baz}
> and likewise with matrix parameters.

Something like

     var   = "value";
     undef = null;
     empty = "";
     list  = [ "val1", "val2", "val3" ];
     keys  = [ "key1", "val1", "key2", "val2", "key3", "val3" ];
     path  = "/foo/bar"
     x     = "1024";
     y     = "768";

{var}                     value
{var=default}             value
{undef=default}           default
{var:3}                   val
{x,y}                     1024,768
{?x,y}                    ?x=1024&y=768
{?x,y,empty}              ?x=1024&y=768&empty=
{?x,y,undef}              ?x=1024&y=768
{;x,y}                    ;x=1024;y=768
{;x,y,empty}              ;x=1024;y=768;empty
{;x,y,undef}              ;x=1024;y=768
{/list,x}                 /val1/val2/val3/1024
{+path}/here              /foo/bar/here
{+path,x}/here            /foo/bar,1024/here
{+path}{x}/here           /foo/bar1024/here
{+empty}/here             /here

I think the above covers all of the common cases without making
the uncommon cases impossible.  The common case is that the delimiters
(";", "?", and "/") are omitted when none of the listed variables are
defined, which matches good URI practice.  Likewise, the substitution
handler for ";" (path parameters) will omit the "=" when its value is  
empty,
whereas the handler for "?" (form queries) will not omit the "=".
Multiple variables and list values have their values joined with ","
if there is no predefined joining mechanism for the operator.

I think this mechanism is simple and readable when used with simple
examples because the single-character operators match the URI generic
syntax delimiters.  Only one operator inserts unencoded values; all
of the others encode any characters other than unreserved.

The mechanism does become harder to read when we do very unusual
things and add all the bells and whistles, like

{var,undef,empty,list}    value,,val1,val2,val3
{/var:3,undef,list,empty} /val/val1/val2/val3/
{;var,undef,empty,list}   ;var=value;empty;list=val1,val2,val3
{?var,undef,empty,list}   ?var=value&empty=&list=val1,val2,val3
{?var,undef,empty, <at> list}  ? 
var=value&empty=&list1=val1&list2=val2&list3=val3
{?var,undef,empty,%keys}  ? 
var=value&empty=&key1=val1&key2=val2&key3=val3

but we don't need to care if complex cases are hard to read.

The mechanism is extremely simple to implement.  There is always a
variable list (one variable need not be special-cased).
Any of the variables can be prefixed.  Any of the substitutions
can have a default when undefined.

The ABNF is something like

  instruction   = "{" [ operator ] variable-list "}"
  operator      = "/" / "+" / ";" / "?" / op-reserve
  variable-list =  varspec *( "," varspec )
  varspec       =  [ var-type ] varname [ ":" prefix-len ] [ "="  
default ]
  var-type      = " <at> " / "%" / type-reserve
  varname       = ALPHA *( ALPHA | DIGIT | "_" )
  prefix-len    = 1*DIGIT
  default       = *( unreserved / reserved )
  op-reserve    = <anything else that isn't ALPHA or operator>
  type-reserve  = <anything else that isn't ALPHA, ",", or operator>

as a quick pass (I haven't checked it).  It is extremely easy to
parse and perform the substitutions within a single pass loop.

....Roy

Joe Gregorio | 16 Sep 2008 05:09
Favicon
Gravatar

Re: URI Templates: done or dead?


On Mon, Sep 15, 2008 at 7:57 AM, Mark Nottingham <mnot <at> mnot.net> wrote:
> There hasn't been a lot of discussion or activity on URI Templates recently,
> which either means it's very stable, or very nearly dead.

Neither, I have a list of open issues that were raised from the last
I-D and I have yet to address them, but I do plan on addressing them,
and that plan is slowly working its way higher on my to do list.

In the interim I have been keeping track of implementations:

   http://code.google.com/p/uri-templates/wiki/Implementations

   Thanks,
   -joe

>
> If it's very stable, we should ship it and be done with it. If it's nearly
> dead (and I do get a whiff of that; while I continuously hear people
> clamouring for it to be finished, not many seem to be willing to use it in
> its current state; YMMV), we should at least try to revive it.
>
> My continuing concerns with the -03 draft are that it's too complex, not
> human-friendly, and it makes the common, simple use cases hard. The first
> example in the spec ( http://www.example.com/users/{userid} ) holds up well,
> but it goes quickly downhill from there; (
> http://www.example.com/?{-join|&|query,number} ) looks like line noise,
> IMHO.
>
> I believe there are a few things we can do to make URI Template more broadly
> useful and useable, without sacrificing too much functionality (at least in
> the 80% case).
>
> 1. Reduce or drop operators.
>
> As mentioned above, they don't read well; they're obviously intended for
> machines, not people. The expansion for a template should be blindingly
> obvious, but the operator syntax seems to want to get in the way rather than
> help. Furthermore, the vast majority of use cases for templates are for
> simple template substitution, not operations like 'neg' and 'opt'.
>
> 2. Drop list values.
>
> Again, the majority of use cases out there have no need for list values in
> template variables, and including them in the spec significantly complicates
> things.
>
> 3. Make percent-encoding context-sensitive.
>
> There are just too many cases where the 'escape everything but unreserved'
> rule gets in the way; for example, if my template is
> "http://example.com/user/{email}", I'm going to have percent-encoded  <at>  signs
> in my URIs whether I like it or not -- even though they're not required to
> be percent-encoded there. This is a relatively simple thing to do, as long
> as we also...
>
> 4. Allow exceptions to percent-encoding.
>
> We need a syntax that allows characters to be excepted from encoding, even
> in context. As a straw-man, I suggest preceding the expression with the
> characters that are excepted, like:
>
>   http://example.com/{/path}
>   http://example.com/thing{?&=query_args}
>
> and so forth.
>
> 5. If we keep operators at all, mint special ones for the common cases.
>
> E.g., something to handle encoded form query values "out of the box":
>  http://example.com/thing{-?a=foo&b=bar&c=baz}
> and likewise with matrix parameters.
>
>
> Let's see if that shakes things up...
>
>
> --
> Mark Nottingham     http://www.mnot.net/
>
>

--

-- 
Joe Gregorio http://bitworking.org

Phillips, Addison | 16 Sep 2008 05:21
Picon
Favicon

RE: URI Templates: done or dead?

>   varname       = ALPHA *( ALPHA | DIGIT | "_" )

We have pretty good knowledge of what makes a good Unicode identifier. If we're going to assign variable
names in a new pattern language, why are we limiting it to alphanum? The software we are linking to (the part
generating the variables that get substituted in) may not--indeed probably does not--have that same limitation.

While the result needs to be a valid URI, there doesn't seem to be a reason for the pattern language itself to
be limited in this way. Just because your personal examples are all ASCII doesn't make that the right
solution for the world. For that matter, path values and so forth are plain text Unicode and encoded to URI
as appropriate as the URI is assembled from the template. The replacement syntax should probably
consider the character vs. bytes problem, especially in the query part, since the templates syntax is
heavily character oriented.

But you knew I was going to say that :-).

Addison

Addison Phillips
Globalization Architect -- Lab126
Chair -- W3C Internationalization WG

Internationalization is not a feature.
It is an architecture.


> -----Original Message-----
> From: uri-request <at> w3.org [mailto:uri-request <at> w3.org] On Behalf Of
> Roy T. Fielding
> Sent: Monday, September 15, 2008 7:29 PM
> To: Mark Nottingham
> Cc: URI; Joe Gregorio; David Orchard; Marc Hadley
> Subject: Re: URI Templates: done or dead?
> 
> 
> On Sep 15, 2008, at 4:57 AM, Mark Nottingham wrote:
> > There hasn't been a lot of discussion or activity on URI
> Templates
> > recently, which either means it's very stable, or very nearly
> dead.
> 
> We should just remind the authors that they have several
> outstanding
> comments on the spec and see if they are still interested in
> editing.
> 
> > If it's very stable, we should ship it and be done with it. If
> it's
> > nearly dead (and I do get a whiff of that; while I continuously
> > hear people clamouring for it to be finished, not many seem to be
> > willing to use it in its current state; YMMV), we should at least
> > try to revive it.
> 
> I won't use it in its current state because it isn't finished yet.
> The prose is, at best, an outline.  The operators aren't even
> defined
> in words -- the reader has to guess why they exist.  The examples
> seem
> to be obsessed with the most irrelevant corner cases instead of
> teaching
> the common cases first.  And it is far too focused on python
> language
> as a means of definition.  None of these are technical issues.
> I am not griping about the lack of completion because I have a
> similar
> list of issues with the HTTPbis spec that I haven't done yet either.
> 
> Technically, the mechanism is caught halfway between being concise
> and
> being human friendly, which means it is currently neither.  My
> opinion
> is that we have IETF specs for the purpose of defining
> interoperable
> protocols, not to define user interfaces, and so the argument that
> these
> things should be end-user readable is unfounded and potentially
> very
> costly.
> They only need to be readable by the folks who are defining
> applications.
> 
> > My continuing concerns with the -03 draft are that it's too
> > complex, not human-friendly, and it makes the common, simple use
> > cases hard. The first example in the spec
> ( http://www.example.com/

> > users/{userid} ) holds up well, but it goes quickly downhill from
> > there; ( http://www.example.com/?{-join|&|query,number} ) looks
> > like line noise, IMHO.
> 
> Then please let's drop the idea of using english words as function
> names and go back to the use cases that really matter.  I need a
> way to describe substitutions for variable values, value prefixes,
> URI inserts, ordered value lists, and unordered substitutions
> within path segments (path ;param=value) and queries (form
> &var=value).
> Only one of those (URI inserts) needs a raw substitution.
> 
> I could use some other tricks as well, but the above is what I know
> is needed.  Joe had a lot more use cases that I have probably
> forgotten.
> 
> > I believe there are a few things we can do to make URI Template
> > more broadly useful and useable, without sacrificing too much
> > functionality (at least in the 80% case).
> >
> > 1. Reduce or drop operators.
> >
> > As mentioned above, they don't read well; they're obviously
> > intended for machines, not people. The expansion for a template
> > should be blindingly obvious, but the operator syntax seems to
> want
> > to get in the way rather than help. Furthermore, the vast
> majority
> > of use cases for templates are for simple template substitution,
> > not operations like 'neg' and 'opt'.
> 
> Actually, the vast majority use case is unordered form key=value
> substitution.  Complete path segment replacement is second,
> followed
> by URI inserts ("insert this value without further encoding").
> 
> > 2. Drop list values.
> >
> > Again, the majority of use cases out there have no need for list
> > values in template variables, and including them in the spec
> > significantly complicates things.
> 
> I think it is complicated because the introduction of list-only
> operators (typed functions) is unnecessary.  Complex values can be
> addressed in an orthogonal manner when the value is substituted,
> mainly by defaulting to the most common form, and more complex
> behavior can be defined only when applicable (i.e., a prefix on
> the variable name can indicate how to translate a list into
> numbered parameters or even associative array key=value sets).
> The important thing to note is that compound values are only
> interesting when templates are embedded within computer language
> processing, so we could easily allow such things to be language
> specific by reserving non-alphanumeric prefixes on variable names
> for that purpose.
> 
> > 3. Make percent-encoding context-sensitive.
> >
> > There are just too many cases where the 'escape everything but
> > unreserved' rule gets in the way; for example, if my template is
> > "http://example.com/user/{email}", I'm going to have percent-
> > encoded  <at>  signs in my URIs whether I like it or not -- even
> though
> > they're not required to be percent-encoded there. This is a
> > relatively simple thing to do, as long as we also...
> 
> URI inserts could do that.  E.g., use {+email} instead of {email}.
> 
> > 4. Allow exceptions to percent-encoding.
> >
> > We need a syntax that allows characters to be excepted from
> > encoding, even in context. As a straw-man, I suggest preceding
> the
> > expression with the characters that are excepted, like:
> >
> >    http://example.com/{/path}

> >    http://example.com/thing{?&=query_args}

> >
> > and so forth.
> 
> That is much more complex.  Dynamically changing the transcoding
> algorithm is far more expensive than just using a different
> operator
> for non-encoded insertion.
> 
> > 5. If we keep operators at all, mint special ones for the common
> > cases.
> >
> > E.g., something to handle encoded form query values "out of the
> box":
> >   http://example.com/thing{-?a=foo&b=bar&c=baz}

> > and likewise with matrix parameters.
> 
> Something like
> 
>      var   = "value";
>      undef = null;
>      empty = "";
>      list  = [ "val1", "val2", "val3" ];
>      keys  = [ "key1", "val1", "key2", "val2", "key3", "val3" ];
>      path  = "/foo/bar"
>      x     = "1024";
>      y     = "768";
> 
> {var}                     value
> {var=default}             value
> {undef=default}           default
> {var:3}                   val
> {x,y}                     1024,768
> {?x,y}                    ?x=1024&y=768
> {?x,y,empty}              ?x=1024&y=768&empty=
> {?x,y,undef}              ?x=1024&y=768
> {;x,y}                    ;x=1024;y=768
> {;x,y,empty}              ;x=1024;y=768;empty
> {;x,y,undef}              ;x=1024;y=768
> {/list,x}                 /val1/val2/val3/1024
> {+path}/here              /foo/bar/here
> {+path,x}/here            /foo/bar,1024/here
> {+path}{x}/here           /foo/bar1024/here
> {+empty}/here             /here
> 
> I think the above covers all of the common cases without making
> the uncommon cases impossible.  The common case is that the
> delimiters
> (";", "?", and "/") are omitted when none of the listed variables
> are
> defined, which matches good URI practice.  Likewise, the
> substitution
> handler for ";" (path parameters) will omit the "=" when its value
> is
> empty,
> whereas the handler for "?" (form queries) will not omit the "=".
> Multiple variables and list values have their values joined with
> ","
> if there is no predefined joining mechanism for the operator.
> 
> I think this mechanism is simple and readable when used with simple
> examples because the single-character operators match the URI
> generic
> syntax delimiters.  Only one operator inserts unencoded values; all
> of the others encode any characters other than unreserved.
> 
> The mechanism does become harder to read when we do very unusual
> things and add all the bells and whistles, like
> 
> {var,undef,empty,list}    value,,val1,val2,val3
> {/var:3,undef,list,empty} /val/val1/val2/val3/
> {;var,undef,empty,list}   ;var=value;empty;list=val1,val2,val3
> {?var,undef,empty,list}   ?var=value&empty=&list=val1,val2,val3
> {?var,undef,empty, <at> list}  ?
> var=value&empty=&list1=val1&list2=val2&list3=val3
> {?var,undef,empty,%keys}  ?
> var=value&empty=&key1=val1&key2=val2&key3=val3
> 
> but we don't need to care if complex cases are hard to read.
> 
> The mechanism is extremely simple to implement.  There is always a
> variable list (one variable need not be special-cased).
> Any of the variables can be prefixed.  Any of the substitutions
> can have a default when undefined.
> 
> The ABNF is something like
> 
>   instruction   = "{" [ operator ] variable-list "}"
>   operator      = "/" / "+" / ";" / "?" / op-reserve
>   variable-list =  varspec *( "," varspec )
>   varspec       =  [ var-type ] varname [ ":" prefix-len ] [ "="
> default ]
>   var-type      = " <at> " / "%" / type-reserve
>   varname       = ALPHA *( ALPHA | DIGIT | "_" )
>   prefix-len    = 1*DIGIT
>   default       = *( unreserved / reserved )
>   op-reserve    = <anything else that isn't ALPHA or operator>
>   type-reserve  = <anything else that isn't ALPHA, ",", or
> operator>
> 
> as a quick pass (I haven't checked it).  It is extremely easy to
> parse and perform the substitutions within a single pass loop.
> 
> ....Roy


Gmane