Martin J. Dürst | 1 May 09:19
Picon
Gravatar

Re: A new RFC for Web Addresses/Hypertext References: Background wrt LEIRIs

Hello Henry,

Many thanks for this very good overview. I'm cross-posting this to the 
IRI list (public-iri <at> w3.org) because Lisa at one point proposed to have 
this kind of discussion there, as well as to the Apps Discuss list 
(discuss <at> apps.ietf.org) to reach out to the relevant people in the IETF. 
I have also copied Lisa and Alex directly. I guess this is overall a bit 
too agressive of a cross-posting (but please tell me if you think I have 
missed somebody important). However, I hope we can converge quickly on 
where to move forward with what bits of the discussion/work.

On 2009/04/29 0:31, Henry S. Thompson wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> There are currently five documents in this space (that I am aware of):
>
>   [URI] The current RFC governing URIs:
>     http://tools.ietf.org/html/rfc3986
>
>   [IRI] The current RFC governing IRIs:
>     http://tools.ietf.org/html/rfc3987
>
>   [IRI-BIS] The most recent draft of a planned update for the RFC
>             governing IRIs:
>     http://tools.ietf.org/html/draft-duerst-iri-bis-04
>
>   [LEIRI] A W3C Note defining Legacy Extended IRIs (extracted from [IRI-BIS]):
>     http://www.w3.org/TR/leiri/
>
(Continue reading)

Martin J. Dürst | 1 May 11:18
Picon
Gravatar

Re: A new RFC for Web Addresses/Hypertext References: Background wrt LEIRIs

Hello Dan,

[same added cross-postings as for previous mail.]

On 2009/05/01 1:33, Dan Connolly wrote:
> On Tue, 2009-04-28 at 16:31 +0100, Henry S. Thompson wrote:
> [...]
>>   [WEBADDR] A preliminary draft of a possible RFC for Web Addresses
>>             (extracted from HTML5 [1]):
>>     http://www.w3.org/html/wg/href/draft.html [not yet in RFC format,
>>                                                converted version expected
>>                                                RSN]
> [...]
>> I am sure that the above summaries can be improved.  In particular it
>> would be helpful have clear statements from their respective
>> authors/owners as to what the _requirements_ for the three new
>> documents ([IRI-BIS], [LEIRI] and [WEBADDR]) are.  Only after we have
>> those would it make sense to turn to the question of whether we can
>> merge some or all of them.
>
> Good question. I had hoped to document this in the draft by
> now.
>
> One of the main requirements the design in [WEBADDR] takes
> on is the non-western search engine problem.
>
> My understanding is that MS IE implemented a pre-IRI,
> pre-unicode convention that form submission data should
> go in the encoding that the form page was encoded in,
> and the servers bought into this.
(Continue reading)

Giovanni Campagna | 1 May 11:37
Picon
Gravatar

Re: A new RFC for Web Addresses/Hypertext References: Background wrt LEIRIs

2009/5/1 "Martin J. Dürst" <duerst <at> it.aoyama.ac.jp>:
> [...]
>
>> [WEBADDR] had in some ways a similar origin to [LEIRI], starting out
>>   as a section of the HTML5 spec which addressed the process by which
>>   existing browsers process strings to produce URIs which can be
>>   dereferenced.
>
> Yes indeed. It changes a space to %20, the same as for LEIRIs.
>
>>   It differs from [LEIRI] in the exact set of
>>   characters which it escapes,
>
> Has anybody done an analysis?
>
> It seems to provide more detail about '[' and ']', escaping them depending
> on context. It could be that that's also necessary for LEIRIs.
>
> But "any occurrences of percent-encoding in the Web address will be
> double-encoded at this step." looks extremely scary.

It did such an analysis. You find it at
<http://lists.w3.org/Archives/Public/www-archive/2009Apr/0064.html>
(originally sent to the WHATWG mailing list, forwarded by Ian Hickson
to www-archive <at> w3.org)

Giovanni
Sam Johnston | 2 May 11:07
Favicon
Gravatar

Re: On the proliferation of well-known URLs; draft-nottingham-site-meta-01

Brad,

Sounds like a good idea, though I'd avoid using characters which are likely to break scripts/filters/etc.

Perhaps something like "/__meta__" would be a better bet?

Sam

On Wed, Apr 29, 2009 at 7:50 PM, Brad Fitzpatrick <brad <at> danga.com> wrote:
I'd like to discuss the proliferation of well-known URLs, notably the new one proposed in draft-nottingham-site-meta-01.

I talked to Eran Hammer-Lahav about this, but he suggested I bring it up here:

I object to the use of /host-meta for two reasons:

1) I feel that /host-meta is too casual of a name and prone to collisions.  It matches /^[\w\-]+$/, which I think is a subset of a fair number of sites' usernames.

2) There are already too many well-known URLs cluttering up the namespace:

/robots.txt
/favicon.ico
/crossdomain.xml

We can't fix those, but rather than make another one (and don't kid yourself: /host-meta won't be the last one) and make the situation worse, I propose we do the respectful thing and make a well-known directory to put this and all future well-known files in.  e.g.:

/;well_known/host-meta

i.e. put something ugly and weird in there, like a semicolon, to minimize the chance that it interferes with people's existing URL structure.

Hopefully when the next spec decides to add a new well-known URL, they put it under /;well_known/.  "But host-meta is the final one, forever!", you say.  I doubt it.  XRD will become passé, or people will object to doing two HTTP requests when they really want to do one, so yet another well-known URL will be born.  Let's give it a future home now.

Thoughts?

- Brad


_______________________________________________
Apps-Discuss mailing list
Apps-Discuss <at> ietf.org
https://www.ietf.org/mailman/listinfo/apps-discuss


_______________________________________________
Apps-Discuss mailing list
Apps-Discuss <at> ietf.org
https://www.ietf.org/mailman/listinfo/apps-discuss
Mark Baker | 2 May 14:30
Picon
Favicon
Gravatar

Re: On the proliferation of well-known URLs; draft-nottingham-site-meta-01

On Wed, Apr 29, 2009 at 10:50 AM, Brad Fitzpatrick <brad <at> danga.com> wrote:
> I'd like to discuss the proliferation of well-known URLs, notably the new
> one proposed in draft-nottingham-site-meta-01.
>
> I talked to Eran Hammer-Lahav about this, but he suggested I bring it up
> here:
>
> I object to the use of /host-meta for two reasons:
>
> 1) I feel that /host-meta is too casual of a name and prone to collisions.
> It matches /^[\w\-]+$/, which I think is a subset of a fair number of sites'
> usernames.

Perhaps, but I don't see any cause for concern here;

http://www.google.com/search?q=inurl:host-meta

>
> 2) There are already too many well-known URLs cluttering up the namespace:
>
> /robots.txt
> /favicon.ico
> /crossdomain.xml
>
> We can't fix those, but rather than make another one (and don't kid
> yourself: /host-meta won't be the last one)

I agree, but because there will be people who don't know about the
protocol, and that will be the case no matter which protocol is used.

> and make the situation worse, I
> propose we do the respectful thing and make a well-known directory to put
> this and all future well-known files in.  e.g.:

The Web doesn't have directories, it has URIs, and both what you
describe, and what's in the draft, define a well known URI per domain.
 So I really don't see any practical difference here.

Mark.
Brad Fitzpatrick | 2 May 18:11
Gravatar

Re: On the proliferation of well-known URLs; draft-nottingham-site-meta-01

On Sat, May 2, 2009 at 5:30 AM, Mark Baker <distobj <at> acm.org> wrote:
On Wed, Apr 29, 2009 at 10:50 AM, Brad Fitzpatrick <brad <at> danga.com> wrote:
> I'd like to discuss the proliferation of well-known URLs, notably the new
> one proposed in draft-nottingham-site-meta-01.
>
> I talked to Eran Hammer-Lahav about this, but he suggested I bring it up
> here:
>
> I object to the use of /host-meta for two reasons:
>
> 1) I feel that /host-meta is too casual of a name and prone to collisions.
> It matches /^[\w\-]+$/, which I think is a subset of a fair number of sites'
> usernames.

Perhaps, but I don't see any cause for concern here;

http://www.google.com/search?q=inurl:host-meta

Yet.  But once we pick something easily-creatable, you'd see a lot more bogus ones, or you'd see resistance to people supporting it, because host-meta is already in the namespace that they use for other things.  I'm not making up this concern:  I'm already hearing that from people when I talk about having them implement this.
 
>
> 2) There are already too many well-known URLs cluttering up the namespace:
>
> /robots.txt
> /favicon.ico
> /crossdomain.xml
>
> We can't fix those, but rather than make another one (and don't kid
> yourself: /host-meta won't be the last one)

I agree, but because there will be people who don't know about the
protocol, and that will be the case no matter which protocol is used.

I don't follow.  I'm proposing a common prefix, not protocol, for well-known URIs cluttering up the namespace.  Unrelated to host-meta, I want a company in the future that _needs_ to have their little file at the top-level of a domain to be able to follow in our footsteps and use /;wellknown/new-file.txt

  
> and make the situation worse, I
> propose we do the respectful thing and make a well-known directory to put
> this and all future well-known files in.  e.g.:

The Web doesn't have directories, it has URIs,

Yes, thanks.  :)  I'm proposing that our well-known URI that we're creating here has a prefix which other well-known URIs in the future can share, to stop the build-up of well-known URIs in the namespace that sites have effectively allocated for other things, like users:

http://twitter.com/host-meta
http://profiles.google.com/host-meta
http://identi.ca/host-meta

Those don't exist at present, but it's not crazy to think there are sites out there where URLs like those could exist.
 
and both what you
describe, and what's in the draft, define a well known URI per domain.
 So I really don't see any practical difference here.

There's no functional difference.  The difference is only not cluttering up people's namespaces.

The practical difference is not annoying sites.

_______________________________________________
Apps-Discuss mailing list
Apps-Discuss <at> ietf.org
https://www.ietf.org/mailman/listinfo/apps-discuss
Brad Fitzpatrick | 2 May 18:13
Gravatar

Re: On the proliferation of well-known URLs; draft-nottingham-site-meta-01



On Sat, May 2, 2009 at 2:07 AM, Sam Johnston <samj <at> samj.net> wrote:
Brad,

Sounds like a good idea, though I'd avoid using characters which are likely to break scripts/filters/etc.

That's kinda the point.
 
Perhaps something like "/__meta__" would be a better bet?

That's even worse than host-meta in the sense that it's more common of a username pattern (more sites allow "_" than "-" in a username).  And one of my main concerns is the number of sites that give users a URL like site.com/USERNAME


_______________________________________________
Apps-Discuss mailing list
Apps-Discuss <at> ietf.org
https://www.ietf.org/mailman/listinfo/apps-discuss
Eran Hammer-Lahav | 4 May 17:18
Favicon
Gravatar

RE: On the proliferation of well-known URLs; draft-nottingham-site-meta-01

This makes sense but I don’t have strong views here. My only concern is to get this adopted for host-meta so whatever proves to be the path of least resistance works for me.

 

On the business of using odd characters, isn’t choosing any ‘*/’ prefix going to solve your concerns? No site I know allows using ‘/’ for usernames, short URLs, etc. I’m afraid making the prefix “ugly” will cause people not to use it and will make documenting it confusing to people who will not be expecting something like that.

 

EHL

 

From: apps-discuss-bounces <at> ietf.org [mailto:apps-discuss-bounces <at> ietf.org] On Behalf Of Brad Fitzpatrick
Sent: Saturday, May 02, 2009 9:13 AM
To: Sam Johnston
Cc: apps-discuss <at> ietf.org
Subject: Re: On the proliferation of well-known URLs; draft-nottingham-site-meta-01

 

 

On Sat, May 2, 2009 at 2:07 AM, Sam Johnston <samj <at> samj.net> wrote:

Brad,

Sounds like a good idea, though I'd avoid using characters which are likely to break scripts/filters/etc.


That's kinda the point.
 

Perhaps something like "/__meta__" would be a better bet?


That's even worse than host-meta in the sense that it's more common of a username pattern (more sites allow "_" than "-" in a username).  And one of my main concerns is the number of sites that give users a URL like site.com/USERNAME

 

_______________________________________________
Apps-Discuss mailing list
Apps-Discuss <at> ietf.org
https://www.ietf.org/mailman/listinfo/apps-discuss
Nicolas Williams | 4 May 18:01
Picon

Re: On the proliferation of well-known URLs; draft-nottingham-site-meta-01

On Sat, May 02, 2009 at 09:11:14AM -0700, Brad Fitzpatrick wrote:
> On Sat, May 2, 2009 at 5:30 AM, Mark Baker <distobj <at> acm.org> wrote:
> > > 2) There are already too many well-known URLs cluttering up the
> > namespace:
> > >
> > > /robots.txt
> > > /favicon.ico
> > > /crossdomain.xml
> > >
> > > We can't fix those, but rather than make another one (and don't kid
> > > yourself: /host-meta won't be the last one)
> >
> > I agree, but because there will be people who don't know about the
> > protocol, and that will be the case no matter which protocol is used.
> 
> I don't follow.  I'm proposing a common prefix, not protocol, for well-known
> URIs cluttering up the namespace.  Unrelated to host-meta, I want a company
> in the future that _needs_ to have their little file at the top-level of a
> domain to be able to follow in our footsteps and use
> /;wellknown/new-file.txt

The well-known prefix should be used for new things like robots.txt,
host-meta, ...  That relies on developers to get that right, but I think
eventually enough will get it right for this to make a difference, thus
this idea seems worthwhile.

It would be nice to have a registry of well-known URI components.

> > The Web doesn't have directories, it has URIs,
> 
> Yes, thanks.  :)  I'm proposing that our well-known URI that we're creating
> here has a prefix which other well-known URIs in the future can share, to
> stop the build-up of well-known URIs in the namespace that sites have
> effectively allocated for other things, like users:
> 
> http://twitter.com/host-meta
> http://profiles.google.com/host-meta
> http://identi.ca/host-meta
> 
> Those don't exist at present, but it's not crazy to think there are sites
> out there where URLs like those could exist.

Exactly.  Why should some bright eyed developer come up with the Next
Big Thing that steps all over some other peoples' URIs when said
developer could use a reserved part of the namespace at no extra cost to
themselves?

Nico
--

-- 
Nicolas Williams | 4 May 18:06
Picon

Re: On the proliferation of well-known URLs; draft-nottingham-site-meta-01

On Mon, May 04, 2009 at 08:18:17AM -0700, Eran Hammer-Lahav wrote:
> This makes sense but I don’t have strong views here. My only concern
> is to get this adopted for host-meta so whatever proves to be the path
> of least resistance works for me.
> 
> On the business of using odd characters, isn’t choosing any ‘*/’
> prefix going to solve your concerns? No site I know allows using ‘/’
> for usernames, short URLs, etc. I’m afraid making the prefix “ugly”
> will cause people not to use it and will make documenting it confusing
> to people who will not be expecting something like that.

But sites do often allow users to have files, so "*/" falls down.
_______________________________________________
Apps-Discuss mailing list
Apps-Discuss <at> ietf.org
https://www.ietf.org/mailman/listinfo/apps-discuss

Gmane