Peter Constable | 1 Jul 2008 17:33
Picon
Favicon

extlang & deprecation (was draft updated

> From: ltru-bounces <at> ietf.org [mailto:ltru-bounces <at> ietf.org] On Behalf Of
> Martin Duerst
> Sent: Friday, June 20, 2008 12:01 AM
> Subject: Re: [Ltru] draft updated

> >Just letting you and the WG know that I continue to *strongly* favor
> >making "zh-cmn" and "zh-yue" (and the rest) the preferred forms, and
> >"cmn" and "yue" the pre-deprecated ones.
>
> [technical hat on] Seconded.       Regards,   Martin.

I haven't commented on this yet, nor has there been much discussion. I'm guessing that there are some
differences of opinion here. Indeed, it strikes me that opinions may correlate to a large degree with
opinions on whether to keep use of extlang or not.

I think we have three approaches that can be taken wrt extlang, only two of which have really been seen so far:
for language productions X-Y and Y where X denotes a macrolanguage that encompasses Y

1. only allow one of X-Y or Y to be used
2. allow either X-Y or Y, but deprecate one of them
3. allow either X-Y or Y without deprecation

#1, of course, we have spent months on without reaching any consensus. Hence, I think #1 is dead.

#2 is currently on the table, but I see indicators that we could get stuck on it. (In fact, I'm guessing that
one reason there haven't been responses on John's and Martin's comments is that people are afraid of
getting into the same rut we were in wrt #1.)

I also see another potential concern wrt #2: I think there is an implicit assumption that X-Y and Y be
considered semantically equivalent in terms of what they denote (although they may have slightly
(Continue reading)

Phillips, Addison | 1 Jul 2008 18:47
Picon
Favicon

Re: extlang & deprecation (was draft updated

Peter wrote:
>
> I also see another potential concern wrt #2: I think there is an
> implicit assumption that X-Y and Y be considered semantically
> equivalent in terms of what they denote (although they may have
> slightly different behaviours in certain matching scenarios). Yet,
> with one or the other deprecated, there's some likelihood that some
> implementations will assume that the deprecated form can be largely
> disregarded in designing matching behaviour.

Disregarded may not be the right word. In my implementation, I canonicalize the deprecated form away
before matching. A "well-formed" 4646 or 3066 implementation might not do that and arrive at different
results. Besides, what you're proposing below is that both forms be treated as semantically equivalent,
which is tantamount to "disregarding" one of the forms.

>
> Btw, it should be noted that both #1 and #2 lead to consideration
> of cherry picking.

I don't see why.

What we're doing here is recommending one form over another for equivalent tags. Let me draw a parallel
here. I work with a group of developers who code in Java. Like most development teams, we have coding style
guidelines. One of them has to do with "if" statements. Both of the following forms (a) and (b) are
completely equivalent, but one of them is preferred:

a) if (boolean) {
      // code
   } else {
      // code
(Continue reading)

John Cowan | 1 Jul 2008 19:48

Re: extlang & deprecation (was draft updated

Peter Constable scripsit:

> Now, let me propose an elaboration of #3 for adoption. This elaboration
> is captured by three points:
> 
> (a) that both X-Y and Y are freely allowed,
> (b) that at the level of the language production X-Y and Y must always
> be considered a match (regardless of which is part of a tag or of a
> language range), but

I don't understand what (b) means, particularly in contrast with (c).
The whole point in allowing both is that in some contexts X-Y works
better with naive matching, and in some contexts Y works better.

> (c) that how X and Y compare in matching is a separate consideration
> (perhaps with some suggestions but ultimately left to implementations).

I assume that this proposal is still in the context of only allowing a
small number of macrolanguages (plus 'sgn') as Xs in X-Y?

--

-- 
[W]hen I wrote it I was more than a little              John Cowan
febrile with foodpoisoning from an antique carrot       cowan <at> ccil.org
that I foolishly ate out of an illjudged faith          http://ccil.org/~cowan
in the benignancy of vegetables.  --And Rosta
Shawn Steele | 1 Jul 2008 20:18
Picon
Favicon

Re: extlang & deprecation (was draft updated

> This is where #3 is a non-starter for me: it requires us to
> change all of the matching schemes in ways that are
> incompatible with our previous tenets.

We already have to.  There's no way you can compare cmn with zh-HK or vice versa (if you want to) without
modifying the doc.

> I didn't have to change the matching code because I
> canonicalize tags and ranges before matching. With a deprecation,
> the registry provides all of the information for this using
> the same mechanism I already use for mapping

The registry could provide similar equivalence for you, without deprecation, since you find the registry useful.

> "Better" may not be the right word. "Preferred" would better
> describe the situation. Either form may be better for *your*
> application (or mine), depending on circumstances.

That's not the definition of Deprecate:

  From OED & Webster online:

  Deprecate dep-re-cate:

  1. trans. To pray against (evil); to pray for deliverance from; to seek to avert by prayer. arch.
  3. trans. To plead earnestly against; to express an earnest wish against (a proceeding); to express
earnest disapproval of (a course, plan, purpose, etc.).

  And:

(Continue reading)

Phillips, Addison | 1 Jul 2008 20:44
Picon
Favicon

Re: extlang & deprecation (was draft updated

>
> > This is where #3 is a non-starter for me: it requires us to
> > change all of the matching schemes in ways that are
> > incompatible with our previous tenets.
>
> We already have to.  There's no way you can compare cmn with zh-HK
> or vice versa (if you want to) without modifying the doc.

(laughing) Sure you can: they don't match. Not even if you canonicalize in the other direction (to zh-cmn).

You can create, of course, a proprietary matching scheme that recognizes that there is some relationship
between these two tags/ranges. But this is somewhat similar to Mark's famous handling of Breton and
French. That is, "en-US" and "en-boont" don't match, even though, obviously, there is some kind of
relationship between them.

>
> > I didn't have to change the matching code because I
> > canonicalize tags and ranges before matching. With a deprecation,
> > the registry provides all of the information for this using
> > the same mechanism I already use for mapping
>
> The registry could provide similar equivalence for you, without
> deprecation, since you find the registry useful.

Ah... but I already have a validating implementation that produces the "right" result via tag
canonicalization (something I'm permitted to do to a tag or range). The distinction is that we'd be
introducing a *separate* process only done in tag matching? Ick.

I find the registry useful for validation, but don't ignore the fact that I don't need the registry to peel
off the macrolanguage from an extlang in a well-formed implementation. Requiring the registry for
(Continue reading)

Peter Constable | 1 Jul 2008 20:53
Picon
Favicon

Re: extlang & deprecation (was draft updated

> From: John Cowan [mailto:cowan <at> ccil.org]

> > (b) that at the level of the language production X-Y and Y must
> always
> > be considered a match (regardless of which is part of a tag or of a
> > language range), but
>
> I don't understand what (b) means, particularly in contrast with (c).

For X-Y-Z (Z being some string) and Y-Z, then X-Y-Z and Y-Z is always a match, but (c) whether X-Z and Y-Z are
matched is a separate matter.

> I assume that this proposal is still in the context of only allowing a
> small number of macrolanguages (plus 'sgn') as Xs in X-Y?

That choice is left open. Part of what I was saying is that I think there's less potential need to consider
cherry picking in that particular regard for #3 than there is for #1 or #2: since X-Y and Y equivalence would
be guaranteed, then there really isn't a need to say up front that only a small set of macrolanguages are
candidates for X.

Peter
John Cowan | 1 Jul 2008 20:57

Re: extlang & deprecation (was draft updated

Shawn Steele scripsit:

> That's not the definition of Deprecate:

Dictionary definitions of "deprecate" aren't relevant, because it is a term
of art among programmers and standardizers.  From Wikipedia:

        In computer software standards and documentation, the term
        deprecation is applied to software features that are superseded
        and should be avoided. Although deprecated features remain
        in the current version, their use may raise warning messages
        recommending alternate practices, and deprecation may indicate
        that the feature will be removed in the future. Features are
        deprecated -- rather than being removed -- in order to provide
        backward compatibility and give programmers using the feature
        time to bring their code into compliance with the new standard.
                http://en.wikipedia.org/wiki/Deprecation

In our case, however, deprecated tags and subtags remain available for
use indefinitely: we never remove anything.  The same is true for
Linnaean species names:

        An example in paleontology would be Brontosaurus, a deprecated
        term for the genus Apatosaurus.

--

-- 
John Cowan    http://ccil.org/~cowan    cowan <at> ccil.org
SAXParserFactory [is] a hideous, evil monstrosity of a class that should
be hung, shot, beheaded, drawn and quartered, burned at the stake,
buried in unconsecrated ground, dug up, cremated, and the ashes tossed
(Continue reading)

Phillips, Addison | 1 Jul 2008 21:03
Picon
Favicon

Re: extlang & deprecation (was draft updated

>
> For X-Y-Z (Z being some string) and Y-Z, then X-Y-Z and Y-Z is
> always a match, but (c) whether X-Z and Y-Z are matched is a
> separate matter.

X-Z and Y-Z would match in *extended* filtering if X-Z is the range and Y-Z is equivalent to X-Y-Z, but at no
other time.

It is useful to note that in lookup, you cannot obtain the Y-Z resource itself by requesting X-Z. You can
obtain its parent resource X if you canonicalize to X-Y-Z and then fall back, but not Y-Z (or even X-Y-Z).

In basic filtering you can never obtain Y-Z by requesting X-Z. Neither can you obtain X-Z by requesting Y-Z
(or X-Y-Z). As a reminder of why: consider Suppress-Script.

>
>
> > I assume that this proposal is still in the context of only
> allowing a
> > small number of macrolanguages (plus 'sgn') as Xs in X-Y?
>
> That choice is left open. Part of what I was saying is that I think
> there's less potential need to consider cherry picking in that
> particular regard for #3 than there is for #1 or #2: since X-Y and
> Y equivalence would be guaranteed, then there really isn't a need
> to say up front that only a small set of macrolanguages are
> candidates for X.
>

Multiple tag choices for the same meaning are Bad News, regardless of what form they take. We should create a
few as possible (but no fewer). The "equivalence guarantee" you seek is, I fear, fraught with
(Continue reading)

Shawn Steele | 1 Jul 2008 21:20
Picon
Favicon

Re: extlang & deprecation (was draft updated

It's the "and should be avoided" part that trips me up.

- Shawn

-----Original Message-----
From: John Cowan [mailto:cowan <at> ccil.org]
Sent: Tuesday, July 01, 2008 11:58 AM
To: Shawn Steele
Cc: Phillips, Addison; Peter Constable; LTRU Working Group
Subject: Re: [Ltru] extlang & deprecation (was draft updated

Shawn Steele scripsit:

> That's not the definition of Deprecate:

Dictionary definitions of "deprecate" aren't relevant, because it is a term
of art among programmers and standardizers.  From Wikipedia:

        In computer software standards and documentation, the term
        deprecation is applied to software features that are superseded
        and should be avoided. Although deprecated features remain
        in the current version, their use may raise warning messages
        recommending alternate practices, and deprecation may indicate
        that the feature will be removed in the future. Features are
        deprecated -- rather than being removed -- in order to provide
        backward compatibility and give programmers using the feature
        time to bring their code into compliance with the new standard.
                http://en.wikipedia.org/wiki/Deprecation

In our case, however, deprecated tags and subtags remain available for
(Continue reading)

Phillips, Addison | 1 Jul 2008 21:24
Picon
Favicon

Re: extlang & deprecation (was draft updated

Use RFC 2119 meaning of "SHOULD" here:

   This word, or the adjective "RECOMMENDED", mean that there
   may exist valid reasons in particular circumstances to ignore a
   particular item, but the full implications must be understood and
   carefully weighed before choosing a different course.

If you think you want to use the deprecated form, think about the implications (sometimes you won't get the
matching you desire) before proceeding. Since we make clear what those considerations are, people can
make an informed decision.

Of course, we still have a disagreement (so far between Mark/Addison and Martin/John) about which way to
point the deprecating arrow, but that's a small matter.

Addison

Addison Phillips
Globalization Architect -- Lab126

Internationalization is not a feature.
It is an architecture.

> -----Original Message-----
> From: Shawn Steele [mailto:Shawn.Steele <at> microsoft.com]
> Sent: Tuesday, July 01, 2008 12:20 PM
> To: John Cowan
> Cc: Phillips, Addison; Peter Constable; LTRU Working Group
> Subject: RE: [Ltru] extlang & deprecation (was draft updated
>
> It's the "and should be avoided" part that trips me up.
(Continue reading)


Gmane