Addison Phillips | 1 Jan 18:21
Picon
Favicon

Re: Re: Security and nationality

Frank Ellermann wrote:
>> --
>> Languages and language variations are often closely associated with
>> specific social, national, or ethnic affinities. Thus, language tags
>> used in content negotiation, like other information exchanged on the
>> Internet, might be a source of concern because they might be used to
>> infer information about the sender and thus identify potential targets
>> for surveillance.
>> --
> 
> Okay.  Jukka's proposal is also fine (maybe a bit long).  I like the
> keyword "private" in Stephane's version (Jukka has "personal", you've
> only "information").
> 

One's language preference in a request header is hardly "private" 
information. The whole point of having an Accept-Language header is 
sharing that information in requests. "Personal" is probably better in 
this context.

If we modify my paragraph to say "... infer personal information...", 
would that fix it? Or do you think that some other instance of 
"information" is the problem?

Also, should we note that highly idiosyncratic Language Preference Lists 
(to use the 4647 term) might act as a signature for the user?

Addison

--

-- 
(Continue reading)

Nicolas Krebs | 2 Jan 23:16
Favicon

Re: Security and nationality


>Date: Thu, 28 Dec 2006 13:46:52 +0200 (EET)
>From: "Jukka K. Korpela" <jkorpela <at> cs.tut.fi>
>To: ltru <at> ietf.org

A little off topic, on Accept-Language header: 

>In the so-called real world, we also have poorly written software that 
>does very nasty things at times. In particular, the Thunderbird email 
>program, following an old Netscape/Mozilla tradition, automatically 
>inserts an X-Accept-Language header, with values taken from the settings 
>in the Mozilla/Firefox web browser. It even includes that header into 
>Usenet postings. Thus, information intended for selecting between 
>different language versions of web pages is thereby sent, without 
>informing the user, in all outgoing email and Usenet messages. This is of 
>course all wrong, but it's probably not formally wrong by any 
>specification.

- http://www.cs.tut.fi/~jkorpela/headers.html#X-Accept-Language 
- RFC 4021 section 2.1.27 http://www.ietf.org/rfc/rfc4021.txt 
   http://tools.ietf.org/html/rfc4021#section-2.1.27
- news:448bd88f$0$163$a3f2974a <at> nnrp1.numericable.fr
  http://groups.google.com/group/fr.comp.mail/msg/aee199862e8e1a56
- http://bugzilla.mozilla.org/show_bug.cgi?id=234033#c47
- http://www.w3.org/International/questions/qa-lang-priorities

Nicolas Krebs | 2 Jan 23:22
Favicon

Re: Re: two identical singleton extension tags issue


>Date: Thu, 21 Dec 2006 10:15:43 +0100
>From: Stephane Bortzmeyer <bortzmeyer <at> nic.fr>
>To: John Cowan <cowan <at> ccil.org>
>Copy: LTRU Working Group <ltru <at> ietf.org>

>On Wed, Dec 20, 2006 at 12:02:11PM -0500,
> John Cowan <cowan <at> ccil.org> wrote 
> a message of 24 lines which said:
>
>> What is more, we should publish the regex in 4646bis 
>
>-1
>
>Or even -X with X >> 1 :-)
>
>We publish the ABNF, which is the normative definition of the
>grammar. Publishing the grammar in yet another format is a recipe for
>subtle inconsistencies between them.

Could it be INFORMATIVE, such in rfc 4287 Appendix B ? 

>> so that people can use it in declarative contexts like XML schemas.
>
>May be we could suggest a very simplified regexp such as:
>
>[a-z0-9]+(-[a-z0-9]+)* 
>
>(case-insensitive, /i in Perl, re.IGNORECASE in Python)
>
(Continue reading)

Addison Phillips | 3 Jan 01:26
Picon
Favicon

Re: Security and nationality

It might not have occurred to you, but mailers might use this 
information to send bounce, error, and other messages in the language 
that the user prefers.

Interestingly, it can also help mail servers determine the mail encoding 
for those messages that arrive unlabeled (any hint is better than none 
at all).

Addison

Nicolas Krebs wrote:
>> Date: Thu, 28 Dec 2006 13:46:52 +0200 (EET)
>> From: "Jukka K. Korpela" <jkorpela <at> cs.tut.fi>
>> To: ltru <at> ietf.org
> 
> A little off topic, on Accept-Language header: 
> 
>> In the so-called real world, we also have poorly written software that 
>> does very nasty things at times. In particular, the Thunderbird email 
>> program, following an old Netscape/Mozilla tradition, automatically 
>> inserts an X-Accept-Language header, with values taken from the settings 
>> in the Mozilla/Firefox web browser. It even includes that header into 
>> Usenet postings. Thus, information intended for selecting between 
>> different language versions of web pages is thereby sent, without 
>> informing the user, in all outgoing email and Usenet messages. This is of 
>> course all wrong, but it's probably not formally wrong by any 
>> specification.
> 
> - http://www.cs.tut.fi/~jkorpela/headers.html#X-Accept-Language 
> - RFC 4021 section 2.1.27 http://www.ietf.org/rfc/rfc4021.txt 
(Continue reading)

Doug Ewell | 3 Jan 20:16
Picon

Unresolved issues for draft-4645bis

Now that the holidays are over and many people have returned from e-mail 
hiatus, I'd really like to see the following issues resolved:

1.  Decide how the new Registry, as initialized by RFC 4645bis, should 
handle inverted names:

    a.  Use only the uninverted names from ISO 639-3
    b.  Use only the inverted names from ISO 639-3
    c.  Use all names from ISO 639-3
    d.  Some other possibility

2.  Decide, based on which option above is chosen, whether any 
corresponding changes need to be made in the region subtags, some of 
which have a single Description in inverted form.

2.  Determine whether everyone is satisfied with the current wording in 
draft-4646bis-02, Section 3.1.4 about the first of multiple 
"Description" fields corresponding to the ISO 639-3 Reference Name, or 
whether there is still a desire for a discrete "Reference-Name" field.

These issues MUST all be resolved before I can prepare a new 
draft-4645bis-01.  The deadline for this draft to go to IETF Last 
Call -- not just WG Last Call -- is 29 days away, so we should resolve 
them soon.  I ask the co-chairs to help gauge the hum of the WG.

--
Doug Ewell  *  Fullerton, California, USA  *  RFC 4645  *  UTN #14
http://users.adelphia.net/~dewell/
http://www1.ietf.org/html.charters/ltru-charter.html
http://www.alvestrand.no/mailman/listinfo/ietf-languages
(Continue reading)

Addison Phillips | 3 Jan 21:27
Picon
Favicon

Re: Unresolved issues for draft-4645bis

Doug Ewell wrote:
> Now that the holidays are over and many people have returned from e-mail 
> hiatus, I'd really like to see the following issues resolved:
> 
> 1.  Decide how the new Registry, as initialized by RFC 4645bis, should 
> handle inverted names:
> 
>    a.  Use only the uninverted names from ISO 639-3
>    b.  Use only the inverted names from ISO 639-3
>    c.  Use all names from ISO 639-3
>    d.  Some other possibility

I think (c) is the most reasonable solution. Why should we get into the 
business of choosing and editing the names at initial registration? The 
more we dabble with this, the more I favor following ISO 639.

> 
> 2.  Decide, based on which option above is chosen, whether any 
> corresponding changes need to be made in the region subtags, some of 
> which have a single Description in inverted form.

... which choice obviates this problem.

> 
> 2.  Determine whether everyone is satisfied with the current wording in 
> draft-4646bis-02, Section 3.1.4 about the first of multiple 
> "Description" fields corresponding to the ISO 639-3 Reference Name, or 
> whether there is still a desire for a discrete "Reference-Name" field.

I think "Reference-Name" as a field is a bad idea. I'm happy to make the 
(Continue reading)

Doug Ewell | 3 Jan 21:54
Picon

Re: Unresolved issues for draft-4645bis

Addison Phillips <addison at yahoo dash inc dot com> wrote:

>>    a.  Use only the uninverted names from ISO 639-3
>>    b.  Use only the inverted names from ISO 639-3
>>    c.  Use all names from ISO 639-3
>>    d.  Some other possibility
>
> I think (c) is the most reasonable solution. Why should we get into 
> the business of choosing and editing the names at initial 
> registration? The more we dabble with this, the more I favor following 
> ISO 639.

Great, and I do not disagree (although that does require a change to the 
first paragraph of draft-4646bis-02, Section 3.1.4).  What do others 
think?

BTW, for those who are deciding, there are approximately 1,400 inverted 
forms in ISO/FDIS 639-3.  The overall size impact of adding these to the 
Registry is not great.

> I think "Reference-Name" as a field is a bad idea. I'm happy to make 
> the first name that appears in a Description field "turn out" to be 
> the Reference Name. In which case, I think it should *exactly* match 
> the reference name, (un)inversions and all.

Great, and I do not disagree.  What do others think?

> I think that this should also be extended to all ISO based records, 
> not just language/extlangs. (I think this is mostly the case anyway.)

(Continue reading)

Doug Ewell | 3 Jan 22:32
Picon

Re: Well-formed vs. regular (Was: I-D ACTION:draft-ietf-ltru-4646bis-01.txt

Addison Phillips <addison at yahoo dash inc dot com> wrote:

>> This is not perfect yet in -02 which says:
>>
>>    grandfathered = langtag   ; well-formed grandfathered tags
>>                  / irregular ; grandfathered that don't match langtag
>>
>> The "well-formed" should be replaced by "regular".
>
> Except that "regular" doesn't really convey any information here. Yes, 
> it is an antonym for 'irregular', but that doesn't really add 
> information. I think we should just omit "well-formed".

I agree.  This would define "langtag" and "regular" circularly, in terms 
of each other, much like writing:

grandfathered = A   ; grandfathered tags that aren't B
              / B   ; grandfathered tags that aren't A

Are the comments really necessary at all, or are the definitions of 
"langtag" and "irregular" sufficient (or even better) on their own?

--
Doug Ewell  *  Fullerton, California, USA  *  RFC 4645  *  UTN #14
http://users.adelphia.net/~dewell/
http://www1.ietf.org/html.charters/ltru-charter.html
http://www.alvestrand.no/mailman/listinfo/ietf-languages

John Cowan | 3 Jan 23:24

Definition of "grandfathered" considered lame

In the ABNF in 4646bis-02, a Language-Tag (the root production) can be
a langtag, a privateuse, or a grandfathered.  But a grandfathered can
be a langtag or an irregular.  Consequently, the grammar is technically
ambiguous.

I propose that the Language-Tag production be changed to refer to
irregular directly rather than grandfathered.  The grandfathered
production needs to be kept because there is a reference to it in 3.1.2.

On the editorial side, the term "well-formed tag" is not really defined
anywhere.  It should be defined in one place and then used.  This would
be improved by not talking about "well-formed processors", but rather
about "non-validating processors", which is what is meant.

--

-- 
A rabbi whose congregation doesn't want         John Cowan
to drive him out of town isn't a rabbi,         http://www.ccil.org/~cowan
and a rabbi who lets them do it                 cowan <at> ccil.org
isn't a man.    --Jewish saying

John Cowan | 3 Jan 23:36

Re: Unresolved issues for draft-4645bis

Doug Ewell scripsit:

> >I think (c) is the most reasonable solution. Why should we get into 
> >the business of choosing and editing the names at initial 
> >registration? The more we dabble with this, the more I favor following 
> >ISO 639.
> 
> Great, and I do not disagree (although that does require a change to the 
> first paragraph of draft-4646bis-02, Section 3.1.4).  What do others 
> think?

I reluctantly agree.  The names are a mess, and we might as well capture
all that we can, as it may help somebody.

> >I think "Reference-Name" as a field is a bad idea. I'm happy to make 
> >the first name that appears in a Description field "turn out" to be 
> >the Reference Name. In which case, I think it should *exactly* match 
> >the reference name, (un)inversions and all.
> 
> Great, and I do not disagree.  What do others think?

In the ISO 639 world, only the entries in 639-3 have a reference name.
In particular, the language-collection codes of 639-2 are not replicated
in 639-3 and have no reference names.  I don't care how you sort the
fields, but I am very much against giving guarantees about the order of
Description fields in 4646bis.  You'd have to say "The first Description
field is the reference name provided by the standard, if it provides
one at all."  Since we do not document which subtags come from which
standards, this is of very little use.

(Continue reading)


Gmane