Doug Ewell | 1 Aug 04:37
Favicon

Re: Duplicate Busters: Survey #1 [bwo] and [bxx]

Joan Spanne <ISO639 dash 3 at sil dot org> wrote:

> I will take action to resolve the matter for each of the four (Aruá; 
> Awa; Borna; Murik) by early next week.

I'm delighted to see that the RA is willing to resolve these conflicts
within ISO 639-3, so we don't have to deviate from them.  I'll hold off
on any action here until the new 639-3 files come out.

> I am inclined to accept Doug's recommendations, with the exception of 
> [Aruá]. Precedent within the standard uses a state or province level 
> geographic qualifier, so those would be [arx] "Aruá (Rodonia State)" 
> and [aru] "Aruá (Amazonas State)". If they were geographically 
> proximal to the district level, the next choice of qualifier would be 
> classification based (the highest level where they are distinct).

That is perfectly fine.  I wanted to stay consistent with the precedent,
but couldn't find state-level identifications in the Ethnologue pages
(though I see it now for 'arx').

--
Doug Ewell  *  Thornton, Colorado, USA  *  RFC 4645  *  UTN #14
http://www.ewellic.org
http://www1.ietf.org/html.charters/ltru-charter.html
http://www.alvestrand.no/mailman/listinfo/ietf-languages  ˆ

_______________________________________________
Ltru mailing list
Ltru <at> ietf.org
https://www.ietf.org/mailman/listinfo/ltru
(Continue reading)

Doug Ewell | 1 Aug 07:02
Favicon

Duplicate Busters: Survey #2

This is the second of two surveys being sent to both LTRU and 
ietf-languages on the subject of removing certain duplicate Description 
fields in the Language Subtag Registry.  Some of these issues affect the 
current Registry, while others affect only the proposed RFC 4646bis 
Registry being considered by LTRU.

Whereas the first survey dealt with eliminating duplicates across 
records by adding differentiating text, this survey deals with removing 
essentially duplicate Description fields within a record.  "Essentially 
duplicate" in this sense means either of two things:

1.  Two Description fields are identical, except for different 
punctuation marks (hyphens or apostrophes), or one contains letters with 
diacritical marks while the other is a pure-ASCII equivalent (i.e. all 
diacritical marks stripped).  No other types of spelling differences are 
considered (such as Kirghiz vs. Kyrgyz, or Dhivehi vs. Divehi).  The 
premise is that both Description fields convey the exact same content, 
but using slightly different typography.  The goal is to pick one and 
discard the other.

2.  Two Description fields are identical, except that one includes a 
parenthetical comment signifying a region or individual/macrolanguage 
status, and the other does not.  In each case, the description with 
comment is the ISO 639-3 name, while the description without comment is 
the ISO 639-1 and/or -2 name.  The premise is that the commented names 
convey the same content, but are less likely to be confused with other 
similarly named languages.  The goal (I hope) is to pick the commented 
(639-3) name and discard the uncommented (639-2) name; a reasonable 
alternative would be to continue to list both names.

(Continue reading)

Felix Sasaki | 1 Aug 07:27
Picon
Favicon

Using BCP 47 for translation information

There is a proposal to use BCP 47 for translation information in HTML5. See
http://lists.w3.org/Archives/Public/public-html/2008Aug/0005.html
and the start of the thread at
http://lists.w3.org/Archives/Public/public-html/2008Jul/0427.html
Comments are very welcome.

Felix
Phillips, Addison | 1 Aug 07:34
Picon
Favicon

Re: Using BCP 47 for translation information

Note: I made a partially responding comment on the "winter" (www-international <at> w3.org) list in response
to Leif Halvard Silli here:

http://lists.w3.org/Archives/Public/www-international/2008JulSep/0023.html

Addison

Addison Phillips
Globalization Architect -- Lab126

Internationalization is not a feature.
It is an architecture.

> -----Original Message-----
> From: ltru-bounces <at> ietf.org [mailto:ltru-bounces <at> ietf.org] On
> Behalf Of Felix Sasaki
> Sent: Thursday, July 31, 2008 10:27 PM
> To: LTRU Working Group
> Subject: [Ltru] Using BCP 47 for translation information
> 
> There is a proposal to use BCP 47 for translation information in
> HTML5. See
> http://lists.w3.org/Archives/Public/public-html/2008Aug/0005.html
> and the start of the thread at
> http://lists.w3.org/Archives/Public/public-html/2008Jul/0427.html
> Comments are very welcome.
> 
> Felix
> _______________________________________________
> Ltru mailing list
(Continue reading)

John Cowan | 1 Aug 07:54

Re: Duplicate Busters: Survey #2

Doug Ewell scripsit:

[snip]

In each case keep the first exccept as noted.

> ---
> 
> Type: script
> Subtag: Ethi
> Description: Ge&#x2BB;ez
> Description: Ge'ez

Keep the second.

> ---
> 
> Type: script
> Subtag: Hang
> Description: Hangul
> Description: Hang&#x16D;l
> Description: Hangeul

Keep the first and third.

--

-- 
John Cowan  cowan <at> ccil.org  http://ccil.org/~cowan
And now here I was, in a country where a right to say how the country should
be governed was restricted to six persons in each thousand of its population.
For the nine hundred and ninety-four to express dissatisfaction with the
(Continue reading)

Kent Karlsson | 1 Aug 10:50
Picon

Re: Duplicate Busters: Survey #1

Doug Ewell wrote:
> This is the first of two surveys that are being distributed 

I agree with Doug's suggested changes in the "#1" list, in particular
if 639-3 RA misses out on some name changes (promised for next week).

	/kent k

Kent Karlsson | 1 Aug 10:50
Picon

Re: Duplicate Busters: Survey #2

Doug Ewell wrote:
> 1.  Two Description fields are identical, [...]
> or one contains letters with 
> diacritical marks while the other is a pure-ASCII
> equivalent (i.e. all 
> diacritical marks stripped).  [...]  The 
> premise is that both Description fields convey
> the exact same content, 
> but using slightly different typography. ...

I do **NOT** agree with the position that removing diacritial
marks would be "slightly different typography". It is a difference
in spelling, much the same as differences in spelling that you
excluded from your list ["(such as Kirghiz vs. Kyrgyz, or Dhivehi
vs. Divehi)"] and thus want to keep as multiple names.

As for the other items in your "#2" list, keep just the ISO 639-3
names. (Don't generalise my statement here. As you know, I think
some of the items not on this "#2" list need spell correction.)

	/kent k

Frank Ellermann | 1 Aug 17:27
Picon
Picon

Re: Duplicate Busters: Survey #2

Doug Ewell wrote:

 [set 1]
> The goal is to pick one and discard the other.

For ASCII vs. non-ASCII "spelling" differences I'd
doubt that this is a good goal.

 [set 2]
> the description without comment is the ISO 639-1
> and/or -2 name.

IOW the "relevant" name for all Internet protocols,
Web standards, etc. using RFC 1766, 3066, or 4646
tags.  A quite significant number of existing tags.

> Type: language
> Subtag: ms
> Description: Malay (macrolanguage)
> Description: Malay

If the 4646bis proponents invent some kind of scope
field indicating "macrolanguage" the longer name is
not strictly necessary.

If they'd invent a flag (*) they could even indicate
that this is not the main entry for Malay IFF there
will be a new "individual" Malay dupe.

I prefer the shorter description, assuming that the
(Continue reading)

CE Whitehead | 1 Aug 17:30
Picon

Re: Duplicate Busters: Survey #2

Hi, I think I understand Kent Karlsson to be saying that Hangul,  Hang&#x16D;l, and Hangeul should all be kept as description fields.  If so, then I am in agreement (but I'm not an expert so I'd be willing to listen to other arguments-- I did not get John Cowan's reasoning for retaining just the first and third of these three fields.)  I think it's better to list the different ways the names are spelled as it makes the description more easily recognized.  (It would help me to see a particular spelling I was used to.)

 
--C. E. Whitehead
cewcathar <at> hotmail.com


> From: "Doug Ewell" <doug <at> ewellic.org>
> Subject: Duplicate Busters: Survey #2

> Type: script
> Subtag: Hang
> Description: Hangul
> Description: Hang&#x16D;l
> Description: Hangeul
 
> (Technically I should not be including Hangeul, which is a different
> transcription of the same Korean word, not a genuinely different

> name.
> Make your own judgment.)

 


From: "Kent Karlsson" kent.karlsson14 <at> comhem.se

To: "'LTRU Working Group'" <ltru <at> ietf.org>, <ietf-languages <at> iana.org>
> Doug Ewell wrote:
> > 1. Two Description fields are identical, [...]
> > or one contains letters with
> > diacritical marks while the other is a pure-ASCII
> > equivalent (i.e. all
> > diacritical marks stripped). [...] The
> > premise is that both Description fields convey
> > the exact same content,
> > but using slightly different typography. ...
 
> I do **NOT** agree with the position that removing diacritial
> marks would be "slightly different typography". It is a difference
> in spelling, much the same as differences in spelling that you
> excluded from your list ["(such as
Kirghiz vs. Kyrgyz, or Dhivehi
> vs. Divehi)"] and thus want to keep as multiple names.

ISO639-3 | 1 Aug 18:42
Favicon

Re: Duplicate Busters: Survey #2 [rup]


I have no problem with adding the hyphen, so that ISO 639-3 agrees with 639-2. It came without the hyphen from Ethnologue into the ISO/DIS 639-3, and apparently no one noticed the issue until now.

-Joan


"Frank Ellermann" <nobody <at> xyzzy.claranet.de>
Sent by: ietf-languages-bounces <at> alvestrand.no

2008-08-01 10:27 AM

Please respond to
Frank Ellermann <hmdmhdfmhdjmzdtjmzdtzktdkztdjz <at> gmail.com>

To
ietf-languages <at> alvestrand.no
cc
ltru <at> ietf.org
Subject
Re: Duplicate Busters: Survey #2




...

> Type: language
> Subtag: rup
> Description: Macedo Romanian
> Description: Macedo-Romanian

That is stupid.  Pick the currently registered name,
or convince ISO 638 to toss a coin.  Note what you
have done manually in 4645bis.

...

Frank

_______________________________________________
Ietf-languages mailing list
Ietf-languages <at> alvestrand.no
http://www.alvestrand.no/mailman/listinfo/ietf-languages


Gmane