Frank Ellermann | 1 Oct 01:55
Picon
Picon

Re: RFC 4690 on Review and Recommendations for Internationalized Domain Names (IDNs)

rfc-editor <at> rfc-editor.org wrote:

> RFC 4690

> Title:      Review and Recommendations for Internationalized
>             Domain Names (IDNs)
> Author:     J. Klensin, P. Faltstrom,
>             C. Karp, IAB
> Status:     Informational
> Date:       September 2006
[...]
> I-D Tag:    draft-iab-idn-nextsteps-06.txt
> URL:        http://www.rfc-editor.org/rfc/rfc4690.txt

Pointer for info, it has references to RFCs 4645 and 4646.

Frank

Doug Ewell | 1 Oct 03:33
Picon

Re: Re: Suppress-Script batch 1

Mark Davis wrote:

> We are arguing about angels on the head of a pin. I'm saying, it is 
> MORE informative to say
>
> Don't use script subtags when they are not necessary or inappropriate, 
> such as for non-written content. Such content may include spoken or 
> sung language.
>
> than it is to say.
>
> Don't use script subtags when they are not necessary.

I'm fine with this, and in particular with the de-parenthesizing of the 
"spoken and sung" part.

--
Doug Ewell  *  Fullerton, California, USA  *  RFC 4645  *  UTN #14
http://users.adelphia.net/~dewell/
http://www1.ietf.org/html.charters/ltru-charter.html
http://www.alvestrand.no/mailman/listinfo/ietf-languages 

Doug Ewell | 1 Oct 03:58
Picon

Re: Ltru Digest, Vol 19, Issue 88

Mark Davis <mark dot davis at icu dash project dot org> wrote:

> Let's not make up problems where they don't exist. I see no 
> groundswell, nor any realistic prospect of a groundswell, of people 
> tagging languages with unnecessary scripts.

+1 to that.

> Now, this is all getting away from Karen's quite reasonable desire to 
> provide people more clarity on the fact that for material that is only 
> non-written it isn't a good idea to include a script subtag. I could 
> even see, in some applications, tagging records with, for example, 
> zh-Zxxx, meaning unwritten Chinese. Then I could search records for zh 
> (meaning any zh, no matter whether spoken, or written in traditional, 
> or written in simplified, or written in pinyin), or filter that to 
> only non-written, or only simplified, or only romanized.

Hold on, Zxxx is defined as "Code for unwritten languages."  I don't see 
that as meaning the same as "linguistic content that is not written." 
Then again, I'm the literalist who argued for years that "sign language 
as used in the United States" wasn't the same as "American Sign 
Language."

--
Doug Ewell  *  Fullerton, California, USA  *  RFC 4645  *  UTN #14
http://users.adelphia.net/~dewell/
http://www1.ietf.org/html.charters/ltru-charter.html
http://www.alvestrand.no/mailman/listinfo/ietf-languages

(Continue reading)

David Conrad | 1 Oct 04:39
Picon

Re: Re: Submission: draft-ietf-ltru-4645bis-00

>> http://www.iana.org/numbers.html, which is no better.
> I hope that IANA won't screw-up this link as long as
> it exists.

Me too.  (:-))

Rgds,
-drc

Karen_Broome | 1 Oct 05:07
Picon

Re: Re: Suppress-Script batch 1


"Mark Davis" <mark.davis <at> icu-project.org> wrote on 09/30/2006 12:49:55 PM:

>
> Let's not make up problems where they don't exist. I see no
> groundswell, nor any realistic prospect of a groundswell, of people
> tagging languages with unnecessary scripts.

However, when it comes to tagging audiovisual content, I do see this quite frequently -- at least with Chinese language classification. (And I've analyzed audiovisual language data from many sources and studios.) The fault is not always an ill-informed classifier. Developers often advise against having separate or filtered language lists for spoken and written languages as they want to be able to do a basic query for all content in Mandarin or "Chinese" and don't quite understand the issue. "ISO doesn't make this distinction, why should we?"

Or classifiers are stuck with a shrink-wrapped system architecture that assumes written content and a single language list. So when a classifier is faced with inappropriate choices for a dubbed or spoken work, they often choose a "scripted" form of Mandarin over suggesting a system redesign. It's not the role of the working group to design systems, but maybe giving this issue some attention might help developers who implement RFC 4646bis in systems that deal with both spoken and written language.

Regards,

Karen Broome
Sony Pictures Entertainment

Martin Duerst | 1 Oct 03:45
Picon
Gravatar

Suppress-Script: fix registration description

When preparing the proposal to add a Suppress-Script to Japanese,
I discovered the following contradiction in RFC 4646:

In 3.4., Stability of IANA Registry Entries:

   8.   The field 'Suppress-Script' MAY be added or removed via the
        registration process.

In 3.5., Registration Procedure for Subtags:

   Only subtags of type 'language' and 'variant' will be considered for
   independent registration of new subtags.  Handling of subtags needed
   for stability and subtags necessary to keep the registry synchronized
   with ISO 639, ISO 15924, ISO 3166, and UN M.49 within the limits
   defined by this document are described in Section 3.3.  Stability
   provisions are described in Section 3.4.

   This procedure MAY also be used to register or alter the information
   for the 'Description', 'Comments', 'Deprecated', or 'Prefix' fields
   in a subtag's record as described in Section 3.4.  Changes to all
   other fields in the IANA registry are NOT permitted.

This means that 3.4 says that Suppress-Script can be added or removed,
but 3.5 doesn't allow any changes to Suppress-Script.

My assumption as a technical contributor is that 3.4 is what we wanted,
and the fact that Suppress-Script is missing in 3.5 is an oversight.

As a co-chair, I think that we have to eliminate this contradiction,
both in RFC 4646bis, and preferably also via an erratum for RFC 4646.

Regards,     Martin.

#-#-#  Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University
#-#-#  http://www.sw.it.aoyama.ac.jp       mailto:duerst <at> it.aoyama.ac.jp     

Frank Ellermann | 1 Oct 12:59
Picon
Picon

Re: Suppress-Script: fix registration description

Martin Duerst wrote:

> 3.4 is what we wanted, and the fact that Suppress-Script is
> missing in 3.5 is an oversight.
[...]
> I think that we have to eliminate this contradiction, both in
> RFC 4646bis, and preferably also via an erratum for RFC 4646.

Yes.  I really don't understand why some prefer a one-step or
two-step standards process.  Three-step is a sound minimum.

Frank

Doug Ewell | 1 Oct 18:32
Picon

Re: RFC 4690 on Review and Recommendations for Internationalized Domain Names (IDNs)

Frank Ellermann <nobody at xyzzy dot claranet dot de> wrote:

>> RFC 4690
>
>> URL:        http://www.rfc-editor.org/rfc/rfc4690.txt
>
> Pointer for info, it has references to RFCs 4645 and 4646.

I wonder why they chose RFC 4645 as an example of "the most recent IETF 
work in this area" instead of 4647.  We should be relieved that the 
initial LSR contents were removed from 4645.

--
Doug Ewell  *  Fullerton, California, USA  *  RFC 4645  *  UTN #14
http://users.adelphia.net/~dewell/
http://www1.ietf.org/html.charters/ltru-charter.html
http://www.alvestrand.no/mailman/listinfo/ietf-languages

Mark Davis | 1 Oct 20:25
Favicon

Re: Re: Suppress-Script batch 1

I'm all for expanding the guidance to avoid using inappropriate script (or region, for that matter). I was never against addng more information about the issue; just against making it a MUST (for the reasons I outlined earlier).

Mark

On 9/30/06, Karen_Broome <at> spe.sony.com <Karen_Broome <at> spe.sony.com > wrote:

"Mark Davis" <mark.davis <at> icu-project.org> wrote on 09/30/2006 12:49:55 PM:

>
> Let's not make up problems where they don't exist. I see no
> groundswell, nor any realistic prospect of a groundswell, of people
> tagging languages with unnecessary scripts.


However, when it comes to tagging audiovisual content, I do see this quite frequently -- at least with Chinese language classification. (And I've analyzed audiovisual language data from many sources and studios.) The fault is not always an ill-informed classifier. Developers often advise against having separate or filtered language lists for spoken and written languages as they want to be able to do a basic query for all content in Mandarin or "Chinese" and don't quite understand the issue. "ISO doesn't make this distinction, why should we?"

Or classifiers are stuck with a shrink-wrapped system architecture that assumes written content and a single language list. So when a classifier is faced with inappropriate choices for a dubbed or spoken work, they often choose a "scripted" form of Mandarin over suggesting a system redesign. It's not the role of the working group to design systems, but maybe giving this issue some attention might help developers who implement RFC 4646bis in systems that deal with both spoken and written language.

Regards,

Karen Broome
Sony Pictures Entertainment


Mark Davis | 1 Oct 20:30
Favicon

Re: Re: Ltru Digest, Vol 19, Issue 88

Script tags are applicable to instances of languages. A particular instance (document, record, span) may be Latn or Latf, for example. Our interpretation of Zxxx is the same: marking an instance as being unwritten. The other interpretation wouldn't be particularly useful.

Mark

On 9/30/06, Doug Ewell <dewell <at> adelphia.net> wrote:
Mark Davis <mark dot davis at icu dash project dot org> wrote:

> Let's not make up problems where they don't exist. I see no
> groundswell, nor any realistic prospect of a groundswell, of people
> tagging languages with unnecessary scripts.

+1 to that.

> Now, this is all getting away from Karen's quite reasonable desire to
> provide people more clarity on the fact that for material that is only
> non-written it isn't a good idea to include a script subtag. I could
> even see, in some applications, tagging records with, for example,
> zh-Zxxx, meaning unwritten Chinese. Then I could search records for zh
> (meaning any zh, no matter whether spoken, or written in traditional,
> or written in simplified, or written in pinyin), or filter that to
> only non-written, or only simplified, or only romanized.

Hold on, Zxxx is defined as "Code for unwritten languages."  I don't see
that as meaning the same as "linguistic content that is not written."
Then again, I'm the literalist who argued for years that "sign language
as used in the United States" wasn't the same as "American Sign
Language."

--
Doug Ewell  *  Fullerton, California, USA  *  RFC 4645  *  UTN #14
http://users.adelphia.net/~dewell/
http://www1.ietf.org/html.charters/ltru-charter.html
http://www.alvestrand.no/mailman/listinfo/ietf-languages


_______________________________________________
Ltru mailing list
Ltru <at> ietf.org
https://www1.ietf.org/mailman/listinfo/ltru


Gmane