John C Klensin | 1 Sep 2002 06:28

Re: Document Status?

The rarity of Dave's and my agreeing is again noted for the
record.  Lest we get too carried away by it, one observation
below.

--On Saturday, August 31, 2002 10:13 AM -0700 Dave Crocker
<dhc <at> dcrocker.net> wrote:

>...
> Personally, I believe that publishing as Experimental is
> unnecessary and that it would be disastrous.
> 
> Experimental makes sense when a technology is not well
> understood.  That is not the problem, here.  The problem,
> here, about making difficult decisions, not about
> understanding them.
>...

Well, I think there is a problem that may border on
understanding, and it is tied up with my major personal
objection to the general style of the work that has come out of
the WG.  I believe that IDNA (and the supporting documents) are,
with the exceptions and qualifications Dave and I are pushing
on, a reasonably well-understood solution to _some_ problem.
I'm not sure I know what that problem is, who cares about it,
and whether it is important enough to justify changes to the way
the DNS works and is interpreted.  Re those changes, we can
debate how significant they will be, and there are differences
of opinion about who will be impacted and how much pain they
will feel, but I think it is relatively certain that the pain
level will be non-zero.
(Continue reading)

Adam M. Costello | 1 Sep 2002 07:24

Re: Document Status?

"JFC (Jefsey) Morfin" <jefsey <at> jefsey.com> wrote:

> 1. would it not be a good occasion of getting rid of the odd phrase
> about domain/host names and to introduce a stable wording such as
> "internet name" and "international internet names" or "multilingual
> internet names" which corresponds to the compromise we actually use?

Various people have had various discussions trying to come to a common
understanding of the precise meanings of "domain name" versus "host
name", with very little success.  Just about the only thing everyone
agrees on is that every host name is a domain name, but some domain
names might not be host names.  Settling this type-of-names issue is
beyond the scope and ability of this working group.  IDNA is a mechanism
for allowing non-ASCII characters to be used in domain names, whatever
those might be.  Internationalized domain names can be used wherever
domain names can be used, wherever that might be, except in non-IN
class DNS resource records (this exclusion is stated in the forthcoming
idna-11 draft).

> - I am concerned about using a concept (international) for another
> (multilingual) when the international concept may become another issue
> with national DNS views.

I don't know exactly what the difference is between internationalization
and multilingualization.  I think one reason the latter term was
not used is that domain names have no language tag.  Maybe there
were other reasons, or maybe it was arbitrary.

> 2.  I am confused about the implications of the proposed change of
> part 7.
(Continue reading)

JFC (Jefsey) Morfin | 1 Sep 2002 14:40

(unknown)

Thank you for your responses.

 > On 07:24 01/09/02, Adam M. Costello said:
 > Various people have had various discussions trying to come to a common
 > understanding of the precise meanings of "domain name" versus "host
 > name", with very little success.  Just about the only thing everyone
 > agrees on is that every host name is a domain name, but some domain
 > names might not be host names.  Settling this type-of-names issue is
 > beyond the scope and ability of this working group.

This is my point. As you say IDNA is only a mechanism. It may apply to any 
semantic made of labels linked by dots. I also feel there are several 
issues related to the specific use of the names. In removing references to 
the usage of the names, we would have a stable universal system. Special 
adaptations for particular uses, if there are some, would be understood as 
particular cases. Would this not be more logical and easier to maintain?

 > > - I am concerned about using a concept (international) for another
 > > (multilingual) when the international concept may become another issue
 > > with national DNS views.

 > I don't know exactly what the difference is between internationalization
 > and multilingualization.  I think one reason the latter term was
 > not used is that domain names have no language tag.  Maybe there
 > were other reasons, or maybe it was arbitrary.

 From my understanding one of the issue is about real life typographies 
(language oriented).

I am in the Eurolinc bootstrap (European languages as Minc and Ainc). 
(Continue reading)

vinton g. cerf | 1 Sep 2002 14:53
Favicon

Re: Document Status?

One working definition of internationalization is that the encoding/expression is "understood" by
speakers of all languages. There is global agreement, I believe, that block Latin characters can be used
by anyone in any country to express the name of a destination country in a postal address. So for example
"UNITED STATES" or "FRANCE" or "AUSTRALIA", "JAPAN", "VIETNAM" are all considered acceptable in every
country. This agreement allows, for example, that the destination address, except for the name of the
country, can be rendered in a language local to the target country and does not have to be understood by the
postal service in the originating country. Consequently, someone sending a letter from the US to a
recipient in Vietnam can write the destination address in Vietnamese and the US postal service need only
understand the characters "VIETNAM" at the bottom of the destination address.

Multilingualization is more focused on what is sometimes called "localization" - that is, the characters
used in rendering a local language can be used (e.g. for domain names or for filling out forms etc) and these
renderings need not be universally understood.

This definitional distinction helps (me anyway) to appreciate that the creation of multilingual domain
names may not necessarily contribute to universal ability to use the resulting strings because it may be
difficult to impossible to render or enter arbitrary character sets at the user interface to a local
service. We have collectively probably created some confusion for ourselves by using the term
"internationalized domain names" to cover both concepts. It strikes me that the IDNA documents are more
aimed at localization/multilingualization than internationalization, using the "definition" in the
first paragraph above. 

Concerns about how cut/paste will work are germane to the discussion about the utility of IDNs because such
actions may be the ONLY way in which someone may be able to enter special character strings into text
intended to represent an IDN. Something like this happens to me regularly as I compose email to friends
whose names involve the use of characters with various accent markings. Since I don't know how to enter
these from my simple ASCII keyboard, I usually end up cutting and pasting the characters. This works
because the text of email is permitted to be pretty general in its encoding. I don't know how that would work
out if I were dealing with non-Latin character sets. I know I would need special software to render Hangul
or Kanji, for instance, but I assume that the rendering packages also serve to make highlighting and
(Continue reading)

Patrik Fältström | 1 Sep 2002 18:03
Picon
Favicon

Re: Document Status?

--On 2002-09-01 08.53 -0400 "vinton g. cerf" <vinton.g.cerf <at> wcom.com> wrote:

> I know I would need special software to render Hangul or Kanji, for
> instance, but I assume that the rendering packages also serve to make
> highlighting and cut/paste work.

The copy and paste problem is difficult, but not so hard as people belive
(I think).

I know how copy and paste work on the Apple Macintosh platform, and as that
has been around and worked that way for decades(!) I take for granted it
works the same way in for example Windows.

When doing "copy", the software "sending" the copied information identifies
the selection and calls a routine which notifies the operating system that
data exists in the paste buffer. The information passed include information
like what type(s) the data can be fetched as, the size(s) etc. Note that
several alternatives can be stored there.

It looks like the content-type mechanism in email. Very precise tagging of
the data.

Now, some other application have a menu which is to be drawn. The menu
includes an item called "paste". Before doing the actual drawing, it calls
a routine to check (a) if there is something in the paste buffer, and (b)
if the data is of a type which it can interpret. If both are true, the menu
item "paste" is _not_ shadowed.

The paste operation happens, and it can either grab data which is already
generated by the sender application, or the sender application is called
(Continue reading)

Stephen Dyer | 1 Sep 2002 19:34
Favicon

Re: Document Status?

At 00:28 01/09/2002 -0400, John C Klensin wrote:

>Well, I think there is a problem that may border on
>understanding, and it is tied up with my major personal
>objection to the general style of the work that has come out of
>the WG.

Dear John & All,

I have followed this group with interest and often mystification. It 
appears that members of the group have widely different understandings of 
the scope and objective. The multi-lingual and cross-cultural aspects have 
exacerbated this situation.

(I do not blame or criticise the IDN group - it's not their fault if the 
initial question is wrong, and they were asked to look at a narrow 
technical aspect of a much wider problem.)

My "view from the hill" is that overall we want to get to a position where 
the Internet can be comfortably and reliably used by the users of different 
character sets and languages.

That simply expressed goal is a long way off, but we have at our disposal a 
set of possible technologies and protocols. It seems to me that IDN has 
been a process of trying to rationalise these with the character sets and 
produce a single, fully inter-operable answer.

I think we need an additional process - a plan for evolving 
internationalization over an extended period. There is however a need for 
some visible progress here because the current hegemony of the English 
(Continue reading)

Simon Josefsson | 1 Sep 2002 19:42

Re: Document Status?

Patrik Fältström <paf <at> cisco.com> writes:

> (a1) The email program understand IDNA, but not the address book program.
> As it understands IDNA, it will display (if the script and font exists) the
> correct Unicode characters, and not the ACE encoded string. Now, the copy
> operation happens, and I would if I were the email programmer put two (2)
> things in the paste buffer: One "email address" which is the ACE encoded
> string. Same thing as what is passed in SMTP or POP. One which is the
> address in Unicode (or local script, which will be named as part of the
> tag). The addressbook which fetches data from the paste buffer gets the
> string, and notice it is ace encoded, and can choose to decode that if it
> can/know etc.

At least in X11 cut'n'paste works by transfering charset tagged but
otherwise opaque character arrays.  What you are proposing seem to
require a cut'n'paste protocol to be implemented in both the MUA and
the address book application.  The protocol must specify how the
structure containing the raw string and the ACE encoded string is
encoded and identified by both applications.  Will IDNA define this
protocol for X11, MacOS, Windows etc?

Assuming IDNA will limit itself to not require modifications to
cut'n'paste operations in various operating systems, you will only be
able to cut'n'paste charset tagged but opaque text strings.  If the
strings are to be ACE encoded or raw encoded is not specified anywhere
as far as I can tell, and different implementations will chose
different strategies.  If the application is running in a Unicode
environment, it might (only might!) make sense to transfer the raw
Unicode encoding, but if it is running in a non-Unicode environment
the IDNA specification leaves you in the cold as for how to implement
(Continue reading)

JFC (Jefsey) Morfin | 1 Sep 2002 18:07

Re: Document Status?

On 14:53 01/09/02, vinton g. cerf said:
>One working definition of internationalization is that the 
>encoding/expression is "understood" by speakers of all languages. There is 
>global agreement, I believe, that block Latin characters can be used by 
>anyone in any country to express the name of a destination country in a 
>postal address. So for example "UNITED STATES" or "FRANCE" or "AUSTRALIA", 
>"JAPAN", "VIETNAM" are all considered acceptable in every country. This 
>agreement allows, for example, that the destination address, except for 
>the name of the country, can be rendered in a language local to the target 
>country and does not have to be understood by the postal service in the 
>originating country. Consequently, someone sending a letter from the US to 
>a recipient in Vietnam can write the destination address in Vietnamese and 
>the US postal service need only understand the characters "VIETNAM" at the 
>bottom of the destination address.

Absolutely correct. This is what is used as a default international set by 
common sense,  postal agreements and EDI. You may note that this is also 
the way international mnemonics work (JFK, CDG, LAX, ... and ISO 3166 2/3 
letters we use in ccTLDs, or as X.121 DNICs or telephone numbers, etc.).

They usually are organized in a way O, I, 0 and 1 cannot be confused. As 
you note it, they are often used in printed uppercases.

This means that we are using a 28 character set (0-9, A Z, dot and dash). 
In adding column, slash, comma/crosshatch and star we may have a 32 touch 
pad for future telephone sets?

That reasoning in line with EDI, common language, etc. makes the current 
domain names the international default.

(Continue reading)

Patrik Fältström | 1 Sep 2002 20:26
Picon
Favicon

Re: Document Status?

--On 2002-09-01 19.42 +0200 Simon Josefsson <jas <at> extundo.com> wrote:
> At least in X11 cut'n'paste works by transfering charset tagged but
> otherwise opaque character arrays.

Ok. Good.

> What you are proposing seem to
> require a cut'n'paste protocol to be implemented in both the MUA and
> the address book application.

Not at all.

What I say is that one should send the ACE encoded string in the paste
buffer. Further, that is what will happen when an application doesn't know
anything about IDNA at all. In cases like MacOS where one can have
alternative forms of the data, it is possible to define a new type for the
Unicode version of the domain name.

> If the
> strings are to be ACE encoded or raw encoded is not specified anywhere
> as far as I can tell, and different implementations will chose
> different strategies.

IDNA says that if no negotiation exists between two entities which exchange
domain names between them, ACE encoding should be used. There is no
difference between a protocol which uses IP or the paste buffer. It is the
same thing.

> In general, cut'n'paste of IDNA in the real world is not well defined,
> since IDNA only solves the IDNA problem for Unicode, and the real
(Continue reading)

Soobok Lee | 1 Sep 2002 20:59
Picon

Re: Document Status?

On Sun, Sep 01, 2002 at 06:03:00PM +0200, Patrik F?ltstr?m wrote:
> 
> (a) I get an email with IDNA encoded sender address. I want to add that to
> some address book software. That imply copy and paste from email program to
> address book program. The email address have ACE encoded labels in them.
> 
> (a1) The email program understand IDNA, but not the address book program.
> As it understands IDNA, it will display (if the script and font exists) the
> correct Unicode characters, and not the ACE encoded string. Now, the copy
> operation happens, and I would if I were the email programmer put two (2)
> things in the paste buffer: One "email address" which is the ACE encoded
> string. Same thing as what is passed in SMTP or POP. One which is the
> address in Unicode (or local script, which will be named as part of the
> tag). The addressbook which fetches data from the paste buffer gets the
> string, and notice it is ace encoded, and can choose to decode that if it
> can/know etc.
>

I often run xterm and then launch MUTT (or PINE).
Even if MUTT would become IDNA-aware in the future, copy & paste operations 
grab the IDN-like strings directly from the xterm, not from the MUTT.
So, the MUTT cannot have any opportunity to toss ACE-encod the IDN into the 
receiving applications or the clip board area. Text-based MUA does not have
any copy&paste support to/from it. Xterm does all the job.

Consistent IDNA-specific and IDNA-aware copy&paste operations, if we make any,
should be implementable and meaningful also in xterm which has been regarded
as a purely textual application.

Soobok Lee
(Continue reading)


Gmane