Paul Hoffman | 28 Nov 2012 04:28

An implementer's lament

http://annevankesteren.nl/2012/11/idna-hell
JFC Morfin | 18 Nov 2012 02:28

Re: Updating RFC 5890-5893 (IDNA 2008) to Full Standard

At 00:15 17/11/2012, Vint Cerf wrote:
I am loathe to return to the debates of the 2008-2010 period but the strong utility of canonical forms that are unambiguous as to identity (ie between A-Label and U-Label) should not be underestimated. Mapping has the unfortunate side-effect of making things "equivalent" when they are not in fact identical. I think many who were in favor of the IDNA2008 formulation were persuaded that this powerful feature was worth some breakage with regard to backward compatibility. It is obvious that there is a value judgment here and people's opinions varied.
vint


Anne,

As you have noticed, there are still some misunderstandings between two points of view (WG and Mark). And, still more confusing, with our IUse third  party's point of view while our support unlocked the situation and permitted the rough consensus.


1. To clarify the "IUse" term

We call "IUser" those who wish to intelligently "interuse" the Internet. This means that their interest is in better relating with other parties in the digisphere in every possible way, including through the Internet. This means that they do not want to be particularly bound to Internet protocols and practices however they certainly want that their Internet best practice and clear logic are kept being use throughout the whole digital ecosystem (WDE). Intelligent use consists in obtaining more from any technology being used, in a smart cross technology consistent manner, through an IUI (Intelligent Use Interface) shaped to better address the specific and generic needs of categories of users.

- Until August 29, this was a concept that was supported but not clarified by the IETF: RFC 3935 assigns the IETF the goal to make the Internet work better (but does not explain what working better means).

- Since August 29, the IETF strives to influence an Internet that is to be acceptable for the global market. This means that IUse (Intelligent Use) is now a market of people who wish to better use the digisphere resources, including the Internet, with the assistance of an IUI, at two sets of layers above the OSI model, providing :

   (1) extended value services of intellition (filtering the logical noise and assisting in the enunciation area) and
   (2) facilitation semiotic services (filtering the semantic noise and assisting in the comprehension area).

As such we are interested in multilinguistics (the cybernetics of the linguistic diversity and mecalanguages - the way computers speak and think, including natural languages - as being the protocols of exchange within our anthroporobotic world).

- IUsers may liaise with the IETF through the iucg <at> ietf.org mailing list, and the IUCG (Internet Users Contributing Group - http://iucg.org/wiki) the charter of which is to contribute in reminding IETF participants as to what their community/market is looking for, trying to help with contributions, objecting to positions they consider as detrimental for their plans or reducing the capacity of the core Internet technology to interact efficiently with its periphery (IUI) to better support us, the users, and in informing on their own lead users' projects and intents.
 
Being a FLOSS initiative, IUsers lack research and notoriety funding. It happens that there that some concepts, work and people that have gathered around my positions and initiatives (cf. http://iutf.org/wiki/JEDI). The present handling of multilinguistic issues by the IETF is more or less acceptable to me. This was not obtained easily, as the consensus I wished for could only be sometimes obtained
_______________________________________________
Idna-update mailing list
Idna-update <at> alvestrand.no
http://www.alvestrand.no/mailman/listinfo/idna-update
Paul Hoffman | 16 Nov 2012 20:18

Re: Updating RFC 5890-5893 (IDNA 2008) to Full Standard

On Nov 16, 2012, at 6:09 AM, John C Klensin <klensin <at> jck.com> wrote:

> I don't want to drag this out, but even that change implies that
> we dismissed the "backward compatibility" issues as unimportant.
> That wasn't the case.  

I am someone who, often vocally, disagreed with the way IDNA2008 went with respect to backward
compatibility. Having said that, I think Mark's characterization of the people who were promoting
IDNA2008 as "people who did not feel that it was an important concern" is simply wrong. The long
discussions about backward compatibility on the mailing list very much showed that the authors were
concerned about it and were willing to incorporate changes for backward compatibility that had WG
consensus (of which I was often on the wrong side).

We have IDNA2003 and IDNA2008 in deployment, both partially. We knew that this would happen, we talked
about it, and we did IDNA2008 anyway. Name-calling at this point is not helpful to developers and end users
of the two protocols.

--Paul Hoffman
Anne van Kesteren | 15 Nov 2012 17:28
Picon
Gravatar

Re: Updating RFC 5890-5893 (IDNA 2008) to Full Standard

On Wed, Nov 14, 2012 at 3:23 PM, JFC Morfin <jefsey <at> jefsey.com> wrote:
> May I remind that all what IUsers need is the ability to "filter out the
> filters" as an option, i.e. that the browser transparently transmit the user
> entry. The reason why is that I should get to the same place whatever the
> application or the browser on my machine, and for that I do prefer to use
> the same IDNEngine for my browsers and applications. IDNEngines are partly
> documented in RFC 5895 (partly) because IDNA2008 does not support some key
> features (like majuscules).

What is an "IUser"? Also, what other than "a" (U+0061) would "A"
(U+FF21) map to? Host names have been case-insensitive from the start,
the Turkish I is not going to change that.

Also, the focus on end users over stability of URLs found in markup in
elsewhere feels like a distraction. Most users, for better or worse,
use a search engine these days to get to a particular domain. They no
longer enter addresses in the address bar.

--

-- 
http://annevankesteren.nl/
_______________________________________________
Idna-update mailing list
Idna-update <at> alvestrand.no
http://www.alvestrand.no/mailman/listinfo/idna-update
Marcos Sanz | 6 Nov 2012 16:43
Picon
Favicon

Lookup for reserved LDH labels

Dear all,

I had this small discussion with Mark and Markus and, despite our treefold 
homonymy, we couldn't get to common ground, so I've decided to get a 
reading of the standard directly from the IDNA2008 editors.

According to my interpretation (cf. RFC 5891, Section 5) the lookup 
protocol relies on the assumption that names that are already present in 
the DNS are valid. And, in fact, I have a bunch of domains in my database 
with hyphens in the third and fourth position, so-called reserved LDH 
labels that are not XN-labels (s. nomenclature in RFC 5890, Section 
2.3.2.1). Take for instance "ad--acta.de". My expectation would be that 
the protocol doesn't fail on those*. Mark however reminded me of the 
restrictions in RFC 5891, Section 4.2.3.1. But those are for the 
registration, which I am not interested in at the moment. If at all 
relevant, we'd have Section 5.4:

"Putative U-labels with any of the following characteristics MUST be 
rejected prior to DNS lookup:
[·..]
 o  Labels containing '--' (two consecutive hyphens) in the third and 
fourth character positions."

On my side, I claim that that restriction simply does not apply because 
"ad--acta.de" is not a "putative U-label", in fact it is no U-label at all 
(cf. U-Label definition in RFC 5890, Section 2.3.2.1).

Thus, the protocol should never fail on lookup for "ad--acta.de". Is that 
correct?

Best regards,
Marcos

* FWIW idnkit-2.2 works according to my expectations, ICU does not.
_______________________________________________
Idna-update mailing list
Idna-update <at> alvestrand.no
http://www.alvestrand.no/mailman/listinfo/idna-update
RFC Errata System | 9 Aug 2012 11:06
Favicon

[Editorial Errata Reported] RFC5892 (3312)


The following errata report has been submitted for RFC5892,
"The Unicode Code Points and Internationalized Domain Names for Applications (IDNA)".

--------------------------------------
You may review the report below and at:
http://www.rfc-editor.org/errata_search.php?rfc=5892&eid=3312

--------------------------------------
Type: Editorial
Reported by: Patrik Fältström <paf <at> netnod.se>

Section: A and A.1

Original Text
-------------
In A:

Code point:

The code point, or code points, to which this rule is to be
applied.  Normally, this implies that if any of the code points in
a label is as defined, then the rules should be applied.  If
evaluated to True, the code point is OK as used; if evaluated to
False, it is not OK.

In A.1:

Rule Set:
  False;
  If Canonical_Combining_Class(Before(cp)) .eq.  Virama Then True;
  If RegExpMatch((Joining_Type:{L,D})(Joining_Type:T)*\u200C
    (Joining_Type:T)*(Joining_Type:{R,D})) Then True;

Corrected Text
--------------
In A:

Code point:

The code point, or code points, to which this rule is to be
applied.  Normally, this implies that if any of the code points in
a label is as defined, then the rules should be applied.  If
evaluated to True, the code point is OK as used; if evaluated to
False, it is not OK.

For the rule to be evaluated to True for the label, it MUST be
evaluated to True for every occurrence of Code point in the
label.

In A.1:

Rule Set:
  False;
  If Canonical_Combining_Class(Before(cp)) .eq.  Virama Then True;
  If cp .eq. \u200C And RegExpMatch((Joining_Type:{L,D})(Joining_Type:T)*cp
    (Joining_Type:T)*(Joining_Type:{R,D})) Then True;

Notes
-----
The original text did not make it clear whether the actual rule is to be applied once for every occurrence of
the code point in the label. This is a regular expression that can be interpreted in multiple ways, plus it
does not take into account the case where more than one U+200C exists in a label.

Instructions:
-------------
This errata is currently posted as "Reported". If necessary, please
use "Reply All" to discuss whether it should be verified or
rejected. When a decision is reached, the verifying party (IESG)
can log in to change the status and edit the report, if necessary. 

--------------------------------------
RFC5892 (draft-ietf-idnabis-tables-09)
--------------------------------------
Title               : The Unicode Code Points and Internationalized Domain Names for Applications (IDNA)
Publication Date    : August 2010
Author(s)           : P. Faltstrom, Ed.
Category            : PROPOSED STANDARD
Source              : Internationalized Domain Names in Applications (Revised)
Area                : Applications
Stream              : IETF
Verifying Party     : IESG
_______________________________________________
Idna-update mailing list
Idna-update <at> alvestrand.no
http://www.alvestrand.no/mailman/listinfo/idna-update
debug@test1.org | 6 Aug 2012 04:31

confusing notation in the ZERO WIDTH NON-JOINER contextual rule

Hi,

RFC5892 contains the following rule about the contextual validity of U+200C:

> If RegExpMatch((Joining_Type:{L,D})(Joining_Type:T)*\u200C
>         (Joining_Type:T)*(Joining_Type:{R,D})) Then True;

By intuition, I understand that "\u200C" within the regex means the code
point in question. So, a feasible interpretation would be:

(*) The code point MUST occur between Joining_Type:{L,D} and
Joining_Type:{R,D}, where arbitrary occurences of Joining_Type:T MAY be
in between.

On the other hand, the statement literally defines just a regex that
should match the string somewhere (with no reference to "cp" as in other
rules), such that the rule would be satisfied already if any U+200C
fulfill the requirement.

The literally interpretation sounds stupid, but I found both variants
within IDNA2008 implementations.

For instance, consider the Perl module Net::IDN::UTS46 on CPAN. Here,
it's taken literally and hence the sequence

  U+0628 U+200C U+0627 U+200C U+0627

is considered to be valid, although U+0627 is Joining_Type:R and thus
the second U+200C doesn't meet the requirement (*).

On the other hand, the (probably more reliable) implementation idnkit-2
from the Japan Registry reports a CONTEXTJ rule violation for the same
string. Now, who is right?

regards, Sebastian
Abdulrahman I. ALGhadir | 1 Jul 2012 07:43
Picon

how did the idna theory start?

Greeting all,

 

It might be a little bit odd to ask this question at the moment, I know it is a little bit late and I tried my best to search for it. What is the main reason for not supporting the UNICODE in the DNS protocol and to not use the hack-and-slash current way to solve this issue?

 

I tried to virtualized these scenarios but I failed to fulfill them cause I found another scenario which can contradict it:

 

1)      It is hard to update the internet old legacy of machines will be impossible to maintain:
Well considering current supporting for IPv6 RRs, DNSSEC RRs, … and other records within the protocol I don’t think it is hard to use the UNICODE as based encoding in DNS servers.

2)      It is bad to increase the size of the zone which will slow the cashing and will increase paging which will cause slowness in responses:
again, with ICANN allowing the new GTLDs and supporting the DNSSEC (big chunk of hashes) these things already increased the size.

3)      As technical part it is hard to maintain supporting other languages within the zone, it is hard to work with them without understanding:
aren’t IDNAs considered to be hashes?

 

 

I am looking forward for answers!

 

AbdulRahman,

_______________________________________________
Idna-update mailing list
Idna-update <at> alvestrand.no
http://www.alvestrand.no/mailman/listinfo/idna-update
Paul Hoffman | 21 Mar 2012 02:09

WG Last Call for draft-ietf-websec-strict-transport-sec

Greetings again. The websec WG is in WG Last Call for draft-ietf-websec-strict-transport-sec, an
interesting document that is likely to be widely deployed in web browsers and servers. There are a few
places in the document that touch (slam?) into IDNA 2003 and IDNA 2008, so I thought this list should pay
attention now rather than later. In specific, please see section 8 (just the beginning), section 9, and
section 13.

The WG LC ends April 9. Please send any comments to the websec WG mailing list, not here. Thanks!

--Paul Hoffman
JFC Morfin | 13 Mar 2012 03:07

RE: Draft on IDN Tables in XML

At 19:28 12/03/2012, Shawn Steele wrote:
>I don't see a relationship between the proposed XML behavior and 
>JFC's ideas.  Nor am I interested in JFC's ML stuff, sorry.

Noted.
jfc
J-F C. Morfin | 12 Mar 2012 04:07

RE: Draft on IDN Tables in XML

At 03:13 12/03/2012, Shawn Steele wrote:
>What kinds of applications are expected to consume this 
>data?  What's the target?

Shawn,
The browsers want to use them. To validate the IDNs. This is why we 
need to get them synced.
jfc  

Gmane