1 Jul 02:56
Re: Problems in normalisation and matching
James Seng <jseng <at> pobox.org.sg>
2002-07-01 00:56:34 GMT
2002-07-01 00:56:34 GMT
Hi Dan, I remember the "dot issues" was extensively discussed by the Nameprep Design Team. It is decided that dots (other than U+002E) should be included because there are IMEs which generate these dots in place of the normal dots (it become a hassy to switch in and out of IME just for the dot). Now, some may say IME is out of scope but on the other hand, we really dont need to rehash a topic which have been concluded. Lets move forward. If we can agreed on the above, then the "many problems" you point out are really just misunderstanding of the Nameprep/IDNA relationships. First, the Nameprep/Stringprep is designed to handle domain names on a _per label_ basis. Before some IDNs going thru Nameprep, it is already broken up into its individual labels so Nameprep arent the place to fixed. The place where IDNs get broken down into label is in IDNA. What IDNA now specify is that to break down IDNs into their label, you look for this set of separators (U+002E, U+3002, U+FF0E, U+FF61). (See IDNA Requirement 1) Comparison is also done on a per label basis. A IDN is considered equivalent if and only if all their individual labels are equivalent. The separators during comparison is also irrelevant. (See IDNA Requirement 4) If the individual labels need to piece back together into a FQDN, then IDNA have already clearly specified that U+002E should be used. (See IDNA Requirement 2) -James Seng(Continue reading)
RSS Feed