Marcos Sanz/Denic | 1 Dec 2006 21:35
Picon
Favicon

RE: IDNAbis Goals

Vint,

> We probably should take into account as much as we are able the 
registered domain names as opposed to those that might possibly 
> have been registered under the earlier rules. Also, one wonders to what 
extent those IDNs that have been registered have been part
> of the domain name parking business as opposed to domain names for what 
I will call functioning Internet destinations (not only 
> web sites but other services also). It may be that not many 
registrations fall into the area of backward incompatibility.

That is a new, broken definition of "backward incompatibility".

Gentlemen, if the work of this group would render invalid some existing 
IDN (never mind if "parked" or "functioning" or at second or eighth 
level), I think it's in scope to determine a mechanim for 
support/migration of those.

Talking about "work" and "group": Are there plans to make an IETF working 
group out of this friendly circle? It would help to provide attention to a 
wider range of people and notoriously more input.

Best
Marcos
Erik van der Poel | 2 Dec 2006 00:51
Picon
Favicon

Re: prohibiting previously mapped and unmapped characters

OK, thanks to Mark Davis, my IDN character frequency results have been
made available on the Web:

http://macchiato.com/idn/idn-unmapped-sorted.html
http://macchiato.com/idn/idn-mapped-sorted.html

There are several caveats/notes:

These URLs are for documents that Google was actually able to fetch
from the Web quite recently. The sample was a large portion of the
main index. This means that it is only a subset of domain names that
have actually been registered.

I recommend MSIE 7 if you wish to try the links. Firefox is more
strict about the links it will follow.

Some of the domains are wildcard domains. No attempt has been made to
distinguish between wildcard and normal domains. Wildcard means that
if bar.com is a wildcard domain, then foo.bar.com, blah.bar.com and
blurfl.bar.com all work just fine. You can type anything there.

Some of the URLs take you to "parked" domains, which are really just
ads for those domain names and other services. No attempt has been
made to distinguish between parked and normal domains.

Some domain names and Web sites may be offensive to some. No attempt
has been made to filter out potentially offensive material.

The first table contains both unmapped and mapped characters. The IDNA
process maps characters to themselves, to nothing or to something else
(Continue reading)

Erik van der Poel | 2 Dec 2006 01:18
Picon
Favicon

Re: prohibiting previously mapped and unmapped characters

One correction: 0.0188% of all the URLs in the sample contained
character sequences in their domain names that were mapped to
something else in the IDNA and Nameprep processes, but not the
Punycode process. This includes the various versions of the dot (CJK,
full-width, etc), characters mapped to nothing and any sequences
affected by normalization and case-mapping, excluding ASCII
case-mapping.

Erik

On 12/1/06, Erik van der Poel <erikv <at> google.com> wrote:
> 0.0188% of the domain names
> are mapped to different strings by the IDNA process, from the links
> found in HTML to the domain names passed to DNS.
Mark Davis | 2 Dec 2006 03:39
Picon

Fwd: prohibiting previously mapped and unmapped characters



An interesting one is $sum(7,18,-9.4).x42.com, which essentially takes everything in front of the x42 and evaluates it. So all kinds of strange characters can be there.

Mark

On 12/1/06, Erik van der Poel <erikv <at> google.com > wrote:
OK, thanks to Mark Davis, my IDN character frequency results have been
made available on the Web:

http://macchiato.com/idn/idn-unmapped-sorted.html
http://macchiato.com/idn/idn-mapped-sorted.html

...
 

 

_______________________________________________
Idna-update mailing list
Idna-update <at> alvestrand.no
http://www.alvestrand.no/mailman/listinfo/idna-update
Patrik Fältström | 2 Dec 2006 08:19
Picon
Gravatar

Re: prohibiting previously mapped and unmapped characters

On 2 dec 2006, at 03.39, Mark Davis wrote:

> An interesting one is
> $sum(7,18,-9.4).x42.com<http://$sum%287,18,-9.4%29.x42.com/>,
> which essentially takes everything in front of the x42 and  
> evaluates it. So
> all kinds of strange characters can be there.

This is a site by a friend of mine. He has experimented with all  
different kind of things regarding the HTTP protocol. Especially  
everything the implementations allow (as compared to the standards).

     paf
Martin Duerst | 4 Dec 2006 10:53
Picon
Gravatar

RE: IDNAbis Goals

I have to say that I agree with Vint to some extent,
and with Marcos to some extent. See below for details.

At 05:35 06/12/02, Marcos Sanz/Denic wrote:
>Vint,
>
>> We probably should take into account as much as we are able the 
>registered domain names as opposed to those that might possibly 
>> have been registered under the earlier rules. Also, one wonders to what 
>extent those IDNs that have been registered have been part
>> of the domain name parking business as opposed to domain names for what 
>I will call functioning Internet destinations (not only 
>> web sites but other services also).

In the case of IDNs, one should be careful when talking about
"functioning destinations". There is a large number of registrations
that have been made in good faith, and that are just not activated
yet because before IE7 and before top level IDNs (the two main
milestones I identified for myself when attending the ICANN
meeting in Kuala Lumpur, about two years ago), deployment didn't
make sense. While it is difficult to find hard criteria to
distinguish these from domains that have just been bought
for speculation, there is clearly such a distinction.

>It may be that not many 
>registrations fall into the area of backward incompatibility.
>
>That is a new, broken definition of "backward incompatibility".
>
>Gentlemen, if the work of this group would render invalid some existing 
>IDN (never mind if "parked" or "functioning" or at second or eighth 
>level), I think it's in scope to determine a mechanim for 
>support/migration of those.

I think that for most cases, the actually registered domain names
are among those that still will be allowed under any kind of new rules.
The discussion here is just about fringe cases. A particular fringe
case is tests and other registrations made just to prove a point.

A very good example is the now infamous paypal homograph attack.
http://www.p&1072;ypal.com was registered not for inherent interest
in this domain, but just to prove a point, in early 2005.
For good reasons, this registration was quickly removed.

Another example would be a registration including one of the
sequences in
http://www.unicode.org/reports/tr15/#Corrigendum_5_Sequences
These are totally theoretical, yet allowed (and not normalized
away) on a strict reading of stringprep. The only reason I can
immagine that anybody would make such a registration is to
check what exactly happens for such sequences, or to try to
claim that they exist in practice to somehow influence
the update of IDNA in an unproductive way.

It would be a bad idea to predispose the work ahead by such
marginal issues.

As for migration, the world doesn't run out of domain names
soon. So offering somebody a better alternative for what
was probably a bad choice in the first place will help
everybody, and should keep everybody happy.

Regards,     Martin.

#-#-#  Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University
#-#-#  http://www.sw.it.aoyama.ac.jp       mailto:duerst <at> it.aoyama.ac.jp     
John C Klensin | 4 Dec 2006 17:32

RE: IDNAbis Goals

I agree with Martin (below), but want to make one additional
observation...

With the understanding that I don't believe it will be necessary
to do anything radical and that, to my knowledge, no one is
proposing it at this point, I think we need to understand and
accept that:

	(1) The purpose of IDNs is in use and usability, not in
	what can be registered, or has been registered, or that
	might be registered.
	
	(2) If IDNs are successful in satisfying some real need
	(and that has _not_ yet been demonstrated, IMO), then,
	however many users (and registrations) there are of IDNs
	today, they are a tiny, tiny fraction of the user and
	registrations numbers that will be associated with them
	a few years out.  
	
	(3) Conversely, if IDNs are not going to be successful
	in their current form, or any compatible modification of
	their current form, everything we are doing now will
	turn out to have been a brave effort to save the
	unsalvageable.  In the long term, registrations will
	taper off and decrease because no one will care and all
	of the machinery is a lot of clutter and an unnecessary
	risk from attacks.

Now, given that, if we concluded that IDNs were not useful today
and/or posed high risk, and that some change would create
something that was useful, then we would be justified in making
that change, even if it involved, not only a change of prefix
but a fundamental change in the underlying algorithms.  Not only
would users benefit, but even registrars and registries would
benefit because, regardless of the short-term pain of a
transition, the implications of (2) and (3) would suggest that
the money would lie in moving toward a system that people would
want to use.

Again, I'm not suggesting that anything drastic will be
necessary. I don't think it will be.  But I also don't think it
is useful to base today's decisions on counting up the number of
names that are registered today and trying to classify them.  If
IDNs are actually important and changes are necessary, there are
not enough registrations today to be significant.

However, the reality of the proposals being made now is that, as
Martin suggests, we are talking about making changes that would
impact a small number of fringe cases.  Even most of those are
fringe cases that violate recommendations that have been on the
books since around the time the standards were approved.
Because applications software has been modified over the last
year or two to enforce those recommendations (by display of
punycode, rather than more natural characters and glyphs) even
when registries do not, the practical value of many of the
fringe-case names have already been severely reduced in the
marketplace.  Distorting the outcome of this work to accommodate
those fringe cases would be, IMO, not in the best long-term
interests of the Internet, and especially not in the best
long-term interests of those users who need IDNs most.   Put
differently and more bluntly, it would be truly stupid.

     john

p.s. If one wants to think about this strictly from a registry
perspective, I would suggest that, long-term,  only those IDNs
that one can count on having displayed to users in native-script
form (in the overwhelming number of cases for which they are
likely to be accessed) count in any way at all.  The market for
punycode strings that will almost always display as punycode
strings, possibly with highlighting in whatever color is
localized as representing "warning of evil", is, inevitably,
going to be very limited.  We already know that the browser
vendors, who feel some obligation to protect users, will do
exactly that with names they consider suspicious.  While their
algorithms differ, fringe-case names almost certainly meet
reasonable criteria for "suspicious".    

>From lots of prior experience with transitions of Internet
applications, we can also predict that, if we tell an
applications developer to stop supporting a particular case, the
code is unlikely to come out unless that developer is either
convinced that it will never be used or that it is actively
harmful and risky.

So the transition strategy for marginal names is (i) see if they
have a long-term future in display in native form and (ii) if
not, let them expire when they expire.  The decision as to
whether to offer preferential registrations of better choices as
an alternative is a business decision, but I suggest it might be
a good one for several reasons.

--On Monday, 04 December, 2006 18:53 +0900 Martin Duerst
<duerst <at> it.aoyama.ac.jp> wrote:

> I have to say that I agree with Vint to some extent,
> and with Marcos to some extent. See below for details.
> 
> At 05:35 06/12/02, Marcos Sanz/Denic wrote:
>> Vint,
>> 
>>> We probably should take into account as much as we are able
>>> the 
>> registered domain names as opposed to those that might
>> possibly 
>>> have been registered under the earlier rules. Also, one
>>> wonders to what 
>> extent those IDNs that have been registered have been part
>>> of the domain name parking business as opposed to domain
>>> names for what 
>> I will call functioning Internet destinations (not only 
>>> web sites but other services also).
> 
> In the case of IDNs, one should be careful when talking about
> "functioning destinations". There is a large number of
> registrations that have been made in good faith, and that are
> just not activated yet because before IE7 and before top level
> IDNs (the two main milestones I identified for myself when
> attending the ICANN meeting in Kuala Lumpur, about two years
> ago), deployment didn't make sense. While it is difficult to
> find hard criteria to distinguish these from domains that have
> just been bought for speculation, there is clearly such a
> distinction.
> 
>> It may be that not many 
>> registrations fall into the area of backward incompatibility.
>> 
>> That is a new, broken definition of "backward
>> incompatibility".
>> 
>> Gentlemen, if the work of this group would render invalid
>> some existing  IDN (never mind if "parked" or "functioning"
>> or at second or eighth  level), I think it's in scope to
>> determine a mechanim for  support/migration of those.
> 
> I think that for most cases, the actually registered domain
> names are among those that still will be allowed under any
> kind of new rules. The discussion here is just about fringe
> cases. A particular fringe case is tests and other
> registrations made just to prove a point.
> 
> A very good example is the now infamous paypal homograph
> attack. http://www.p&1072;ypal.com was registered not for
> inherent interest in this domain, but just to prove a point,
> in early 2005. For good reasons, this registration was quickly
> removed.
> 
> Another example would be a registration including one of the
> sequences in
> http://www.unicode.org/reports/tr15/#Corrigendum_5_Sequences
> These are totally theoretical, yet allowed (and not normalized
> away) on a strict reading of stringprep. The only reason I can
> immagine that anybody would make such a registration is to
> check what exactly happens for such sequences, or to try to
> claim that they exist in practice to somehow influence
> the update of IDNA in an unproductive way.
> 
> It would be a bad idea to predispose the work ahead by such
> marginal issues.
> 
> As for migration, the world doesn't run out of domain names
> soon. So offering somebody a better alternative for what
> was probably a bad choice in the first place will help
> everybody, and should keep everybody happy.
> 
> Regards,     Martin.
> 
> 
> 
># -#-#  Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin
># University -#-#  http://www.sw.it.aoyama.ac.jp
># mailto:duerst <at> it.aoyama.ac.jp     
> 
> _______________________________________________
> Idna-update mailing list
> Idna-update <at> alvestrand.no
> http://www.alvestrand.no/mailman/listinfo/idna-update
Michel Suignard | 6 Dec 2006 02:35
Picon
Favicon

FW: UTC Agenda Item: IDNA proposal

I am forwarding this message that Ken tried to send before he got
enlisted.

Michel
------------- Begin Forwarded Message -------------

Date: Thu, 30 Nov 2006 18:35:00 -0800 (PST)
From: Kenneth Whistler <kenw <at> sybase.com>
...
Patrick,

Following up on your drafted tables, I have built a utility
that lets me experiment with various criteria, to produce
tables that are easier to manipulate and compare.

For first results, see:

http://www.unicode.org/~whistler/SPLlLoLmMnMcNdStableCaseNFKC.txt

and

http://www.unicode.org/~whistler/SPXIDContStableCaseNFKC.txt

SPLlLoLmMnMcNdStableCaseNFKC.txt, as the name I hope suggests,
consists of all Unicode characters of General_Category =
[Ll Lo Lm Mn Mc Nd], constrained to those code points which
are also stable under lowercasing ( cp = lowercase(cp) ) and
which are also stable under NFKC normalization ( cp = NFKC(cp) ).

SPXIDContStableCaseNFKC.txt repeats the same general scheme,
but starts with all Unicode characters of XID_Continue = True,
then constrained to those code points which are also stable
under lowercasing and those which are also stable under
NFKC normalization.

These are plain text files, fielded with spaces, to simplify
sorting on various fields with simple sort utilities and
to simplify searching with grep and comparison with diff.

Lines have the form:

000E0 gc=Ll sc=Latn LATIN SMALL LETTER A WITH GRAVE
^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
fixed column fields character name

I've elided all the unified Han characters, which under
any criteria need to all be included. I've also elided all
the Hangul syllables, which also need to all be included.
I think we can just take those as given and not have to
deal with the 10's of thousands of extra lines of redundant
material they represent, and instead focus on the issues
for the non-Han and non-Hangul characters.

I haven't bothered special-casing uppercase Latin A-Z,
as again we all know those are a special case to be
dealt with.

The files are in code point order, and I've used zero-extended
5 digit fields for the code point, to make sorting on the
code point easy. The second field (with General_Category
values) and the third field (with Script values) have
unique values, so it is easy to use grep or grep -v to
either pull out a specific subset of records by attribute
or to exclude some specific subset of records by attribute,
to examine the results more carefully.

You can, of course, easily write a transducer for these
files that would reformat into HTML tables and/or convert
the code point to UTF-8 for display of the actual characters
with fonts in a browser.

XID_Continue is the Unicode character property that summarizes
the basic recommendation for characters appropriate for
use in identifiers (cf. UAX #31).

If you diff SPLlLoLmMnMcNdStableCaseNFKC.txt and
SPXIDContStableCaseNFKC.txt, you'll find that the former is
a proper subset of the latter -- in other words using the
criterion General_Category = [Ll Lo Lm Mn Mc Nd] as the
starting point for defining the appropriate set of characters
is *more* restricted than XID_Continue. And in particular,
XID_Continue also allows the following subtypes that
General_Category = [Ll Lo Lm Mn Mc Nd] omits:

  A. U+00B7 MIDDLE DOT (a special case)
  B. Some connector punctuation (U+005F LOW LINE and a
     few others that are similar in function)
  C. Ethiopic digits (which are gc=No, instead of gc=Nd)
  D. Number letters (gc=Nl), which are letterlike numberforms
     that would be appropriate in identifiers
  E. Two letterlike symbols (gc=So) that are grandfathered
     in to maintain identifier definition stability for
     characters whose General_Category was changed at
     a certain point in the history of the standard.

We could discuss whether including any of these subsets in
the StringPrep output repertoire would be desirable. In
particular, I don't think C and D would hurt anything. But
none of them are really high priority, and many of the
characters in D are for historic scripts only.

Given that the relationship between the General_Category = 
[Ll Lo Lm Mn Mc Nd] criterion and the more lenient
XID_Continue = True criterion can now be quantified exactly
by comparing these two files, I think it would then
be productive to next examine:

SPLlLoLmMnMcNdStableCaseNFKC.txt

to make the case for paring it down further simply by the
omission of characters in it which otherwise seem inappropriate
for domain names (and similar identifiers that StringPrep
would be used for). In particular, the next chunk that could
easily be eliminated algorithmically would be to
drop the historic-only scripts. The list could easily be
pared down, for example, by dropping cuneiform scripts:

sc=Xsux  (Sumero-Akkadian cuneiform)
sc=Ugar  (Ugaritic cuneiform)
sc=Xpeo  (Old Persian cuneiform)

other archaic alphabets and syllabaries:

sc=Goth  (Gothic alphabet)
sc=Ital  (Old Italic alphabet)
sc=Cprt  (Cypriot syllabary)
sc=Linb  (Linear-B syllabary)
sc=Phnx  (Phoenician alphabet)
sc=Khar  (Kharoshthi abjad)
sc=Phag  (Phags-pa alphabet)
sc=Glag  (Glagolitic alphabet)

and conscripts with no current usage:

sc=Shaw  (Shavian conscript alphabet)
sc=Dsrt  (Deseret conscript alphabet)

I don't think anybody would shed any tears if those weren't
available for domain names, etc.

More controversial ones might be:

sc=Ogam  (Ogham, which has a devoted following in Ireland)
sc=Runr  (Runic, which has much current usage, despite 
            being officially archaic)
sc=Cher  (Cherokee, which has little current use and is
            a problem for confusables, but whose elimination
            could be a cause celebre and be taken as discriminatory)

After that the pickings get slim, and I don't think you can
make a very good case for eliminating any more scripts
qua scripts.

If we could get consensus somewhere along these lines, I think
we could then examine what remains for the next priority
collections of characters to omit systematically. For
example, while many, many combining marks are clearly
required for many languages, there are identifiable
subsets whose usage is restricted and not required for
normal orthography. Examples include Hebrew annotation
marks and Arabic Koranic annotation marks, whose usage is
primarily for annotating religious texts for chanting and
singing. Also combining marks used only in musical notation.
Such characters are harder to identify by Unicode properties,
and would best be handled by specifying a small number
of ranges of code points that would be restricted, instead.

Comments?

By the way, if there is additional information or a different
format that folks would find more useful for fiddling with
these data files, just let me know. It is easy to adjust
the output formatting or to incorporate listing of
additional properties for characters, if having them
explicitly listed would assist any in making these
decisions. At the moment, it seems to me that General_Category
and Script are really the crucial ones that folks are
most concerned with and which seem most useful as filtering
criteria.

Regards,

--Ken

P.S. I am assuming that "idna-update <at> alvestrand.no" is simply
a mailing list that is set up to automatically distribute
this discussion to the relevant group. If not, I need to
know, so I can manually cc this to the relevant participants.

------------- End Forwarded Message -------------
Soobok Lee | 7 Dec 2006 05:01

[lsb <at> lsb.org: [EAI] (summary) display of RightToLeft chars in localparts and hostnames]


I found this section in stringprep2003:

<quote from section 5.7>
 5.8 Change display properties or are deprecated

   The following characters can cause changes in display or the order in
   which characters appear when rendered, or are deprecated in Unicode.

   200E; LEFT-TO-RIGHT MARK
   200F; RIGHT-TO-LEFT MARK
   202A; LEFT-TO-RIGHT EMBEDDING
   202B; RIGHT-TO-LEFT EMBEDDING
   202C; POP DIRECTIONAL FORMATTING
   202D; LEFT-TO-RIGHT OVERRIDE
   202E; RIGHT-TO-LEFT OVERRIDE
   206A; INHIBIT SYMMETRIC SWAPPING
   206B; ACTIVATE SYMMETRIC SWAPPING
   206C; INHIBIT ARABIC FORM SHAPING
   206D; ACTIVATE ARABIC FORM SHAPING
</quote>

My suggestion for new stringprep200x is to move these chars
  to "mapped to nothing lists". that is, how about deleting silently
  them instead of prohibiting them and returning error ?

Reason:
  As Harald bidi draft here explains, browser/email client
 implementors somehow should determine to settle their own preferred
 display order of IDN bidi labels and localparts ,regardless of 
 whether or not IETF recommends some specific display order .

 For their purposes, above bidi functional chars would be used
 to surround major IRI delimiters for display preparation.

 When they are copied and pasted, those u+200e~u+206d may be
 contained in the copy buffer, and then prohibited by stringprep2003,
 but they would better be deleted by future stringprep200x.

Soobok

----- Forwarded message from Soobok Lee <lsb <at> lsb.org> -----

Date: Thu, 7 Dec 2006 11:04:25 +0900
From: Soobok Lee <lsb <at> lsb.org>
To: ima <at> ietf.org

On Thu, Dec 07, 2006 at 09:59:07AM +0900, Soobok Lee wrote:
> On Thu, Dec 07, 2006 at 09:48:17AM +0900, Soobok Lee wrote:
> > 
> > http://www.ietf.org/internet-drafts/draft-alvestrand-idna-bidi-00.txt (page 6)
> > and http://www.unicode.org/reports/tr36/#Bidirectional_Text_Spoofing
> > 
> > If you read above references, you can understand why this:
> > (storage order)
> >   LocalRTL <at> FirstRTL.SecondRTL.com
> > 
> > (old display order)
> >   LTRdnoceS.LTRtsriF <at> LTRlacoL.com
> > 
> > (  <at>  and dot are neutral chars wrt RtL and LtR direction)
> > 
> > I remember that there had been some discussion about whether
> >   we should do "RtoL stopper chars"(this may be not the right tech term, sorry) 
> >   around  delimiters like  <at>  or dot.
> > 
> > this may not have any definitive right answer, but we may have better choose 
> >  one anyway.
> 
> one of my self-answer is "for display preparation, insert RtoL stopper around 
> all special chars in IRI/URL".
> 
> If we follow this:
> 
>   (new display order)
>    LTRlacoL <at> LTRtsriF.LTRdnoceS.com
> 
>   here each   <at>  and dot have preceding unseen(transparent) RtoL stopper char.

For RtoL stopper, we can use "LRE  dot PDF" sequence.

But, we still have 3 problems:

1) the above choice would still make input-time display order still look as (old display
order), since we can't expect input method editors for BIDI chars intelligently
determine when to insert such surrounding stoppers on the running input entry form.
So we should provide consistent user experience around all of storage order,
input time order,old display order, and new display order. But, it won't be a trvial
task.

2) Moreover, when we copy and paste (new display order) hostname and localpart strings, 
they may contain hidden LRE and PDF chars which have been *prohibited* in stringprep.
  (http://www.ietf.org/rfc/rfc3454.txt section 5 and section 5.8)

When IDN hostnames contains prohibited chars, they will fail be stringpreped 
and return an error.
To prevent this from happening, bidi LRE /PDF should not be copied  by mouse
operation. 

3) localparts may contain  dots, for example, [OSAMA].[BIN].[LADEN] <at> free.af
how to display dot-containing bidi localparts would complicate this problem.
I guess localpart dot should follow the way that hostname dot does .

I welcome any criticism/suggestion.

Soobok

> 
> Soobok
> 
> _______________________________________________
> IMA mailing list
> IMA <at> ietf.org
> https://www1.ietf.org/mailman/listinfo/ima

_______________________________________________
IMA mailing list
IMA <at> ietf.org
https://www1.ietf.org/mailman/listinfo/ima
----- End forwarded message -----
Soobok Lee | 7 Dec 2006 06:00

Re: [lsb <at> lsb.org: [EAI] (summary) display of RightToLeft chars in localparts and hostnames]


this is another problem from bidi localparts:

> > > (storage order)
> > >   LocalRTL <at> FirstRTL.SecondRTL.com
> > >
> > > (old display order)
> > >   LTRdnoceS.LTRtsriF <at> LTRlacoL.com
> >

after applying    " s/([. <at> ])/(LRE)&1(PDF)/g ";  
we get:
      (new storage order for display)
        LocalRTL(LRE) <at> (PDF)FirstRTL(LRE).(PDF)SecondRTL(LRE).(PDF)com

When stringprep200?, 2nd and 3rd LRE will be deleted/prohibited, 
but first LRE which is part of localpart string  
would remain *attached* unless we apply some kind of preprocessing
to delete it before using it as recipient address in composing/replying.

This problem occurs when below displayed address is copy&pasted into
recipient address entry form in EAI-capable MUA for replying/composing.

> >   (new display order)
> >    LTRlacoL <at> LTRtsriF.LTRdnoceS.com

Soobok

Gmane