Erik van der Poel | 5 May 23:34

homographs in TrueType fonts

I have written a small program that parses a number of TrueType font 
tables to determine which pairs of Unicode codepoints end up using the 
same glyphs. The ASCII part of the table is included below. Each line 
has a codepoint, its glyph, the other codepoint of the pair, and the 
number of fonts in which that pair is identical.

U+2044 and U+2215 use the same glyph as the slash (U+002F) in a few East 
Asian fonts. Note also that the capital letters I and O have homographs, 
although some apps present domain names in lower case, so those 
homographs would stand out in those apps. For the complete table, see:

http://nameprep.org/tt-hg.html

Erik

0021(!);01C3;2
0022(");02BA;4
0022(");05F4;12
0027(');0060;1
0027(');02B9;4
0027(');05F3;12
0027(');2032;6
0028(();FD3E;3
0029());FD3F;3
002C(,);201A;9
002D(-);2010;12
002D(-);2012;1
002D(-);2013;2
002F(/);2044;3
002F(/);2215;4
(Continue reading)

Mark Davis | 7 May 02:01

Re: homographs in TrueType fonts

Eric, I updated the file on
http://unicode.org/reports/tr36/draft/confusables.txt incorporating your
list (and others).

There is also the file
http://unicode.org/reports/tr36/draft/confusables-raw.txt, which contains
raw data; it is not reconciled, and not closed. The items with a number or
'skip' in the third field are usually from your data. I do remove some of
the data where you have identicals because they are basically font bugs.

Anyway, comments welcome.

‎Mark

----- Original Message ----- 
From: "Erik van der Poel" <erik <at> vanderpoel.org>
To: <idn <at> ops.ietf.org>
Sent: Thursday, May 05, 2005 14:34
Subject: [idn] homographs in TrueType fonts

> I have written a small program that parses a number of TrueType font
> tables to determine which pairs of Unicode codepoints end up using the
> same glyphs. The ASCII part of the table is included below. Each line
> has a codepoint, its glyph, the other codepoint of the pair, and the
> number of fonts in which that pair is identical.
>
> U+2044 and U+2215 use the same glyph as the slash (U+002F) in a few East
> Asian fonts. Note also that the capital letters I and O have homographs,
> although some apps present domain names in lower case, so those
> homographs would stand out in those apps. For the complete table, see:
(Continue reading)

JFC (Jefsey) Morfin | 11 May 06:08

a way toward homograph resolution ? (was "improving WG operation")

On 04:43 11/05/2005, Randy Presuhn said:
>From: "JFC (Jefsey) Morfin" <jefsey <at> jefsey.com>
> > To: "Hallam-Baker, Phillip" <pbaker <at> verisign.com>
> > Cc: <ietf <at> ietf.org>
> > Sent: Tuesday, May 10, 2005 5:29 PM
> > Subject: RE: improving WG operation
>...
> > They do not not only delete. I suggest you just come to the WG-ltru where
> > they have decided to document RFC 2277 charsets into RFC 3066 langtags. So
> > you can enjoy charset conflicts, something you never though about, I
> > presume. You cannot stop progress.
>...
>
>I guess Jefsey is upset because the WG rejected his proposal
>to expand our scope to include charsets.  The ltru WG is most
>emphatically *not* confusing charsets with language tags.

I am not upset :-). To the countrary I find extremely interesting that some 
people were able to rename charsets "scripts" in order to insert charsets 
into languages descriptions while claiming they dont (cf. above). Obviously 
they are unhappy when I expose the trick. Anyway the result is great fun: 
people will be prevented from accessing a page they know to read, if they 
do not know the language.

This cacologic however might be a good way to solve the IDN homograph issue 
and the phishing problem.

If we revert from those famous "scripts" to what they are, i.e. unicode 
partitions, hence stable and well documented charsets 
(http://www.unicode.org/Public/4.1.0/ucd/Scripts.txt) , using them browsers 
(Continue reading)

Randy Presuhn | 11 May 08:18
Picon

Re: a way toward homograph resolution ?

Hi -

Let it suffice for me to say that I believe the gentleman is mistaken.
I do not intend to waste additional bandwidth on this thread.
Those interested in ltru and its work will find our charter at
http://www.ietf.org/html.charters/ltru-charter.html and our archives at
http://www.ietf.org/mail-archive/web/ltru/index.html

Randy, ltru co-chair

----- Original Message ----- 
> From: "JFC (Jefsey) Morfin" <jefsey <at> jefsey.com>
> To: <ietf <at> ietf.org>
> Cc: <idn <at> ops.ietf.org>
> Sent: Tuesday, May 10, 2005 9:08 PM
> Subject: a way toward homograph resolution ? (was "improving WG operation")
>

> On 04:43 11/05/2005, Randy Presuhn said:
> >From: "JFC (Jefsey) Morfin" <jefsey <at> jefsey.com>
> > > To: "Hallam-Baker, Phillip" <pbaker <at> verisign.com>
> > > Cc: <ietf <at> ietf.org>
> > > Sent: Tuesday, May 10, 2005 5:29 PM
> > > Subject: RE: improving WG operation
> >...
> > > They do not not only delete. I suggest you just come to the WG-ltru where
> > > they have decided to document RFC 2277 charsets into RFC 3066 langtags. So
> > > you can enjoy charset conflicts, something you never though about, I
> > > presume. You cannot stop progress.
> >...
(Continue reading)

Picon
Favicon

RE: a way toward homograph resolution ? (was "improving WG operation")


> This cacologic however might be a good way to solve the IDN 
> homograph issue 
> and the phishing problem.

I have been spending most of my time on the phishing problem for three
years. I have yet to see a phishing gang use the DNS IDN loophole for a
phishing attack.

This is probably because the issue was an administrative one, the cert
should never have issued and in the wake of the paper the CAs I have
talked to have all corrected the issue. 

The lookalike DNS name problem was known before the design of SSL
started, remember Micros0ft.com?

Today the phishing gangs use bigbank-security.com or bigbank-corp.com or
something similar. They are not going to use IDN DNS names until the
application support is much much more comprehensive by which time the
strategy will have changed.

So in summary no, 'solving' the homolog issue is irrelevant to current
phishing issues and by the time it is relevant I hope that we would no
longer think it is a good idea to try to train users to recognise DNS or
X.500 names as security indicata. We need to make security much more
informative and usable if we want it to be used.

_______________________________________________
Ietf mailing list
Ietf <at> ietf.org
(Continue reading)

JFC (Jefsey) Morfin | 11 May 16:21

RE: a way toward homograph resolution ? (was "improving WG operation")

On 15:29 11/05/2005, Hallam-Baker, Phillip said:
> > This cacologic however might be a good way to solve the IDN
> > homograph issue and the phishing problem.
>
>I have been spending most of my time on the phishing problem for three
>years. I have yet to see a phishing gang use the DNS IDN loophole for a
>phishing attack.

Dear Allan,
I am afraid you are right due to the low interest in the IDN solution 
(however punycode is of interest). Why not to document your experience to 
ccTLDs? We are very concerned about this because we can do nothing about it 
and people believe we can.

What what "techies" say is "don't worry" we know the problem for a long 
:-). True this is one of the reason why I objected to IDNA. But IDNA is 
still here? Help welcome!

>This is probably because the issue was an administrative one, the cert
>should never have issued and in the wake of the paper the CAs I have
>talked to have all corrected the issue.

CA?

>The lookalike DNS name problem was known before the design of SSL
>started, remember Micros0ft.com?
>
>Today the phishing gangs use bigbank-security.com or bigbank-corp.com or
>something similar. They are not going to use IDN DNS names until the
>application support is much much more comprehensive by which time the
(Continue reading)


Gmane