Julien ÉLIE | 9 Oct 2011 21:27
Favicon

Experiment with UTF-8 in message-IDs


Hi all,

In the IETF working group for IMA (Internationalized eMail Address),
there is a current thread about UTF-8 in message-IDs:
    http://www.ietf.org/mail-archive/web/ima/current/threads.html#04330

Quick references in the thread:

http://www.ietf.org/mail-archive/web/ima/current/msg04430.html
http://www.ietf.org/mail-archive/web/ima/current/msg04344.html
http://www.ietf.org/mail-archive/web/ima/current/msg04345.html
http://www.ietf.org/mail-archive/web/ima/current/msg04420.html
http://www.ietf.org/mail-archive/web/ima/current/msg04422.html

RFC 5536 (USEFOR) currently allows only ASCII characters in message-IDs.

INN 2.4 and INN 2.5 have always rejected message-IDs containing
non-ASCII chars.  (I have not looked at INN 2.3 and before.)  When
a message-ID is not valid per RFC 850/1036/... and now 5536, the
article is rejected.

200 news.trigofacile.com InterNetNews server INN 2.6.0 (20110908 prerelease) ready (transit mode)
IHAVE <© <at> fr>
435 Syntax error in message-ID
MODE READER
200 news.trigofacile.com InterNetNews NNRP server INN 2.6.0 (20111003 prerelease) ready (posting ok)
ARTICLE <© <at> test>
501 Syntax error in message-ID
QUIT
(Continue reading)

Charles Lindsey | 30 Mar 2010 15:31
Picon
Picon

Re: No subject given


>Hi!

>2010/3/24 Charles Lindsey <chl <at> clerew.man.ac.uk>:
>> BTW,
>>
>> http://www.ietf.org/internet-drafts/draft-teint-xidna-base-00.txt
>> http://www.ietf.org/internet-drafts/draft-teint-xidna-email-00.txt
>>
>> have just been published, and if we do add EAI extensions to Netnews, then
>> those mechanism might come in handy for downgrading the Newsgroups header
>> when gatewaying to non-EAI mailing lists.

>Thanks for posting this here. There's also a proposal for newsgroup names:
>http://www.ietf.org/internet-drafts/draft-teint-xidna-newsgroup-00.txt

OK, I shall look at that Tuits permitting. But current thinking is that
News should be extended with EAI (when that has settled down), so
downgrading should only be needed when gatewaying to non-EAI email, at
which point the Newsgroups header is only for information. But since
newsgroup-names are case insensitive and pretty similar in appearance to
domain names, then punycoding them should be pretty straightforward.

But so long as newsgroups names remain within the netnews environment,
vicious normalization will be essential, because they will be routed by
the simplest of octet-for-octet comparisons.

--

-- 
Charles H. Lindsey ---------At Home, doing my own thing------------------------
Tel: +44 161 436 6131            Web: http://www.cs.man.ac.uk/~chl
(Continue reading)

Charles Lindsey | 24 Mar 2010 12:59
Picon
Picon

Re: Newsgroups names with IDNA


>Hi all,

>When specifying internationalized newsgroup names, will other separators
>than the usual dot between newsgroup components be considered?

>RFC 3490 (Internationalizing Domain Names in Applications) mentions:

>   1) Whenever dots are used as label separators, the following
>      characters MUST be recognized as dots:  U+002E (full stop), U+3002
>      (ideographic full stop), U+FF0E (fullwidth full stop), U+FF61
>      (halfwidth ideographic full stop).

>Of course, newsgroup names can go on using only the usual full stop.
>Anyway, it would perhaps be interesting to allow other kind of full stops.
>DNS uses them (but e-mail does not allow them).

I would hope we would insist on U+002E (full stop). The idea always was to
insist on NFKC normalization, which I would expect to collapse all of
those.

BTW,

http://www.ietf.org/internet-drafts/draft-teint-xidna-base-00.txt
http://www.ietf.org/internet-drafts/draft-teint-xidna-email-00.txt

have just been published, and if we do add EAI extensions to Netnews, then
those mechanism might come in handy for downgrading the Newsgroups header
when gatewaying to non-EAI mailing lists.

(Continue reading)

Julien ÉLIE | 23 Mar 2010 21:48
Favicon

Newsgroup names with IDNA


Hi all,

When specifying internationalized newsgroup names, will other separators
than the usual dot between newsgroup components be considered?

RFC 3490 (Internationalizing Domain Names in Applications) mentions:

   1) Whenever dots are used as label separators, the following
      characters MUST be recognized as dots:  U+002E (full stop), U+3002
      (ideographic full stop), U+FF0E (fullwidth full stop), U+FF61
      (halfwidth ideographic full stop).

Of course, newsgroup names can go on using only the usual full stop.
Anyway, it would perhaps be interesting to allow other kind of full stops.
DNS uses them (but e-mail does not allow them).

It was just a suggestion.  Maybe silly...

--

-- 
Julien ÉLIE

« Aux quatre coins de l'Hexagone. » 

Charles Lindsey | 1 Feb 2010 19:06
Picon
Picon

Extending news to EAI


There is now an experimental protocol for UTF-8 headers in Email (RFC5335
and its relations). This was the product of the IMA WG. There has been
recent discussion of applying this to Netnews, and the conclusion seems to
be that the IMA WG is not the place to do this, and that a private draft
would be the way to do this. However, this list would be a reasonable
place to discuss it.

Essentially, under this protocol, UTF-8 may be freely used in Email
headers, but a downgrading mechanism is needed whenever mail passes to a
server that does not advertise the UTF8SMTP capability.

This is much what the USEFOR WG wanted to do in its earlier days, but the
decision was then taken to postpone it until the base documents were
complete, and then to bring it up again as an Experimental protocol. So
maybe now is the time to embark on it.

It is much easier with Netnews than with Email, since the underlying
transport (whether NNTP or UUCP) is already 8-bit clean. I would not
expect it to become the norm on the Big-8 groups for quite some time, but
it would be very useful for National hierarchies, such as the Scandinavian
ones where the inability to have Newsgroup Names with their own special
characters in them is a right pain (apparently).

So the experimental protocol would start off with the extensions allowed
by RFC5535, and then add UTF-8 in the Newsgroups header. It would be up to
individual hierarchies to encourage deployment of the experiment within
their groups.

It has already been established that the existing transport mechanisms
(Continue reading)

Julien ÉLIE | 30 Jan 2010 10:55
Favicon

UTF-8 newsgroup name feedback


Hi all,

For they who are interested, I have just sent a newgroup
control article for local.test.υτφ8 so that you could
retrieve it and eventually process it (changing "local"
to whatever you want):

    <news:newgroup-local-test-utf8-1264843277 <at> news.trigofacile.com>

Note for inn-workers
--------------------
I confirm it works fine with INN provided that we remove
the check for [^a-z0-9+_\-] in a newsgroup name (enforced
by controlchan).  I suggest that we remove it in INN 2.5.2.
We can reintegrate a better check with allowed UTF-8 values
in future versions of INN.

Note for USEFOR
---------------
In case someone wants to see what the result is, you can use
news.trigofacile.com on port 119.
The trigofacile.test.υτφ8 newsgroup has been created in UTF-8.
It is readable by everyone.

If you want to post, authenticate with user "test" and password
"test".

The group properly shows up in LIST ACTIVE and LIST NEWSGROUPS.

(Continue reading)

Julien ÉLIE | 30 Jan 2010 10:34
Favicon

Tr: DNews and newsgoups in UTF-8


Hi all,

I've just asked in fr.usenet.8bits and fr.usenet.forums.evolution whether
we could create in fr.* a testing UTF-8 newsgroup name.

Michel Guillou reports that DNews fails to use such names:
<news:2jt6m5tsg8arj09lbhd266rd96p9s2qeqs <at> meta.neottia.net>

    New group {neottia.test.&#965;&#964;&#966;8}, Modflag {y}, Creator {MG}
    Description {Local, pour les tests dans un forum à nom UTF8.}
    Group creation failed, check ME feed in newsfeeds.conf and max_groups
    setting (examine dnews.log too)

The corresponding newsgroup is:

    neottia.test.υτφ8

--

-- 
Julien ÉLIE

« Ne crains pas d'avancer lentement, crains seulement de t'arrêter. » (proverbe 
chinois) 

Julien ÉLIE | 15 Jan 2010 23:55
Favicon

Errata for RFC 5536 and 5537


Hi,

According to previous discussion on the mailing-list, I reckon that
the statuses of current open errata for RFCs 5536 and 5537 are:

1979 -> still no validation, though I believe it should be VERIFIED.
        Can someone confirm?

1980 -> it should be reworded as follows, and VERIFIED.

1981 -> it should be VERIFIED.

1982 -> I don't know; the original text is right and reads better.
        Should it be VERIFIED or REJECTED?  (with a note added to
        say that both forms are correct English)

1983 -> it should be VERIFIED.

1993 -> it should be VERIFIED.

--

-- 
Julien

-----------------------------------------------------------------------
RFC 5537 - Erratum 1980
-----------------------------------------------------------------------

It should be VERIFIED.

(Continue reading)

Julien ÉLIE | 9 Jan 2010 00:21
Favicon

Parsing the Injection-Info: header field


Hi,

I follow a discussion that was initiated in news.software.nntp:

    http://groups.google.fr/group/news.software.nntp/browse_frm/thread/c704d0859dd84d38
    <news:slrnhjcc9b.u40.steve <at> news.mixmin.net>

According to Section 3.2.8 of RFC 5536:

   injection-info  =  "Injection-Info:" SP [CFWS] path-identity
                      [CFWS] *( ";" [CFWS] parameter ) [CFWS] CRLF

      NOTE: The syntax of <parameter> (Section 5.1 of [RFC2045], as
      amended by [RFC2231]), taken in conjunction with the folding rules
      of [RFC0822] (note: not [RFC2822] or [RFC5322]), effectively
      allows [CFWS] to occur on either side of the "=" inside a
      <parameter>.

I do not understand well that point.
Does it mean that CFWS is allowed both before and after the "="?
Why should RFC 822 be followed here?  And why doesn't RFC 5322 mention
that?

I thought that RFC 5536 was based upon RFC 5322 and therefore would
not allow a space before "=", and would allow a space after "=" only
if <value> is a quoted-string.

Just afterwards:

(Continue reading)

Julien ÉLIE | 4 Jan 2010 08:15
Favicon

External complaints about the deprecated Lines: header


Hi,

Ray, the administrator of news.eternal-september.org, is currently having issues
with his users; they complain about the fact that a news server is no longer
required to generate a Lines: header.

Lots of news readers suddenly break...
Especially when they do not use the overview data for the :lines count but
only rely on the Lines: header.

Ray pointed me out to that thread in alt.free.newsservers:
    http://groups.google.fr/group/alt.free.newsservers/browse_frm/thread/438e64aa1e740f3e

Just following here the issue (in case someone wants to participate).

--

-- 
Julien ÉLIE

« When a newly married couple smiles, everyone knows why.
  When a ten-year married couple smiles, everyone wonders why. »

Julien ÉLIE | 31 Dec 2009 17:31
Favicon

Syntax validation of articles by injecting agents


Hi,

RFC 5537 mentions that an injecting agent MUST reject any proto-article
that is not syntactically valid as defined by RFC 5536.

What is the best way to do that then?
Is it safe to implement that requirement?  RFC 5536 is said to
"reflect current practice", but if we enforce that MUST, I believe
it will break lots of news readers.

NN for instance does not generate MIME-Version: header fields
although "user agents MUST meet the definition of MIME conformance"
("a mail user agent that is MIME-conformant MUST always generate
a "MIME-Version: 1.0" header field in any message it creates").
I believe this sentence applies to news user agents too, otherwise
a reference to MIME is useless.

And what if a news reader generates an incorrect User-Agent: header
field?  or if it always adds a tail-entry which is not a path-nodot
in Path:?  All its posts will be rejected by a RFC-compliant injecting
agent...
It it the intention?

I quite understand that it would help to have better compliant
articles.  For instance, rejecting articles with "all" in their
distribution list.

But in some cases, people would need to upgrade their news
readers...  (and maybe change their news readers if it is
(Continue reading)


Gmane