Keith Moore | 1 Aug 2004 23:13
Picon

Re: Mandatory From field, anonymity, and hacks


> >> The TLD ".invalid" already has the required
> >> property, and is registered with IANA as being guaranteed never to  
> >> resolve to anything.
> >
> > that doesn't do a thing to keep the root servers from getting hit.
> 
> It does if user agents that receive ill-considered requests to send mail  
> to such addresses recognize that particular case and don't waste time  
> looking it up.

If we were talking about a completely new protocol, I might agree with you.
But Usenet and email are both widely deployed, and it's not reasonable
to expect the installed base of Usenet and email software to change to 
avoid the impact on the root servers.  

(for the same reason, the .local TLD as used by a certain poorly designed
local name lookup system is also a bad idea.)

Charles Lindsey | 3 Aug 2004 00:58
Picon
Picon

Re: Mandatory From field, anonymity, and hacks


On Sat, 31 Jul 2004 17:37:29 -0400, Bruce Lilly <blilly <at> erols.com> wrote:

>
> Charles Lindsey wrote:

>> It says
>>
>>       ".invalid" is intended for use in online construction of domain
>>       names that are sure to be invalid and which it is obvious at a
>>       glance are invalid.
>
> That is not a recommendation for anything -- it is a statement of intent;
> intent for use "online" and not via hard-coding -- and is certainly not
> a recommendation for use in address fields.

It is a statement of what that TLD is intended to be used for, with the  
expectation that it may well turn up online - i.e. in actual  
communications using assorted internet protocols.

However, if you want to see what the actual authors of RFC 2606 intended  
it to mean, you might care to look at
<http://purl.net/net/msgid/aacu37$clc$1 <at> krell.zikzak.de>,
which not only confirms its appropriateness within Netnews, quoting the  
vary same texts I have been quoting at you, but also indicates that  
Netnews seems to have been one of the protocols particularly in mind when  
the .invalid TLD was invented.

>
> You have utterly missed the point of RFC 3696. The point is that
(Continue reading)

Charles Lindsey | 3 Aug 2004 00:58
Picon
Picon

Re: Mandatory From field, anonymity, and hacks


On Sun, 1 Aug 2004 17:13:01 -0400, Keith Moore <moore <at> cs.utk.edu> wrote:

> If we were talking about a completely new protocol, I might agree with  
> you.
> But Usenet and email are both widely deployed, and it's not reasonable
> to expect the installed base of Usenet and email software to change to
> avoid the impact on the root servers.

There is a lot that is wrong with Usenet, which is precisely why a new  
standard in needed. Yes, new features in the new standard will take time  
to settle down, and there may be a little pain in the meantime, as is  
always the case when things change. But, in this particular case, nothing  
actually breaks.

Moreover, the use of the TLD ".invalid" in munged addresses is already  
fairly widespread, and is already imposing less load on the DNS system  
than the use of other arbitrary bogus domains - a practice which is  
currently even more widespread than the use of ".invalid".

--

-- 
Charles H. Lindsey ---------At  
Home, doing my own thing------------------------
Tel: +44 161 436 6131 Fax: +44 161 436 6133   Web: http://www.cs.man.ac.uk/~chl
Email: chl <at> clerew.man.ac.uk      Snail: 5 Clerewood Ave, CHEADLE, SK8 3JU, U.K.
PGP: 2C15F1A9      Fingerprint: 73 6D C2 51 93 A0 01 E7 65 E8 64 7E 14 A4 AB A5

Keith Moore | 3 Aug 2004 06:12
Picon

Re: Mandatory From field, anonymity, and hacks


>  But, in this particular case, nothing  actually breaks.

as long as you don't consider pounding the root servers "breakage".

> Moreover, the use of the TLD ".invalid" in munged addresses is already  
> fairly widespread

that's no reason to make it worse.

I think this is what is known as a "showstopper".

ned+ietf-822 | 3 Aug 2004 06:54

Re: Has IANA gone mad?


> ned+ietf-822 <at> mrochek.com  wrote on 29.07.04 in <01LD1S0HX57M00005R <at> mauve.mrochek.com>:

> > > Kai Henningsen wrote:
> >
> > > > I found out about the datatracker about a week ago, either from
> > > > something here or possibly on ietf <at> ietf.org - I don't remember exactly.
> > > >
> > > > It may have been discussed at length on various lists, but that
> > > > obviously wasn't enough except for someone who was following one of
> > > > those lists at that time. Which, it seems, didn't include me, or Bruce,
> > > > or persumable a large number of other people.
> >
> > > It's certainly been discussed in some of the plenaries. Some ADs also
> > > make a point of letting you know about it when your I-D gets to the
> > > point of being tracked.
> >
> > And also during apps area sessions, if memory serves. I'm sure it has been
> > covered in WG chair training as well, although I haven't been there to see
> > it.
> >
> > Document authors receive automatic notification of tracker state changes
> > for their documents. This may have been extended to WG chairs by now - I
> > think that was the plan some time back.
> >
> > It was also announced on the main IETF announcement list back on 4-Nov-2002.
> > Discussion has subsequently occured on that list and elsewhere.

> Which pretty much proves my point. It explicitely gets told to a small
> number of people, and was discussed in a small number of places.
(Continue reading)

Laird Breyer | 5 Aug 2004 04:52
Favicon

RFC validation samples?


Hi everybody,

I'm new to this list. I've started implementing a mail header
scanner/parser which will eventually be released under the GPL, as
part of a wider mail classification package I'm working on (homepage
on sourceforge: http://dbacl.sourceforge.net).

I'm really a newby on mail headers, but I've come across some
discrepancies in the RFC2821/RFC2822 grammars which led me to this
group. My current subgoal is to validate each header line separately
according to the four standards 821/288/2821/2822 and any other
relevant ones. So each header line will be marked by all the standards
which apply.

What I would like to know is if there are any publicly available
validation sample messages which I can use to check correctness of my
parser.  Apologies if this has been discussed on the list before, I
have only skimmed the archives. Any other comments and pointers welcome.

Regards,
--

-- 
Laird Breyer.

Keith Moore | 5 Aug 2004 16:52
Picon

Re: RFC validation samples?


My suggestion is that you ask yourself - do you want a parser that 
validates email or do you want a parser that is useful for email 
readers or do you want a parser that can _correct_ malformed email? It 
can be difficult to do more than one of these at the same time.

If you want the latter, you generally need to parse things according to 
RFC 822 rather than 2822, because 822's grammar is simpler and more 
permissive and more representative of what is out there.  And you need 
to look not just at the specifications but also at common kinds of 
errors.  For instance, dates are often malformed (in a wide variety of 
ways), and "." often appears in a phrase before an address.

One way to get an idea of common kinds of errors would be to use a 
strict email syntax checker to validate a large body of stored email 
(say, from various mailing list archives).  Then you could look at the 
discrepancies you find and use that information to write a looser 
parser for use by email readers.  Or maybe people here could contribute 
to a list of common email format errors.

Keith

> I'm new to this list. I've started implementing a mail header
> scanner/parser which will eventually be released under the GPL, as
> part of a wider mail classification package I'm working on (homepage
> on sourceforge: http://dbacl.sourceforge.net).
>
> I'm really a newby on mail headers, but I've come across some
> discrepancies in the RFC2821/RFC2822 grammars which led me to this
> group. My current subgoal is to validate each header line separately
(Continue reading)

Nick Ing-Simmons | 5 Aug 2004 18:15

Re: RFC validation samples?


Keith Moore <moore <at> cs.utk.edu> writes:
>My suggestion is that you ask yourself - do you want a parser that 
>validates email or do you want a parser that is useful for email 
>readers or do you want a parser that can _correct_ malformed email? It 
>can be difficult to do more than one of these at the same time.

The last two can be combined to some extent.

>
>If you want the latter, you generally need to parse things according to 
>RFC 822 rather than 2822, because 822's grammar is simpler and more 
>permissive and more representative of what is out there.  And you need 
>to look not just at the specifications but also at common kinds of 
>errors.  For instance, dates are often malformed (in a wide variety of 
>ways), and "." often appears in a phrase before an address.
>
>One way to get an idea of common kinds of errors would be to use a 
>strict email syntax checker to validate a large body of stored email 
>(say, from various mailing list archives).  Then you could look at the 
>discrepancies you find and use that information to write a looser 
>parser for use by email readers.  Or maybe people here could contribute 
>to a list of common email format errors.

Those are all sensible suggestions. There are several perl modules 
that do this kind of thing that are widely used by SpamAssassin etc.
that have been "trained" (i.e. manually tweaked until they stopped
complaining) on lots of mail. (They are now very robust.)
If you can read perl their code would give hints on common problems.

(Continue reading)

Keith Moore | 5 Aug 2004 18:38
Picon

Re: RFC validation samples?


> The other thing a "useful" library for email readers would need to do 
> is co-exist nicely with tools/library that understands MIME headers. 
> The MIME headers are often mis-formatted.

I find it helpful to think of [2]822/MIME (including 2047) as a single format.

Bruce Lilly | 6 Aug 2004 02:08
Picon

Re: RFC validation samples?


Laird Breyer wrote:

 > I'm new to this list. I've started implementing a mail header
> scanner/parser which will eventually be released under the GPL, as
> part of a wider mail classification package I'm working on (homepage
> on sourceforge: http://dbacl.sourceforge.net).

If you are going to classify message content, you'll need to be able
to parse the message body in addition to the header.  For MIME
messages, you'll also need to be able to handle MIME-part headers,
boundary delimiters, and body sections.  Including base64 and
quoted-printable encoded body content.

> I'm really a newby on mail headers, but I've come across some
> discrepancies in the RFC2821/RFC2822 grammars which led me to this
> group.

There are indeed a few differences; Some are because RFC 2821 is
SMTP-specific whereas RFC 2822 is intended to be a general message
format (so, for example, RFC 2822 has a very loose definition of
"domain-literal" and of the Received header field).  A very few
are genuine incompatibilities (likely to be corrected in the next
revisions, due soon).

> My current subgoal is to validate each header line separately
> according to the four standards 821/288/2821/2822 and any other
> relevant ones. So each header line will be marked by all the standards
> which apply.

(Continue reading)


Gmane