Chris Newman | 1 Jun 2007 04:41
Picon

Re: #1485: Choice of body part for transport of UTF8SMTP messages

NOTE: This is my opinion as a technical contributor, not as area director.

When an SMTP server advertises "UTF8SMTP", it is saying "I accept either RFC 
822 and UTF8SMTP messages".  So the DATA command (or BDAT/BURL command) is not 
a type label, it's merely a mechanism to carry one of the supported types. 
Indeed SMTP was originally designed as a generic transfer protocol and the fact 
RFC 822 / MIME is the only format presently used is an accident of history.

What concerns me with UTF8SMTP content in a message/rfc822 body part is the use 
of an explicit type label in a way that is known to be incompatible with the 
current defined meaning for that type.  Perhaps we could get away with it if we 
could guarantee it would never ever leak to a non-UTF8SMTP system.  But this is 
a redefinition of the meaning of a standards track type label as opposed to an 
isolated extension to existing protocols.  I'm a lot more comfortable with 
message/utf8smtp as today's MIME compliant systems are expected to treat 
unknown message subtypes as equivalent to application/octet-stream, much as 
systems unfamiliar with application/zip treat it as an opaque object.

With message/utf8smtp, existing message/rfc822 systems can be vetted to make 
sure they're UTF-8 clean, address the UTF-8 security considerations and then 
add "message/utf8smtp" to the list of compound types they descend.

                - Chris

Kari Hurtta wrote on 5/27/07 6:42 PM +0300:
> Harald Tveit Alvestrand <harald <at> alvestrand.no> writes in gmane.ietf.ima:
>
>> Subject: [psg.com #1485]: UTF8HDR 4.6/DSN: Choice of body part for
>> transport of UTF8SMTP messages
>>
(Continue reading)

John C Klensin | 1 Jun 2007 17:13

Re: #1485: Choice of body part for transport of UTF8SMTP messages

Hi.  

I am personally gone back and forth on this one, but, at this
point, find myself largely in agreement with Chris.  See below.

--On Thursday, 31 May, 2007 19:41 -0700 Chris Newman
<Chris.Newman <at> Sun.COM> wrote:

> NOTE: This is my opinion as a technical contributor, not as
> area director.
> 
> When an SMTP server advertises "UTF8SMTP", it is saying "I
> accept either RFC 822 and UTF8SMTP messages".  So the DATA
> command (or BDAT/BURL command) is not a type label, it's
> merely a mechanism to carry one of the supported types. Indeed
> SMTP was originally designed as a generic transfer protocol
> and the fact RFC 822 / MIME is the only format presently used
> is an accident of history.

As a historical note on that accident, some of us tried to
preserve that "generic transfer" idea for many years.  The
notion of being able to accept an envelope-only message (i.e.,
construct headers on delivery if at all) was actually critical
to the successful working of a number of gateways, just as the
ability of accept header-only (RFC822, more or less) messages
and construct an envelope in flight was essential for others. 

The issue was especially important before DRUMS deprecated the
SMTP SEND command: many of us believed that the message data for
SEND should be sent header-free since display of the headers on
(Continue reading)

Harald Alvestrand | 3 Jun 2007 22:52
Picon

Note on the design team

My AD reminded me that it is wise to keep the WG informed and updated on 
the existence of a design team, per standard IETF guidelines. So here goes:

This WG has a design team consisting of the chairs and the document 
editors. The purpose of the team is to provide an opportunity to get 
quick feedback on documents and help suggest solutions to problems 
raised in the WG.

The design team's role is not to make decisions (that is the prerogative 
of the WG), but to suggest proposals.
The team meets occasionally by Jabber.

The current members are:

Abel Yang
Alexey Melnikov
Chris Newman
Edmon Chung
Kazunori Fujiwara
Harald Alvestrand
YAO Jiankang
John Klensin
Xiaodong Lee
Yangwoo Ko
Randall Gellens
Pete Resnick
Nai-Wen Hsu
Yao Jiankang
Yoshiro Yoneya
Yangwoo Ko
(Continue reading)

Harald Alvestrand | 4 Jun 2007 00:48
Picon

Re: Re: #1485: Choices

Yuri Inglikov wrote:
> +1 for message/rfc822.
>
> I think that even if it leaks outside UTF8SMTP environment without downgrading (which /normally/ should
not happen), it is better if older applications can at least partially interpret it / show most of the
content than let them deal with unknown subtype of a message/ (which they certainly unable to) or deal with
unstructured blob attachment (application/utf8smtp or anything like that). I don't quite understand
the hesitation to extend the meaning of message/rfc822 to allow UTF8 content in UTF8SMTP environment.
Isn't it very similar to allowing any other incompatible thing in such environment, like address headers
with UTF8 local parts? Any such extensions likely will cause same problems if leaked outside UTF8SMTP environment.
And using message/rfc822 would offer an easy and convenient means of 
leakage.

Note that in the case of signed messages, you CANNOT downgrade on the 
border; in the case of encrypted messages, you can't even detect the 
MIME type of the inner part.

> On the other hand, most applications are robust enough to deal with unexpected / malformed MIME content
and could be able to reasonably recover in most cases if faced with UTF8 embedded message.
>
> Yuri Inglikov
>
> _______________________________________________
> IMA mailing list
> IMA <at> ietf.org
> https://www1.ietf.org/mailman/listinfo/ima
>
>   
Charles Lindsey | 4 Jun 2007 13:09
Picon
Picon

Re: #1485: Choice of body part for transport of UTF8SMTP messages

On Fri, 01 Jun 2007 03:41:18 +0100, Chris Newman <Chris.Newman <at> Sun.COM>
wrote:

> What concerns me with UTF8SMTP content in a message/rfc822 body part is  
> the use of an explicit type label in a way that is known to be  
> incompatible with the current defined meaning for that type.  Perhaps we  
> could get away with it if we could guarantee it would never ever leak to  
> a non-UTF8SMTP system.

Yes, that is an essential feature of any proposal to allow UTF8SMTP
content in a message/rfc822.

>  But this is a redefinition of the meaning of a standards track type  
> label as opposed to an isolated extension to existing protocols.

That might well be a valid argument were it not for the fact that we have
already changed what is allowed within multipart/* types (when within the
UTF8SMTP universe, of course). That is now solidly written into into both
the utf8headers and downgrade drafts, and noone has raised any ISSUE
against it. It leads, of course, straight to a requirement to scan the
full body of each message looking for UTF-8 in suspicious places before
you can declare that it is not a UTF8SMTP message, but we agreed on all
that at the timne when we abolished the Header-Type header field.

Now, since a message/rfc822 is rather similar in its structure to a
multipart with only one part, it follows that whatever mechanisms an
implementor has to deploy to deal with our extended multipart/* types is
readily adaptable to deal with an extended message/rfc822, and whatever
risks arise from UTF-8 leaking into the non-UTF8SMTP world are identical
in the two cases.
(Continue reading)

Charles Lindsey | 4 Jun 2007 16:04
Picon
Picon

Re: Re: #1485: Choices

On Sun, 03 Jun 2007 23:48:16 +0100, Harald Alvestrand  
<harald <at> alvestrand.no> wrote:

> Yuri Inglikov wrote:
>> +1 for message/rfc822.
>>
>> I think that even if it leaks outside UTF8SMTP environment without  
>> downgrading (which /normally/ should not happen), it is better if older  
>> applications can at least partially interpret it / show most of the  
>> content than let them deal with unknown subtype of a message/ (which  
>> they certainly unable to) or deal with unstructured blob attachment  
>> (application/utf8smtp or anything like that). I don't quite understand  
>> the hesitation to extend the meaning of message/rfc822 to allow UTF8  
>> content in UTF8SMTP environment. Isn't it very similar to allowing any  
>> other incompatible thing in such environment, like address headers with  
>> UTF8 local parts? Any such extensions likely will cause same problems  
>> if leaked outside UTF8SMTP environment.
> And using message/rfc822 would offer an easy and convenient means of  
> leakage.

No more than using multipart/whatever with UTF-8 in Content-Description  
fields within its parts. Whilst all sorts of weird and wonderful  
departures from standards undoubtedly exist in the present network, if we  
cannot assume that implementors of a newly deployed feature such as  
UTF8SMTP will not follow our requirements, then there is not much hope for  
EAI. All that is required to prevent leakage is for MTAs that do  
downgrading to inspect all contained message/rfc822s to see if their  
headers contain 8bit (and so, recursively, for inner objects within the  
message/rfc822).
>
(Continue reading)

John C Klensin | 4 Jun 2007 18:56

Re: Re: #1485: Choices


--On Monday, June 04, 2007 3:04 PM +0100 Charles Lindsey 
<chl <at> clerew.man.ac.uk> wrote:

>...
>>> with   UTF8 local parts? Any such extensions likely will
>>> cause same problems   if leaked outside UTF8SMTP environment.
>> And using message/rfc822 would offer an easy and convenient
>> means of   leakage.
>
> No more than using multipart/whatever with UTF-8 in
> Content-Description fields within its parts. Whilst all sorts
> of weird and wonderful departures from standards undoubtedly
> exist in the present network, if we cannot assume that
> implementors of a newly deployed feature such as UTF8SMTP will
> not follow our requirements, then there is not much hope for
> EAI. All that is required to prevent leakage is for MTAs that
> do downgrading to inspect all contained message/rfc822s to see
> if their headers contain 8bit (and so, recursively, for inner
> objects within the message/rfc822).

Charles, having started out in favor of message/rfc822 and 
gradually changed my mind, I think the concern is that there are 
ways that things leak even in what is generally a conforming 
environment.  We've had experience with that over and over 
again, to the point that it is hard to pretend that leaks cannot 
(or will not) happen.  I fear that examples abound and that, 
while any one of them can be dismissed with an understanding of 
what can be done to prevent it, their number is such that we are 
very constrained if we want to be safe about these things in 
(Continue reading)

Internet-Drafts | 4 Jun 2007 21:50
Picon
Favicon

I-D ACTION:draft-ietf-eai-smtpext-06.txt

A New Internet-Draft is available from the on-line Internet-Drafts 
directories.
This draft is a work item of the Email Address Internationalization Working Group of the IETF.

	Title		: SMTP extension for internationalized email address
	Author(s)	: J. Yao, W. Mao
	Filename	: draft-ietf-eai-smtpext-06.txt
	Pages		: 20
	Date		: 2007-6-4
	
This document specifies the use of SMTP extension for
   internationalized email address delivery.  Communication with systems
   that do not implement this specification is specified in another
   document.

A URL for this Internet-Draft is:
http://www.ietf.org/internet-drafts/draft-ietf-eai-smtpext-06.txt

To remove yourself from the I-D Announcement list, send a message to 
i-d-announce-request <at> ietf.org with the word unsubscribe in the body of 
the message. 
You can also visit https://www1.ietf.org/mailman/listinfo/I-D-announce 
to change your subscription settings.

Internet-Drafts are also available by anonymous FTP. Login with the 
username "anonymous" and a password of your e-mail address. After 
logging in, type "cd internet-drafts" and then 
"get draft-ietf-eai-smtpext-06.txt".

A list of Internet-Drafts directories can be found in
(Continue reading)

Charles Lindsey | 5 Jun 2007 13:15
Picon
Picon

Re: Re: #1485: Choices

On Mon, 04 Jun 2007 17:56:03 +0100, John C Klensin <klensin <at> jck.com> wrote:

> Charles, having started out in favor of message/rfc822 and gradually  
> changed my mind, I think the concern is that there are ways that things  
> leak even in what is generally a conforming environment.  We've had  
> experience with that over and over again, to the point that it is hard  
> to pretend that leaks cannot (or will not) happen.  I fear that examples  
> abound and that, while any one of them can be dismissed with an  
> understanding of what can be done to prevent it, their number is such  
> that we are very constrained if we want to be safe about these things in  
> practice.

I suspect that leaks of UTF8SMTP messages into message/rfc822 objects  
visible in the non-utf-8 environment will mostly arise from current MUAs  
that manage to acquire a UTF8SMTP message somehow (not supposed to happen,  
but it will) and then try to forward it as a message/rfc822 attachment  
(clearly, such MUAs will not be aware that the attachment should have been  
message/utf8smtp or application/utff8smtp). So there is absolutely nothing  
we can do to stop this from happening; the best thing we can hope to do it  
to arrange that any UTF8SMTP-capable MTA that detects such a beast can at  
least downgrade it to something legal (unless the onward path is  
UT8SMTP-capable, of course).

I doubt that such leaks are going to arise from MTAs that are truly  
UTF8SMTP-capable, which means that they are unlikely to arise in the  
course of DSNs that are simply returning the whole of the original  
UTF8SMTP message, since if they are smart enough to generate a DSN  
according to our documents, then they are probably smart enough to  
downgrade the returned message to something that is safe on the existing  
network (in cases where the return path is not UTF8SMTP).
(Continue reading)

John C Klensin | 5 Jun 2007 21:16

ISSUE: Trivial downgrading

Hi.

Kari and I have been having an offlist discussion that has 
inspired a thought.

Suppose one has a message that has UTF-8 headers, but there are 
no non-ASCII addresses in the headers (either forward or 
backward-pointing).  This may actually turn out to be a common 
case for people who stick with ASCII-addresses (either for 
interoperability or because, e.g., their names don't happen to 
require non-ASCII characters even if their languages are written 
using extended "Latin" scripts and do).   But, if they are using 
UTF8SMTP-aware MUAs and Submission servers (MSAs), they are 
likely to be producing non-ASCII subject lines (as is the case 
today), but sending them as UTF-8 rather than as encoded words. 
There are some other cases, but this is the important one.

At present, our rules appear to sent such messages through the 
entire downgrade process, generating Downgrade headers, etc.

I wonder if, to make that class of cases efficient and maximize 
its utility, we ought to have a different downgrade path that 
simply:

    * Converts the known header fields containing UTF-8 data
    into the same fields with encoded words.

    * Documents the conversion only in trace information (i.e.,
    no "Downgraded:" headers).

(Continue reading)


Gmane