Re: disposition vs. media type
Frank Ellermann <nobody <at> xyzzy.claranet.de>
2006-09-19 07:27:57 GMT
Bruce Lilly wrote:
>>> A MIME-part has inline disposition if and only if it
>>> contains a Content-Disposition MIME-part header field
>>> specifying inline disposition.
>> I'd consider a MUA not displaying US-ASCII text/plain as
>> plain text as broken, YMMV.
> Media type (text/plain, etc.) and disposition (inline,
> attachment, etc.) are distinct characteristics of a MIME-part
Where they are specified. Your objection was to my remark
about a part with no Content header field at all. In that
case it's either the default message/rfc822 for a part of
multipart/digest, or default text/plain US-ASCII 7bit for
all other multiparts (RFC 2046 5.1.1)
For a simple message/rfc822 without Content header fields
that's ordinary mail, no tricks, no traps, no attachment -
unless you and your MUA wish to look for say UUE, that's
your private business.
> Incidentally, RFC 2049 specifies that MIME-conforming UAs
> must treat text/plain (and all other media types) with an
> unrecognized transfer encoding as application/octet-stream
An unrecognized CTE isn't the same as no explicit CTE. The
default is 7bit (RFC 2045 6.1). For unrecognized subtypes
of text the default type is text/plain (RFC 2046 4.1.4), and
any charset other than US-ASCII must be specified (4.1.2).
From that I guess that UAs could be free to panic if they
find an 8bit octet in wannabe-default text/plain US-ASCII,
and enter "unrecognized CTE mode". But that's not what you
found in RFC 2049, where "unrecognized" IMO means "the UA
doesn't know the explicitly specified CTE", e.g. CTE x-yenc.
Clearly there must be an explicit CTE as soon as there are
8bit bytes in mail (RFC 2046 4.1.2 on page 8)
> The same section, paragraph "(6)" states that if an
> unrecognized charset is specified, text media types should
> be treated as application/octet-stream
Yes, I'll test it with this message, let's see what our
UAs do.
> RFC 2046 states that behavior only if the subtype is
> also unrecognized
Again 4.1.4, I see it. With that text/foo; charset=bar
is a convoluted way to get application/octet-stream.
If that's an intentional difference from RFC 2049 I miss
the point. An attempt: text/html; charset=bar For a
recognized text subtype there could be a more elaborated
fallback, e.g. Latin-1 in that case. Because RFC 2049
discusses "minimal conformance" such cases are out of
scope.
> Perhaps you consider MIME-conforming behavior to be broken?
I just don't confuse unrecognized and unspecified... <beg>
> while there is interaction between transfer encoding and
> media type, and between charset and media type for text
> subtypes, disposition and media type are orthogonal.
If specified. Otherwise there are sound defaults with the
unsurprising effect to display plain text ASCII as plain
text ASCII.
> application/octet-stream with a disposition of "inline"
> doesn't make much sense.
That's up to the UA, it could start a hex. viewer <shrug />
In reality we want it to ask if it should save this as a
file, if we care to name an application to do something
with it, or if we want to just ignore / skip / trash it.
And ideally (in my parallel universe) it's never accepted
by the MSA or MX, let the sender figure out how to get it
right.
> US-ASCII text/plain with a specified disposition of
> "attachment" should never be displayed inline by a
> 2183-conforming UA.
Certainly it shouldn't without asking, if it supports
RFC 2183 (not covered by RFC 2049 for obvious reasons).
That's again a case of a _specified_ disposition, and
your objection was to my remark about no Content header
field at all.
Apparently that's the definition of an "attachment" in
RFC 2183: no automatic display without asking. Display
after asking is okay. Makes sense, e.g. for a text/html
attachment you could decide to display it as plain text.
For some attached text/plain source code you could wish
to look at it before you save it as file. For ordinary
mail worms checking if the raw B64 starts with TV or UE
(= boring, delete) is also okay.
> US-ASCII text/plain with no specified disposition can
> be displayed inline or not by an MUA. One which does
> not display it inline would be unusual, but also
> unusually safe, as it is not above the ethics of
> spammers, virus writers and other miscreants to mislabel
> other content as text/plain
That's IMO not unusually safe but unusually paranoid.
If an UA runs on top of a terminal emulation where weird
escape sequences can do bad things, then the "security
considerations" for the spammers would be "if you try
this add an explicit Content-Disposition: inline". Zero
advantage for the users of this paranoid UA, but more
gibberish annoying users of pre-MIME UAs and/or modems.
> it apparently is beyond the sensibility of some UA
> programmers to avoid decoding and displaying mislabeled
> HTML, including scripting content
True, but my vision of displaying plain text is "as is".
If possible without crashing at NUL bytes (my UA does
this). Wrt NO-WS-CTL we discussed this already for some
years "elsewhere" (BTW, get your LC comments in shape
Frank