Yuri Inglikov | 11 Aug 2006 10:52
Picon
Favicon

RE: rfc2231 implementations?


Does anybody have insight why RFC2047 explicitly prohibits encoding parameter values and quoted
strings? I am looking through archives but seems cannot find appropriate discussion. At some point using
encoded words inside quoted strings was removed from the initial proposal and there should have been a
good reason which I don't immediately see...

Ultimately I am curious why, instead of extending RFC2047 to parameters, we ended up with RFC2231?

   + An 'encoded-word' MUST NOT appear within a 'quoted-string'.

   + An 'encoded-word' MUST NOT be used in parameter of a MIME
     Content-Type or Content-Disposition field, or in any structured
     field body except within a 'comment' or 'phrase'.

--
Yuri Inglikov

Bruce Lilly | 11 Aug 2006 12:40
Picon

RE: rfc2231 implementations?


On Fri August 11 2006 04:52, Yuri Inglikov wrote:
> 
> Does anybody have insight why RFC2047 explicitly prohibits encoding parameter values and quoted
strings? I am looking through archives but seems cannot find appropriate discussion. At some point using
encoded words inside quoted strings was removed from the initial proposal and there should have been a
good reason which I don't immediately see...

RFC 2047 specifically addresses non-protocol human-readable text (see RFCs
1958 and 2277 for discussions of protocol vs. text).  E.g. in
   From: Yuri Inglikov <Yuri.Inglikov <at> microsoft.com>
the display name "Yuri Inglikov" exists purely for human presentation; it
plays no role in message-related protocols (the protocols do use
"<Yuri.Inglikov <at> microsoft.com>", and RFC 2047 encoding is neither necessary
nor permitted there.

Likewise,
   Subject: RE: rfc2231 implementations?
plays no protocol role in message-related protocols; it merely carries "only
human-readable content" (RFC 2822) for presentation to the recipient.

Conversely, parameter values such as in
   Content-Type: text/plain;
     charset="us-ascii"
are usually composed of protocol keywords (as in the charset parameter in the
example above from your message) or other non-text (in the RFC 1958/2277
sense of "text") content.  RFC 2231 does provide for language-tagging in case
a parameter is used to convey some human-readable text value.

RFC 2047 encoding uses the '=' character for encoded octets, which might prove
(Continue reading)

Keith Moore | 11 Aug 2006 13:24
Picon

Re: rfc2231 implementations?


> Does anybody have insight why RFC2047 explicitly prohibits encoding parameter values and quoted
strings? 

There are two reasons.  The first is that there were objections from
the ietf-822 working group (when this was originally proposed) to
having encoded-words within a string, on the grounds that part of the
purpose of the quotes in a quoted-string were to protect the contents
from interpretation.  It was argued that encoded-words were another
kind of quoting mechanism but that they shouldn't invalidate the
original kind.

The second reason is that encoded-words were intended as human-readable
text rather than machine-readable text.    This allowed encoded-words
to avoid solving certain problems, like canonicalization and
comparison, that turned out to be significant technical hurdles to
the development of encoding schemes for machine-readable tokens
such as IDNs and internationalized email addresses (the hurdles for the
latter have not yet been satisfactorily overcome).

The filename parameter of a content-disposition field is somewhat of a
gray area because it's only a suggestion - for security and other
reasons (e.g. character set differences, restrictions on file
names imposed by the receiving system) no receiving system should
automatically use the filename parameter as a destination filename.
So for the specific case of a content-disposition filename parameter it
might be reasonable for an MUA to interpret a parameter value that
looks like an encoded-word, as if it were decoded according to RFC
2047, just as the MUA might need to transform the filename in other
ways to avoid scribbling in arbitrary directories and/or conform to
(Continue reading)

Yuri Inglikov | 14 Aug 2006 19:35
Picon
Favicon

RE: rfc2231 implementations?


Thank you for your reply, and yes, I am concerned about filename parameter of a content-disposition header
and wish RFC2047 made an exception for this parameter, like what you describe below. There are some
backwards compatibility issues which force me to continue using RFC2047 for a filename. As I mentioned
previously in this mail thread, next version of Exchange Server will decode RFC2231, but more likely than
not will keep producing filename parameter encoded with RFC2047 (even though we have code in place to
encode RFC2231).

--
Yuri Inglikov

-----Original Message-----
From: Keith Moore [mailto:moore <at> cs.utk.edu]
Sent: Friday, August 11, 2006 4:24 AM
To: Yuri Inglikov
Cc: ietf-822 <at> imc.org; arnt <at> gulbrandsen.priv.no
Subject: Re: rfc2231 implementations?

> Does anybody have insight why RFC2047 explicitly prohibits encoding parameter values and quoted strings?

There are two reasons.  The first is that there were objections from
the ietf-822 working group (when this was originally proposed) to
having encoded-words within a string, on the grounds that part of the
purpose of the quotes in a quoted-string were to protect the contents
from interpretation.  It was argued that encoded-words were another
kind of quoting mechanism but that they shouldn't invalidate the
original kind.

The second reason is that encoded-words were intended as human-readable
text rather than machine-readable text.    This allowed encoded-words
(Continue reading)


Gmane