John Gardiner Myers | 13 Jun 1997 21:08
Picon

Accept-Language: proposal

I have a need to localize the human-readable text inside of DSN's and
other automated responses to 822-format messages.  Here is a proposal I
would like to float here, prior to issuing an internet-draft.

Rough definition:

Define an "Accept-Language" header with the following syntax.

accept-language = "Accept-Language" ":" 1#Language-tag
			; Language-tag as defined in RFC 1766

This header discloses the language preference(s) of the sender. 
Languages listed earlier in the list are preferred.

The existence of this header also indicates that the sender is capable
of handling text encoded in the utf-8 charset [RFC 2044], when properly
MIME labeled.

Comments:

The syntax of this header is deliberately a subset of that in HTTP.

Since this header is presumably only going to be generated by "new"
UA's, it is not unreasonable to require these "new" UA's to support
UTF-8.  The alternative of also negotiating acceptable charset is worse.

Disadvantages:

Does not necessarily work correctly if the message goes through list
expansion.  The language preferences of the list maintainer can be
(Continue reading)

Larry Masinter | 13 Jun 1997 21:58
Picon
Favicon

Re: Accept-Language: proposal

> The existence of this header also indicates that the sender is capable
> of handling text encoded in the utf-8 charset [RFC 2044], when properly
> MIME labeled.
> 

Please no.

> The syntax of this header is deliberately a subset of that in HTTP.

Please don't use something with the same syntax but different
semantics. In HTTP, accept-language and accept-charset are orthogonal,
and the reasons for making them so in HTTP apply equally to mail.

Larry
--

-- 
http://www.parc.xerox.com/masinter

Valdis.Kletnieks | 13 Jun 1997 22:19
Picon
Favicon

Re: Accept-Language: proposal

On Fri, 13 Jun 1997 12:08:38 PDT, you said:
> The existence of this header also indicates that the sender is capable
> of handling text encoded in the utf-8 charset [RFC 2044], when properly
> MIME labeled.

Umm.. yeah, but.. this may not do what you hope.

This AIX 4.2 box I'm typing on has *some* UFT-8 support.  However,
my MUA does not support it, nor do I have fonts to cover the entire UTF-8
space.

Also, remember that UTF-8 is only an *encoding* scheme.  It is *not* an
internationalization or localization scheme.  As such, things like currency
formats, date/time preferences, and sorting/collating issues are totally
not addressed.  If the person requests Cyrillic, what's the format of the
date?

Why are you implying UTF-8 support? What does this buy you?

> Since this header is presumably only going to be generated by "new"
> UA's, it is not unreasonable to require these "new" UA's to support
> UTF-8.  The alternative of also negotiating acceptable charset is worse.

How about an alternative of "Return English if you can't supply in one of
the requested languages"?  This would eliminate a UTF-8 requirement, and
be more backward-compatible as well..
--

-- 
				Valdis Kletnieks
				Computer Systems Senior Engineer
				Virginia Tech
(Continue reading)

John Gardiner Myers | 14 Jun 1997 00:21
Picon

Re: Accept-Language: proposal

Larry Masinter wrote: 
> Please no.

Do you mean "no, please define accept-charset as well" or "no, please
call this header something other than accept-charset"?  I strongly
disagree with the former, I could be convinced of the latter.

> In HTTP, accept-language and accept-charset are orthogonal,
> and the reasons for making them so in HTTP apply equally to mail.

I disagree that the reasons for accept-charset in HTTP apply equally to
mail.

1) In mail, the lack of interactivity significantly raises the
importance of using a canonical format over the importance of having the
sender avoid the cost of converting from local form to canonical form.

2) Time has passed on since accept-charset was designed.  The report of
the IAB Character Set Workshop strongly recommends transitioning to
ISO-10646 based charsets, such as UTF-8 and/or UTF-7.

Valdis.Kletnieks <at> vt.edu wrote:
> This AIX 4.2 box I'm typing on has *some* UFT-8 support.  However,
> my MUA does not support it, nor do I have fonts to cover the entire UTF-8
> space.

Your MUA also doesn't support the accept-language header.  In order for
it to support the accept-language header, it would have to be extended
to understand the UTF-8 charset.

(Continue reading)

Keith Moore | 14 Jun 1997 01:21
Picon

Re: Accept-Language: proposal

If it's not going to have the syntax used in HTTP, I'd rather
have it called by some other name.  

I don't think the UTF-8 requirement is reasonable.  
It seems like this would impose a huge implementation burden for most UAs.
With this requirement in place, most people can't use it or benefit
from it.  Without this requirement, many users could just add a header
field to outgoing mail specifying the languages and charsets that the
user can deal with, without needing UTF-8 support. 

Also, the chosen I18N approach to DSNs was to define precise
error codes and include enough information so that the error
could be presented in the user's language by the user agent.
As far as I can tell, the error codes are usually adequate for this
purpose, in that they can describe most conditions with sufficient
precision.  So I have to wonder if this is really needed for DSNs.

And while I could see something like this header being used for
email-based document requests, for it to affect DSNs would require 
that MTAs look at the header of the subject message.   This
is a layering violation.

Keith

Larry Masinter | 14 Jun 1997 02:45
Picon
Favicon

Re: Accept-Language: proposal

Let me try to say this differently:

Headers should be compatible between HTTP and mail: that is,
a single message header should have the same meaning, no 
matter which protocol is being used.

Accept-Language already has a meaning in HTTP. If you want
to use it in mail, you should only use it with the same meaning.
Do not attach new meanings to the same header. If you really
need a different meaning, then use a different header name.

> I disagree that the reasons for accept-charset in HTTP apply equally to
> mail.
> 
> 1) In mail, the lack of interactivity significantly raises the
> importance of using a canonical format over the importance of having the
> sender avoid the cost of converting from local form to canonical form.
> 
> 2) Time has passed on since accept-charset was designed.  The report of
> the IAB Character Set Workshop strongly recommends transitioning to
> ISO-10646 based charsets, such as UTF-8 and/or UTF-7.

I don't see what these have to do with it; you might as well just
assume you can always send UTF-8, then. Whether or not "Accept-Language"
is present in the message doesn't have much to do with it.

In fact, "Accept-Language" is a misnomer; it should probably have been
called "Prefer-Language".

If you want to have a header that means "You can reply with UTF-8",
(Continue reading)

Chris Newman | 15 Jun 1997 08:33

Re: Accept-Language: proposal

On Fri, 13 Jun 1997, Keith Moore wrote:
> I don't think the UTF-8 requirement is reasonable.  
> It seems like this would impose a huge implementation burden for most UAs.
> With this requirement in place, most people can't use it or benefit
> from it.  Without this requirement, many users could just add a header
> field to outgoing mail specifying the languages and charsets that the
> user can deal with, without needing UTF-8 support. 

I disagree with this point.  UTF-8 is not particularly hard to support --
it's certainly much easier to support than a large table of multiple
charsets for each language and all the cross-conversions (which often use
Unicode as an intermediate).

We need to start "hooking" UTF-8 to proposals like this so that we can
transition away from the current mess of character sets we have.  In
addition, an "accept-language" header which requires UTF-8 is a far
simpler architecture than a combination of "accept-language" and
"accept-charset".  Given the IAB charset workshop guidelines I think it
would be a mistake to let "accept-charset" leak out of HTTP.

Ned Freed | 16 Jun 1997 04:15

Re: Accept-Language: proposal

> I have a need to localize the human-readable text inside of DSN's and
> other automated responses to 822-format messages.  Here is a proposal I
> would like to float here, prior to issuing an internet-draft.

> Rough definition:

> Define an "Accept-Language" header with the following syntax.

> accept-language = "Accept-Language" ":" 1#Language-tag
> 			; Language-tag as defined in RFC 1766

> This header discloses the language preference(s) of the sender.
> Languages listed earlier in the list are preferred.

I agree with Larry that use of Accept-Language: in this fashion, which doesn't
correspond to the HTTP semantics, is a showstopper. The IETF has both critiqued
and modified HTTP usage where that usage conflicts with MIME, and I think the
IETF was correct in doing this. But it is only fair that when the shoe is on
the other foot, and we're defining stuff for MIME-based but non-HTTP-based
protocols like Internet mail, that we pay attention to HTTP usage and not do
things that conflict with it.

There are two ways to solve this problem. One is to use the same field
name and keep the semantics the same. The other is to define a new field
with limited semantics.

I see problems with both approaches. The first seems excessive for the
stated application. And the second only solves this one problem without
providing a basis for other sorts of capability discovery (e.g. discovery
of support for things other than langauge, support for different capabilities
(Continue reading)

Jeff Stephenson (Exchange | 16 Jun 1997 17:42
Picon

RE: Accept-Language: proposal

It seems to me that it would be better to have the UA provide a
localized human-readable explanation of the error based on the enhanced
error codes, for two reasons:

1)  This proposal, to be useful, requires that MTAs be able to generate
DSNs in a large number of languages.  In practice, I suspect that you'd
find maybe 5 or 6 major languages being supported, with the result that
in many cases DSNs would still end up being generated in English.
2)  The user agent, almost by definition, is going to be localized for
any particular user; it can use the enhanced error codes to generate an
appropriate message.  Further the UA is in a position to tell the user
what, if anything, they can do about the problem using that particular
UA.

I also share Keith's concern about the layering issue: this
functionality should be an SMTP extension rather than an 822 header.
But unless I'm missing something, I don't think it should be either -
enhanced error codes should be all that's required.

-- jeff

> -----Original Message-----
> From:	John Gardiner Myers [SMTP:jgmyers <at> netscape.com]
> Sent:	Friday, June 13, 1997 12:09 PM
> To:	ietf-822 <at> imc.org
> Subject:	Accept-Language: proposal
> 
> I have a need to localize the human-readable text inside of DSN's and
> other automated responses to 822-format messages.  Here is a proposal
> I
(Continue reading)

Keith Moore | 16 Jun 1997 20:21
Picon

Re: Accept-Language: proposal

> On Fri, 13 Jun 1997, Keith Moore wrote:
> > I don't think the UTF-8 requirement is reasonable.  
> > It seems like this would impose a huge implementation burden for most UAs.
> > With this requirement in place, most people can't use it or benefit
> > from it.  Without this requirement, many users could just add a header
> > field to outgoing mail specifying the languages and charsets that the
> > user can deal with, without needing UTF-8 support. 
> 
> I disagree with this point.  UTF-8 is not particularly hard to support --

It would require a huge effort to take every tool that I use to
process mail, and extend them all to use UTF-8.  This remains true
even if I only care about supporting the 8859/1 subset.

(in my case, we're talking MH, exmh, metamail, xterm, procmail,
/bin/mail, and numerous scripts)

> We need to start "hooking" UTF-8 to proposals like this so that we can
> transition away from the current mess of character sets we have.  

I strongly disagree.  I want to encouage a transition to UTF-8, but
not by "hooking" it to what would otherwise be a very simple and
useful extension, and not in such a way that it would have a severe
adverse impact on the installed base.

We haven't yet recovered from the 822-to-MIME transition.  The vast
majority of mailers out there can't yet deal with it adequately.
(They may have some support for it, but in general it seems to be
lousy.)  To raise the bar further at this point would not encourage
adoption of UTF-8, it would just degrade interoperability.  Some tools
(Continue reading)


Gmane