Andrew Bartlett | 8 May 2006 15:32
Picon
Favicon

Re: Charter: internationalization is out of scope

On Thu, 2006-04-27 at 10:23 -0500, Nicolas Williams wrote:

> There are none that I can think of.
> 
> Neither Heimdal, nor MIT, nor GNU, nor Solaris implementations of
> GSS-API framework or mechanisms, nor any of the apps that use the
> GSS-API that I can think of (SSHv2 [e.g., OpenSSH], SASL [e.g., Cyrus
> SASL], NFS w/ RPCSEC_GSS [e.g., CITI] , and so on), qualify.
> 
> ALL just-use-8, which always amounts to: just-use-the-current-locale's-
> codeset.

Just as a note about Samba's place in this.

Looking at interacting with Windows, I'm expecting we will slowly start
on this.  I suspect we will assume the libs are just-send-8 and that we
will convert outside the API.  

I hope to try and do this as 'right' as practical in Samba4, while
retaining windows wire compatibility.  In Samba3 (where we don't use
real GSSAPI, but rather fake things up), we assume kerberos principals
are UTF8 on receipt of a ticket from a client.

Andrew Bartlett

--

-- 
Andrew Bartlett                                http://samba.org/~abartlet/
Authentication Developer, Samba Team           http://samba.org
Student Network Administrator, Hawker College  http://hawkerc.net
(Continue reading)

Nicolas Williams | 8 May 2006 22:30
Picon

I18N & L10N proposal for GSS-API (v2u2/v3)

I propose the following solution to the I18N issue:

1) Add constants for identifying character encoding schemes with the
   prefix 'GSS_C_CES_' (e.g., GSS_C_CES_UTF8).

   The type could be INTEGER or OID, I don't care much either way.

2) Add new versions of:

    - GSS_Import_name()
    - GSS_Display_name()
    - GSS_Display_status()

   that take an additional input parameter indicating the character
   encoding scheme used or to be used.

   (And don't forget the display_as_name_type input parameter of
   GSS_Display_name_ext() in the naming extensions I-D.)

3) Specify two GSS_C_CES_* constants to begin with:

   a) GSS_C_CES_UTF8

     Meaning that input human readable strings are encoded in UTF-8, and
     output human readable strings are to be encoded in UTF-8.

   b) GSS_C_CES_DEFAULT

     Meaning that the CES used/to be used is determined in a
     platform-specific manner, and typically [on POSIX/POSIX-like
(Continue reading)

Martin Rex | 8 May 2006 22:59
Picon
Favicon

Re: I18N & L10N proposal for GSS-API (v2u2/v3)

I would propose to add the following three versions instead:

gss_import_name_utf8()
gss_display_name_utf8()
gss_display_status_utf8()

because I consider it too much of a burden for gssapi mechanisms
to be able to cope with every possible codepage under the sun.

With the limitation to utf8, a gssapi mechanism must only know
how to transform between UTF8 and the (canonical) network encoding
of printables (travelling within gssapi context level tokens or
being communicated out-of-band, as with the Kerberos KDC
communication).  The application designers should have to
figure out (or worry) how many or which codepages (and conversions)
are necessary to support specific environments or scenarios.

I'm just trying to add support for arbitrary codepages at the
gssapi at some point of our application, and it grows the
code by 8-12 MBytes ...

About L10N: Although I don't want to prevent anyone from going
down that rathole, I will continue to use exclusively the
english variant of the error message (if possible) within
our application.  I have been requiring that for interoperability
certification with our application that the gssapi library
produces meaningful error messages in english and manually review
them (at least for those error-situation that gsstest tries).

-Martin
(Continue reading)

Nicolas Williams | 8 May 2006 23:02
Picon

Re: I18N & L10N proposal for GSS-API (v2u2/v3)

On Mon, May 08, 2006 at 10:59:00PM +0200, Martin Rex wrote:
> I would propose to add the following three versions instead:
> 
> gss_import_name_utf8()
> gss_display_name_utf8()
> gss_display_status_utf8()
> 
> because I consider it too much of a burden for gssapi mechanisms
> to be able to cope with every possible codepage under the sun.

Someone has to: the application or the mechanism.

Perhaps we should settle that first.

I vote for the burden to be on the mechanism -- there are more
applications than mechanism implementations [in my world].
Jeffrey Hutzelman | 8 May 2006 23:02
Picon
Favicon

Re: I18N & L10N proposal for GSS-API (v2u2/v3)


On Monday, May 08, 2006 03:30:42 PM -0500 Nicolas Williams 
<Nicolas.Williams <at> sun.com> wrote:

> I propose the following solution to the I18N issue:
>
> 1) Add constants for identifying character encoding schemes with the
>    prefix 'GSS_C_CES_' (e.g., GSS_C_CES_UTF8).
>
>    The type could be INTEGER or OID, I don't care much either way.

How about an RFC2978 charset name?  Or, if you really want something 
numberic, the MIBenum values from the same registry.

> 3) Specify two GSS_C_CES_* constants to begin with:
>    a) GSS_C_CES_UTF8
>    b) GSS_C_CES_DEFAULT

> 4) Create an IANA registry of GSS_C_CES_ values/semantics.

See above - we should reuse the registry that already exists for this 
purpose, rather than inventing our own.  If we decide the functionality 
represented by your GSS_C_CES_DEFAULT is important, we can provide a way to 
indicate that is the desired behavior (for example, in C, one could pass 
NULL instead of a charset name)

Aside from this, I agree with this proposal as it applies to GSSAPIv3.
For GSSAPIv2 we are only empowered to "clarify", not to define new API 
elements.  So I think all we can/should do is apply the same note you 
describe in your point 5, indicating that there are many implementations 
(Continue reading)

Nicolas Williams | 8 May 2006 23:29
Picon

Re: I18N & L10N proposal for GSS-API (v2u2/v3)

On Mon, May 08, 2006 at 04:02:42PM -0500, Nicolas Williams wrote:
> On Mon, May 08, 2006 at 10:59:00PM +0200, Martin Rex wrote:
> > because I consider it too much of a burden for gssapi mechanisms
> > to be able to cope with every possible codepage under the sun.
> 
> Someone has to: the application or the mechanism.
> 
> Perhaps we should settle that first.
> 
> I vote for the burden to be on the mechanism -- there are more
> applications than mechanism implementations [in my world].

Allow me to expand as to why (and then offer a concession):

 - Mechanisms MUST deal with stringprep and/or other conversion issues
   and this *cannot* be left to the application.

   I.e., mechanisms already MUST have some of the I18N burden placed on
   them.

   Particularly since mechanisms may have different stringprep
   requirements.

 - There are (or, darn it, should tend to be) more applications than
   mechanisms.

   I.e., placing more of the I18N burden on mechanisms means less code
   than placing more of the I18N burden on applications.  And less code
   is better.

(Continue reading)

Martin Rex | 9 May 2006 17:27
Picon
Favicon

Re: I18N & L10N proposal for GSS-API (v2u2/v3)

Nicolas Williams wrote:
> 
> On Mon, May 08, 2006 at 10:59:00PM +0200, Martin Rex wrote:
> > I would propose to add the following three versions instead:
> > 
> > gss_import_name_utf8()
> > gss_display_name_utf8()
> > gss_display_status_utf8()
> > 
> > because I consider it too much of a burden for gssapi mechanisms
> > to be able to cope with every possible codepage under the sun.
> 
> Someone has to: the application or the mechanism.
> 
> Perhaps we should settle that first.

the two largest camps of existing mechanisms are:

 - Kerberos and Kerberos-like  either completely ignoring the issue
   or using UTF8 (on the network) like Microsoft's implementation

 - SPKM or SPKM-like using X.509 certificates -- and they do have
   a canonical encoding (inside the certs) of mostly ASCII,
   ISO-8859-1 (=t61string abuse) and UTF8STRING

I don't know how many gssapi mechanisms there are today that
offer to convert a well-defined canonical network encoding
into an arbitrary local codepage.  There are likely VERY few.

Claiming that an implementation supports the locale, while in
(Continue reading)

Nicolas Williams | 9 May 2006 17:37
Picon

Re: I18N & L10N proposal for GSS-API (v2u2/v3)

On Tue, May 09, 2006 at 05:27:38PM +0200, Martin Rex wrote:
> Nicolas Williams wrote:
> > Someone has to: the application or the mechanism.
> > 
> > Perhaps we should settle that first.
> 
> the two largest camps of existing mechanisms are:
> 
>  - Kerberos and Kerberos-like  either completely ignoring the issue
>    or using UTF8 (on the network) like Microsoft's implementation
> 
>  - SPKM or SPKM-like using X.509 certificates -- and they do have
>    a canonical encoding (inside the certs) of mostly ASCII,
>    ISO-8859-1 (=t61string abuse) and UTF8STRING
> 
> I don't know how many gssapi mechanisms there are today that
> offer to convert a well-defined canonical network encoding
> into an arbitrary local codepage.  There are likely VERY few.
> 
> Claiming that an implementation supports the locale, while in
> fact it completely ignores the problem is not going to help us.

I didn't make that claim.

> > I vote for the burden to be on the mechanism -- there are more
> > applications than mechanism implementations [in my world].
>  
> I'm strongly opposed to putting any burden besides
> iso-8859-1 and UTF8 onto the GSS-API (v3) mechanisms!

(Continue reading)

Jeffrey Altman | 9 May 2006 19:16
Favicon

Re: I18N & L10N proposal for GSS-API (v2u2/v3)

Nicolas Williams wrote:

> The concession I may be willing to make:
> 
>  - REQUIRE support only for GSS_C_CES_UTF8, and REQUIRE that
>    GSS_C_CES_DEFAULT be an alias for GSS_C_CES_UTF8 if it isn't
>    something else.
> 
>    This would mean that portable GSS-APIv3 applications would have the
>    burden of doing any charset conversions or requiring that they be run
>    in UTF-8 locales only.
> 
>    However, applications intended to be portable within the POSIX world
>    would typically be able to rely on  which, for my purposes would be
>    sufficient.
> 
> 
> Eventually most OS vendors will deprecate and stop supporting and
> shipping non-Unicode locales, so technically GSS_C_CES_DEFAULT would
> ultimately become an alias for GSS_C_CES_UTF8, and which makes
> GSS_C_CES_DEFAULT ultimately nothing more than a transition crutch...
> 
> ...But a very good crutch.
> 
> Nico

>From a practical perspective I don't see this proposal as being much
different than adding the "_utf8" versions that Martin is suggesting
and using the existing versions as they are used today.  Applications
will eventually migrate to the _utf8 versions and eventually we will
(Continue reading)

Jeffrey Altman | 9 May 2006 19:27
Favicon

Re: I18N & L10N proposal for GSS-API (v2u2/v3)

Nicolas Williams wrote:

> But anyways, in my follow-up to the mail you were replying to I offer to
> make only GSS_C_CES_UTF8 REQUIRED, which means that you could have a
> compliant mechanism implementation that caters only for UTF-8 and does
> no conversions.
> 
> Nico

Nico:

All I see this resulting in is a new negotiation problem.  The
application and the mechanism now have to negotiate over the common
set of character-sets that both support.

If we don't support negotiation and a mechanism is passed
GSS_C_CES_KLINGON what is it supposed to do?  All it can do is
return an error indicating unsupported character-set.  At that point
the application will have to perform translation to UTF8.  While the
concept of selectable character sets has the potential to save a few
hours of application developer programming time, I don't see it
providing any real benefits.  All of the major OSs already provide
application developers with the functions to convert from locale to
UTF8 and back.  Requiring that application developers make such calls
or be UTF8 based prior to using GSSv3 is truly not a big deal in my
mind.

Jeffrey Altman (speaking as an individual)

(Continue reading)


Gmane