Arnt Gulbrandsen | 2 Aug 2007 12:27
Favicon

Re: I-D ACTION:draft-ietf-imapext-i18n-11.txt


Mark Crispin writes:
> As for codepoints that are not defined in the charset; this is 
> nebulous and fungible over time.  A codepoint which is undefined 
> today may be defined tommorrow by an update to the charset.  Thus, 
> nobody should depend upon an undefined codepoint remaining undefined; 
> nor should depend upon any particular error condition in that case.

Bad example. Sorry. ISO-2002-JP ESC $ B B ESC ( B or UTF-8 N A 0xCF V E 
are better. I've seen both in real mail, and I do not think IMAP 
clients should be permitted to send either with impunity. But it's a 
base IMAP issue and out of scope for this document, so I deleted the 
text.

Arnt

Arnt Gulbrandsen | 2 Aug 2007 12:25
Favicon

Re: I-D ACTION:draft-ietf-imapext-i18n-11.txt


Timo Sirainen writes:
> Two things:
>
> 1. I thought i;unicode-casemap was supposed to be the only MUST implement:
> http://www1.ietf.org/mail-archive/web/lemonade/current/msg03559.html
>
> Is this still going to change?

Do you mean that i;ascii-casemap and i;octet should not be MUSTs? I 
don't care about i;octet, but leaving out i;ascii-casemap seems wrong 
somehow.

> 2. "Strings encoded using unknown character encodings should never  
> match when using the SEARCH command, and should sort together with 
> invalid input"
>
> What about matching invalid input with SEARCH? I think it should be 
> handled in implementation-specific way. I still receive ISO-8859-1 
> mails without charset header and I'd like to be able to SEARCH them.

I started by explaining to you why you're wrong and ended by explaining 
to myself why that text is inappropriate ;) The problem is means to 
solve is out of scope for this document. I've deleted it now.

I have one concern which is not reflected in the text as it stands:  I 
think it would be bad and ugly if the success of a search depends on 
factors the user doesn't control. If you (as user) search for the word 
münchen, the success of your search should not depend on whether your 
IMAP client chooses to send 'charset utf-8 body münchen' or 'charset 
(Continue reading)

Arnt Gulbrandsen | 2 Aug 2007 12:33
Favicon

Re: I-D ACTION:draft-ietf-imapext-i18n-11.txt


Mark Crispin writes:
> The LANGUAGE extension is technically l10n and not i18n.  I won't go 
> as far as to suggest that this document be broken up into a separate 
> l10n and i18n document.  However, the document should not blur the 
> distinction and imply that l10n issues are i18n issues.  They aren't.

While I may agree, I'm not entirely sure, I would strongly prefer not to 
do that. Hardly anyone distinguishes between l10n and i18n anyway, and 
many people don't even know what l10n is. I'd rather be slightly fuzzy 
than use words the audience doesn't know.

Arnt

Timo Sirainen | 2 Aug 2007 12:42
Picon
Picon
Favicon

Re: I-D ACTION:draft-ietf-imapext-i18n-11.txt

On Thu, 2007-08-02 at 12:25 +0200, Arnt Gulbrandsen wrote:
> Timo Sirainen writes:
> > Two things:
> >
> > 1. I thought i;unicode-casemap was supposed to be the only MUST implement:
> > http://www1.ietf.org/mail-archive/web/lemonade/current/msg03559.html
> >
> > Is this still going to change?
> 
> Do you mean that i;ascii-casemap and i;octet should not be MUSTs?

Yes.

> I 
> don't care about i;octet, but leaving out i;ascii-casemap seems wrong 
> somehow.

The only reason I see for i;ascii-casemap is if server developer is lazy
and doesn't want to bother implementing it. I don't see why a client
would want to use it if i;unicode-casemap is also available. I would be
happy if it was a "server MUST implement either one or both of
i;ascii-casemap and i;unicode-casemap".

> > What about matching invalid input with SEARCH? I think it should be 
> > handled in implementation-specific way. I still receive ISO-8859-1 
> > mails without charset header and I'd like to be able to SEARCH them.
..
> I have one concern which is not reflected in the text as it stands:  I 
> think it would be bad and ugly if the success of a search depends on 
> factors the user doesn't control. If you (as user) search for the word 
(Continue reading)

Arnt Gulbrandsen | 2 Aug 2007 12:59
Favicon

Re: I-D ACTION:draft-ietf-imapext-i18n-11.txt


Timo Sirainen writes:
> The only reason I see for i;ascii-casemap is if server developer is 
> lazy and doesn't want to bother implementing it. I don't see why a 
> client would want to use it if i;unicode-casemap is also available. I 
> would be happy if it was a "server MUST implement either one or both 
> of i;ascii-casemap and i;unicode-casemap".

Unfortunately RFC 4790 says this:

    In IMAP, the default collation is i;ascii-casemap, because its
    operations are understood to match IMAP's built-in operations.

I asked and tested before writing that (sometime between IETF LC and 
IESG review), but I didn't ask and test enough, so I didn't know that 
Mark's server changed to a different default collation some time ago.

My current inclination is to keep i;ascii-casemap a MUST and thereby 
avoid a discrepancy between the two documents. After all it's not 
difficult to implement i;ascii-casemap.

Arnt

Dan Karp | 2 Aug 2007 14:01
Favicon

Re: I-D ACTION:draft-ietf-imapext-i18n-11.txt


> My current inclination is to keep i;ascii-casemap a MUST and thereby 
> avoid a discrepancy between the two documents. After all it's not 
> difficult to implement i;ascii-casemap.

As I've mentioned earlier, that depends on where you're starting from.  Supporting i;ascii-casemap --
especially supporting it in anything resembling an efficient manner -- would require a large amount of
work for me.

Arnt Gulbrandsen | 2 Aug 2007 15:26
Favicon

Re: I-D ACTION:draft-ietf-imapext-i18n-11.txt


Dan Karp writes:
>>  My current inclination is to keep i;ascii-casemap a MUST and thereby 
>>  avoid a discrepancy between the two documents. After all it's not 
>>  difficult to implement i;ascii-casemap.
>
> As I've mentioned earlier, that depends on where you're starting from. 
>  Supporting i;ascii-casemap -- especially supporting it in anything 
> resembling an efficient manner -- would require a large amount of 
> work for me.

Don't you use either UCS-2 or UCS-4? Isn't that just mapping A-Z to a-z 
and comparing the resulting code points?

Arnt

Dan Karp | 2 Aug 2007 15:34
Favicon

Re: I-D ACTION:draft-ietf-imapext-i18n-11.txt


> > As I've mentioned earlier, that depends on where you're starting
> > from.  Supporting i;ascii-casemap -- especially supporting it in
> > anything resembling an efficient manner -- would require a large
> > amount of work for me.
> 
> Don't you use either UCS-2 or UCS-4? Isn't that just mapping A-Z to
> a-z and comparing the resulting code points?

Essentially it's an upcased UCS-2.  To do an octet compare would require opening every message file
containing a potential match.

Pawel Salek | 10 Aug 2007 07:27
Picon
Picon
Favicon

UID SEARCH responses


Hi,

I wonder which of the alternatives is a proper ESEARCH response to a  
following search:

. uid search 28:34
* SEARCH 38782 38789 40004 40301 40421 40814 41424
. OK UID SEARCH completed

If I now issue corresponding ESEARCH command:
. uid search return (all) 28:34

What should the IMAP server answer:
* ESEARCH (TAG ".") UID ALL 38782,38789,40004,40301,40421,40814,41424
or
* ESEARCH (TAG ".") UID ALL 38782:41424
. OK UID SEARCH completed
or maybe something else?

The argument for the first form is that it represents a compressed form  
of the ordinary search response (OK, not very much compressed this  
time), and for the latter, that it can be used as the sequence-set  
argument in a subsequent command, as demanded by the RFC.

Pawel Salek

Mark Crispin | 10 Aug 2007 07:47

Re: UID SEARCH responses


For what it's worth, UW imapd issues the
 	* ESEARCH (TAG ".") UID ALL 38782:41424
response instead of
 	* ESEARCH (TAG ".") UID ALL 38782,38789,40004,40301,40421,40814,41424
because there are no other UIDs in that range.

Pawel thought this is wrong, and that a UID range of 38782:41424 must mean 
all the possible values are in the map.

I suggested that he bring the matter up on this list.  I reviewed the 
ESEARCH RFC, and can find nothing to indicate that ESEARCH must deliver a 
UID map (as SEARCH does) rather than a UID sequence set.

If Pawel is correct on the intent, then the ESEARCH RFC is broken since it 
is ambiguous on that point; either live with the ambiguity, or deprecate 
ESEARCH and replace it with some other syntax.  The horse is already out 
of the barn.

I contend that if a client wants the map, then "UID SEARCH ALL" is the 
correct command.  The only advantage to ESEARCH is getting the tag, which 
I don't think is much of an advantage for that particular command.

-- Mark --

http://panda.com/mrc
Democracy is two wolves and a sheep deciding what to eat for lunch.
Liberty is a well-armed sheep contesting the vote.

(Continue reading)


Gmane