Lyndon Nerenberg | 18 May 2001 23:24

BINARY version 3

I've submitted the -03 revision to the I-D editor. You can take an
early peek via:

 http://research.messagingdirect.com/draft-nerenberg-imap-binary-03.txt

The major change is the replacement of the horrid BINARYSTRUCTURE goop
with the new BINARY.SIZE attribute.

--lyndon

Jutta Degener | 21 May 2001 21:26

offset and size domain in partial BINARY fetch

Lyndon Nerenberg wrote:
>  http://research.messagingdirect.com/draft-nerenberg-imap-binary-03.txt

I'm worried about the <partial> addressing in BINARY.  This one here:

#    BINARY[<section_binary>]<<partial>>
#         Requests that the specified body section be transmitted after
#         performing CTE-related decoding.
#
#         When performing a partial FETCH, the offset argument refers to
#         the offset into the ENCODED body section.

This seems next-to-useless, since I can't predict where to legally
"break" in a base64- or quoted-printable encoded document without
examining the document itself.  (In base64, line length may vary;
in quoted-printable, encoded length depends on the data.)

What my client _would_ need would be a way of getting "the next block
of decoded binary".  That is, in my ideal world, the offset argument
would refer to the offset into the DECODED body section, not the
encoded one.

I suspect the current definition comes from a desire to reduce the
burden on the server, but I think we've already abandoned that strategy
when we admitted that the binary SIZE would have to be exact, rather
than just an educated guess. I'd rather have something useful but
potentially expensive than something fast, but useless.

Thoughts?

(Continue reading)

Mark Crispin | 21 May 2001 21:32

re: offset and size domain in partial BINARY fetch

I agree with Jutta's analysis.  I don't see much point in the BINARY extension
at all if partial fetches use the encoded section values.

A server is probably going to cache the decoded binary content anyway, rather
than decode on the fly (or do that more than once).  Nor does the requirement
for encoded section offset/size save the server any work; it probably adds
more work.  Remember that both QP and B64 are stateful; you have to know the
state at a particular byte in order to decode correctly.

Not that I see much point in the BINARY extension anyway; its benefit seems to
be primarily psychological.  But it seems more or less harmless as long as
it's designed right, and I'll probably implement it.

Lyndon Nerenberg | 22 May 2001 19:18

Re: offset and size domain in partial BINARY fetch

>>>>> "Jutta" == Jutta Degener <jutta.degener <at> mediagate.com> writes:

    Jutta> What my client _would_ need would be a way of getting "the
    Jutta> next block of decoded binary".  That is, in my ideal world,
    Jutta> the offset argument would refer to the offset into the
    Jutta> DECODED body section, not the encoded one.

Good point. The use of encoded offsets is a holdover from the
previous draft, where there was no guarantee of an accurate
(or any) decoded size being reported. Now that an accurate size
is required from the server, I'll change the offset to refer to
the decoded object.

--lyndon

Lyndon Nerenberg | 22 May 2001 23:30

Re: offset and size domain in partial BINARY fetch

>>>>> "Jutta" == Jutta Degener <jutta.degener <at> mediagate.com> writes:

    Jutta> I'm worried about the <partial> addressing in BINARY.
           [...]
    Jutta> This seems next-to-useless

Yes. This was a holdover from when the reference was to the
encoded section. Based on the feedback from -03 I've made
the following changes:

* Offsets now refer to the DEcoded content.
* Removed partials from BINARY.SIZE.

These have been rolled up into a -04 version. Additional changes
include:

* UNKNOWN-TRANSFER-ENCODING changed to UNKNOWN-CTE.
* msg_att grammar added.
* <section_binary> element removed. BINARY now uses
  <section>, and text has been added to clarify what
  it means to binary fetch things like message headers.

The preview version is at http://research.messagingdirect.com/

--lyndon

Arnt Gulbrandsen | 29 May 2001 22:35
Picon
Favicon
Gravatar

draft-ietf-imapext-thread

The thread extension offers no way to get threading information for only
some mail (typically, the newly arrived messages).  Isn't this a problem?

The thread=references model specifies an algorithm to be used, in detail,
and that algorithm is one that makes several passes over the entire set of
messages.  The algorithm is complex, and it's not clear to me whether it
can be adapted to be incremental without possibly changing its output, and
whether such changes are permitted by the draft.

So, if a server does incremental References:-based threading on delivery,
can it advertise thread=references?

--Arnt

Mark Crispin | 29 May 2001 22:57

re: draft-ietf-imapext-thread

On Tue, 29 May 2001 22:35:43 +0200, Arnt Gulbrandsen wrote:
> The thread extension offers no way to get threading information for only
> some mail (typically, the newly arrived messages).  Isn't this a problem?

Did you notice that the last argument specifies what messages are to be
threaded?  So if you only want to thread new messages, you can do:
	THREAD REFERENCES UTF-8 NEW

However, usually ALL is used, because you want to thread in context.

> The thread=references model specifies an algorithm to be used, in detail,
> and that algorithm is one that makes several passes over the entire set of
> messages.  The algorithm is complex, and it's not clear to me whether it
> can be adapted to be incremental without possibly changing its output, and
> whether such changes are permitted by the draft.

That's the problem -- a newly-arrived message can completely change the thread
tree.  It's therefore quite possible that an "incremental" change (one that
simply says what is different from the previous tree) is larger in size that
the previous tree.

To complicate matters, there is no obvious algorithm for determining how a
newly arrived message would change the tree; without such an algorithm, you
have to rethread on the server, and then do a comparison between the old and
new tree.

This has been the flaw in the arguments for "subsetting"; you can only
"subset" in the simplest cases.  It turns out that the only meaningful
subsetting is on a static tree, which makes subsetting much less valuable than
its proponents claim.  That's why subsetting hasn't been particularly
(Continue reading)

Arnt Gulbrandsen | 30 May 2001 11:18
Picon
Favicon
Gravatar

Re: draft-ietf-imapext-thread

Mark Crispin <MRC <at> CAC.Washington.EDU>
> Did you notice that the last argument specifies what messages are to be
> threaded?  So if you only want to thread new messages, you can do:
> 	THREAD REFERENCES UTF-8 NEW

The sound you hear is me kicking myself.

> > The thread=references model specifies an algorithm to be used, in detail,
> > and that algorithm is one that makes several passes over the entire set of
> > messages.  The algorithm is complex, and it's not clear to me whether it
> > can be adapted to be incremental without possibly changing its output, and
> > whether such changes are permitted by the draft.
> 
> That's the problem -- a newly-arrived message can completely change the thread
> tree.  It's therefore quite possible that an "incremental" change (one that
> simply says what is different from the previous tree) is larger in size that
> the previous tree.
> 
> To complicate matters, there is no obvious algorithm for determining how a
> newly arrived message would change the tree; without such an algorithm, you
> have to rethread on the server, and then do a comparison between the old and
> new tree.

I don't think we're talking about the same thing here. You seem to assume
that clients want thread information about the entire mailbox, and that
the question is whether servers can save bandwidth by transmitting changes
to the tree, rather than resending the entire three.  Right?

Well, I am concerned about whether the server can do thread processing at
message delivery time, so that processing a thread request becomes as
(Continue reading)

Mark Crispin | 30 May 2001 20:59

Re: draft-ietf-imapext-thread

The problem is that threads are not like other types of data.  A message that
was not previously in a thread could join the thread as the result of the
addition of another message.  Put another way, a message's "thread status"
depends upon other messages.

Consequently, "thread status" is not at all per-message metadata.  It is
mailbox metadata.  If you subset the messages considered by the THREAD
command, the resulting tree is metadata for that subset, and not for any other
set (including a superset of that subset).

There isn't any known algorithm to update thread metadata with new messages,
either on a global or a per-message basis.  The only known way is to
recalculate the entire thread tree; and if the client needs to know it, to
transmit the entire tree to the client.


Gmane