Laird Breyer | 1 Jul 02:33
Favicon

Re: gzip/deflate compression/encoding


I think this thread is getting bogged down in technical points without
a clear overview. Rather than respond to each good point you've
raised, I'd like to try another line which broadly summarizes my view
and is perhaps more useful to the list.

I generally agree with you that double compression is useless, where
we differ is where the single compression step should be applied.  You
propose that it should occur in the MIME format, while I still believe
it should be a transparent service at a level much closer to the OS.

Since XML data seems to be the main beneficiary mentioned thus far of
the CTE (37% base64 overhead aside), let me ask what the proponents
aim to achieve in this instance?  

Other media types such as audio/video/images are already compressed so
won't benefit substantially from a CTE. Is the idea of CTE compression
simply a way to pass the burden of XML compression onto another
system level?

XML is by design a textual format. An explicit goal of this format
is the ability for humans and other programs to easily read and
modify it. The internet message format has similar goals, although
MIME more generally may not.

If space is such a serious concern, I ask: why shouldn't XML applications
simply read and write XML directly in compressed form when needed
(you already gave the example of OpenOffice) ?
There would be no need for MIME readers and writers to be adapted at
all, and the physical benefits would be comparable.
(Continue reading)

Bruce Lilly | 1 Jul 03:26
Picon

Re: gzip/deflate compression/encoding


On Thu June 30 2005 20:33, Laird Breyer wrote:

> I generally agree with you that double compression is useless, where
> we differ is where the single compression step should be applied.  You
> propose that it should occur in the MIME format, while I still believe
> it should be a transparent service at a level much closer to the OS.

Although it's a perhaps subtle distinction, my position is that an
end-to-end solution is preferable to multiple hop-by-hop partial
solutions, and that MIME is the most appropriate mechanism for
supporting a general end-to-end solution.

> Since XML data seems to be the main beneficiary mentioned thus far of
> the CTE (37% base64 overhead aside), let me ask what the proponents
> aim to achieve in this instance?  

I'm not particularly a compression proponent and I'm certainly not
an XML proponent.  The gains would be the usual ones for compression;
shorter transmission time, etc.  For email, it probably doesn't
matter much, but for highly interactive protocols (HTTP, IM, etc.)
it may be a significant factor.  As a non-proponent of XML, I'd
personally say pick a more efficient representation, but that's
just me.

> Other media types such as audio/video/images are already compressed so
> won't benefit substantially from a CTE.

A few points:
1. some specific image (etc.) formats can benefit from compression
(Continue reading)

Frank Ellermann | 1 Jul 05:47
Picon
Picon

Re: gzip/deflate compression/encoding


Charles Lindsey wrote:

> Are the usual audio, image, etc. formats truly 8bit clean
> (i.e. are they guaranteed not to contain NUL or naked CR or
> LF)?

Of course not, this thread is about introducing yenc as CTE,
in pseudo-REXX (untested, ignoring trailing SP or HT issues):

   CRLF = x2c( 0D0A )
   BAD  = CRLF || x2c( 0 ) || '='
   OUT  = ''
   LIM  = 500 /* REXX idiosyncrasy on my side ;-) */

   do while INPUT \== '' /* strict comparison */
      parse var INPUT TOP 2 INPUT
      TOP = d2c( c2d( TOP ) + 64 ) // 256 )

      if sign( pos( TOP, BAD )) then OUT = OUT || '='
      OUT = OUT || TOP

      if LIM <= length( OUT ) then do
         call charout /* stdout */, OUT || CRLF
         OUT = ''
      end
   end  

> If not, then you are back to the 37+% expansion of base64.

(Continue reading)

Charles Lindsey | 1 Jul 13:06
Picon
Picon

Re: gzip/deflate compression/encoding


In <01LQ2KJ1KAXY00004T <at> mauve.mrochek.com> ned+ietf-822 <at> mrochek.com writes:

>Repeating for the Nth time: Media types are supposed to be used to identify
>types of media, not general compression schemes or other stuff. Section 2.2.1
>of RFC 2048 is very clear about this, and this requirement remains unchanged in
>section 4.1 of draft-freed-media-type-reg-04.txt. As such, any attempt to
>register such a media type will be immediately rejected.

>If you want to play around with general compression schemes and MIME in a
>standards-compliant way, nothing prevents you from using x-gzip or whatever
>as a CTE.

I don't want to play around with x-gzip. I am trying to see what options
are available within the present MIME setup.

The problem appears to be that there might be a largish number of
compression algorithms one might wish to use, combined with a variety of
ways of encoding them into 8bit. That would seem to require a rather large
number of new CTEs to be registered.

One would like to be able to write

   Content-Transfer-Encoding: base64; compression=gzip

but that is incompatible with the present syntax, and it is doubtful it
could be introduced as an extension without upping the MIME-Version
number.

Essentially, we have three bits of information to convey:
(Continue reading)

Charles Lindsey | 1 Jul 18:34
Picon
Picon

Re: gzip/deflate compression/encoding


In <01LQ2M0CNBXC00004T <at> mauve.mrochek.com> ned+ietf-822 <at> mrochek.com writes:

>> Are the usual audio, image, etc. formats truly 8bit clean (i.e. are they
>> guaranteed not to contain NUL or naked CR or LF)?

>Of course they aren't.

Yes, that's what I thought.

>> If not, then you are
>> back to the 37+% expansion of base64.

>Doesn't follow. A binary-to-8bit encoding has enough characters available that
>the overhead can be limited to at most 1-2%.

Sure, but there is no currently recognized/standardized encoding with that
property (except for yEnc, which I don't think is a serious candidate
yet).

--

-- 
Charles H. Lindsey ---------At Home, doing my own thing------------------------
Tel: +44 161 436 6131 Fax: +44 161 436 6133   Web: http://www.cs.man.ac.uk/~chl
Email: chl <at> clerew.man.ac.uk      Snail: 5 Clerewood Ave, CHEADLE, SK8 3JU, U.K.
PGP: 2C15F1A9      Fingerprint: 73 6D C2 51 93 A0 01 E7 65 E8 64 7E 14 A4 AB A5

ned+ietf-822 | 1 Jul 23:24

Re: gzip/deflate compression/encoding


> In <01LQ2KJ1KAXY00004T <at> mauve.mrochek.com> ned+ietf-822 <at> mrochek.com writes:

> >Repeating for the Nth time: Media types are supposed to be used to identify
> >types of media, not general compression schemes or other stuff. Section 2.2.1
> >of RFC 2048 is very clear about this, and this requirement remains unchanged in
> >section 4.1 of draft-freed-media-type-reg-04.txt. As such, any attempt to
> >register such a media type will be immediately rejected.

> >If you want to play around with general compression schemes and MIME in a
> >standards-compliant way, nothing prevents you from using x-gzip or whatever
> >as a CTE.

> I don't want to play around with x-gzip. I am trying to see what options
> are available within the present MIME setup.

There's only one: Define one or more new CTEs.

> The problem appears to be that there might be a largish number of
> compression algorithms one might wish to use,

"Might wish" != "really need"

> combined with a variety of ways of encoding them into 8bit.

There is certainly no reason for there to be more than one of these.

> That would seem to require a rather large
> number of new CTEs to be registered.

(Continue reading)

Bruce Lilly | 2 Jul 00:03
Picon

Re: gzip/deflate compression/encoding


On Thu June 30 2005 15:28, ned+ietf-822 <at> mrochek.com wrote:

> Doesn't follow. A binary-to-8bit encoding has enough characters available that
> the overhead can be limited to at most 1-2%.

for random data.

The yEnc method will yield a 2x expansion for a bitmap image of
an elephant in a fog (every pixel has a value of 0xD6).  That's
considerably worse than a 37% increase.

ned+ietf-822 | 2 Jul 00:33

Re: gzip/deflate compression/encoding


> On Thu June 30 2005 15:28, ned+ietf-822 <at> mrochek.com wrote:

> > Doesn't follow. A binary-to-8bit encoding has enough characters available that
> > the overhead can be limited to at most 1-2%.

> for random data.

> The yEnc method will yield a 2x expansion for a bitmap image of
> an elephant in a fog (every pixel has a value of 0xD6).  That's
> considerably worse than a 37% increase.

No, for any data. It is relatively easy to design a scheme that limits the
overhead to 1-2% no matter what the input. The question is which is worth more,
some measure of yEnc compatibility or having this guarantee. I'm persoally
somewhat ambivalent on this and would appreciate hearing other opinions.

				Ned

Bruce Lilly | 2 Jul 05:02
Picon

Re: gzip/deflate compression/encoding


On Fri July 1 2005 18:33, ned+ietf-822 <at> mrochek.com wrote:
> It is relatively easy to design a scheme that limits the
> overhead to 1-2% no matter what the input.

Maybe, depending on the constraints.  The minimum constraints on the
output are:
o CRLF only for line endings, no lone CR or lone LF
o no NUL
o line length <= 998 octets
(for that is the definition of 8bit).  The input of course is an
unconstrained sequence of octets.

About 1.4% expansion should be possible with only those constraints,
a fairly simple algorithm (faster decode than encode), and moderate
encoder memory requirements (an 84 octet input buffer).

Adding constraints makes things more difficult.  For example,
constraining line length to 76 octets as is the case with
the other CTEs necessitates an expansion by 78/76 which is >2%
overhead exclusive of any other considerations.

Staying within 2% expansion while avoiding lone CR, lone LF, and NUL
implies a line length of about 250 octets.

Other constraints include amount of memory an encoder or decoder
might require, restrictions on leading or trailing whitespace,
avoiding other troublesome output sequences, whether or not a
stream can be encoded on the fly, and complexity of encoder and/or
decoder.
(Continue reading)

Tony Hansen | 7 Jul 20:58
Picon
Favicon

OPES and Email


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Heads up, email experts!

I don't know if you saw the recent I-D documents that came out, entitled
OPES SMTP Use Cases (draft-ietf-opes-smtp-use-cases-02.txt).

As the abstract says:

   The Open Pluggable Edge Services (OPES) framework is application
   agnostic.  Application specific adaptations extend that framework.
   This document describes OPES SMTP use cases and deployment scenarios
   in preparation for SMTP adaptation with OPES.

The intent is to provide access from within email to the OPES filters
being used with HTTP. (Virus scanners are an example.)

I'm co-chair of the OPES group, and am aware of its potential
ramifications to the email world. So, I'd appreciate it if you could
review the use-cases document. Discussion about the document is
occurring on the ietf-openproxy <at> imc.org mailing list. (If you'd rather
not join that list, I can channel your comments.)

Thanks!

	Tony Hansen
	tony <at> att.com
-----BEGIN PGP SIGNATURE-----
(Continue reading)


Gmane