Bob McElrath <mcelrath+filterproxy <at> draal.physics.wisc.edu>
2003-04-23 18:21:38 GMT
dar santos [junedar <at> hotmail.com] wrote:
> Yes. Got it running. It seems so weird that I rather not spend my time
> tracing the proble. I reinstalled woody and used a diffrent http source to
> update apt-get then there it was. Maybe the initial source I got had some
> un-updated packages. Well anyway. Congrats. Its Impressive and fast knowing
> that its based on perl. Its performance is comparable to any C based
> apllications.Greatwork. Now i can start finding what I had initially
> planned of porting it to win32.
I'm glad you like it. :)
> Anyway, I noticed the Imagecomp(based on
> imagemagick convert I presume). If Im not mistaken Its activation is not by
> default and the user should be the one to make it function.
That module was contributed without a config page, and I have never
really used it so it is kind of decaying... I wrote a config page for
it though and put it in CVS. It will be in the next release, when I get
around to it...
> I dont know if you would be interested , correct me if Im wrong, the whole
> idea of http compression (text/html, xml etc) like the implementation of
> mod-gzip would be less significant to those using the dialup. Mainly modems
> have hardware compression and there is also software base(Stac,LZS,MPPC).
> To compress an already compressed content(gzip encoded)would be useless if
> not add overhead to the browsing process. I was able to come accross some
> datas wherein instead of gzip compression html and others are parsed or
> rewritten(I dont know if this should be the right term I use). I tested it
> several times and its really impressive that size reduction is up to 40-50%
> . And still the output is plain html(not compressed.) If you would find
> interest in these I would gladly lookup again the datas and send them.
On the contrary, the speedup over a modem is simply astounding. I don't
fully understand why, but it is visably faster (by my measurements, 5
times faster or more). You have to have FilterProxy running on a
fast-connected server so that it can feed compressed stuff over the
modem. Just try it. ;)
I think gzip is a more efficent algorithm for compressing text than any
used by a modem (typical compression ratios for HTML are 5x to 10x).
Not only that but modems suffer from latency. It can take 300ms to
fetch a 0 byte file from a server. Image-heavy pages are the pits over
a modem. By removing ads, FilterProxy typically reduces the number of
connections your browser needs to make to render the page, thereby
speeding it up significantly.
It is possible to also parse HTML and rewrite it to be smaller. The
typical HTML file contains a lot of whitespace, comments, etc that can
be removed without changing the appearance of the page. However,
parsing HTML is extremely CPU intensive. In my tests it would take
several seconds to do this on a modern CPU. There are other tools out
there that do this (even perl modules). If you are interested in
pursuing this, I would definitely accept such a module, but I think it
would be slow.
The slowness of parsing HTML is why I chose a regex-based method to
strip ads. If I used a full HTML parser (like the perl module
HTML::Parser) it would be extremely slow.
> Thanks very much and Ill write as soon as I can manage to make your program
> run on win32 or win64 that is.
Bob McElrath [Univ. of Wisconsin at Madison, Department of Physics]
"You measure democracy by the freedom it gives its dissidents, not the
freedom it gives its assimilated conformists." -- Abbie Hoffman