Martin Bähr | 9 Jul 13:33

www.consortiuminfo.org/standardsblog/ blocks wwwoffle?

hi,

is anyone here able to access www.consortiuminfo.org/standardsblog/ through
wwwoffle?  i keep getting this message below, and i just don't get why they
would block a proxy, and also how they detect something anyways. 

what does wwwoffle do to the request? is it possible to make wwwoffle send out
the request exactly as it received it from the browser?

----------------- error message: ---------------------
Precondition Failed

We're sorry, but we could not fulfill your request for
/standardsblog/article.php?story=20080708052706429 on this server.

We have established rules for access to this server, and any person or robot that violates these rules will
be unable to access this site.

To resolve this problem, please try the following steps:

    * Ensure that your computer is free of viruses, Trojan horses, spyware or any other sort of malicious software.
    * If you are using any sort of personal firewall or browser privacy software, check to ensure that its
settings do not cause your web browser to inadvertently violate any of the rules listed below.
    * If you are behind a Web proxy or corporate firewall, the proxy must conform to the HTTP specification with
respect to proxy servers. Contact your network administrator if the trouble persists, or bypass the
proxy and connect directly if possible.
    * Disable any download accelerators you may be using. They don't speed up your downloads anyway; in most
cases, they actually run slower!
    * If all else fails, try using a different Web browser, such as Firefox.

(Continue reading)

jidanni | 1 Mar 19:10

DontCache still ends up in outgoing

Complaint: offline both
$ wwwoffle SomeDontGetURL
$ wwwoffle SomeDontCacheURL
both say
Requesting ThatURL
and return $?=0 to the shell,
even though WWWOFFLE intends to do no such fetching.

At least one can do
  # grep 'not to get' /var/log/syslog
  wwwoffles[5218]: The URL 'http://example.net/f.jpg' matches one in the
  list not to get.
to know about the former, but what about the latter?
The latter still ends up in http://localhost:8080/index/outgoing/
Clicking on it there, still here offline, says
  Your request for URL
  http://en.wikipedia.org/w/index.php?title=List_of_thinking_errors&action=edit
  failed because it is on the list of hosts and/or paths that are not to
  be cached and cannot be requested when offline.
Well, OK, then it should be barred from ending up in outgoing too.

OK, to check for the latter one would do, after fetching,
# less +/not\ possible /var/log/syslog
(note I use maximum debug level for my messages)

Anyway, if the shell returned 1 and a message for both, one could much
easier tell which of one's command line requests one had betted make
other plans for (fetching by hand, as they are on our DontGet and
DontCache lists), rather that thinking WWWOFFLE will remember to fetch
them for us when indeed it has no such plans, and that will be last we
(Continue reading)

jidanni | 27 Nov 19:46

executives still use WWWOFFLE to keep track of what's already read

Gentlemen, assuming that one has "the works" in the latest computing
equipment and network connections, why would one still use WWWOFFLE?

Well, certainly one cannot keep track of what articles one has already
read on a sites with many articles. So with
Purge
{
 age=-1
}
one will always know that one has already read which articles by looking
at their link colors, (etc. as I mentioned in one of my previous postings.)

Karsten Kruse | 6 Oct 20:25

404 with wwwoffle, page delivered without

Hi,

when i visit a certain URL i get error 404 with wwwoffle and the page 
without a proxy.

Here is the URL:

http://www.instructables.com/id
/Bicyle-Power-for-Your-Television,-Laptop,-or-Cell-/?relatedLink

(note that i broke the URL to fit it in here, it should read
...id/Bi...)

Can anyone confirm and/or explain that? Here is my environment:

Server
- NetBSD 4.0_BETA2 i386
- wwwoffle version 2.9a

Client
- Debian 4.0r1 i386
- Mozilla Firefox 2.0.0.5

Thanks for your time.

MFG,

Karsten Kruse

--

-- 
(Continue reading)

Ian Stirling | 2 Sep 13:48

Not quite wwwoffle - for mobiles and low bandwidth use.

I've used wwwoffle for some years, and it 'just works' for me.

So, naturally, when I realised a need for a possibly related bit of 
software, and lacking results from google. I started wondering if anyone 
had thought of, or knows of an implementation or proof of concept with 
wwwoffled.

Basically, it's a two part web proxy to drastically reduce web usage 
bandwidth.
One part resides on a mobile device with a (usually) poor bandwidth 
link, but relatively large amount of storage, that may occasionally be 
plugged into a high speed network.

The other part is on server, connected via a fast connection to the 
internet.

To quote a page I wrote describing this.

-------------------------------------

"This is a brief page describing a web proxy optimised for use on 
devices with a reasonable amount of persistant storage, and very limited 
bandwidth.

Once, each page linked to a subpage of contents, which remained static, 
and could be easily refreshed if it changed based on dates in the HTTP 
headers.

Now, this is the case in the minority of popular sites. Most sites now 
have a substantial fraction of pages with some non-static content.
(Continue reading)

jidanni | 1 Sep 15:32

Dan Jacobson does not fall for "SpamBLK"

e> To: exp@...
e> Subject: [WWWOFFLE-Users] only-same-host-frames
e> The person you tried to send this email to is using SpamBLK...
e> If you don't click the link above your previous email will be deleted.

Wait, it must be a trap Dan! Don't click! It must be a spam-bot
subscribed to the list, hoping to harvest "confirmed live ones". Well,
I'll have him know that I just happen to be College Educated -- no
easy mark. Ha!

jidanni | 1 Sep 03:06

only-same-host-frames

Gentlemen, it's me again with another brilliant idea.
You know when those sites pull in those ad frames,
URL='http://news.com.com/Developing+nations+losing+spam+battle,+
Default Recursive Fetch options: stylesheets=0 images=0 frames=2
Frame=http://view.atdmt.com/M0N/iview/cntcmssc0770000080m0n/dire
Frame=http://view.atdmt.com/MRT/iview/cntnkinf0250006355mrt/dire

Well, there needs to be an only-same-host-frames variable, on the
model of only-same-host-images, for those of us that use frames=yes.

Andy Rabagliati | 19 Aug 16:43

SSL certificates

Folks,

  I have a slightly different use-case for wwwoffle - I scoop websites
  on one machine with wwwoffle, tar up the files, and pass them to
  another machine via UUCP.

  It allows me to provide web services for disconnected networks.

  I am upgrading to the latest version of wwwoffle, and have bumped into
  a problem or two.

  First, using Ubuntu Feisty and wwwoffle 2.9a-2 the creation of root
  certificates is not reliable on startup - I often get an empty file.
  There seems to be some discussion of this on the list, and maybe I
  need a newer version.

  Second, the path to those certificates seems to be hardcoded into the
  binary, at /etc/wwwoffle/certificates.

  I don't really like this, as I create dynamic wwwoffle instances on
  the fly, in /var/tmp/wwwoffle2345/* as a different user (uucp) and now
  it tangles up with my 'upstream' wwwoffle instance on the same
  machine.

  First prize for me is a way to disable all the SSL stuff completely,
  as all this happens unattended so SSL is not that necessary.

  For the moment I now run the master wwwoffle as the uucp user (ugh)
  and all instances share the certificates.

(Continue reading)

Joshua Fein | 29 Jul 16:58

Re: Release of sho_title0.9

sho_title0.94

changes:

Three new tags:  SIZEGT, TYPE and COPY.
Allow selection of objects based on size and mime type, display size and type information.
Provide mechanism by which selected objects can be exported outside WWWOFFLE.

minor bugfixes: formatting, documentation.

Available at:

http://sourceforge.net/project/platformdownload.php?group_id=182008

OR

https://sourceforge.net/project/platformdownload.php?group_id=182008

(Both URLs do EXACTLY the same thing!)

Paul Slootman | 19 Jul 12:25

extra spaces being added

While investigating
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=395009 (which is a
wierd bug...) I noticed that pages received via wwwoffle were larger
than directly received pages. Wwwoffle is adding spaces before />
strings.

To demonstrate:

# echo "<bla />"|wwwoffle-write --addheader http://foo.bar/bla

# wwwoffle-read http://foo.bar/bla | cat -vet
HTTP/1.0 200 OK^M$
Content-Type: text/html^M$
^M$
<bla />$

# http_proxy=http://localhost:8080/ GET http://foo.bar/bla | cat -vet
<bla  />$

Now this shouldn't be a problem for properly written browsers (I've
checked that it doesn't happen with e.g. text/plain content), but I
suspect an off-by-one error somewhere...

Paul Slootman

Miernik | 3 Jul 19:31

is it possible to have freshness information in response headers?

Here is what I want to do: I want to turn off add-cache-info which I
currently use to see if the page I am browsing is fresh (x minutes ago
added to the page HTML), and instead I want some Firefox extension to
present that information in Firefox statusbar.

That's because the AddCacheInfo thing often breaks the page layout.

For that I would need WWWOFFLE to have the "x minutes ago" information
in some HTTP header in the response it gives to the browser. Is that
possible, and if not now, can you consider to implement it?

--

-- 
Miernik
http://miernik.name/


Gmane