Marcin Dziwnowski | 3 Nov 2009 15:33
Picon

Downloading old articles

Hello,

I would like to download some (ok, quite a lot) old messages for my
archive. My goal is to find and save every article posted by me to a
particular group during the last 3 years. Having the archive of the
group for that time locally would be a nice bonus.

With Google Groups broken it's easier to grep the damn posts than to
find something there. Leafnode stores articles conveniently for
grepping, so I tried using fetchnews for downloading. Unfortunately,
after a certain number of posts it just gives up:
# fetchnews -vvvv -x 235000
 backing up from 1706789 to 1471789
 considering articles 1471789 - 1706793
 0 articles fetched, 0 killed
No information about why it does not attempt to download the missing
posts is given (or maybe leafnode logs it somewhere?). I am pretty
confident, but not 100% sure, that the server leafnode downloads from
has more articles and it allows downloading them.

Is there a way to make fetchnews copy more articles? Or maybe some
smarter tool for the job? Preferably a tool that would run under
Linux.

I apologize if this mail is off-topic.
--

-- 
_______________________________________________
leafnode-list mailing list
leafnode-list@...
https://www.dt.e-technik.uni-dortmund.de/mailman/listinfo/leafnode-list
(Continue reading)

Whiskers | 3 Nov 2009 15:57
Picon
Favicon

Re: Downloading old articles

On Tue, 3 Nov 2009 15:33:37 +0100 Marcin Dziwnowski
<m.dziwnowski@...> wrote:

[...]

> I am pretty
> confident, but not 100% sure, that the server leafnode downloads from
> has more articles and it allows downloading them.

Telnet into the server and send it the GROUP command specifying the group 
you're interested in; "Assuming the group specified exists, the server 
returns to the client the numbers of the first and last articles currently 
in the group, along with an estimate of the number of messages in the 
group. The server's internal article pointer is also set to the first 
message in the group." 
<http://www.tcpipguide.com/free/t_NNTPCommands-2.htm>.

If you then fetch the first article the date of it will indicate how far 
back the articles are retained for that group on that server.

> Is there a way to make fetchnews copy more articles? Or maybe some
> smarter tool for the job? Preferably a tool that would run under
> Linux.
> 
> I apologize if this mail is off-topic.

I think the easiest way would be to use a newsreader to connect directly 
to the server (not to your Leafnode) and subscribe to that group and 
download all the messages.  You might have to over-ride a 'maximum number 
of messages' setting first.  Slrn+slrnpull would create a local spool in 
(Continue reading)

Matthias Andree | 4 Nov 2009 02:51
Picon
Picon

Re: Downloading old articles

Am 03.11.2009, 15:33 Uhr, schrieb Marcin Dziwnowski  
<m.dziwnowski@...>:

> I would like to download some (ok, quite a lot) old messages for my
> archive. My goal is to find and save every article posted by me to a
> particular group during the last 3 years. Having the archive of the
> group for that time locally would be a nice bonus.
>
> With Google Groups broken it's easier to grep the damn posts than to
> find something there. Leafnode stores articles conveniently for
> grepping, so I tried using fetchnews for downloading. Unfortunately,
> after a certain number of posts it just gives up:
> # fetchnews -vvvv -x 235000
>  backing up from 1706789 to 1471789
>  considering articles 1471789 - 1706793
>  0 articles fetched, 0 killed
> No information about why it does not attempt to download the missing
> posts is given (or maybe leafnode logs it somewhere?). I am pretty
> confident, but not 100% sure, that the server leafnode downloads from
> has more articles and it allows downloading them.

Probably nothing offered. Does the server support XOVER? If not, try  
usexhdr=1 (in the config file, under the server=... line). Otherwise
please see the section "TROUBLESHOOTING" in the README file that comes  
with leafnode.

> Is there a way to make fetchnews copy more articles? Or maybe some
> smarter tool for the job? Preferably a tool that would run under
> Linux.

(Continue reading)

Marcin Dziwnowski | 5 Nov 2009 03:59
Picon

Re: Downloading old articles

On Tue, Nov 3, 2009 at 3:33 PM, Marcin Dziwnowski
<m.dziwnowski@...> wrote:

> # fetchnews -vvvv -x 235000
>  backing up from 1706789 to 1471789
>  considering articles 1471789 - 1706793
>  0 articles fetched, 0 killed
> No information about why it does not attempt to download the missing
> posts is given

As it turns out the server I am using probably just can't cope with
listing the demanded 235 000 posts all at once. The articles are
there, the xover command is recognized but it's just too much.

Is there a way to make fetchnews download posts in "batches", ten,
maybe twenty thousand every run? Without overloading the server with
listing three, four, five or six hundred thousand articles first?

> Or maybe some smarter tool for the job?

The "smarter tool" term was very unfortunate, Matthias, I'm sorry.
--

-- 
_______________________________________________
leafnode-list mailing list
leafnode-list@...
https://www.dt.e-technik.uni-dortmund.de/mailman/listinfo/leafnode-list
http://leafnode.sourceforge.net/

Matthias Andree | 5 Nov 2009 09:28
Picon
Picon

Re: Downloading old articles

Am 05.11.2009, 03:59 Uhr, schrieb Marcin Dziwnowski  
<m.dziwnowski@...>:

> On Tue, Nov 3, 2009 at 3:33 PM, Marcin Dziwnowski
> <m.dziwnowski@...> wrote:
>
>> # fetchnews -vvvv -x 235000
>>  backing up from 1706789 to 1471789
>>  considering articles 1471789 - 1706793
>>  0 articles fetched, 0 killed
>> No information about why it does not attempt to download the missing
>> posts is given
>
> As it turns out the server I am using probably just can't cope with
> listing the demanded 235 000 posts all at once. The articles are
> there, the xover command is recognized but it's just too much.

Would XHDR work? Try   usexhdr = 1   in the .../config file below your  
"server = example.pl" line.

> Is there a way to make fetchnews download posts in "batches", ten,
> maybe twenty thousand every run? Without overloading the server with
> listing three, four, five or six hundred thousand articles first?

I figured that in leafnode 1 at least (didn't check leafnode 2)  
maxfetch=12345 overrides larger fetchnews -x 98765 values. I'm not sure if  
I'd consider that a bug or feature and if I want to fix it. Given  
leafnode-1's low release frequency and really long propagation into  
distributions I have some reservations about touching it.

(Continue reading)

Marcin Dziwnowski | 5 Nov 2009 15:51
Picon

Re: Downloading old articles

On Thu, Nov 5, 2009 at 9:28 AM, Matthias Andree <matthias.andree@...> wrote:

> Would XHDR work? Try   usexhdr = 1   in the .../config file below your
> "server = example.pl" line.

/var/log/news/news.notice says:
fetchnews[6630]: config: unknown line 12: "usexhdr = 1"

Guess I should compile leafnode instead of using distro's version and
try this? I'll be back ;)
--

-- 
_______________________________________________
leafnode-list mailing list
leafnode-list@...
https://www.dt.e-technik.uni-dortmund.de/mailman/listinfo/leafnode-list
http://leafnode.sourceforge.net/

Marcin Dziwnowski | 6 Nov 2009 00:09
Picon

Re: Downloading old articles

On Thu, Nov 5, 2009 at 3:51 PM, Marcin Dziwnowski
<m.dziwnowski@...> wrote:

>> Would XHDR work? Try   usexhdr = 1   in the .../config file below your
>> "server = example.pl" line.
>
> /var/log/news/news.notice says:
> fetchnews[6630]: config: unknown line 12: "usexhdr = 1"
>
> Guess I should compile leafnode instead of using distro's version and
> try this? I'll be back ;)

I don't understand it, same thing:
config: unknown line 12: "usexhdr = 1"
--

-- 
_______________________________________________
leafnode-list mailing list
leafnode-list@...
https://www.dt.e-technik.uni-dortmund.de/mailman/listinfo/leafnode-list
http://leafnode.sourceforge.net/

Robert Grimm | 6 Nov 2009 10:55
Picon

Re: Downloading old articles

Marcin Dziwnowski <m.dziwnowski@...> wrote:
> I don't understand it, same thing:
> config: unknown line 12: "usexhdr = 1"

Apparently it is "noxover = 1" in leafnode 1.

Rob
--

-- 
In C we had to code our own bugs. 
In C++ we can inherit them.

Marcin Dziwnowski | 12 Nov 2009 19:12
Picon

Re: Downloading old articles

On Fri, Nov 6, 2009 at 10:55 AM, Robert Grimm <lists@...> wrote:

> Apparently it is "noxover = 1" in leafnode 1.

... and it doesn't help. Leafnode still doesn't download more articles.
Well, the fault is more on the server side than leafnode's, but is
there another way around it?
Werner Geuens | 12 Nov 2009 21:50

Re: Downloading old articles

On Thursday 12 November 2009 19:12:44 Marcin Dziwnowski wrote:
> On Fri, Nov 6, 2009 at 10:55 AM, Robert Grimm <lists@...> wrote:
> > Apparently it is "noxover = 1" in leafnode 1.
>
> ... and it doesn't help. Leafnode still doesn't download more articles.
> Well, the fault is more on the server side than leafnode's, but is
> there another way around it?

I haven't been reading the complete history of this thread.

But can't you use fetchnews -x 10000 to get the last 10.000 articles?

Replace 10.000 with whatever number you want. Make it gradually larger in 
subsequent calls.

Articles you allready have won't be download again, so...

Or am I seriously missing the point of this subject?

--

-- 
When in doubt, use brute force.
                -- Ken Thompson

Gmane