Gregory Maxwell | 6 Mar 2006 05:11
Picon
Gravatar

Text access on toolserver?

What is the status of getting text access back on Toolserver?
Is there anything I can do to make it happen?

The lack of text access is killing most of my projects other than toy
statistics gathering.

Daniel Kinzler | 14 Mar 2006 16:53
Picon
Favicon
Gravatar

Multi-Master replication (was: Yahoo-Cluster and Replication)

Hi again

Yesterday, Kate told me that the problem with replication from the Asian
cluster is that mysql can only connect to one replication master. I have
googeled a bit, and it appears that that is not true (at least for MySQL
5.1): http://dev.mysql.com/doc/refman/5.1/en/replication-intro.html says:

Multiple-master replication is possible, but raises issues not present
in single-master replication. See Section 6.15, “Auto-Increment in
Multiple-Master Replication”.

http://dev.mysql.com/doc/refman/5.1/en/replication-auto-increment.html
talks about problems with auto-increment keys, which should not affect
us, since databases replicated from each master would be disjunct.

I have not dug deeper yet, but I'm probably not the best person to
fiddle with that anyway.

I hope one of you can have a look at it - having no data from the Asian
wikis on the toolserver is a serious problem.

Regards,
Daniel

--

-- 
Homepage: http://brightbyte.de

Gregory Maxwell | 21 Mar 2006 00:37
Picon
Gravatar

Access to ipblocks table.

Could we get ipblocks table visible on toolserver minus the ipb_address column?

This column needs to be omitted because autoblock IPs are stored in
it. Without this column the table contains no information which isn't
available to the general public, as far as I can tell.

Ideally we'd keep that column and use a view which nulls it for rows
where ipb_auto is 1. However I understand that views in mysql 5 are
still pretty limited and we lose indexes... For my applications I'd
rather lose the ability to see IP blocks entirely than lose indexes.

Thanks.

Leo Büttiker | 22 Mar 2006 22:03
Picon
Favicon

Troubles with reading Articles

Hi all,
For a toolserver-project I will read all Wikipedia (pwiki_de) articles and 
parse them for geoinformation. After some troubles I've fixed now nearly all 
bugs, but I have still some troubles with opening the articles. 

I open the article with the help of the mediawiki functions in the following 
way:
 $title = Title::newFromID($page_id);
 $art = new Article($title);
 $text = $art->getContent(true);

For some articles this work quite well, but for some it doesn't return text. I 
think there's a problem with the compresion of the database (in a local 
enviroment with a wikipedia dump it works), but I could't find out a 
workaround. Any suggestions?

Thanks
Leo

Rob Church | 22 Mar 2006 22:10
Picon

Re: Troubles with reading Articles

Some text is stored compressed in the databases, and some is on
external storage, a feature of MediaWiki which Wikimedia sites use;
this is not available at the present time.

Rob Church

On 22/03/06, Leo Büttiker <leo.buettiker <at> hsr.ch> wrote:
> Hi all,
> For a toolserver-project I will read all Wikipedia (pwiki_de) articles and
> parse them for geoinformation. After some troubles I've fixed now nearly all
> bugs, but I have still some troubles with opening the articles.
>
> I open the article with the help of the mediawiki functions in the following
> way:
>  $title = Title::newFromID($page_id);
>  $art = new Article($title);
>  $text = $art->getContent(true);
>
> For some articles this work quite well, but for some it doesn't return text. I
> think there's a problem with the compresion of the database (in a local
> enviroment with a wikipedia dump it works), but I could't find out a
> workaround. Any suggestions?
>
> Thanks
> Leo
> _______________________________________________
> Toolserver-l mailing list
> Toolserver-l <at> Wikipedia.org
> http://mail.wikipedia.org/mailman/listinfo/toolserver-l
>
(Continue reading)

Stefan F. Keller | 22 Mar 2006 22:20
Picon
Favicon
Gravatar

Re: Troubles with reading Articles

Rob

> this is not available at the present time.

Does this mean, Wikimedia sites (like de) use the compression but don't
offer decompression?? As a user I've never seen compressed stuff, so this is
weird... we really need somehow a workaround.

-- Stefan K.

> -----Original Message-----
> From: toolserver-l-bounces <at> Wikipedia.org [mailto:toolserver-l-
> bounces <at> Wikipedia.org] On Behalf Of Rob Church
> Sent: Wednesday, March 22, 2006 10:10 PM
> To: toolserver-l <at> wikipedia.org
> Subject: Re: [Toolserver-l] Troubles with reading Articles
> 
> Some text is stored compressed in the databases, and some is on
> external storage, a feature of MediaWiki which Wikimedia sites use;
> this is not available at the present time.
> 
> 
> Rob Church
> 
> On 22/03/06, Leo Büttiker <leo.buettiker <at> hsr.ch> wrote:
> > Hi all,
> > For a toolserver-project I will read all Wikipedia (pwiki_de) articles
> and
> > parse them for geoinformation. After some troubles I've fixed now nearly
> all
(Continue reading)

Rob Church | 22 Mar 2006 22:25
Picon

Re: Troubles with reading Articles

Well, on Wikimedia sites it'll be handled within the software, won't
it? Old versions which are compressed will be decompressed for viewing
and for calculating diffs. The same is true for external storage.

On the toolserver, we have the problem that it's a different setup; we
don't have access to the external storage grid (that I know of and at
the moment; Kate has ways and means). Compressed stuff should still be
okay to view, but I bet there's less of it.

Rob Church

On 22/03/06, Stefan F. Keller <sfkeller <at> hsr.ch> wrote:
> Rob
>
> > this is not available at the present time.
>
> Does this mean, Wikimedia sites (like de) use the compression but don't
> offer decompression?? As a user I've never seen compressed stuff, so this is
> weird... we really need somehow a workaround.
>
> -- Stefan K.
>
> > -----Original Message-----
> > From: toolserver-l-bounces <at> Wikipedia.org [mailto:toolserver-l-
> > bounces <at> Wikipedia.org] On Behalf Of Rob Church
> > Sent: Wednesday, March 22, 2006 10:10 PM
> > To: toolserver-l <at> wikipedia.org
> > Subject: Re: [Toolserver-l] Troubles with reading Articles
> >
> > Some text is stored compressed in the databases, and some is on
(Continue reading)

Leo Büttiker | 22 Mar 2006 22:45
Picon
Favicon

Re: Troubles with reading Articles

Thanks Rob
I think I have a problem with the compressed stuff, I can't open it. In a 
previous Version of MediaWiki I receive a warning from php, now they tourned 
this off, but it still dosen't work. I can't say exactly how many articles 
are affected but there's a lot of them. By the way I will only use the actual 
revision, no old stuff.

Am Mittwoch, 22. März 2006 22:25 schrieb Rob Church:
> Well, on Wikimedia sites it'll be handled within the software, won't
> it? Old versions which are compressed will be decompressed for viewing
> and for calculating diffs. The same is true for external storage.
>
> On the toolserver, we have the problem that it's a different setup; we
> don't have access to the external storage grid (that I know of and at
> the moment; Kate has ways and means). Compressed stuff should still be
> okay to view, but I bet there's less of it.
>
>
> Rob Church

Gregory Maxwell | 22 Mar 2006 22:49
Picon
Gravatar

Re: Troubles with reading Articles

On 3/22/06, Leo Büttiker <leo.buettiker <at> hsr.ch> wrote:
> Thanks Rob
> I think I have a problem with the compressed stuff, I can't open it. In a
> previous Version of MediaWiki I receive a warning from php, now they tourned
> this off, but it still dosen't work. I can't say exactly how many articles
> are affected but there's a lot of them. By the way I will only use the actual
> revision, no old stuff.

I can't speak for dewiki, but the vast majority of the new edits in
enwiki are just going straight to external storage... making
toolserver not very useful.

What are the flags on the revision you are trying to read? and what is
the raw text output of the text column?  is it just 'cluster'
something another then your problem is external storage.

FlaBot | 23 Mar 2006 00:44

Re: Troubles with reading Articles

The external storage is not a good soluction for the toolserver.

I use a little bloody hack to get all content :

ini_set('user_agent', 'TOOLSERVER');
$replag=file_get_contents(" http://tools.wikimedia.de/~interiot/cgi-bin/replag?raw");

$result1=mysql_query("select t.old_text,t.old_flags
from ".$fields[1]."wiki_p.text as t,".$fields[1]."wiki_p.page as p,".$fields[1]."wiki_p.revision as r
where p.page_title='".utf8_encode($fields[4])."'
and p.page_namespace=0
and p.page_is_redirect=0
and p.page_latest=r.rev_id
and r.rev_text_id=t.old_id",$db);
if ($result1 == false) die("failed");
$fields1 = mysql_fetch_row($result1);

if (preg_match('/external/',$fields1[1]) or preg_match('/object/',$fields1[1]) or strlen($fields1[1])==0 or $replag>=300){
$text1=file_get_contents("http://".$fields[1].".wikipedia.org/w/index.php?title=".$fields[4]."&action=raw");
}
else {
$text1=gzinflate($fields1[0]);
}


Ok . That works .. but is not very fast .... and produce load to the wiki-server ..

But i cant find a better solution.

Greeting Flacus/Flabot


Gmane