Tim Starling | 1 Nov 2006 04:31
Picon

Re: Doing backups and avoiding downtime

George Herbert wrote:
> This may be an argument for using PostgreSQL instead of MySQL for your site
> - the PostgreSQL "pg_dump" command doesn't cause a read or write lock on the
> database, so nobody using it gets blocked.  Things do slow down a bit, but
> you can keep right on reading or updating through the dump.
> 
> Even mysqlhotcopy has a moderate lock window.

mysqldump on InnoDB is the same, if you use the right options. No locking, 
just multi-versioned copy-on-write tables. That's how we used to do backups 
before we had slaves that we could stop.

-- Tim Starling
MHart | 1 Nov 2006 15:32

Re: Doing backups and avoiding downtime

> We've been using mysqldump to do daily full database backups in case
> our hardware on our DB server fails. This causes some problems because
> for a short period of 4 minutes or so, the site in inaccessible
> because mysqldump has the db locked.

I do a full mysqldump of 100+ wikis and 25+ blogs. It takes and average of 
55 seconds to dump and gzip. It auto-runs at around midnight Eastern time. 
Never a problem. Saved to a RAID 1 device daily and all backups are copied 
to an offsite RAID 5 device weekly.

How big is your resulting dump... and how powerful is the dumper? 4 minutes 
sounds excessive. Gzipped, my backups are 156meg. Unzipped, 1.2 gig. Around 
the time of the backup, I have between 17 and 21 meg of bandwidth hitting 
the site: around 2600 hits, just under 1000 page views. If I adjusted my 
backup time to 6am Eastern - that's when I have the least traffic - 2meg of 
bandwidth usage.

- MHart
Travis Derouin | 1 Nov 2006 16:23

Re: Doing backups and avoiding downtime

You're right, looking at the logs, it is more like 1.5 minutes. Funny!
I still hear complaints about that small amount of downtime though.
The downtime is likely longer because the incoming connections are
likely getting backed up, and there's probably a bottleneck when the
dump has finished.

Our dump uncompressed is 1.2 GB  one wiki.
Emmanuel Engelhart | 1 Nov 2006 16:33

Re: Problem with mwdumper or the last frwiki dump

Thank you Felipe for your answer.

The solution was the following one :
max_allowed_packet = 16M in my.cnf

But I have an other question:
Does mwdumper allow to generate gzipped article text to insert into
the old revision table ?
How to do that ?

Best regards

Emmanuel

2006/10/31, Felipe Ortega <glimmer_phoenix@...>:
> You shouldn't need to specify a -classpath argument for mysql-connector.
>
> mwdumper. jar works fine simply calling the JRE 1.5 with the appropiate .bz2 archive.
>
> Try exactly the command in the README file, or in:
>
> http://www.mediawiki.org/wiki/MWDumper
>
> All the best.
>
> Felipe.
>
> Emmanuel Engelhart <emmanuel@...> escribió: HI
>
> I have downloaded the last frwiki dump, in particulary:
(Continue reading)

Felipe Ortega | 1 Nov 2006 18:09
Picon
Picon
Favicon
Gravatar

Re: Problem with mwdumper or the last frwiki dump

As far as I know, it doesn't.

You can get XML dumps with mwdumper and perform some basic filter options (as
you can see in the documentation available), as well as produce gzip output data.

But you cannot filter individual tables. The XML dump must have been filtered
previously. I don't know how the XML dump process (not the recovery) is
implemented (perhaps Brion or someone else could help you... if they're not very
busy).

Regards,

Felipe.

Emmanuel Engelhart <emmanuel@...> escribió: Thank you
Felipe for your answer.

The solution was the following one :
max_allowed_packet = 16M in my.cnf

But I have an other question:
Does mwdumper allow to generate gzipped article text to insert into
the old revision table ?
How to do that ?

Best regards

Emmanuel

2006/10/31, Felipe Ortega :
(Continue reading)

Felipe Ortega | 1 Nov 2006 18:17
Picon
Picon
Favicon
Gravatar

Odd election of contents in Wikipedia dumps

A very simple question:

Why the stub-meta-history-xml.gz version of the dumps does NOT include page_len information in table 'page'?

On the other hand, the latest-pages-meta-history-xml.7z version does include that info, as well as the
page.sql.gz archives....

It simply contributes to mess things up (you've got to manually 'paste' that info to a stub dump from
page.sql.gz if you want it). Crazy....

Felipe.

 		
---------------------------------

LLama Gratis a cualquier PC del Mundo.
Llamadas a fijos y móviles desde 1 céntimo por minuto.
http://es.voice.yahoo.com
George Herbert | 1 Nov 2006 20:58
Picon
Gravatar

Re: Doing backups and avoiding downtime

On 11/1/06, Travis Derouin <travis@...> wrote:
>
> You're right, looking at the logs, it is more like 1.5 minutes. Funny!
> I still hear complaints about that small amount of downtime though.
> The downtime is likely longer because the incoming connections are
> likely getting backed up, and there's probably a bottleneck when the
> dump has finished.
>
> Our dump uncompressed is 1.2 GB  one wiki.
> _______________________________________________
> Wikitech-l mailing list
> Wikitech-l@...
> http://mail.wikipedia.org/mailman/listinfo/wikitech-l
>

That does sound sort of slow.

As Tim points out, the ultimate solution to this is a slave DB which you can
stop without affecting the primary which is serving to the live wiki.

Are you doing the dump-to-/tmp trick others noted early in the thread?  A
RAM disk is far better than real disk for dump speed, if you have the RAM
available... and 1.2 GB isn't all that much RAM these days.

--

-- 
-george william herbert
george.herbert@...
Brion Vibber | 1 Nov 2006 21:03
Picon
Favicon
Gravatar

Re: Problem with mwdumper or the last frwiki dump

Emmanuel Engelhart wrote:
> But I have an other question:
> Does mwdumper allow to generate gzipped article text to insert into
> the old revision table ?

Not at this time.

> How to do that ?

Write some Java code that does that? :)

-- brion vibber (brion  <at>  pobox.com)
Timwi | 1 Nov 2006 23:26
Picon
Gravatar

Re: Special page names case-insensitive and localisable

Tim Starling wrote:
> 
> I've just committed a change to make special page names case-insensitive and 
> localisable. The default name for a special page can be changed, but a 
> redirect from the English name will always be kept. At present, there are no 
> local sets of names committed, although one has been proposed for German.

So what would be the canonical name of the page that is currently 
[[Special:Recentchanges]]? You see, you said "the English name", but in 
my dictionary, "Recentchanges" is not an English word. It should be 
[[Special:Recent changes]]. The same applies to more than half of all 
the special pages. :)

Timwi
Tian-Jian "Barabbas" Jiang | 2 Nov 2006 13:59
Picon

Re: [Wikitech-l] Wikimania 2007 Hacking Days call-for-help

Dear Mr. Starling,

    I am very sorry about that, seems my information was out-of-dated. It is pretty good to know that Wikipedia sites applied Lucene, then it will be more feasible to further applications of cross-lingual information retrieval.

    Best Regards,
Mike

2006/10/31, Tim Starling <tstarling-AeOJrEpdGNeGglJvpFV4uA@public.gmane.org>:
Tian-Jian "Barabbas" Jiang <at> Gmail wrote:
>           o Site searching (Is there any plan to make Lucene available
>             for other languages besides English?)

It already is.

-- Tim Starling

_______________________________________________
Wikitech-l mailing list
Wikitech-l-AeOJrEpdGNeGglJvpFV4uA@public.gmane.org
http://mail.wikipedia.org/mailman/listinfo/wikitech-l

_______________________________________________
Wiki-research-l mailing list
Wiki-research-l@...
http://mail.wikipedia.org/mailman/listinfo/wiki-research-l

Gmane