Re: upcoming 1.17 deployment and the xml dumps
Ariel T. Glenn <ariel@...
2011-02-09 22:14:28 GMT
We halted them because we can have bad data creep on during times when
the codebase is badly broken. I don't want to have to walk through and
detrmine later which 30 or 50 wiki dumps those are and toss them, so I
have them on hold til things are sorted out or until we have a date for
deployment that is a number of days off.
A dump with errors isn't better than no dump in that it is possible for
bad data to be carried forward into subsequent dumps, even with the
revision length check in the code.
The only certain check involves doing an md5sum of the revision text,
something that can only be accomplished right now by retrieving the text
from the database, thus making prefetch from the previous dump file a
After a brief meeting just now about deployment, it appears we are going
to make another stab at testing tomorrow at this time. (Check
http://techblog.wikimedia.org/ in a couple of hours for the details.)
After that we should have several days of a break; if that pans out,
I'll happily crank dumps back up for that interval.
Στις 09-02-2011, ημέρα Τετ, και ώρα 13:44 -0800, ο/η Jamie Morken
> Hi Ariel,
> I don't really understand why the dumps need to be halted as I thought
> the mediawiki code and database dump code were basically two separate
> entities already*. I guess the 1.17 branch code changes the structure
> of the database causing potential errors in the database dump? I also
> don't understand the "precautionary" logic of halting the dumps, as a
> dump with errors is better than no dump in the case where there are a
> limited supply of recent dumps due to the RAID server failure as well.
> If its only a couple day halt as you mentioned that's probably
> irrelevant, but it sounds like it may be a longer period of limited
> testing from your last wikitech email, which makes me wonder if it is
> even worth halting the dumps in the first place.. Also wouldn't
> potential dump errors be detected better if they continue to be
> produced and check them for errors, rather than halt them?
> ----- Original Message -----
> From: "Ariel T. Glenn" <ariel@...>
> Date: Saturday, February 5, 2011 10:56 pm
> Subject: [Xmldatadumps-l] upcoming 1.17 deployment and the xml dumps
> To: xmldatadumps-l@..., wikitech-l@...
> > A little bit before the scheduled deployment of the 1.17 branch
> > on our
> > production servers, I will be halting production of XML dumps.
> > Deployment is set for Tuesday Feb 8 at 07:00 UTC, so a few hours
> > beforethat I'll start shutting down processes.
> > This is a precautionary measure; after the deployment and any hasty
> > fixes that may be needed, I will be doing some testing to ensure
> > dumps are not impacted, before we restart them. Barring some bizarre
> > problem, we should be back up and running within a day or two.
> > Ariel
> > _______________________________________________
> > Xmldatadumps-l mailing list
> > Xmldatadumps-l@...
> > https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l