K. Peachey | 1 Dec 01:48 2010
Picon

Re: Commons ZIP file upload for admins

On Tue, Nov 30, 2010 at 9:40 PM, Roan Kattouw <roan.kattouw <at> gmail.com> wrote:
> We don't necessarily want ZIP uploads at Wikimedia, but it's not
> unreasonable to want to upload OpenOffice documents. Since the OO
> formats are ZIP-like, blocking ZIPs blocks those too.
>
> Roan Kattouw (Catrope)
Although this feature(/s) should they get implemented in code would
probably be wanted more than just at WMF and we shouldn't focus
discussion on features such as this a Yes or No just because it's
something the foundation may or may not want.
-Peachey
Skip Garner | 1 Dec 17:35 2010
Picon

Re: [Ticket#2010112810016598] I would like to provide a different search engine for Wikimedia

Dear Wikiteam,
  Guy Chapman requested that I post to the mailing list to ask how we can proceed to getting a copy of Wikipedia
so that we can offer it as a database in our free search service, in response to the request in the following
paragraph.  He made me aware of its size, but that is not an issue.  I would like to obtain a copy and then
establish a routine for automated synced downloads like we do for the other databases we have in our system.
  I have had several requests to add Wikipedia to our eTBLAST text similarity search engine.  This is to
improve reference finding as well as novelty assessment.  Our search tool is widely used, widely
published and is free.  Please see etblast.org or http://en.wikipedia.org/wiki/ETBLAST.  I would like
to create a searchable copy of Wikipedia locally with links back to Wikipedia for hits, and of course
acknowledge Wikimedia.  We do this for several open text datasets and are prepared to keep a local, synced
copy of Wikipedia, if you are interested.  I am certain that our mutual users would like and benefit from our
working together.

Cheers, and thank you,
Skip

----- Original Message -----
From: "Wikipedia information team" <info-en <at> wikimedia.org>
To: "Skip Garner" <garner <at> vbi.vt.edu>
Cc: "Dominik L. Borkowski" <dom <at> vbi.vt.edu>, "Johnny Sun" <szhaohui <at> vbi.vt.edu>
Sent: Wednesday, December 1, 2010 9:43:25 AM
Subject: Re: [Ticket#2010112810016598] I would like to provide a different search engine for Wikimedia

Dear Skip Garner,

Thank you for your email.  Our response follows your message. 

11/29/2010 16:23 - Skip Garner wrote:

> Guy,
(Continue reading)

Diederik van Liere | 1 Dec 18:34 2010
Picon

Re: [Ticket#2010112810016598] I would like to provide a different search engine for Wikimedia

Dear Skip,

You can always use the different dump files to host a local version of
Wikipedia. These dump files are being available at
download.wikimedia.org. However, at this moment there are some
hardware issues and the site is currently not available. Given the
task, I think that the
[language-code][wikiproject]-pages-meta-current.xml.bz2 are the most
interesting files.
You can find a complete dump of August 2009 as part of Amazon's AWS
public datasets at http://aws.amazon.com/publicdatasets/.

I have posted a step-by-step tutorial on Wiki research mailing list
explaining how to get access to those files.

Best,

Diederik

On Wed, Dec 1, 2010 at 11:35 AM, Skip Garner <garner <at> vbi.vt.edu> wrote:
> Dear Wikiteam,
>  Guy Chapman requested that I post to the mailing list to ask how we can proceed to getting a copy of
Wikipedia so that we can offer it as a database in our free search service, in response to the request in the
following paragraph.  He made me aware of its size, but that is not an issue.  I would like to obtain a copy
and then establish a routine for automated synced downloads like we do for the other databases we have in
our system.
>  I have had several requests to add Wikipedia to our eTBLAST text similarity search engine.  This is to
improve reference finding as well as novelty assessment.  Our search tool is widely used, widely
published and is free.  Please see etblast.org or http://en.wikipedia.org/wiki/ETBLAST.  I would
like to create a searchable copy of Wikipedia locally with links back to Wikipedia for hits, and of course
(Continue reading)

Nicolas Vervelle | 1 Dec 23:11 2010
Picon

Re: Creating a Media handler extension for molecular files ?

Hi Brion and others,

On Tue, Nov 23, 2010 at 12:46 AM, Nicolas Vervelle <nvervelle <at> gmail.com>wrote:

> On Mon, Nov 22, 2010 at 11:57 PM, Brion Vibber <brion <at> pobox.com> wrote:
>
>> On Mon, Nov 22, 2010 at 1:03 PM, Nicolas Vervelle <nvervelle <at> gmail.com
>> >wrote:
>>
>> > Molecular files exist in several formats : pdb, cif, mol, xyz, cml, ...
>> > Usually they are detected as simple MIME types (either text/plain or
>> > application/xml) by MediaWiki and not as more precise types (even if
>> this
>> > types exist : chemical/x-pdb, chemical/x-xyz, ...).
>> > It seems that to register a Media handler, I have to add an entry to
>> > $wgMediaHandlers[] : $wgMediaHandler['text/plain'] = 'MolecularHandler';
>> > Will it be a problem to use such a general MIME type to register the
>> > handler
>> > ? Especially for files of the same MIME type but that are not molecular
>> > files ?
>> >
>>
>> You'd want to make sure the type detection correctly identifies your files
>> so you can associate the handler types, or it's going to make things
>> confusing.
>>
>> For XML files, you should usually be able to add to the $wgXMLMimeTypes
>> array, which by default recognizes the root elements for HTML, SVG, and
>> Dia
>> vector drawings -- see the entries in DefaultSettings.php as examples. It
(Continue reading)

jidanni | 2 Dec 04:32 2010

Re: No more syntax errors!

Glad you are finally checking.
I still will check that I can at least see my Main Page after updating though.
Neil Kandalgaonkar | 2 Dec 15:46 2010
Picon

Making usability part of the development process

Hi there -- I don't post much here, but I was the programmer on the 
Multimedia Usability Project, which primarily focused on making uploads 
easier. The outside funding for that project just ended, so I think it's 
a good time to talk about what (if anything) we will do in the future 
along these lines.

Going forward, we ought not to think about usability as the 
responsibility a few people in San Francisco. I have been asking myself 
how we could end the need for usability projects, and instead make that 
part of everyone's practices.

What makes you a usability engineer? My personal belief is that it isn't 
(primarily) a matter of having special knowledge.

You become a usability software engineer when you see five average users 
utterly fail to accomplish the task you wanted them to be able to 
accomplish.

Programming is a hubristic enterprise, but for UI, these negative 
feelings are essential: watching ordinary users get angry and frustrated 
dealing with what you've created, even feeling a certain shame and 
embarassment that you got it so wrong. Only then do you see how large 
the conceptual gap is between you and the average user -- but you also 
usually come out of the experience with an immediate understanding of 
how to fix things.

So is there a way to have *everybody* who develops software for end 
users in our community have that experience? Maybe.

At the WMF, for these Usability Projects, we had to do formal studies 
(Continue reading)

Dmitriy Sintsov | 2 Dec 18:23 2010
Picon

null revisions

Hi!
From looking at DB scheme I cannot find an efficient way of getting the 
list of null revisions or opposite (no null revisions list). With LIMIT 
paging (for custom API). When I GROUP then ORDER and LIMIT, it behaves 
extremly slow.
It seems that I should use very inefficient GROUP BY rev_text_id (and 
also MySQL not offering FIRST / LAST aggregate functions) and also there 
is no index on rev_text_id by default :-( I wish there was a field like 
rev_minor_edit but for detection of null revisions, such as these 
generated by XML import / export. They confuse the logic of my wiki 
synchronization script. However, even if I were able to persuade to 
include these features into the scheme, 1.15 which customers use, was 
already released some time ago, anyway :-( So probably the core patch is 
the only efficient way to solve my problem?
Dmitriy
Trevor Parscal | 2 Dec 19:35 2010
Picon

Re: Making usability part of the development process

+1

Cheap hallway testing is so incredibly useful that I dedicated my time 
in Berlin last year to giving a crash course in it. I am not sure it was 
effective in inspiring or educating people on how to do this, but 
everyone is welcome to revisit the slides here:

http://wikitech.wikimedia.org/index.php?title=File:Trevor_Parscal_-_Wikimedia_Developers_Workshop_-_Berlin_2010.pdf

Yesterday we had our first "on our own" series of user tests, conducted 
by Parul Vora. While she is train in the kung-fu of user testing, she 
personally helped me put this set of slides together. I also pulled from 
my own experiences being involved in this kind of testing earlier in my 
career.

My general pitch is, we should all be in the habit of doing whatever it 
takes to view users as they interact with our creations. I often use my 
wife, and now sometimes my 3-year old daughter to help me. Usually just 
showing someone a picture of a screen and asking "how would you do X?" 
is amazingly revealing. Higher-fidelity testing is great, but it's 
designed to squeeze the last bit of juice out of the lemon. In my 
experience the majority of it comes out quite easily in even the most 
causal of circumstances.

My secondary pitch, which is not captured in these slides but was 
verbalized in Berlin was my view that we should user-test APIs with 
developers. This would especially be useful for our public HTTP API, but 
even PHP and JavaScript APIs could benefit from this. This differs from 
posting to the list and saying "does anyone have any better ideas". 
Instead we would design APIs around use-cases, and then observe users in 
(Continue reading)

Bryan Tong Minh | 2 Dec 19:38 2010
Picon

Re: null revisions

On Thu, Dec 2, 2010 at 6:23 PM, Dmitriy Sintsov <questpc <at> rambler.ru> wrote:
> So probably the core patch is
> the only efficient way to solve my problem?
>
You can always supply a database patch with your extension to add
indices you need to core tables.

Bryan
Dmitriy Sintsov | 2 Dec 19:43 2010
Picon

Re: null revisions

* Bryan Tong Minh <bryan.tongminh <at> gmail.com> [Thu, 2 Dec 2010 19:38:47 
+0100]:
> On Thu, Dec 2, 2010 at 6:23 PM, Dmitriy Sintsov <questpc <at> rambler.ru>
> wrote:
> > So probably the core patch is
> > the only efficient way to solve my problem?
> >
> You can always supply a database patch with your extension to add
> indices you need to core tables.
>
Indices are not hard to add, that's true. However, even with indexes the 
GROUP BY rev_text_id query on large revision set is slow. I probably 
will have to patch Revision::newNullRevision to add a new field value 
there (for the existing it is possible to fill the new field with 
UPDATE, however there will be new null revisions).
Dmitriy

Gmane