Gergo Tisza | 18 Apr 00:39 2014

MediaViewer URL format

Hi all,

we have just deployed a new URL format for MediaViewer [0], I am submitting it here for comments and for the benefit of people who have to do something similar in other contexts.

MediaViewer stores the name of the image in the hash part of the URL so one can share links to a page with a specific image open in the lightbox. (We considered using the History API [1] to change the path or the query part, but that degrades poorly.) I have looked at three options:
  1. Just put the file name as-is (with spaces replaced by underscores) in the URL fragment part.
    Pro: readable file names in URLs, easy to generate.
    Con: technically not a valid URI. [2] (It would be a valid IRI, probably, but browser support for that is not so great, so non-ASCII bytes might get encoded in unexpected ways.) Creates nasty usability and security issues (injection vulnerabilities, RTL characters, characters which break autolinking). Would make it very hard to introduce more complex URL formats later, as file names can contain pretty much any character.
  2. Use percent encoding (with underscores for spaces).
    Pro: this is the standard way of encoding fragments. [2][3] Always results in a valid URI. Readable file names in Firefox. Easy to generate on-wiki (e.g. with {{urlencode}})
    Con: Non-Latin filenames look horrible in any browser that's not Firefox.
  3. Use MediaWiki anchor encoding (like percent encoding, but use a dot instead of a percent sign).
    This would have the advantage that links can be generated in wikitext very conveniently, using the [[#...]] syntax. Unfortunately the way MediaWiki does percent encoding is intrinsically broken (the dot itself does not get encoded, but it does get decoded when followed by suitable characters, so file names cannot get roundtripped safely), so this is not an option.
We went with option 2, so URLs look like this:

One issue that we ran into is that window.location.hash behaves weirdly with percent-encoded hashes in Firefox [4], but that's easy to avoid once you know about it. Other than that, it seems to work reliably.

Multimedia mailing list
Multimedia <at>
Fabrice Florin | 17 Apr 23:00 2014

Media Viewer Launches on First Pilot Sites

Hi folks,

I’m happy to let you know that Media Viewer has just deployed on our first pilot sites today!

1. First Pilots
We just released Media Viewer enabled by default on Catalan, Hungarian and Korean Wikipedias, as well as
English Wikivoyage. Next Thursday, we plan to deploy to more pilot sites: Czech, Estonian, Finnish,
Hebrew, Polish, Romanian, Thai, Slovak, and Vietnamese. Try it out for yourself on the Hungarian Wikipedia:

2. First Metrics, we jumped from 100 image views per day to 1k/day, about a 10 x increase. And on Commons it was
much higher, due to the ‘View Expanded’ button: from 240 image views per day to 24k/day yesterday —
that’s a 100 x increase ! You can track the adoption of this tool on these first metrics dashboards.

3. Share your feedback
Please let us know what you think of Media Viewer — and join other beta users from around the world on this
discussion page:

If you’re short on time, please take this quick survey to let us know how Media Viewer works for you:

Many thanks to all the team and community members who made this launch possible!


Fabrice — for the Multimedia Team

P.S.: If you haven’t tried Media Viewer yet, follow the test tips on this demo page on


Fabrice Florin
Product Manager, Multimedia
Wikimedia Foundation

Multimedia mailing list
Multimedia <at>
Mark Holmquist | 17 Apr 20:47 2014

Pilots, bugfixes, and more - oh my! (your weekly Multimedia update)

Good morning, Multimedia-l! This is your (mostly) weekly update on the
doings of the WMF Multimedia team.

== Pilot program ==

We've begun busily breaking^Wembettering several WMF sites with a default-
on MultimediaViewer extension. Last Thursday, we turned it on for the
MediaWiki wiki, where developers have their documentation and discussions,
and this week we'll be enabling it for several small Wikipedias and the
English Wikivoyage. See our release plan [0] for more.

== Cards from last week ==

Last week, we fixed a lot of bugs. [1-8]

We also worked on a few things related to the user survey [9] that we've
now launched on all pilot sites, and finished up a small feature from the
week before [10].

This is to be expected - releases are meant to be bug-focused and not
too heavy on feature additions. You'll also see some attacks on tech debt
and some scope increases during our next few weeks, as we deal with the
last few issues on the project before we shift to others.

== Cards for this week ==

This week is a little lighter, but we expect there to be even more bugs
and scope increases coming in. We're going to try fixing some issues with
a subset of images [11], as well as improve metrics coverage [12-13], link
directly to CC license deeds [14], and deal with some other small issues
as we go.

== Feedbacks ==

If you have feedback for the Multimedia team, you should poke us, either
on this mailing list or on our IRC channel, #wikimedia-multimedia on We'll be looking for thoughts on our near-future plans
in the coming weeks in particular, so keep an eye out and don't be afraid
to speak up :)




Mark Holmquist
Software Engineer, Multimedia
Wikimedia Foundation
Multimedia mailing list
Multimedia <at>
Gilles Dubuc | 17 Apr 17:53 2014

Re: [Ops] Caching API responses

Including the multimedia list, since the discussion is now broader. Gergo, Mark, I encourage you to read the backlog:

On Thu, Apr 17, 2014 at 4:31 PM, Brad Jorsch (Anomie) <bjorsch <at>> wrote:
On Thu, Apr 17, 2014 at 4:13 AM, Gilles Dubuc <> wrote:
When the user opens media viewer, but there are 4 API calls per image

When I tried it just now, I saw 6 queries: one to prop=imageinfo to fetch a number of different props, one to meta=filerepoinfo, one to list=imageusage, one to prop=globalusage, and two more to prop=imageinfo to fetch the URLs for two different sizes of the image.

Also, getting really offtopic here, "guprop[]=url&guprop[]=namespace" and "&iunamespace[]=0&iunamespace[]=100" that I see in your original queries doesn't actually work; it gives the same results as if guprop and iunamespace are omitted entirely. The API should give a warning about that (filed as bug 64057).

Brad Jorsch (Anomie)
Software Engineer
Wikimedia Foundation

Ops mailing list

Multimedia mailing list
Multimedia <at>
Jean-Frédéric | 16 Apr 23:54 2014

Subtitle support


I understand this is not on the multimedia team priorities, but I wanted to put under the radar the subtitle support on Wikimedia projects

I just created [[bug:64031]] to keep track of this:

The two main features really impairing the subtitle workflow right now (in my opinion at least) are:

* Integration with Translate extension.
Bug discussion seems to indicate not much needs to be done but the path forward is unclear

* Amara / UniversalSubtitles integration
Amara / Universal Subtitles is an awesome tool to easily add subtitles to videos.
As part of the 2011 (?) multimedia Beta, we used to have it integrated on Wikimedia Commons, but not anymore (not sure when that was discontinued). Where did it go? What would it take to have it back?

I hope this is in scope of this list :)


Multimedia mailing list
Multimedia <at>
Gilles Dubuc | 16 Apr 13:22 2014

Filtering out outliers in data used to generate tsvs

Including the analytics team in case they have a magical solution to our problem.

Currently, our graphs display the mean and standard deviation of metrics, as provided in "mean" and "std" columns coming from our tsvs, generated based on EventLogging data:  However we already see that extreme outliers can make the standard deviation and mean skyrocket and as a result make the graphs useless for some metrics. See France, for example, for which a single massive value was able to skew the map into making the country look problematic: There's no performance issue with France, but the graph suggests that is the case because of that one outlier.

Ideally, instead of using the mean for our graphs, we would be using what is called the "trimmed mean", i.e. the mean of all values excluding the upper and lower X percentiles. Unfortunately, MariaDB doesn't provide that as a function and calculating it with SQL can be surprisingly complicated, especially since we often have to group values for a given column. The best alternative I could come up with so far for our geographical queries was to exclude values that differ more than X times the standard deviation from the mean. It kind of flattens the mean. It's not ideal, because I think that in the context of our graphs it makes things look like they perform better than they really do.

I think the main issue at the moment is that we're using a shell script to pipe a SQL request directly from db1047 to a tsv file. That limits us to one giant SQL query, and since we don't have the ability to create temporary tables on the log database with the research_prod user, we can't preprocess the data in multiple queries to filter out the upper and lower percentiles. The trimmed mean would be kind of feasible as a single complicated query if it wasn't for the GROUP BY:

So, my latest idea for a solution is to write a python script that will import the section (last X days) of data from the EventLogging tables that we're interested in into a temporary sqlite database, then proceed with removing the upper and lower percentiles of the data, according to any column grouping that might be necessary. And finally, once the data preprocessing is done in sqlite, run similar queries as before to export the mean, standard deviation, etc. for given metrics to tsvs. I think using sqlite is cleaner than doing the preprocessing on db1047 anyway.

It's quite an undertaking, it basically means rewriting all our current SQL => TSV conversion. The ability to use more steps in the conversion means that we'd be able to have simpler, more readable SQL queries. It would also be a good opportunity to clean up the giant performance query with a bazillion JOINS: which can actually be divided into several data sources all used in the same graph.

Does that sound like a good idea, or is there a simpler solution out there that someone can think of?

Multimedia mailing list
Multimedia <at>
Fabrice Florin | 16 Apr 09:19 2014

Your feedback on Media Viewer

Hi folks,

We’d love to hear what you think of Media Viewer, our new multimedia browser, as we get ready to release it more widely in coming weeks.

Is Media Viewer useful to you? What do you like most? least? How can we improve this tool? Are there any critical improvements we should consider before launch?

Here are three ways you can share your feedback about this new viewing experience:

1. Join our IRC chat
We’re hosting a live IRC chat in a few hours, this Wed. Apr. 9 at 18:00 UTC on #wikimedia-office. 

All are welcome! Drop by to meet the team, share your comments, ask questions about this release, or make suggestions for improvement.

2. Discuss this tool
Meet other beta users from around the world on our Media Viewer discussion page. Here, we talk about new features, bugs and ideas with our community.

3. Take a quick survey
Can you tell us how Media Viewer works for you? It only takes a minute and means a lot to us.

Hope to hear from you on one of these channels. Your feedback will help us improve the tool and launch it more smoothly.

Speak to you soon,

Fabrice — for the Multimedia Team

P.S.: If you haven’t tried Media Viewer yet, visit this test page on


Fabrice Florin
Product Manager, Multimedia
Wikimedia Foundation

Multimedia mailing list
Multimedia <at>
Derk-Jan Hartman | 12 Apr 14:28 2014

Retina screens and MMV

Are retina screens being handled by MMV in any sort of way ?
I don't have one of those machines, so I have trouble verifying. (Brion has one right ?)

Multimedia mailing list
Multimedia <at>
Gergo Tisza | 10 Apr 23:36 2014

Incident tracking

Hi team,

one piece of feedback from Aaron's recent retrospective (not sure if we have a public link for it?) that I felt was a very good advice was that we should track the number and type of incidents (an incident being when something maintained by the multimedia team breaks in production) so that we can get an overview of how good our QA/CI practices are and what needs to be improved.

With that in mind I created this page:
Please expand it whenever something comes up (or even with older events if you still remember them).
Multimedia mailing list
Multimedia <at>
Fabrice Florin | 10 Apr 21:26 2014

Media Viewer Launches on

Hi folks,

I am happy to let you know that we have just launched Media Viewer 0.2 on our first pilot site,, where it is now enabled by default for all users (previously, it was only available as a Beta Feature). 

Media Viewer aims to improve the multimedia viewing experience on Wikipedia and Wikimedia sites, to display images in larger size and with less clutter — as well as invite more people to use our images.

We invite you to try out this new tool today, which you can do on this test page:

Please let us know what you think on this discussion page:

You can learn more about this new feature here:

After this first pilot, we plan to enable Media Viewer by default for these next pilot sites:
• April 17 - Confirmed: Catalan, Hungarian, Korean, English Wikivoyage 
• April 24 - Proposed: Czech, Estonian, Finnish, Hebrew, Polish, Romanian, Thai, Slovak, Vietnamese

Based on these first pilot results, we plan wider releases on larger wikis in the following weeks, with a goal to deploy to all wikis next month. Our release schedule will be based on new findings at each stage of deployment. If this product performs well and meets user needs, we may accelerate the deployment pace -- or we may slow it down for some sites, as needed. 

More details are available on our updated Release Plan:

To discuss this release and review the final product together, we invite you to join our next IRC chat, on Wed. Apr. 9 at 18:00 UTC (11am PT). We also invite you to try out the tool on your own wikis, where it is available for early testing as a Beta Feature in your user preferences, as described above.

Please let us know if you have any questions, suggestions or comments about this release. And many thanks to all the community members who helped create this feature with us in recent months! 

We look forward to bringing a richer multimedia experience to your community very soon. 

Regards as ever,

on behalf of the Multimedia Team


Fabrice Florin
Product Manager, Multimedia
Wikimedia Foundation

Multimedia mailing list
Multimedia <at>
Aaron Arcos | 10 Apr 20:42 2014

April-Sprint-3 planning notes...

Hi there folks,

End of April-Sprint-2 and the beginning of April-Sprint-3. Here the notes,
previous wall and current wallAs always feel free to correct/add
anything missing.

The team worked very hard and took care of most of the blocking issues
for the pilot release today April 10. There are still some wrinkles to iron out
but everything should be taken care of in time for deployment.

Gi11es is bug-on-duty person this week.
Multimedia mailing list
Multimedia <at>