1 Nov 01:05
1 Nov 06:25
wiki "blame" functionality
I've been working on a project to provide an annotation facility for wiki articles, sort of like the "blame" function of your favourite source control system. I've got a prototype of this with the first 1450 or so articles from enwiki here: http://hewgill.com/~greg/wikiblame/ Moving your mouse over text highlights in yellow all the text that was written in the same revision as whatever your mouse is pointing at. Clicking on text brings up the corresponding revision diff on Wikipedia. I wrote a bit about how this works here: http://ghewgill.livejournal.com/118086.html I reckon it would take me something on the order of a couple of years to run this through all enwiki articles on my own machine. Larger articles take a few minutes to process, the largest articles (eg. Anarchism) take an hour or more. I wonder whether there would be any interest in making this function more generally available? Greg Hewgill [[User:Ghewgill]] http://hewgill.com
1 Nov 12:18
Title Rewrite
Most people have probably forgotten about it by now, but back at the
start of march[1] there was a discussion about a partial rewrite of the
Title system (though I use rewrite to heavily when I'm talking about a
big backend project, even when 90% of the system stays the same). It was
branched off the "[Wikitech-l] Case insensitive links (not just
titles)." thread.
The branch failed twice back when I was working on it in March.
Primarily I ran into the twilight zone known as branching and merging
using SVN on Windows... eugh, and the branch always got messed up and
was unmaintained.
Now, I've got my own laptop running Ubuntu, and branching has gotten a
fair bit easier.
I've restarted the titlerewrite branch again. Anyone interested can
check it out and discuss the project as well.
Just to recap for those who didn't see the old discussions, as I
remember there were 3 goals:
A) Make MediaWiki understand the concept of titles like "iPod"
By "understand", I mean [[IPod]] and [[iPod]] are still the same page,
and they still show up as IPod in the url and database. However,
MediaWiki also has "iPod" stored so it knows what the actual styled
format of the title is. Basically this is handled by adding a column
into the database.
The difference between this and {{DISPLAYTITLE:...}} is that
displaytitle is a parser hack, while this is an actual integrated part
of the Title backend. If you rename IPod to iPod on Wikipedia with this
rewrite, then Special:Allpages and everywhere else, will display "iPod"
instead of IPod like it currently does.
(Continue reading)
1 Nov 18:27
Re: new extension for embedded music scores
On Tuesday 28 October 2008 13:18:06 River Tarnell wrote: > i have written a new extension to embed music scores in MediaWiki pages: > https://secure.wikimedia.org/wikipedia/mediawiki/wiki/Extension:ABC > > unlike the Lilypond extension, this uses a simple input language (ABC) that > is much easier to validate for security. ABC is mostly used to transcribe > Irish trad and other simple tunes, but it recently gained support for more > advanced features, e.g. multiple staves and lyrics. this is supported in > the extension using the 'abcm2ps' tool. > > unlike the existing ABC extension (AbcMusic), it doesn't support opening > arbitrary files as ABC input (which is a potential security issue), and has > several additional features: > > - The original ABC can be downloaded easily > - The score can be downloaded as PDF, PostScript, MIDI or Ogg Vorbis > - A media player can be embedded in the page to play the media file > > i believe the ABC format is suitable for transcribing the majority of > scores currently on Wikimedia projects. although it can't handle all of > them, it is better than the current situation. plus, as ABC is simple, and > existing ABC scores are easily available, it's easier for novice users to > contribute. > > i would be interested to hear peoples' thoughts on enabling this extension > on Wikimedia. One caveat only: in my tests, abc2midi didn't support the same ABC format abcm2ps did. I don't think it is a big problem, and if it turns out to be, it could be solved simply by adding an option to disable sound rendering.(Continue reading)
1 Nov 21:05
2 Nov 00:23
Re: Title Rewrite
That sounds very nice! How far away is this from being ready for the MediaWiki trunk? -- -- Remember the dot http://en.wikipedia.org/wiki/User:Remember_the_dot
2 Nov 02:07
Re: Search options and namespace selection
FT2 wrote: > I'd consider also using a second trick Google does. It tries to pick out a > few specific useful links and highlight them first. In our case, pages with > <text> in the title, or sequentially in the text as a string, may be more > likely to be high quality hits, than pages that "just had the words > scattered somewhere in the text". That's what the ranking algorithm is for -- title matches are already more highly-ranked. -- brion
2 Nov 02:08
Re: Search options and namespace selection
Leon Weber wrote: [snip] > In conclusion, we should have a search interface containing three basic > elements: > - a big search box -- Google has a simple front page with a centralised > search box for a reason > - some common options, like those proposed by FT2: That's what Robert's refactoring does, which is why I'm recommending it be further polished up. -- brion
2 Nov 08:16
much better sleep
Hello, sleep is indeed much better once your data is in multiple datacenters.Thanks to Rob and Mark for driving the new Tampa datacenter project. -- -- Domas Mituzas -- http://dammit.lt/ -- [[user:midom]]
2 Nov 09:52
Re: new extension for embedded music scores
On Thu, Oct 30, 2008 at 5:42 AM, Juliano F. Ravasi <ml@...> wrote: > Brion Vibber wrote: >> * Garbage collection for unused output files? Previews or changed pages >> could leave the system littered with millions of output files which will >> never be used again. >> >> It'd be nice to solve these for math and timelines as well -- perhaps a >> unified system? > > I was thinking about this problem, but for Graphviz (which is based on > Timeline, which was the source of the problem). > > Instead of opening a tag and dumping graph/timeline code in the article, > we have a namespace where articles are parsed as graph/timeline code > (some extra syntax needed in order to put some wikitext in the same > article). Then, in article pages, you insert a link like [[Graph:Graph > title|options|description]] or [[Timeline:Timeline title|...]], and they > are inserted just like images are. > > This has a few advantages: > - Graphs or timelines (or music scores) that appear in multiple pages > are maintained in a single central article, changes in it propagates to > all pages that include that object, just like it is with images; > - It becomes easier to edit article pages, because you have only > wikitext to look at, and not a mix of wikitext, dot syntax, timeline > syntax, etc... > - Perfect garbage collection: we produce no garbage. In-disk files are > linked to special article pages; if someone deletes the article, MW just > wipes all files with that given hash and you are done. Any file without > a matching article in that special namespace is easily marked as garbage.(Continue reading)
Thanks to Rob and Mark for driving the new Tampa datacenter project.
RSS Feed