Vadim Shlyakhov | 22 Apr 18:11 2015
Picon

Parsoid to render #Widget parser function result

Hi!,

Like before I'm playing with a wiki with quite an old Mediawiki installed (http://www.cruiserswiki.org/wiki/Special:Version)

We have a Widgets extension installed and it appears we are not able to render a result from #widget parser function there.

For example, at http://www.cruiserswiki.org/wiki/User:Vadim/Sandbox3 we have:

 <nowiki>{{</nowiki>#Widget:Dmh2Deg|d=41|m=14.74|h=N}}

{{#Widget:Dmh2Deg|d=41|m=14.74|h=N}}

 <nowiki>{{</nowiki>#Widget:Dmh2Deg|d=09|m=11.93|h=E}}

{{#Widget:Dmh2Deg|d=09|m=11.93|h=E}}

which renders to
{{#Widget:Dmh2Deg|d=41|m=14.74|h=N}}

41.245666666667

{{#Widget:Dmh2Deg|d=09|m=11.93|h=E}}

9.1988333333333


but the Parsoid gives:

{{#Widget:Dmh2Deg|d=41|m=14.74|h=N}}

{{#Widget:Dmh2Deg|d=09|m=11.93|h=E}}

Actually in HTML code I see:

<!-- ENCODED_CONTENT PCEtLQotLT48IS0tCi0tPjQxLjI0NTY2NjY2NjY2Nw== -->

and

<!--  ENCODED_CONTENT PCEtLQotLT48IS0tCi0tPjkuMTk4ODMzMzMzMzMzMw== -->

which gives 41.245666666667 and 9.1988333333333 if base64-decoded.

I'm quite confused about this behaviour.

I wonder if someone could help me to solve this issue.

Thanks

Vadim
_______________________________________________
Wikitext-l mailing list
Wikitext-l <at> lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitext-l
Vadim Shlyakhov | 22 Apr 17:56 2015
Picon

Parsoid crashes if no mediatype received

Hello,

Some older versions of Mediawiki (1.16) do not provide "mediatype" in "iiprop" for media. Then its worker crashes when it tries to parse a reply from a server.

Here is workaround for an issue:

--- ext.core.LinkHandler.js-    2015-04-22 15:30:04.000000000 +0300
+++ ext.core.LinkHandler.js    2015-04-22 17:03:11.000000000 +0300
<at> <at> -1011,7 +1012,11 <at> <at>
         // Add (read-only) information about original file size (T64881)
         img.addAttribute( 'data-file-width', info.width );
         img.addAttribute( 'data-file-height', info.height );
-        img.addAttribute( 'data-file-type', info.mediatype.toLowerCase() );
+        if (info.mediatype) {
+            img.addAttribute( 'data-file-type', info.mediatype.toLowerCase() );
+        }
     }
 
     if ( hasImageLink ) {



Is it possible implement it into the Parsoid.

Regards
Vadim
_______________________________________________
Wikitext-l mailing list
Wikitext-l <at> lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitext-l
Vadim Shlyakhov | 1 Mar 15:28 2015
Picon

Parsoid to preserve custom tag

Hi!,

Like before I'm playing with a wiki with quite an old Mediawiki installed (1http://www.cruiserswiki.org/wiki/Special:Version)

Some tags there do not seem to be translated by Parsoid (http://www.cruiserswiki.org/wiki/Sandbox):

<imagemap>
Image:NewcastleRoot.jpg|thumb|center|frame|Route From Newcastle - ''Click on name or areas''
rect 40 465 140 535 [[Sydney]]
rect 90 420 172 450 [[Newcastle]]
rect 132 359 215 415 [[Port Stephens]]
rect 177 275 265 335 [[Port Macquarie]]
rect 190 195 275 253 [[Coffs Harbour]]
rect 160 1 270 40 [[Brisbane]]
rect 510 288 687 360 [[Lord Howe Island]]
poly 124 600 196 421 271 239 272 1 648 1 648 600 [[Tasman Sea]]
</imagemap>

==gallery test==
<gallery widths="185px" heights="140px" perrow="4">
Image:TrapaniPurgatorio.jpg|Chiesa del Purgatorio, Trapani
Image:EriceView.jpg|View from Erice
Image:EriceCastle.jpg|Norman castle, Erice
Image:EriceChurch.jpg|Norman church, Erice
</gallery>

Which Parsoid renders to something like:

<imagemap> Image:NewcastleRoot.jpg|thumb|center|frame|Route From Newcastle - Click on name or areas rect 40 465 140 535 Sydney rect 90 420 172 450 Newcastle rect 132 359 215 415 Port Stephens rect 177 275 265 335 Port Macquarie rect 190 195 275 253 Coffs Harbour rect 160 1 270 40 Brisbane rect 510 288 687 360 Lord Howe Island poly 124 600 196 421 271 239 272 1 648 1 648 600 Tasman Sea </imagemap>
gallery test

<gallery widths="185px" heights="140px" perrow="4"> Image:TrapaniPurgatorio.jpg|Chiesa del Purgatorio, Trapani Image:EriceView.jpg|View from Erice Image:EriceCastle.jpg|Norman castle, Erice Image:EriceChurch.jpg|Norman church, Erice </gallery>

Apparently I'm missing something important. Any ideas?

Thanks
Vadim
_______________________________________________
Wikitext-l mailing list
Wikitext-l <at> lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitext-l
Daniel Friesen | 23 Feb 22:56 2015

Getting a sqlite database of extension metadata

I've published the first step of that MediaWiki Extension tool. So far it's only a script that can populate a database with extension metadata from extensions in Gerrit.

So, if you have any use for your own sqlite database of extension metadata, here's how to get one (assuming you have node and git installed).

$ git clone -b v0.1.1 https://github.com/redwerks/mediawiki-extensionservice.git $ cd mediawiki-extensionservice/ $ npm install $ npm install -g sequelize-cli $ npm install sqlite3 $ mkdir storage/ $ echo "STORAGE_DIR=./storage" >> .env $ echo "DATABASE_TYPE=sqlite" >> .env $ echo "DATABASE_STORAGE=./storage/db.sqlite" >> .env $ sequelize db:migrate $ bin/cron.js You'll have to wait a few minutes for it to finish. But at the end you can use whatever sqlite tools you have to look at the database in `./storage/db.sqlite`.

The Extensions table will contain a list of extensions, besides the extid and composerName indexes each row will have a data column containing JSON with data on the extension.

Some of this data is only available for extensions containing an extension.json file (The .php file is not parsed).
- name: The name of the extension (English text for "namemsg" -> "name" -> final fallback to the extension's dirname)
- description: The extension description (English text for "descriptionmsg" -> "description" -> empty)
- versionHint: The "version" in extension.json.

This data will always be available:
- repository: The git repository url.
- composerName: The "name" in composer.json if present.
- sources: An array of some of the possible ways to install the extension.
  - git-master if the repository has a master branch (basically everything)
  - git-stable if the repository has a HEAD that points to something other than master (in this case a "stableBranch" will also be present)
  - git-rel if the repository has REL#_## branches (basically everything)
  - git-tag if the repository has #.#.# or v#.#.# tags
  - composer if the repository has a valid composer.json with a name.

-- ~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://danielfriesen.name/]
_______________________________________________
Wikitext-l mailing list
Wikitext-l <at> lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitext-l
Subramanya Sastry | 20 Feb 18:41 2015
Picon

DOM Spec change: Upcoming changes to data-mw output for <ref> tags


As part of T88290, we are going to be making some changes to the data-mw spec for <ref> tags.

So far, the data-mw attribute for <ref> tags had the entire HTML for the reference represented in the body.html property in data-mw. However, in order to reduce the size of the HTML that we generate (and reduce network load and parsing load on clients, especially visual editor), we have been working on a change where we add a reference to the HTML via the body.id attribute in data-mw.

https://gerrit.wikimedia.org/r/#/c/191593/ is the patch that Marc has been working on.

An example at the end of this email will show the specific change and how it looks. We will update the DOM spec page[1] shortly.

So, once this patch is reviewed, tested, and deployed (most likely Feb 25 or Mar 2 unless there are concerns / problems that show up), Parsoid will only be emitting an id-based reference to the HTML. However, Parsoid will continue to accept both data-mw.body.html and data-mw.body.id for serialization.

That said, because of the specifics of Parsoid's selective serializer implementation. if a <ref>'s content has been edited, Parsoid expects to see *some* edit in the wrapper HTML of the <ref> itself. If you continue to send Parsoid data-mw.body.html back, all will work fine (since that will register as an edit). But, if you send Parsoid data-mw.body.id back, you should change the value of that id to a different value.

This update is a bit late in coming -- kind of lost track of it amidst the work, but as far as we know, only VE is affected by this change and they have already fixed their code. Flow has confirmed they aren't. But, let us know if there are any questions / concerns.

Subbu and Marc.

[1] https://www.mediawiki.org/wiki/Parsoid/MediaWiki_DOM_spec

--------------------------------------------------------------------------------------------------------------------------------------------------------

Wikitext
--------
A <ref>
 This is a '''[[bolded link]]''' and this is a {{echo|transclusion}}
</ref>

<references />

Current HTML
------------
<p>A <span about="#mwt2" class="reference" id="cite_ref-1" rel="dc:references" typeof="mw:Extension/ref" data-mw='{"name":"ref","body":{"html":"This is a &lt;b data-parsoid=&#39;{\"dsr\":[19,40,3,3]}&#39;>&lt;a rel=\"mw:WikiLink\" href=\"./Bolded_link\" title=\"Bolded link\" data-parsoid=&#39;{\"stx\":\"simple\",\"a\":{\"href\":\"./Bolded_link\"},\"sa\":{\"href\":\"bolded link\"},\"dsr\":[22,37,2,2]}&#39;>bolded link&lt;/a>&lt;/b> and this is a &lt;span about=\"#mwt3\" typeof=\"mw:Transclusion\" data-parsoid=&#39;{\"pi\":[[{\"k\":\"1\",\"spc\":[\"\",\"\",\"\",\"\"]}]],\"dsr\":[55,76,null,null]}&#39; data-mw=&#39;{\"parts\":[{\"template\":{\"target\":{\"wt\":\"echo\",\"href\":\"./Template:Echo\"},\"params\":{\"1\":{\"wt\":\"transclusion\"}},\"i\":0}}]}&#39;>transclusion&lt;/span>\n"},"attrs":{}}'><a href="#cite_note-1">[1]</a></span></p>

<ol class="references" typeof="mw:Extension/references" about="#mwt5" data-mw='{"name":"references","attrs":{}}'>
<li about="#cite_note-1" id="cite_note-1"><span rel="mw:referencedBy"><a href="#cite_ref-1">↑</a></span> This is a <b><a rel="mw:WikiLink" href="./Bolded_link" title="Bolded link">bolded link</a></b> and this is a <span about="#mwt3" typeof="mw:Transclusion" data-mw='{"parts":[{"template":{"target":{"wt":"echo","href":"./Template:Echo"},"params":{"1":{"wt":"transclusion"}},"i":0}}]}'>transclusion</span>
</li>
</ol>

New HTML
--------
<p>A <span about="#mwt2" class="reference" id="cite_ref-1" rel="dc:references" typeof="mw:Extension/ref" data-mw='{"name":"ref","body":{"id": "mw-reference-text-cite_note-1"},"attrs":{}}'><a href="#cite_note-1">[1]</a></span></p>

<ol class="references" typeof="mw:Extension/references" about="#mwt5" data-mw='{"name":"references","attrs":{}}'>
<li about="#cite_note-1" id="cite_note-1"><span rel="mw:referencedBy"><a href="#cite_ref-1">↑</a></span> <span id="mw-reference-text-cite_note-1" class="mw-reference-text">This is a <b><a rel="mw:WikiLink" href="./Bolded_link" title="Bolded link">bolded link</a></b> and this is a <span about="#mwt3" typeof="mw:Transclusion" data-mw='{"parts":[{"template":{"target":{"wt":"echo","href":"./Template:Echo"},"params":{"1":{"wt":"transclusion"}},"i":0}}]}'>transclusion</span> </span>
</li>
</ol>

_______________________________________________
Wikitext-l mailing list
Wikitext-l <at> lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitext-l
Nitin Gupta | 14 Feb 06:49 2015
Picon

Parsoid template expansion

Hi,

I created a nodejs service to convert wikitext to HTML with a frontent (written in Golang) which reads wikipedia dump and feed wikitext to this service[1]. However, after doing all this I discovered that Parsoid needs to contact wikimedia server for template expansion. Since I want to convert the entire wikipedia dump and HTML, I do not want to keep hitting wikimedia servers for template expansion requests.

So, are there any plans to add support to Parsoid to do this expansion offline. Once the wikipedia dump is downloaded, I want the entire process of converting to HTML to be offline (of course, don't need images).


Thanks,
Nitin

_______________________________________________
Wikitext-l mailing list
Wikitext-l <at> lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitext-l
Vadim Shlyakhov | 5 Feb 11:25 2015
Picon

http://www.cruiserswiki.org/wiki/

Hello there,

I have a bit of a problem with parsing http://www.cruiserswiki.org/wiki/ on my machine.

When I'm trying to use parsoid with this wiki then all external links at the page do not seem to be converted into <a> element.

For example for the page http://www.cruiserswiki.org/wiki/Cesme external link
[http://en.wikipedia.org/wiki/Cesme Çesme] (near the bottom of the page) appears at the parsoid output as it is instead of being converted to something like: <a href="http://en.wikipedia.org/wiki/Cesme" class="external text" rel="nofollow" target="_blank">

I have Ubuntu 14.04 on my machine and initially I tried version 0.2.0 of parsoid from http://parsoid.wmflabs.org:8080/deb . Then I cloned the parsoid from the git repository. But result is the same.

Any feedback would be highly appreciated.

Regards
Vadim
_______________________________________________
Wikitext-l mailing list
Wikitext-l <at> lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitext-l
Emmanuel Engelhart | 15 Nov 13:07 2014

Questions about Parsoid caching HTTP headers

Hi

I currently try to create a cache for "mwoffliner". A cache for images 
(thumbnails) and a cache for Parsoid output. For the images/thumbnails 
it's pretty straight forward thanks to the "last-modified" header.

Unfortunately, for the Parsoid output, this seems to be more 
complicated. Gabriel's htmldumper relies only on the oldid value, but 
I'm not really satisfied byt this approach because I want to be able to 
download a new version of the HTML for the same oldid if necessary (for 
example if the HTML output was improved with a Parsoid fix).

There is an "age" header but I don't really understand the fundamental 
difference with "last-modified". Do we have the same information here 
but presented in an other way? If yes, why is that better than 
"last-modified"?

There is in addition the "x-varnish" header but this is IMO an internal 
information I should not rely on (and BTW, time to time we get headers 
with two "x-warning" header entries, what looks pretty weird to me - see 
PS).

Finally my question, might we introduce a "last-modified" HTTP header?

Regards
Emmanuel

PS: Here an example of request with two "x-varnish" headers:

$ curl -I 
"http://parsoid-lb.eqiad.wikimedia.org/dewiki/Almer%C3%ADa?oldid=133672544"
HTTP/1.1 200 OK
X-Powered-By: Express
Vary: Accept-Encoding
Access-Control-Allow-Origin: *
Cache-Control: s-maxage=2592000
content-revision-id: 133672544
X-Parsoid-Performance: duration=4063; start=1416051524354
Content-Type: text/html; charset=UTF-8
X-Varnish: 735376643 735208307
Via: 1.1 varnish
Date: Sat, 15 Nov 2014 12:03:47 GMT
X-Varnish: 1047669169
Age: 1499
Via: 1.1 varnish
Connection: keep-alive
X-Cache: cp1058 hit (6), cp1058 frontend miss (0)

--

-- 
Kiwix - Wikipedia Offline & more
* Web: http://www.kiwix.org
* Twitter: https://twitter.com/KiwixOffline
* more: http://www.kiwix.org/wiki/Communication

_______________________________________________
Wikitext-l mailing list
Wikitext-l <at> lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitext-l
Andre Klapper | 3 Nov 14:18 2014
Picon

Google Code-In 2014: Become a mentor and add tasks!

Hi Parsoid & VisualEditor crew,

Google Code-In (GCI) will soon take place again - a contest for 13-17
year old students to contribute to free software projects. 

Wikimedia wants to take part again. 
Last year's GCI results were surprisingly good - see
https://www.mediawiki.org/wiki/Google_Code-in_2013

We need your help:

1) Go to 
https://www.mediawiki.org/wiki/Google_Code-in_2014#Mentors.27_corner and
read the information there. If something is unclear, ask!

2) Add yourself to the table of mentors on 
https://www.mediawiki.org/wiki/Google_Code-in_2014#Contacting_Wikimedia_mentors
- the more mentors are listed the better our chances are that Google
accepts us.

3) Please take ten minutes and go through open recent tickets in
https://bugzilla.wikimedia.org in your area of interest. If you see
self-contained, non-controversial issues with a clear approach which you
can recommend to new developers and would mentor: Add the task to
https://www.mediawiki.org/wiki/Google_Code-in_2014#Proposed_tasks

Until Sunday November 12th, we need at least five tasks from each of
these categories (plus some less technical beginner tasks as well):
* Code: Tasks related to writing or refactoring code
* Documentation/Training: Tasks related to creating/editing documents
and helping others learn more - no translation tasks
* Outreach/research: Tasks related to community management,
outreach/marketing, or studying problems and recommending solutions
* Quality Assurance: Tasks related to testing and ensuring code is of
high quality
* User Interface: Tasks related to user experience research or user
interface design and interaction

Google wants every organization to have 100+ tasks available on December
1st. Last year, we had 273 tasks in the end.

Note that you could also create rather generic tasks, for example fixing
two interface messages from the list of dependencies of
https://bugzilla.wikimedia.org/show_bug.cgi?id=38638

Helpful Bugzilla links:

* Reports that were proposed for GCI last year and are still open:
https://bugzilla.wikimedia.org/buglist.cgi?quicksearch=ALL%20whiteboard%3Agci2014

* Open Parsoid tickets created in the last six months (if I got your
products and components right):
https://bugzilla.wikimedia.org/buglist.cgi?bug_status=UNCONFIRMED&bug_status=NEW&bug_status=REOPENED&chfield=[Bug%20creation]&chfieldfrom=-6m&keywords=easy&keywords_type=nowords&resolution=---&product=Parsoid

* 8 existing Parsoid "easy" tickets (are they still valid? Are they
really self-contained, non-controversial issues with a clear approach?
Could some of them be GCI tasks that you would mentor? If so, please tag
them as described above!):
https://bugzilla.wikimedia.org/buglist.cgi?bug_status=UNCONFIRMED&bug_status=NEW&bug_status=REOPENED&keywords=easy&keywords_type=allwords&resolution=---&product=Parsoid

* Open VisualEditor tickets created in the last six months (if I got
your products and components right):
https://bugzilla.wikimedia.org/buglist.cgi?bug_status=UNCONFIRMED&bug_status=NEW&bug_status=REOPENED&chfield=[Bug%20creation]&chfieldfrom=-6m&keywords=easy&keywords_type=nowords&resolution=---&product=VisualEditor

* Zero existing VisualEditor "easy" tickets (are they still valid? Are
they really self-contained, non-controversial issues with a clear
approach? Could some of them be GCI tasks that you would mentor? If so,
please tag them as described above!):
https://bugzilla.wikimedia.org/buglist.cgi?bug_status=UNCONFIRMED&bug_status=NEW&bug_status=REOPENED&keywords=easy&keywords_type=allwords&resolution=---&product=VisualEditor

Could you imagine mentoring some of these tasks?

Thank you for your help in reaching out to new contributors and making
GCI a success again! Please ask if you have questions.

Cheers,
andre

PS: And in a future Phabricator world, Bugzilla tickets with the 'easy'
keyword will become Phabricator tasks with the 'easy' project.
--

-- 
Andre Klapper | Wikimedia Bugwrangler
http://blogs.gnome.org/aklapper/

_______________________________________________
Wikitext-l mailing list
Wikitext-l <at> lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitext-l
Chris Croome | 17 Jun 14:40 2014
Picon

Problems after switching to the .deb version of parsoid

Hi

I maintain the server running the wiki at
https://wiki.transitionnetwork.org/ and 3 month ago it was upgraded to
MediaWiki 1.22.x and the VisualEditor was installed from source and that
was quite straight forward, details here:

- https://trac.transitionnetwork.org/trac/ticket/706

Yesterday I upgraded to MediaWiki 1.23.0 and switched to using the
Debian repo for parsoid as per:

- https://www.mediawiki.org/wiki/Parsoid/Setup#Ubuntu_.2F_Debian

And since then I have been unable to get the VisualEditor working, when
you click on 'Edit' nothing happens, looking at the HTTP requests there
is a GET to load.php and a 304 in response but nothing actually happens
in the web browser, I have tried with Firefox 30 and the Debian Wheezy
Chromium.

There is nothing of note in the webserver logs or the parsoid.log.

It appears to be working correctly on the command line:

  curl http://localhost:8142/localhost/Sandbox -d wt="Hello ''world''" -d body=1
    <body data-parsoid='{"dsr":[0,15,0,0]}'><p data-parsoid='{"dsr":[0,15,0,0]}'>Hello <i data-parsoid='{"dsr":[6,15,2,2]}'>world</i></p></body>

I'm not sure what to try next.

More details including the Nginx config can be found on the comments I
have posted to this ticket:

- https://trac.transitionnetwork.org/trac/ticket/736

Any pointers would be greatly appreciated, I fear I have made a simple
mistake somewhere...

All the best

Chris

--

-- 
Webarchitects Co-operative
http://webarchitects.coop/
+44 114 276 9709
 <at> webarchcoop

_______________________________________________
Wikitext-l mailing list
Wikitext-l <at> lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitext-l
Pratik Lahoti | 9 Jun 19:37 2014
Picon

Markup cleansing by clearing all linguistic elements

Hi,

I am working on the mass migration tools project as a part of Google Summer of Code. One of the parts of project is to import old translations into the Translate Extension.

We are done with a basic import by splitting the old pages on double newlines (\n\n) and some more alignment based on h2 headers. We are now thinking of improving the alignment.

Is there some work done on the subject mentioned? For each of the unit, what I would like to do is clear all the linguistic elements and have the bare markup left. Then, I could compare the markup of the source and target units and align accordingly.

Are there any API's available which already do this? Please guide me to accomplish this task.

--
Warm Regards,
Pratik Lahoti
GSoC Intern | Wikimedia
User:BPositive

_______________________________________________
Wikitext-l mailing list
Wikitext-l <at> lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitext-l

Gmane