Jan Dudík | 4 Mar 2010 14:18
Picon
Gravatar

How to create many redirects?

I want to create many redirects.
I have list with

* Article name * redirect name

(instead of   * migt be anything)

I want somethink like

python any_module.py -file:listofredirs.txt -summary:hello

how to do it?

JAn

--
Zai Lynch | 4 Mar 2010 14:31
Picon
Picon

Re: How to create many redirects?

I always did that with pagefromfily.py (though you'd have to mod your input file a little).

http://meta.wikimedia.org/wiki/Pywikipediabot/pagefromfile.py

--zai

On Thu, Mar 4, 2010 at 2:18 PM, Jan Dudík <jan.dudik <at> gmail.com> wrote:
I want to create many redirects.
I have list with

* Article name * redirect name

(instead of   * migt be anything)

I want somethink like

python any_module.py -file:listofredirs.txt -summary:hello

how to do it?

JAn

--

_______________________________________________
Pywikipedia-l mailing list
Pywikipedia-l <at> lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l

_______________________________________________
Pywikipedia-l mailing list
Pywikipedia-l <at> lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l
Santiago M. Mola | 4 Mar 2010 17:35
Picon

Working on arbitrary revisions

Hi,

Currently, I have to do work with old revisions from articles. I found
that most method in Page are focused on working with the latest
revision. So it'd very convenient to add a new Revision class where
methods like section(), isDisambig(), userName(), editTime(), etc.,
live. Then Page could properly get (and cache) any revision. These
methods in page could be shortcuts for the actual methods in Revision
of the latest revision.

What do you think?

Also, am I missing something and there's actually a good way of
working on arbitrary revisions that I missed?

Thanks,
--

-- 
Santiago M. Mola
Jabber ID: cooldwind <at> gmail.com
Santiago M. Mola | 4 Mar 2010 23:16
Picon

Re: Working on arbitrary revisions

On Thu, Mar 4, 2010 at 5:35 PM, Santiago M. Mola <cooldwind <at> gmail.com> wrote:
>
> Currently, I have to do work with old revisions from articles. I found
> that most method in Page are focused on working with the latest
> revision. So it'd very convenient to add a new Revision class where
> methods like section(), isDisambig(), userName(), editTime(), etc.,
> live. Then Page could properly get (and cache) any revision. These
> methods in page could be shortcuts for the actual methods in Revision
> of the latest revision.
>

Nevermind. I switched to the rewrite branch and looks like it already
suits my needs ;-)

Best regards,
--

-- 
Santiago M. Mola
Jabber ID: cooldwind <at> gmail.com
Santiago M. Mola | 7 Mar 2010 13:07
Picon

Improvements to the rewrite branch

Hi there,

I'm currently using the rewrite branch for a project. This project is
not a bot, but a tool for vandalism analysis.

Here I'll explain how I used it and what changes I made, so it may be
useful for the new design of the rewrite. Also, I'd like to get
recommendations about my approaches so I can made them suitable for
integration with pywikipedia.

First of all, my main unit of information is Edit. An Edit is an
object composed of a Page and two consecutive revision IDs of such
page. Edit supports some operations such as getting the edition
comment, user, timestamp and the old and the new text.

I had to implement a method similar to BaseSite.loadrevisions():
Given a list of edits, which have associated their revision IDs but
NOT their Page, fetch them and associate them with their Page object.
This method retrieves all the revisions, creates Page objects for them
and Revision objects which are assigned to the corresponding
Page._revisions dict.

Then, I have to store all this info in-disk for later use. So I wrote
a function for exporting my list edits to XML, using WikiMedia's
format Export 0.4. To ease this process, I added a to_element() method
to Page and Revision objects. to_element() returns an Element object
(from the ElementTree API) representing the object. So, exporting is
as easy as iterating over all Pages, calling their to_element()
method() and appending it to a common root. What do you think about
this? Should it be included in pywikipedia? Do you prefer a different
approach for exporting to XML?

For importing again from XML, I adapted the old XmlDump. My version
yields Page objects instead of revisions. Of course this might be a
performance nightmare when working with XML dumps with full history,
so it can be modified to yield Revision objects.

I think the Revision class should include a page attribute, containing
the Page object that the Revision belongs to. That would be of use,
for example, when writing an XmlDump yielding Revisions and, in
general, for more applications that are Revision oriented.

And last but not least, currently it's easy to end up with multiple
Page objects representing the same page, but with different object
state. Do you think that BaseSite should implement a Page factory or
some way to "create a Page object for this title if it doesn't exist
or give me the one that already exists"?

Well, that's all at the moment.

Best regards,
--

-- 
Santiago M. Mola
Jabber ID: cooldwind <at> gmail.com
Pablo Recio Quijano | 15 Mar 2010 21:38
Picon
Gravatar

Doubt about files hierarchy

Hi

I use Python frecuently, and today I start working with pywikipediabot, wich is a very good library, by the way.

But I think that the workflow is very out-of-the-python-way. I explain my point:

To make a script that uses this environment, you need to put the code on the main directory of pywikipediabot, or do some links to that directory. But usually, when you use a third-party module on Python, you should have the chance to "install" the module and load it with a simple

import pywikipediabot

or

from pywikipediabot import wikipedia

And doing thins on any directory on your system, without any extra configuration or needed files. I think this could be a nice feature, because it respects the python-way, and gives the chance to distribuite the module much more easier using 'distutils'[1] or even Debian packages

[1] http://docs.python.org/distutils/index.html

Regards,
Pablo Recio

_______________________________________________
Pywikipedia-l mailing list
Pywikipedia-l <at> lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l
Nicolas Dumazet | 15 Mar 2010 22:37
Picon

Re: Doubt about files hierarchy

Hello,

2010/3/16 Pablo Recio Quijano <rikutheronin <at> gmail.com>:
> Hi
>
> I use Python frecuently, and today I start working with pywikipediabot, wich
> is a very good library, by the way.
>
> But I think that the workflow is very out-of-the-python-way. I explain my
> point:
>
> To make a script that uses this environment, you need to put the code on the
> main directory of pywikipediabot, or do some links to that directory. But
> usually, when you use a third-party module on Python, you should have the
> chance to "install" the module and load it with a simple
>
> import pywikipediabot
>
> or
>
> from pywikipediabot import wikipedia
>
> And doing thins on any directory on your system, without any extra
> configuration or needed files. I think this could be a nice feature, because
> it respects the python-way, and gives the chance to distribuite the module
> much more easier using 'distutils'[1] or even Debian packages

You are right. pywikipedia was at first a collection of user-script,
and users started to factor and gather together common procedures. Its
current architecture has many, many flaws.
A few "crazy" people (Russell, me, some others?) do not use anymore
the trunk/ version of pywikipedia, but the rewrite/ branch.
The rewrite only uses the MediaWiki API to edit and retrieve data, and
is structured as a normal Python module with submodules. It even, oh
magic, includes some tests.

The original idea was to abandon trunk/ to use the rewrite, but we
lack manpower and (at least for me) time to actually do the conversion
work of all existing scripts. But I know for a fact that code is
working, and cleaner. Please give it a try :)

Regards,
>
> [1] http://docs.python.org/distutils/index.html
>
> Regards,
> Pablo Recio
>
> _______________________________________________
> Pywikipedia-l mailing list
> Pywikipedia-l <at> lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l
>
>

--

-- 
Nicolas Dumazet — NicDumZ

_______________________________________________
Pywikipedia-l mailing list
Pywikipedia-l <at> lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l
Merlijn van Deen | 16 Mar 2010 12:11
Picon

Rewrite available as nightly package

Dear all,


As response to Nicolas' e-mail:
The original idea was to abandon trunk/ to use the rewrite, but we
lack manpower and (at least for me) time to actually do the conversion
work of all existing scripts. But I know for a fact that code is
working, and cleaner. Please give it a try :)

I decided to clean up the nightlies page: I removed all the clutter (spelling, threadedhttp, pywikiparser) and added the rewrite (in other words: only the 'pywikipedia' and 'rewrite' packages remain).


Best regards,
Merlijn van Deen / valhallasw
_______________________________________________
Pywikipedia-l mailing list
Pywikipedia-l <at> lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l
Chris Watkins | 26 Mar 2010 07:04
Favicon
Gravatar

Copy pages, or create a new page with pywikipediabot?

I want to copy (not move) a few hundred pages from one namespace to another. Any ideas how?

I think I can do it in two stages - if I can create a page. I'd create each page with only the page name, then convert that to a transclusion from the mainspace article:

python replace.py -regex ".*" "{{subst:PAGENAME}}" -file:pagelist
python replace.py -regex ".*" "{{subst::\\1}}" -file:pagelist

But I get the error "Page [[blah blah]] not found" when I run the first command. I also tried with "" as the search string in the first line, but it's the same. And ideally I'd like it to exclude cases where the page already exists.

Any solution? Thanks!

--
Chris Watkins

Appropedia.org - Sharing knowledge to build rich, sustainable lives.

blogs.appropedia.org
community.livejournal.com/appropedia
identi.ca/appropedia
twitter.com/appropedia
_______________________________________________
Pywikipedia-l mailing list
Pywikipedia-l <at> lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l
Zai Lynch | 26 Mar 2010 09:11
Picon
Picon

Re: Copy pages, or create a new page with pywikipediabot?

This is another case where I'd use "pagefromfile.py".
http://meta.wikimedia.org/wiki/Pywikipediabot/pagefromfile.py

Modify your pagelist so it has a format like

{{-start-}}
'''Namespace:Pagename1'''
{{subst::Pagename1}} <- note the double colon in case your articles are in the main namespace.
{{-stop-}}
{{-start-}}
'''Namespace:Pagename2'''
{{subst::Pagename2}}
{{-stop-}}
etc.

You can do that easily with some clever search-and-replace in the text editor of your choice.

Then call the bot with
python pagefromfile.py -notitle -file:pagelist
This also automatically excludes pages where the page already exist (in case you're not calling the script with a -force parameter).

Hope that helps!
--zai

On Fri, Mar 26, 2010 at 7:04 AM, Chris Watkins <chriswaterguy <at> appropedia.org> wrote:
I want to copy (not move) a few hundred pages from one namespace to another. Any ideas how?

I think I can do it in two stages - if I can create a page. I'd create each page with only the page name, then convert that to a transclusion from the mainspace article:

python replace.py -regex ".*" "{{subst:PAGENAME}}" -file:pagelist
python replace.py -regex ".*" "{{subst::\\1}}" -file:pagelist

But I get the error "Page [[blah blah]] not found" when I run the first command. I also tried with "" as the search string in the first line, but it's the same. And ideally I'd like it to exclude cases where the page already exists.

Any solution? Thanks!

--
Chris Watkins

Appropedia.org - Sharing knowledge to build rich, sustainable lives.

blogs.appropedia.org
community.livejournal.com/appropedia
identi.ca/appropedia
twitter.com/appropedia

_______________________________________________
Pywikipedia-l mailing list
Pywikipedia-l <at> lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l


_______________________________________________
Pywikipedia-l mailing list
Pywikipedia-l <at> lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l

Gmane