William Lewis | 6 Jul 20:07
Picon

URL blacklist

How difficult would it be to create a URL blacklist? It's getting ridiculous dealing with the same spammers
coming and posting the same links over and over again.
Philip Neustrom | 6 Jul 20:23
Picon
Gravatar

Re: URL blacklist

Pretty easy.  There was one before, maintained as a wiki page.  I  
forget what it was called.

If someone sends a working patch I will apply.  I like the idea of  
using the text from a wiki page to run the regexps, line by line.   
There was code before, i think called blacklist.py. May still work.   
Maybe wikispot.org sysadmins can just keep that file updated as an  
interim solution?

Can you email me regexps for some of the URL patterns you've seen?

Sent from my phone

On Jul 6, 2010, at 11:07 AM, William Lewis <wlewis@...> wrote:

> How difficult would it be to create a URL blacklist? It's getting  
> ridiculous dealing with the same spammers coming and posting the  
> same links over and over again.
> _______________________________________________
> Sycamore-Dev mailing list
> sycamore-dev@...
> http://www.projectsycamore.org/
> https://tools.cernio.com/pipermail/sycamore-dev/
> https://tools.cernio.com/mailman/listinfo/sycamore-dev
Graham Freeman | 9 Jul 09:59

Re: URL blacklist


On 06 Jul 10, at 11:23 , Philip Neustrom wrote:

> There was code before, i think called blacklist.py. May still work.   
> Maybe wikispot.org sysadmins can just keep that file updated as an  
> interim solution?

Happy to, if someone else provides the URLs in the appropriate (regex?) format.

-G

Sean Robinson | 10 Jul 23:06
Picon

Re: URL blacklist

On Fri, Jul 9, 2010 at 12:59 AM, Graham Freeman <graham.freeman-D8QiBNiZVM7QT0dZR+AlfA@public.gmane.org> wrote:

On 06 Jul 10, at 11:23 , Philip Neustrom wrote:

> There was code before, i think called blacklist.py. May still work.
> Maybe wikispot.org sysadmins can just keep that file updated as an
> interim solution?



Happy to, if someone else provides the URLs in the appropriate (regex?) format.

-G


  After looking at blacklist.py (but not running it), it appears that the list of restricted URLs are hard coded into the source file.  Is anyone interested in making this use a wiki page for the list of URLs?

  Again, from blacklist.py, it appears to take a list of URLs (one per line), strip leading and trailing whitespace, and join them in a pipe ("|") separated string to compile into the regexp.  So, a simple list of blacklisted URLs, one per line, should be sufficient.

  Here is a list of the spam URLs I have seen on my two spots:
http://customwritingservices.org/
http://cvresumewritingservices.org/professional-resume.php
http://writing-help.org/
http://essay-writer.org/
http://custom-paper-writing.com/

--
Sean Robinson
WiFi Radar - http://wifi-radar.berlios.de
Python WiFi - http://pythonwifi.wikispot.org

Ryan Tucker | 11 Jul 01:01
Picon
Gravatar

Re: URL blacklist

On Sat, Jul 10, 2010 at 5:06 PM, Sean Robinson
<seankrobinson@...> wrote:
>   After looking at blacklist.py (but not running it), it appears that the list of restricted URLs are hard
coded into the source file.  Is anyone interested in making this use a wiki page for the list of URLs?

If someone's looking for inspiration on how to retrieve the list from
a wiki page, the spell-checking code does this.  There's probably some
good room for code reuse (or modularization!) there, for sure.  -rt

Sean Robinson | 16 Jul 02:03
Picon

Re: URL blacklist

On Sat, Jul 10, 2010 at 4:01 PM, Ryan Tucker <rtucker-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
On Sat, Jul 10, 2010 at 5:06 PM, Sean Robinson <seankrobinson-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>   After looking at blacklist.py (but not running it), it appears that the list of restricted URLs are hard coded into the source file.  Is anyone interested in making this use a wiki page for the list of URLs?

If someone's looking for inspiration on how to retrieve the list from
a wiki page, the spell-checking code does this.  There's probably some
good room for code reuse (or modularization!) there, for sure.  -rt


  The following is an attempt at a new blacklist.py.  I am seeking comments and criticism to see if I understood what I was reading in the spell check code and whether I am using PageEditor correctly, etc.  This is an early draft of non-working code, but I would appreciate feedback about my approach.

# -*- coding: utf-8 -*-
# blacklist against wiki spammers

# WikiSpot admins can add a URL to 'Global Blacklist Page' to disallow an
# edit which contains that URL.

import re, types
from Sycamore import config
from Sycamore.PageEditor import PageEditor
from Sycamore.request import RequestDummy
from Sycamore.security import Permissions

# get the global blacklist page contents
request = RequestDummy()
blacklist_page = PageEditor(config.global_blacklist_page, request)
if blacklist_page:
    blacklist = blacklist_page.get_raw_body()
    self.blacklist_re = "|".join(map(lambda s: "%s" % s.strip(), blacklist.strip().split("\n")))
    self.blacklist_re = re.compile(self.blacklist_re)

class SecurityPolicy(Permissions):
    def save(self, editor, newtext, datestamp, **kw):
        match = blacklist_re.search(newtext)
        if match:
            print "blacklist match: %s" % match.group()
        return match == None


--
Sean Robinson
WiFi Radar - http://wifi-radar.berlios.de
Python WiFi - http://pythonwifi.wikispot.org

Philip Neustrom | 16 Jul 02:26
Picon
Gravatar

Re: URL blacklist

Looks basically good.

One note, though:  This is going to grab the blacklist page on a
per-wiki basis, rather than being global.  You'll want to specific a
wiki id in there to grab from the hub (primary) wiki.

-p

On Thu, Jul 15, 2010 at 5:03 PM, Sean Robinson
<seankrobinson@...> wrote:
> On Sat, Jul 10, 2010 at 4:01 PM, Ryan Tucker <rtucker@...> wrote:
>>
>> On Sat, Jul 10, 2010 at 5:06 PM, Sean Robinson <seankrobinson@...>
>> wrote:
>> >   After looking at blacklist.py (but not running it), it appears that
>> > the list of restricted URLs are hard coded into the source file.  Is anyone
>> > interested in making this use a wiki page for the list of URLs?
>>
>> If someone's looking for inspiration on how to retrieve the list from
>> a wiki page, the spell-checking code does this.  There's probably some
>> good room for code reuse (or modularization!) there, for sure.  -rt
>>
>
>   The following is an attempt at a new blacklist.py.  I am seeking comments
> and criticism to see if I understood what I was reading in the spell check
> code and whether I am using PageEditor correctly, etc.  This is an early
> draft of non-working code, but I would appreciate feedback about my
> approach.
>
> # -*- coding: utf-8 -*-
> # blacklist against wiki spammers
>
> # WikiSpot admins can add a URL to 'Global Blacklist Page' to disallow an
> # edit which contains that URL.
>
> import re, types
> from Sycamore import config
> from Sycamore.PageEditor import PageEditor
> from Sycamore.request import RequestDummy
> from Sycamore.security import Permissions
>
> # get the global blacklist page contents
> request = RequestDummy()
> blacklist_page = PageEditor(config.global_blacklist_page, request)
> if blacklist_page:
>     blacklist = blacklist_page.get_raw_body()
>     self.blacklist_re = "|".join(map(lambda s: "%s" % s.strip(),
> blacklist.strip().split("\n")))
>     self.blacklist_re = re.compile(self.blacklist_re)
>
> class SecurityPolicy(Permissions):
>     def save(self, editor, newtext, datestamp, **kw):
>         match = blacklist_re.search(newtext)
>         if match:
>             print "blacklist match: %s" % match.group()
>         return match == None
>
>
> --
> Sean Robinson
> WiFi Radar - http://wifi-radar.berlios.de
> Python WiFi - http://pythonwifi.wikispot.org
>
>
> _______________________________________________
> Sycamore-Dev mailing list
> sycamore-dev@...
> http://www.projectsycamore.org/
> https://tools.cernio.com/pipermail/sycamore-dev/
> https://tools.cernio.com/mailman/listinfo/sycamore-dev
>
>

Sean Robinson | 18 Jul 00:39
Picon

Re: URL blacklist

On Thu, Jul 15, 2010 at 5:26 PM, Philip Neustrom <philipn <at> gmail.com> wrote:
Looks basically good.

One note, though:  This is going to grab the blacklist page on a
per-wiki basis, rather than being global.  You'll want to specific a
wiki id in there to grab from the hub (primary) wiki.

-p

On Thu, Jul 15, 2010 at 5:03 PM, Sean Robinson <seankrobinson-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> On Sat, Jul 10, 2010 at 4:01 PM, Ryan Tucker <rtucker-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>>
>> On Sat, Jul 10, 2010 at 5:06 PM, Sean Robinson <seankrobinson-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
>> wrote:
>> >   After looking at blacklist.py (but not running it), it appears that
>> > the list of restricted URLs are hard coded into the source file.  Is anyone
>> > interested in making this use a wiki page for the list of URLs?
>>
>> If someone's looking for inspiration on how to retrieve the list from
>> a wiki page, the spell-checking code does this.  There's probably some
>> good room for code reuse (or modularization!) there, for sure.  -rt
>>
>
>   The following is an attempt at a new blacklist.py.  I am seeking comments
> and criticism to see if I understood what I was reading in the spell check
> code and whether I am using PageEditor correctly, etc.  This is an early
> draft of non-working code, but I would appreciate feedback about my
> approach.
>
> # -*- coding: utf-8 -*-
> # blacklist against wiki spammers
>
> # WikiSpot admins can add a URL to 'Global Blacklist Page' to disallow an
> # edit which contains that URL.
>
> import re, types
> from Sycamore import config
> from Sycamore.PageEditor import PageEditor
> from Sycamore.request import RequestDummy
> from Sycamore.security import Permissions
>
> # get the global blacklist page contents
> request = RequestDummy()
> blacklist_page = PageEditor(config.global_blacklist_page, request)
> if blacklist_page:
>     blacklist = blacklist_page.get_raw_body()
>     self.blacklist_re = "|".join(map(lambda s: "%s" % s.strip(),
> blacklist.strip().split("\n")))
>     self.blacklist_re = re.compile(self.blacklist_re)
>
> class SecurityPolicy(Permissions):
>     def save(self, editor, newtext, datestamp, **kw):
>         match = blacklist_re.search(newtext)
>         if match:
>             print "blacklist match: %s" % match.group()
>         return match == None
>
>

  How do I create multiple wikis in a local Sycamore install?  I would like to test using the hub wiki blacklist page from child wikis.

  Below is a second version that works within limits and has been tested on a local Sycamore install.  I would again appreciate comments.

# -*- coding: utf-8 -*-
# blacklist against wiki spammers

# WikiSpot admins can add a URL to 'Global Blacklist' to disallow an
# edit which contains that URL.

from Sycamore.security import Permissions

class SecurityPolicy(Permissions):
    def save(self, editor, newtext, datestamp, **kw):
        # do not enforce URL blacklisting in blacklist page
        if editor.request.pagename == editor.request.config.page_global_blacklist:
            return True

        import re, types
        from Sycamore.PageEditor import PageEditor

        blacklist_page = PageEditor(editor.request.config.page_global_blacklist, editor.request)
        if blacklist_page:
            blacklist = blacklist_page.get_raw_body()
            blacklist_re = "|".join(map(lambda s: "%s" % s.strip(), blacklist.strip().split("\n")))
            blacklist_re = re.compile(blacklist_re)

        match = blacklist_re.search(newtext)

        if match:
            print "blacklist match: %s" % match.group()
        return match == None

--
Sean Robinson
WiFi Radar - http://wifi-radar.berlios.de
Python WiFi - http://pythonwifi.wikispot.org

Philip Neustrom | 19 Jul 10:01
Picon
Gravatar

Re: URL blacklist

You'll want to call request.switch_wiki() like this

                        orig_wiki = request.config.wiki_name
                        request.switch_wiki(config.wiki_name) # global
'base' wiki
                        * do stuff here *
                        request.switch_wiki(orig_wiki)

On Sat, Jul 17, 2010 at 3:39 PM, Sean Robinson
<seankrobinson@...> wrote:
> On Thu, Jul 15, 2010 at 5:26 PM, Philip Neustrom <philipn@...> wrote:
>>
>> Looks basically good.
>>
>> One note, though:  This is going to grab the blacklist page on a
>> per-wiki basis, rather than being global.  You'll want to specific a
>> wiki id in there to grab from the hub (primary) wiki.
>>
>> -p
>>
>> On Thu, Jul 15, 2010 at 5:03 PM, Sean Robinson <seankrobinson@...>
>> wrote:
>> > On Sat, Jul 10, 2010 at 4:01 PM, Ryan Tucker <rtucker@...> wrote:
>> >>
>> >> On Sat, Jul 10, 2010 at 5:06 PM, Sean Robinson
>> >> <seankrobinson@...>
>> >> wrote:
>> >> >   After looking at blacklist.py (but not running it), it appears that
>> >> > the list of restricted URLs are hard coded into the source file.  Is
>> >> > anyone
>> >> > interested in making this use a wiki page for the list of URLs?
>> >>
>> >> If someone's looking for inspiration on how to retrieve the list from
>> >> a wiki page, the spell-checking code does this.  There's probably some
>> >> good room for code reuse (or modularization!) there, for sure.  -rt
>> >>
>> >
>> >   The following is an attempt at a new blacklist.py.  I am seeking
>> > comments
>> > and criticism to see if I understood what I was reading in the spell
>> > check
>> > code and whether I am using PageEditor correctly, etc.  This is an early
>> > draft of non-working code, but I would appreciate feedback about my
>> > approach.
>> >
>> > # -*- coding: utf-8 -*-
>> > # blacklist against wiki spammers
>> >
>> > # WikiSpot admins can add a URL to 'Global Blacklist Page' to disallow
>> > an
>> > # edit which contains that URL.
>> >
>> > import re, types
>> > from Sycamore import config
>> > from Sycamore.PageEditor import PageEditor
>> > from Sycamore.request import RequestDummy
>> > from Sycamore.security import Permissions
>> >
>> > # get the global blacklist page contents
>> > request = RequestDummy()
>> > blacklist_page = PageEditor(config.global_blacklist_page, request)
>> > if blacklist_page:
>> >     blacklist = blacklist_page.get_raw_body()
>> >     self.blacklist_re = "|".join(map(lambda s: "%s" % s.strip(),
>> > blacklist.strip().split("\n")))
>> >     self.blacklist_re = re.compile(self.blacklist_re)
>> >
>> > class SecurityPolicy(Permissions):
>> >     def save(self, editor, newtext, datestamp, **kw):
>> >         match = blacklist_re.search(newtext)
>> >         if match:
>> >             print "blacklist match: %s" % match.group()
>> >         return match == None
>> >
>> >
>
>   How do I create multiple wikis in a local Sycamore install?  I would like
> to test using the hub wiki blacklist page from child wikis.
>
>   Below is a second version that works within limits and has been tested on
> a local Sycamore install.  I would again appreciate comments.
>
> # -*- coding: utf-8 -*-
> # blacklist against wiki spammers
>
> # WikiSpot admins can add a URL to 'Global Blacklist' to disallow an
> # edit which contains that URL.
>
> from Sycamore.security import Permissions
>
> class SecurityPolicy(Permissions):
>     def save(self, editor, newtext, datestamp, **kw):
>         # do not enforce URL blacklisting in blacklist page
>         if editor.request.pagename ==
> editor.request.config.page_global_blacklist:
>             return True
>
>         import re, types
>         from Sycamore.PageEditor import PageEditor
>
>         blacklist_page =
> PageEditor(editor.request.config.page_global_blacklist, editor.request)
>         if blacklist_page:
>             blacklist = blacklist_page.get_raw_body()
>             blacklist_re = "|".join(map(lambda s: "%s" % s.strip(),
> blacklist.strip().split("\n")))
>             blacklist_re = re.compile(blacklist_re)
>
>         match = blacklist_re.search(newtext)
>
>         if match:
>             print "blacklist match: %s" % match.group()
>         return match == None
>
> --
> Sean Robinson
> WiFi Radar - http://wifi-radar.berlios.de
> Python WiFi - http://pythonwifi.wikispot.org
>
>
> _______________________________________________
> Sycamore-Dev mailing list
> sycamore-dev@...
> http://www.projectsycamore.org/
> https://tools.cernio.com/pipermail/sycamore-dev/
> https://tools.cernio.com/mailman/listinfo/sycamore-dev
>
>

Sean Robinson | 25 Jul 00:20
Picon

Re: URL blacklist

  I have not been able to make a working wiki farm to adequately test these patches.  But, I believe they should work.  Can someone try them on a testing wiki farm and let us all know whether it works or not?

  Regexes (strings) can be added to Global Blacklist.  Any regex match while saving from an edit will cause the edit to fail for lack of permissions.

On Mon, Jul 19, 2010 at 1:01 AM, Philip Neustrom <philipn <at> gmail.com> wrote:
You'll want to call request.switch_wiki() like this

                       orig_wiki = request.config.wiki_name
                       request.switch_wiki(config.wiki_name) # global
'base' wiki
                       * do stuff here *
                       request.switch_wiki(orig_wiki)

On Sat, Jul 17, 2010 at 3:39 PM, Sean Robinson <seankrobinson-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> On Thu, Jul 15, 2010 at 5:26 PM, Philip Neustrom <philipn-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>>
>> Looks basically good.
>>
>> One note, though:  This is going to grab the blacklist page on a
>> per-wiki basis, rather than being global.  You'll want to specific a
>> wiki id in there to grab from the hub (primary) wiki.
>>
>> -p
>>
>> On Thu, Jul 15, 2010 at 5:03 PM, Sean Robinson <seankrobinson-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
>> wrote:
>> > On Sat, Jul 10, 2010 at 4:01 PM, Ryan Tucker <rtucker-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>> >>
>> >> On Sat, Jul 10, 2010 at 5:06 PM, Sean Robinson
>> >> <seankrobinson-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
>> >> wrote:
>> >> >   After looking at blacklist.py (but not running it), it appears that
>> >> > the list of restricted URLs are hard coded into the source file.  Is
>> >> > anyone
>> >> > interested in making this use a wiki page for the list of URLs?
>> >>
>> >> If someone's looking for inspiration on how to retrieve the list from
>> >> a wiki page, the spell-checking code does this.  There's probably some
>> >> good room for code reuse (or modularization!) there, for sure.  -rt
>> >>
>> >
>> >   The following is an attempt at a new blacklist.py.  I am seeking
>> > comments
>> > and criticism to see if I understood what I was reading in the spell
>> > check
>> > code and whether I am using PageEditor correctly, etc.  This is an early
>> > draft of non-working code, but I would appreciate feedback about my
>> > approach.
>> >
>> > # -*- coding: utf-8 -*-
>> > # blacklist against wiki spammers
>> >
>> > # WikiSpot admins can add a URL to 'Global Blacklist Page' to disallow
>> > an
>> > # edit which contains that URL.
>> >
>> > import re, types
>> > from Sycamore import config
>> > from Sycamore.PageEditor import PageEditor
>> > from Sycamore.request import RequestDummy
>> > from Sycamore.security import Permissions
>> >
>> > # get the global blacklist page contents
>> > request = RequestDummy()
>> > blacklist_page = PageEditor(config.global_blacklist_page, request)
>> > if blacklist_page:
>> >     blacklist = blacklist_page.get_raw_body()
>> >     self.blacklist_re = "|".join(map(lambda s: "%s" % s.strip(),
>> > blacklist.strip().split("\n")))
>> >     self.blacklist_re = re.compile(self.blacklist_re)
>> >
>> > class SecurityPolicy(Permissions):
>> >     def save(self, editor, newtext, datestamp, **kw):
>> >         match = blacklist_re.search(newtext)
>> >         if match:
>> >             print "blacklist match: %s" % match.group()
>> >         return match == None
>> >
>> >
>
>   How do I create multiple wikis in a local Sycamore install?  I would like
> to test using the hub wiki blacklist page from child wikis.
>
>   Below is a second version that works within limits and has been tested on
> a local Sycamore install.  I would again appreciate comments.
>
> # -*- coding: utf-8 -*-
> # blacklist against wiki spammers
>
> # WikiSpot admins can add a URL to 'Global Blacklist' to disallow an
> # edit which contains that URL.
>
> from Sycamore.security import Permissions
>
> class SecurityPolicy(Permissions):
>     def save(self, editor, newtext, datestamp, **kw):
>         # do not enforce URL blacklisting in blacklist page
>         if editor.request.pagename ==
> editor.request.config.page_global_blacklist:
>             return True
>
>         import re, types
>         from Sycamore.PageEditor import PageEditor
>
>         blacklist_page =
> PageEditor(editor.request.config.page_global_blacklist, editor.request)
>         if blacklist_page:
>             blacklist = blacklist_page.get_raw_body()
>             blacklist_re = "|".join(map(lambda s: "%s" % s.strip(),
> blacklist.strip().split("\n")))
>             blacklist_re = re.compile(blacklist_re)
>
>         match = blacklist_re.search(newtext)
>
>         if match:
>             print "blacklist match: %s" % match.group()
>         return match == None
>
> --
> Sean Robinson
> WiFi Radar - http://wifi-radar.berlios.de
> Python WiFi - http://pythonwifi.wikispot.org
>
>



--
Sean Robinson
WiFi Radar - http://wifi-radar.berlios.de
Python WiFi - http://pythonwifi.wikispot.org

Attachment (blacklist-wikipage.patch): application/octet-stream, 1954 bytes
Attachment (blacklist-config.patch): application/octet-stream, 427 bytes

Gmane