Sergey Kishchenko | 20 Jan 2009 23:54
Picon
Gravatar

Twill is not working as expected

Hi,

I am trying to test Google OpenID Auth with twill, but getting an unexpected URLError because of quotes in refresh url that google inserts in response's header (Refresh: 0; url='http://www.google.com/...' , notice single quotes). I don't know is it a google's bug or twill's feature :). I propose this simple patch:

--- /usr/lib/python2.5/site-packages/twill-0.9-py2.5.egg/twill/other_packages/_mechanize_dist/_request.py    2009-01-21 00:34:17.000000000 +0200
+++ /home/sergey/dev/sandbox/_request.py    2009-01-21 00:25:35.000000000 +0200
<at> <at> -31,7 +31,10 <at> <at> class Request(urllib2.Request):
         if not _rfc3986.is_clean_uri(url):
             warn("url argument is not a URI "
                  "(contains illegal characters) %r" % url)
-        urllib2.Request.__init__(self, url, data, headers)
+        if url[0]=="'" and url[-1]=="'":
+            url = url[1:-1]
+
+    urllib2.Request.__init__(self, url, data, headers)
         self.selector = None
         self.unredirected_hdrs = {}
         self.visit = visit


Can anyone help me with not-so-dirty solution?

Regards,
Sergey Kishchenko

_______________________________________________
twill mailing list
twill@...
http://lists.idyll.org/listinfo/twill
John J Lee | 22 Jan 2009 20:52
Picon
Favicon

Twill is not working as expected (fwd)

This appears to be something that should be fixed in mechanize, so 
forwarding to wwwsearch-general so it doesn't get lost.

Thanks

John

---------- Forwarded message ----------
Date: Wed, 21 Jan 2009 00:54:42 +0200
From: Sergey Kishchenko <voidwrk@...>
To: twill@...
Subject: [twill] Twill is not working as expected

Hi,

I am trying to test Google OpenID Auth with twill, but getting an unexpected
URLError because of quotes in refresh url that google inserts in response's
header (Refresh: 0; url='http://www.google.com/...' , notice single quotes).
I don't know is it a google's bug or twill's feature :). I propose this
simple patch:

---
/usr/lib/python2.5/site-packages/twill-0.9-py2.5.egg/twill/other_packages/_mechanize_dist/_request.py
2009-01-21 00:34:17.000000000 +0200
+++ /home/sergey/dev/sandbox/_request.py    2009-01-21 00:25:35.000000000
+0200
 <at>  <at>  -31,7 +31,10  <at>  <at>  class Request(urllib2.Request):
          if not _rfc3986.is_clean_uri(url):
              warn("url argument is not a URI "
                   "(contains illegal characters) %r" % url)
-        urllib2.Request.__init__(self, url, data, headers)
+        if url[0]=="'" and url[-1]=="'":
+            url = url[1:-1]
+
+    urllib2.Request.__init__(self, url, data, headers)
          self.selector = None
          self.unredirected_hdrs = {}
          self.visit = visit

Can anyone help me with not-so-dirty solution?

Regards,
Sergey Kishchenko
_______________________________________________

twill mailing list

twill@...

http://lists.idyll.org/listinfo/twill

_______________________________________________
twill mailing list
twill@...
http://lists.idyll.org/listinfo/twill
Hotsyk | 28 Jan 2009 20:45
Picon
Gravatar

Form parsing error - using Python's own HTMLParser.py and not BeautifulSoup/tidy?

Hello.

I've got similar problem with nose+twill and I found solution in
incorrect parsing of the <br/> tag (without space). I've changed them
to <br /> (with space) and got no error.

I've checked your page and found some <br/> there too. I'm not sure
this is exactly your problem, but IMHO, you should try.

Sincerely,
Volodymyr Hotsyk
Hotsyk | 28 Jan 2009 21:01
Picon
Gravatar

Re: Form parsing error - using Python's own HTMLParser.py and not BeautifulSoup/tidy?

 Hello.

 I've got similar problem with nose+twill and I found solution in
 incorrect parsing of the <br/> tag (without space). I've changed them
 to <br /> (with space) and got no error.

 I've checked your page and found some <br/> there too. I'm not sure
 this is exactly your problem, but IMHO, you should try.

 Sincerely,
 Volodymyr Hotsyk

P.S. Resending, to determine: this is reply to this message:
http://lists.idyll.org/pipermail/twill/2008-January/000856.html

>Hello to all on the mailing list from a new member.
>
>I am trying to use twill to automate the use of my mobile phone operator's SMS
>web portal, to allow me to send text messages from the command line of my
>laptop, using its nice, big keyboard, rather than the tiny, fiddly keypad of my
>mobile.
>
>Using twill, I can successfully log in and follow the link to the SMS-sending
>page, but then twill crashes when it attempts to parse the forms on that page.
>When it crashes, the error seems to be in Python's own HTMLParser.py script.
>That puzzles me, because I have BeautifulSoup and tidy installed, and can prove
>(I think) that they are both being used by the fact that no exceptions are
>raised when commands are issued after requiring them in the config. If these
>superior HTML-parsing modules are being used, why is Python's HTML parser being
>called all?
>
>twill has successfully parsed all other HTML pages (with forms) that I have
>thrown at it. There seems to be something particularly nasty about the HTML on
>this particular page (perhaps inserted deliberately by the mobile provider to
>prevent just this sort of automation). If twill simply can't handle it, then
>I'm happy to accept that. My concern is that there might be something wrong with
>my (pretty new) Python or twill installation, which is causing an avoidable
>exception to occur.
>
>Could anyone please suggest what is going wrong?
>
>(As the SMS-sending page is only accessible after logging in, for the purposes
>of illustration I have copied the HTML of that page and have saved it to a file
>on my own server. This copy still causes twill to crash in the same manner as
>when using the live version.)
>
>--- Start of text dump ---
>
>  -= Welcome to twill! =-
>
>current page:  *empty page*
>>> config require_tidy 1
>current page:  *empty page*
>>> config require_BeautifulSoup 1
>current page:  *empty page*
>>> config
>current configuration:
>        acknowledge_equiv_refresh : True
>        allow_parse_errors : True
>        readonly_controls_writeable : False
>        require_BeautifulSoup : True
>        require_tidy : True
>        use_BeautifulSoup : True
>        use_tidy : True
>        with_default_realm : False
>
>current page:  *empty page*
>>> go http://www.saytheword.org.uk/send-text-preparing.htm
>==> at http://www.saytheword.org.uk/send-text-preparing.htm
>current page: http://www.saytheword.org.uk/send-text-preparing.htm
>>> showlinks
>8>< - - - SNIP! I've cut this bit out to save space, but no exceptions
>are raised. - - - ><8
>>> showforms
>Traceback (most recent call last):
>  File "/usr/bin/twill-sh", line 8, in <module>
>    load_entry_point('twill==0.9', 'console_scripts', 'twill-sh')()
>  File "/usr/lib/python2.5/site-packages/twill-0.9-py2.5.egg/twill/shell.py",
>line 383, in main
>    shell.cmdloop(welcome_msg)
>  File "/usr/lib/python2.5/cmd.py", line 142, in cmdloop
>    stop = self.onecmd(line)
>  File "/usr/lib/python2.5/cmd.py", line 219, in onecmd
>    return func(arg)
>  File "/usr/lib/python2.5/site-packages/twill-0.9-py2.5.egg/twill/shell.py",
>line 42, in do_cmd
>    print '\nERROR: %s\n' % (str(e),)
>  File "/usr/lib/python2.5/HTMLParser.py", line 59, in __str__
>    result = self.msg
>AttributeError: 'ParseError' object has no attribute 'msg'
>
>--- End of text dump ---
>

Gmane