subscriptions | 2 Mar 2007 13:30
Picon

(no subject)

Hi folks,

I'm new to twill and - first of all - happy to have found it. I had such a project in mind for a long time and I'm
glad to find it already realized.

I'm currently trying out the capabilities. I'm currently working in a restricted corporate environment
with lots of proxys and authentications.

I already found out how to deal with the authentication-requiring proxys and I'm currently dealing with
the login to a restricted web site.

I'm currently not able to log in with twill into a site, while I can access the site via urllib2 with the
following script:

import urllib2
passmgr = urllib2.HTTPBasicAuthHandler()
passmgr.add_password("LDAP Intranet-Login:", 'webportal', 'myname', 'mypw')

opener = urllib2.build_opener(passmgr)
urllib2.install_opener(opener)

f = urllib2.urlopen('http://webportal/twiki/bin/view/Main/WebHome')
buf = f.read()
print buf
f.close()

If I use the same credentials within a "add_auth" command in twill it won't work (I get a 403 access forbidden).

What I was wondering about is: twill uses HTTPPasswordMgr where my script uses HTTPBasicAuthHandler, and
my script installs the handler.
(Continue reading)

subscriptions | 2 Mar 2007 13:44
Picon

Question about HTTP basic authentication handling

(Sorry, forgot the heading)

Hi folks, 

I'm new to twill and - first of all - happy to have found it. I had such a project in mind for a long time and I'm
glad to find it already realized. 

I'm currently trying out the capabilities. I'm currently working in a restricted corporate environment
with lots of proxys and authentications. 

I already found out how to deal with the authentication-requiring proxys and I'm currently dealing with
the login to a restricted web site. 

I'm currently not able to log in with twill into a site, while I can access the site via urllib2 with the
following script: 

import urllib2 
passmgr = urllib2.HTTPBasicAuthHandler() 
passmgr.add_password("LDAP Intranet-Login:", 'webportal', 'myname', 'mypw') 

opener = urllib2.build_opener(passmgr) 
urllib2.install_opener(opener) 

f = urllib2.urlopen('http://webportal/twiki/bin/view/Main/WebHome') 
buf = f.read() 
print buf 
f.close() 

If I use the same credentials within a "add_auth" command in twill it won't work (I get a 403 access
forbidden). 
(Continue reading)

subscriptions | 2 Mar 2007 14:55
Picon

Bug? in HTTP Basic Authentication Handling

Hi folks,

I digged into the code of twill and mechanize (and figured out how the basic flow of the data is) concerning
basic HTTP authentication.

What I found seems to be curious: the password handler are coded in _auth.py within mechanize. If I put debug
code in the "add_password" function of HTTPPasswordMgr then the corresponding code is executed if is
issue a "add_auth" command.

The corresponding find_user_password function is never called (at least in my case). I'm currently
trying to figure out why...

Regards,
Andrew
John J Lee | 2 Mar 2007 14:57
Picon
Favicon

Re: Question about HTTP basic authentication handling

On Fri, 2 Mar 2007, subscriptions@... wrote:
[...]
> I'm currently not able to log in with twill into a site, while I can 
> access the site via urllib2 with the following script:
>
> import urllib2
> passmgr = urllib2.HTTPBasicAuthHandler()
> passmgr.add_password("LDAP Intranet-Login:", 'webportal', 'myname', 'mypw')
[...]
> If I use the same credentials within a "add_auth" command in twill it 
> won't work (I get a 403 access forbidden).
>
> What I was wondering about is: twill uses HTTPPasswordMgr where my 
> script uses HTTPBasicAuthHandler, and my script installs the handler.
>
> I'm not deep into the inner workings of urllib2 and mechanize - any 
> pointer would be helpful.

mechanize._auth.HTTPBasicAuthHandler has an 
mechanize._auth.HTTPPasswordManager (or a subtype).  Those classes are 
forked copies of the urllib2 classes of the same names.

I don't know the twill code.  If Titus doesn't pop up with the solution, 
you can try to locate the problem by writing the corresponding mechanize 
code.  Something like (UNTESTED):

br = mechanize.Browser()
# using 3 argument form here -- optional 4th arg is realm ("LDAP Intranet-Login:")
br.add_password('webportal', 'myname', 'mypw')
br.open('http://webportal/twiki/bin/view/Main/WebHome')
(Continue reading)

John J Lee | 2 Mar 2007 15:11
Picon
Favicon

Re: Bug? in HTTP Basic Authentication Handling

On Fri, 2 Mar 2007, subscriptions@... wrote:
[...]
> What I found seems to be curious: the password handler are coded in 
> _auth.py within mechanize. If I put debug code in the "add_password" 
> function of HTTPPasswordMgr then the corresponding code is executed if 
> is issue a "add_auth" command.

They are forked because the urllib2 code was buggy in 2.4.  I submitted 
the fixes to SF, and they're there in Python 2.5.  Also, ISTR I added a 
new proxy password manager to make the front-end proxy auth interface 
friendlier (probably I should have written a new one from scratch rather 
than derivint it from the urllib2 code...).

> The corresponding find_user_password function is never called (at least 
> in my case). I'm currently trying to figure out why...

That suggests you're not getting the expected 401 response from the 
server.

Turn on logging of HTTP request and response headers -- either in twill, 
or using a simple mechanize script like the one I posted.  To turn on the 
logging you need in mechanize:

import mechanize
hh = mechanize.HTTPHandler()
hsh = mechanize.HTTPHandler()
hh.set_http_debuglevel(1)
hsh.set_http_debuglevel(1)
opener = mechanize.build_opener(hh, hsh)

(Continue reading)

subscriptions | 2 Mar 2007 15:56
Picon

Re: Bug? in HTTP Basic Authentication Handling

Hi John,

thanks for answering. I digged a lot deeper now into the code... and I installed the latest mechanize code separately and tried it out. The latest mechanize version works correct, I can access the site easily with a simple script.

I did then something ugly ;-) - I copied the latest mechanize code over the mechanize code provided by twill to see if the bug is purely within mechanize. It still doesn't work with Twill, so I suggest that somehow the 2.4er tweaking (I use 2.4) from the side of twill is the cause.

I tried to figure out if the BasicAuth handler is called and came to the following code:

_opener.py (Lines 176 - 180):

        # In Python >= 2.4, .open() supports processors already, so we must
        # call ._open() instead.
        urlopen = getattr(urllib2.OpenerDirector, "_open",
                          urllib2.OpenerDirector.open)
        response = urlopen(self, req, data)

Here I'm lost... no idea why you copy most of the code of urllib2 and then reuse some portions of it... But i don't have the python source of urllib2 and I can't dig deeper.

What I found also interesting is that twill installs a own HTTPHandler (for whatever reason, I don't know what WSGI is...). Maybe this is the cause (_browser.py)?

def build_http_handler():
    from mechanize._urllib2 import HTTPHandler

    class MyHTTPHandler(HTTPHandler):
        def http_open(self, req):
            return self.do_open(wsgi_intercept.WSGI_HTTPConnection, req)

    return MyHTTPHandler

This will have to wait until next week since I'm leaving now the corporate environment.

Regards,

Andrew

   

>On Fri, 2 Mar 2007, subscriptions-sES7kRODLsuUo3SRjZBpwA@public.gmane.org wrote:    

>[...]    

>> What I found seems to be curious: the password handler are coded in     

>> _auth.py within mechanize. If I put debug code in the "add_password"     

>> function of HTTPPasswordMgr then the corresponding code is executed if     

>> is issue a "add_auth" command.    

>    

>They are forked because the urllib2 code was buggy in 2.4.  I submitted     

>the fixes to SF, and they're there in Python 2.5.  Also, ISTR I added a     

>new proxy password manager to make the front-end proxy auth interface     

>friendlier (probably I should have written a new one from scratch rather     

>than derivint it from the urllib2 code...).    

>    

>    

>> The corresponding find_user_password function is never called (at least     

>> in my case). I'm currently trying to figure out why...    

>    

>That suggests you're not getting the expected 401 response from the     

>server.    

>    

>Turn on logging of HTTP request and response headers -- either in twill,     

>or using a simple mechanize script like the one I posted.  To turn on the     

>logging you need in mechanize:    

>    

>import mechanize    

>hh = mechanize.HTTPHandler()    

>hsh = mechanize.HTTPHandler()    

>hh.set_http_debuglevel(1)    

>hsh.set_http_debuglevel(1)    

>opener = mechanize.build_opener(hh, hsh)    

>    

>response = opener.open(url)    

># etc.    

>    

>    

>John    

>    

>_______________________________________________    

>twill mailing list    

>twill-zFNWeKVCUJJD5a/XDn3G2A@public.gmane.org    

>http://lists.idyll.org/listinfo/twill

_______________________________________________
twill mailing list
twill@...
http://lists.idyll.org/listinfo/twill
John J Lee | 2 Mar 2007 16:30
Picon
Favicon

Re: Bug? in HTTP Basic Authentication Handling

If you don't mind, please don't top-post, it's not really appropriate on 
open-source mailing lists.

On Fri, 2 Mar 2007, subscriptions@... wrote:
[...]
> I tried to figure out if the BasicAuth handler is called and came to the following code:
> 
> _opener.py (Lines 176 - 180):
> 
>         # In Python >= 2.4, .open() supports processors already, so we must
>         # call ._open() instead.
>         urlopen = getattr(urllib2.OpenerDirector, "_open",
>                           urllib2.OpenerDirector.open)
>         response = urlopen(self, req, data)
> 
> Here I'm lost... no idea why you copy most of the code of urllib2 and 
> then reuse some portions of it...

Because mechanize exports a superset of the urllib2 interface.  The number 
of extensions and fixes have grown over time.  I may just fork the whole 
thing later, but now isn't the time.

> But i don't have the python source of urllib2 and I can't dig deeper.

You do have the source of urllib2 -- look in your Python installation 
e.g. C:\Python24\Lib\urllib2.py, or /usr/{local/,}lib/python2.4/urllib2.py.

John
_______________________________________________
twill mailing list
twill@...
http://lists.idyll.org/listinfo/twill
Titus Brown | 2 Mar 2007 18:10
Picon
Favicon

Re: Bug? in HTTP Basic Authentication Handling

>   Hi John,
>
>   thanks for answering. I digged a lot deeper now into the code... and I
>   installed the latest mechanize code separately and tried it out. The
>   latest mechanize version works correct, I can access the site easily
>   with a simple script.
>
>   I did then something ugly ;-) - I copied the latest mechanize code
>   over the mechanize code provided by twill to see if the bug is purely
>   within mechanize. It still doesn't work with Twill, so I suggest that
>   somehow the 2.4er tweaking (I use 2.4) from the side of twill is the
>   cause.  

[ ... ]

>   What I found also interesting is that twill installs a own HTTPHandler
>   (for whatever reason, I don't know what WSGI is...). Maybe this is the
>   cause (_browser.py)?

Actually, I suspect that *this* code is the problem:

twill/_browser.py,

    def __init__(self, *args, **kwargs):
    	...

        # fix basic auth.
        self.handler_classes['_basicauth'] = FixedHTTPBasicAuthHandler

You did exactly the right thing in trying out the new mechanize code
with twill; that helps me nail down the problem.  I'll see what I can do
about it this weekend.

cheers,
--titus

p.s. Is there any way you could send plain text instead of HTML?  My
mail reader doesn't support HTML very nicely, which is generally a good
thing but is causing problems with your e-mail ;)
Andrew Smart | 2 Mar 2007 20:18
Picon

Re: Bug? in HTTP Basic Authentication Handling

> 
> Actually, I suspect that *this* code is the problem:
> 
> twill/_browser.py,
> 
>     def __init__(self, *args, **kwargs):
>     	...
> 
>         # fix basic auth.
>         self.handler_classes['_basicauth'] = FixedHTTPBasicAuthHandler

I found this lines also, and I tried to comment out this "class overwrite". 
Didn't help either, same behaviour as before. Since the find_user_password
isn't
called, but the add_password *is* called I guess the hickup lays somehow in 
the processing of the open-handlers. 

To shorten your debugging: I added debug code to the specialized member
functions
of any of the ...handlers within twill and mechanize. None of the function
defined
there is called while processing, so I guess the handlers aren't called at
all. 
Maybe this helps you.

> You did exactly the right thing in trying out the new 
> mechanize code with twill; that helps me nail down the 
> problem.  I'll see what I can do about it this weekend.

Thanks. I'll do anything to sort out this problem. Happy hunting. I saw some

messages regarding http basic auth and twill, maybe my problem is also the 
root of some of problems some people experienced.

You can send me any debugging code next week, I'm willing to try out the
code in
my corporate environment.

Sorry for my non-open-source-mail style today, I'm using my personal mail
accounts
with help of a web based mailing software (can't access my POP3 within the 
corporate environment directly) and the web based UI sucks. 

Cheers,
Andrew
Matt Singer | 5 Mar 2007 01:08

Global Forms

I usually ask questions that end up being my own stupidity, but...
 
The scenario is that I go to this page on Amazon and the first part of showforms() gives this....
 

Form #0
## ## __Name__________________ __Type___ __ID________ __Value__________________
1     GIO_SERIALIZED_HIDDE ... hidden    (None)       BnpOfQFcmPC/8C+zojtZ7n86zn
yWHLswEkPj ...
2     category                 hidden    (None)
3     newCategory              hidden    (None)
4     itemType                 hidden    (None)
5     todo                     hidden    (None)
6     productType              hidden    (None)
7     isPopUp                  hidden    (None)
8     origCategory             hidden    (None)
9     origItemType             hidden    (None)
10    startEditInPopUp         hidden    (None)
 
Now when I go to this page in a browser this contains the info for the next POST (viewing on ieinspector)
 
If I try and do
 
formvalue(0, "newCategory", "thecategory")
 
I get an exception in mechanize that it cant find the form.  Tracing it, it appears to be because this is the "global_form" and clicked() in browser.py does not look at this.  Is there a way to do that that I'm doing incorrectly?
 
Thanks
 
 
 
 
_______________________________________________
twill mailing list
twill@...
http://lists.idyll.org/listinfo/twill

Gmane