4 Nov 2010 00:19
Should PEP 3333 be Python 3-only? What about transcoding?
As I've been tidying up wsgiref in the stdlib for PEP 3333, I've been noticing that there's a bit of an issue with the PEP as far as CGI variables. Currently, the CGI example is the same as it is in PEP 3333, which means that it's correct code for Python 2.x, but wrong for 3.x due to the environment transcoding issue. (See http://bugs.python.org/issue10155 for details.) There are other code sample differences, too. In effect, PEP 3333 is still using Python 2 code samples, because it's trying to cover every version of Python from 2.1 through 3.2. Should we ditch that, and say, "hey, if you want Python 2.x code samples, go see PEP 333?" That will simplify a couple of things, but still won't address the transcoding issue. Specifically, the problem is that on Python 3, os.environ contains *unicode*, not bytes masquerading as unicode. Unfortunately, this means that it very possibly contains garbage for CGI variables, as the web server puts bytes in the environment, then Python converts those bytes to unicode using the system encoding + surrogateescape. To get back to bytes, then, we have to decode using the same combination, then re-encode with latin-1 to get back to a WSGI-compatible string. The hitch is this: not everything in os.environ comes from an HTTP request, and therefore may not be decodable in such a fashion. For(Continue reading)
_______________________________________________
Web-SIG mailing list
RSS Feed