Manlio Perillo | 1 Oct 17:47 2007
Picon

hop-by-hop headers handling

Hi, I have another question with error handling.

The WSGI spec only says that applications *must* not generate hop-by-hop
headers, but says nothing on how a WSGI server should handle them.

In the previous version of nginx mod_wsgi I just ignored these headers,
but in the latest revisions, I raise an exception.

Thanks   Manlio Perillo
_______________________________________________
Web-SIG mailing list
Web-SIG@...
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/gcpw-web-sig%40m.gmane.org

Manlio Perillo | 2 Oct 21:30 2007
Picon

Multiple message-header fields handling

The HTTP 1.1 protocol (section 4.2) says that:
"""Multiple message-header fields with the same field-name MAY be 
present in a message if and only if the entire field-value for that 
header field is defined as a comma-separated list [i.e., #(values)]."""

This can happen, as an example, with the Cookie header.

My question is: how should this be handled in WSGI?

As an example Nginx stores all the headers in a associative array, 
where, of course, only the "last seen" headers appears.

However common multiple message-headers are stored in the request struct.

Since the WSGI environment is a dictionary with keys and values of type 
str, should an implementation:
"""combine the multiple header fields into one "field-name: field-value" 
pair, without changing the semantics of the message, by appending each 
subsequent field-value to the first, each separated by a comma."""
?

Ngins does not do this (and I don't know what Apache does).

Another question: when an header has an empty field value, what should 
be set in the environment: an empty string or None?

Thanks  Manlio Perillo
_______________________________________________
Web-SIG mailing list
Web-SIG@...
(Continue reading)

Phillip J. Eby | 2 Oct 21:44 2007

Re: Multiple message-header fields handling

At 09:30 PM 10/2/2007 +0200, Manlio Perillo wrote:
>The HTTP 1.1 protocol (section 4.2) says that:
>"""Multiple message-header fields with the same field-name MAY be
>present in a message if and only if the entire field-value for that
>header field is defined as a comma-separated list [i.e., #(values)]."""
>
>This can happen, as an example, with the Cookie header.
>
>My question is: how should this be handled in WSGI?
>
>As an example Nginx stores all the headers in a associative array,
>where, of course, only the "last seen" headers appears.
>
>However common multiple message-headers are stored in the request struct.
>
>Since the WSGI environment is a dictionary with keys and values of type
>str, should an implementation:
>"""combine the multiple header fields into one "field-name: field-value"
>pair, without changing the semantics of the message, by appending each
>subsequent field-value to the first, each separated by a comma."""
>?

If that's the only way to make the headers work, then the server may do so.

>Another question: when an header has an empty field value, what should
>be set in the environment: an empty string or None?

If a value exists in the environ, it *must* be a string -- never 
None.  And if the header exists, then a value should be in the 
environ.  Therefore, it should be an empty string.
(Continue reading)

Phillip J. Eby | 2 Oct 21:45 2007

Re: hop-by-hop headers handling

At 05:47 PM 10/1/2007 +0200, Manlio Perillo wrote:
>Hi, I have another question with error handling.
>
>The WSGI spec only says that applications *must* not generate hop-by-hop
>headers, but says nothing on how a WSGI server should handle them.
>
>In the previous version of nginx mod_wsgi I just ignored these headers,
>but in the latest revisions, I raise an exception.

Raising an exception is indeed preferable.

_______________________________________________
Web-SIG mailing list
Web-SIG@...
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/gcpw-web-sig%40m.gmane.org

Alex Botero-Lowry | 2 Oct 21:50 2007

Re: Multiple message-header fields handling

On Tue, Oct 02, 2007 at 09:30:46PM +0200, Manlio Perillo wrote:
> The HTTP 1.1 protocol (section 4.2) says that:
> """Multiple message-header fields with the same field-name MAY be 
> present in a message if and only if the entire field-value for that 
> header field is defined as a comma-separated list [i.e., #(values)]."""
> 
> This can happen, as an example, with the Cookie header.
> 
> My question is: how should this be handled in WSGI?
> 
> As an example Nginx stores all the headers in a associative array, 
> where, of course, only the "last seen" headers appears.
> 
> However common multiple message-headers are stored in the request struct.
> 
Initially I used such a solution (cookies was a special property in the response
object), but I ended up just throwing together a custom dict that looks like:

class ResponseHeaders(dict):
        def __setitem__(self, item, val):
                if item in self:
                        iv = self[item]
                        if isinstance(iv, list):
                                iv.append(val)
                        else:
                                iv = [iv, val]
                        dict.__setitem__(self, item, iv)
                else:
                        dict.__setitem__(self, item, val)

(Continue reading)

Robert Brewer | 2 Oct 21:47 2007

Re: Multiple message-header fields handling

Manlio Perillo wrote:
> The HTTP 1.1 protocol (section 4.2) says that:
> """Multiple message-header fields with the same field-name MAY be
> present in a message if and only if the entire field-value for that
> header field is defined as a comma-separated list [i.e., #(values)]."""
>
> This can happen, as an example, with the Cookie header.
>
> My question is: how should this be handled in WSGI?
>
> As an example Nginx stores all the headers in a associative array,
> where, of course, only the "last seen" headers appears.
>
> However common multiple message-headers are stored in the request struct.
>
> Since the WSGI environment is a dictionary with keys and values of type
> str, should an implementation:
> """combine the multiple header fields into one "field-name: field-value"
> pair, without changing the semantics of the message, by appending each
> subsequent field-value to the first, each separated by a comma."""
> ?

Yes, it should. As you note, it's part of the HTTP spec that such headers
can be combined without changing the semantics. Here's a list of the
headers that need to be folded:

comma_separated_headers = ['ACCEPT', 'ACCEPT-CHARSET', 'ACCEPT-ENCODING',
    'ACCEPT-LANGUAGE', 'ACCEPT-RANGES', 'ALLOW', 'CACHE-CONTROL',
    'CONNECTION', 'CONTENT-ENCODING', 'CONTENT-LANGUAGE', 'EXPECT',
    'IF-MATCH', 'IF-NONE-MATCH', 'PRAGMA', 'PROXY-AUTHENTICATE', 'TE',
    'TRAILER', 'TRANSFER-ENCODING', 'UPGRADE', 'VARY', 'VIA', 'WARNING',
    'WWW-AUTHENTICATE']

The only tricky one is Cookie, because e.g. Konqueror sends them on
multiple lines, but they're not foldable.

See http://kristol.org/cookie/errata.html

> Ngins does not do this (and I don't know what Apache does).
>
>
> Another question: when an header has an empty field value, what should
> be set in the environment: an empty string or None?

An empty string, or omit them entirely:

"""The following variables must be present, unless their value would
be an empty string, in which case they may be omitted, except as
otherwise noted below...

HTTP_ Variables
""".


Robert Brewer
fumanchu-Q+9y+cpEbCIdnm+yROfE0A@public.gmane.org

_______________________________________________
Web-SIG mailing list
Web-SIG@...
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/gcpw-web-sig%40m.gmane.org
Manlio Perillo | 2 Oct 22:03 2007
Picon

Re: Multiple message-header fields handling

Manlio Perillo ha scritto:
> [...]
> As an example Nginx stores all the headers in a associative array, 
> where, of course, only the "last seen" headers appears.
> 

A correction: Nginx stores "raw" headers in a list of key/value pairs, 
and not in an associative array.

This means that when I iterate over the headers, I see all the multiple 
message-headers, but I only store the last header in the WSGI environment.

 > [...]

Regards  Manlio Perillo
_______________________________________________
Web-SIG mailing list
Web-SIG@...
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/gcpw-web-sig%40m.gmane.org

Manlio Perillo | 2 Oct 22:11 2007
Picon

Re: Multiple message-header fields handling

Phillip J. Eby ha scritto:
> At 09:30 PM 10/2/2007 +0200, Manlio Perillo wrote:
>> The HTTP 1.1 protocol (section 4.2) says that:
>> """Multiple message-header fields with the same field-name MAY be
>> present in a message if and only if the entire field-value for that
>> header field is defined as a comma-separated list [i.e., #(values)]."""
>>
>> This can happen, as an example, with the Cookie header.
>>
>> My question is: how should this be handled in WSGI?
>>
>> As an example Nginx stores all the headers in a associative array,
>> where, of course, only the "last seen" headers appears.
>>
>> However common multiple message-headers are stored in the request struct.
>>
>> Since the WSGI environment is a dictionary with keys and values of type
>> str, should an implementation:
>> """combine the multiple header fields into one "field-name: field-value"
>> pair, without changing the semantics of the message, by appending each
>> subsequent field-value to the first, each separated by a comma."""
>> ?
> 
> If that's the only way to make the headers work, then the server may do so.
> 

Nginx does not combine headers, so I have to do it by myself (and this 
will complicate the implementation)...

However IMHO here you should not use the word "may", but "must", and 
this should be explicitly stated in the WSGI spec.

 > [...]

Thanks and regards   Manlio Perillo
_______________________________________________
Web-SIG mailing list
Web-SIG@...
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/gcpw-web-sig%40m.gmane.org

Manlio Perillo | 2 Oct 22:27 2007
Picon

Re: Multiple message-header fields handling

Robert Brewer ha scritto:
>
 > [...]
> As you note, it's part of the HTTP spec that such headers
> can be combined without changing the semantics. Here's a list of the
> headers that need to be folded:
> 
> comma_separated_headers = ['ACCEPT', 'ACCEPT-CHARSET', 'ACCEPT-ENCODING',
>     'ACCEPT-LANGUAGE', 'ACCEPT-RANGES', 'ALLOW', 'CACHE-CONTROL',
>     'CONNECTION', 'CONTENT-ENCODING', 'CONTENT-LANGUAGE', 'EXPECT',
>     'IF-MATCH', 'IF-NONE-MATCH', 'PRAGMA', 'PROXY-AUTHENTICATE', 'TE',
>     'TRAILER', 'TRANSFER-ENCODING', 'UPGRADE', 'VARY', 'VIA', 'WARNING',
>     'WWW-AUTHENTICATE']
> 

Note that some of these headers are response headers, and it is 
responsibility of the WSGI application to properly folding them, and not 
of the WSGI gateway.

> The only tricky one is Cookie, because e.g. Konqueror sends them on
> multiple lines, but they're not foldable.
> 
> See http://kristol.org/cookie/errata.html
> 

This is a mess...

Note: in some tests, I have seen Firefox sending a Cookie on multiple lines.

 > [...]

Thanks and regards   Manlio Perillo
_______________________________________________
Web-SIG mailing list
Web-SIG@...
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/gcpw-web-sig%40m.gmane.org

Phillip J. Eby | 2 Oct 22:36 2007

Re: Multiple message-header fields handling

At 10:03 PM 10/2/2007 +0200, Manlio Perillo wrote:
>Manlio Perillo ha scritto:
> > [...]
> > As an example Nginx stores all the headers in a associative array,
> > where, of course, only the "last seen" headers appears.
> >
>
>A correction: Nginx stores "raw" headers in a list of key/value pairs,
>and not in an associative array.
>
>This means that when I iterate over the headers, I see all the multiple
>message-headers, but I only store the last header in the WSGI environment.

That's definitely an error.

_______________________________________________
Web-SIG mailing list
Web-SIG@...
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/gcpw-web-sig%40m.gmane.org


Gmane