Re: Inspect HTML response
Jython is failing to write a UTF-16 string to a file. We can fix the
file encoding (print>> file,response.text.encode("UTF-8")), but as Gary
has pointed out, the underlying problem is that you're not first
decoding the gzipped body.
response.getText() returns a UTF string created by interpreting the
bytes received from the wire using the appropriate Content-Type response
header, falling back to ISO-8859-1 if there is no such header.
Interestingly \ufffd is "REPLACEMENT CHARACTER" - used to replace an
incoming character whose value is unknown or unrepresentable in Unicode.
This is because the content has not been gunzipped.
You can ask the grinder to handle the content type and gzip the body.
The simple thing to do is to uncompress everything. This can be done by
adding the following to the top of your script
connectionDefaults.useContentEncoding = True
Gary's right to say this might unnecessarily burn a lot of CPU if you're
not interested in every response (which is why The Grinder doesn't
gunzip by default), so extending the 1 in 100 approach, you could say
logResponse = grinder.runNumber % 100 == 0
# Do this for all of the URLs
def saveHtmlToFile (prefix, response, testId):
if response.getHeader("Content-Encoding").find("gzip") >= 0:
# Not decoded
inputStream = response.getInputStream()
filename = "%s-%d-
file = open(filename,"w")
On 08/02/13 19:38, Sean Tiley wrote:
> Thanks Phil, When I make that adjustment I am getting the following
> error when writing the file..
> 2013-02-08 14:30:57,409 ERROR Tiley-HP-0 thread-0 [ run-0 ]: aborted
> run - Jython exception, <type 'exceptions.UnicodeEncodeError'>:
> 'ascii' codec can't encode character u'\ufffd' in position 1: ordinal
> not in range(128) [calling TestRunner]
> net.grinder.scriptengine.jython.JythonScriptExecutionException: <type
> 'exceptions.UnicodeEncodeError'>: 'ascii' codec can't encode character
> u'\ufffd' in position 1: ordinal not in range(128)
> print>> file,response.text
> File ".\sean3.py", line 176, in saveHtmlToFile
> This seems really hard. I understand that there are a bunch of sub
> requests going on and that there are images etc buit if I only log the
> request that gets text i get the same issue, clearly there is
> somethiong I am missing the boat on.
> On Fri, Feb 8, 2013 at 2:07 PM, Philip Aston <philipa@...
> <mailto:philipa@...>> wrote:
> OK, we can make that simpler
> Checkout the writeToFile function in
> Extending to fit your interface:
> |def saveHtmlToFile (prefix, response,testId):
> | ||filename = "%s-%d- page-%d.html" %
> | ||file| |=| |open||(filename, ||"w"||)|
> | ||print| |>> ||file||, response.text|
> | ||file||.close()
> return filename
> - Phil
> On 08/02/13 16:27, Sean Tiley wrote:
>> Thanks Phil,
>> I have the following
>> def saveHtmlToFile (prefix, response,testId):
>> inputStream = response.getInputStream()
>> filename = "%s-%d- page-%d.html" %
>> file = open(filename, "w")
>> i = 1
>> cc = inputStream.available()
>> while (i <= cc):
>> c = inputStream.read()
>> file.write("%c" % c)
>> i += 1
>> return filename
>> When this writes out from the byte array I get not text but a
>> bunch of weird characters ...9???K?0 ???C?I/mus8? ??? 7?)
>> Is there something else i need to do to get the english / decoded
>> test? Is the result the way it is because i am using https?
>> On Fri, Feb 8, 2013 at 11:12 AM, Philip Aston <philipa@...
>> <mailto:philipa@...>> wrote:
>> toString() just gives you the headers. getText() will give
>> you the body as a String, getData() will give it you as a
>> byte array.
>> There are also some simple helper methods in HTTPUtilities -
>> - Phil
>> On 08/02/13 15:07, Sean Tiley wrote:
>>> I am using the Grinder(3.11) to generate load for a web site
>>> I am testing (https). i have used the proxy to record some
>>> basic interactions with the site and now I have been trying
>>> to find a way to inspect the HTML response using the grinder
>>> and for the life of me I can not find or figure it out.
>>> I can certainly log infor from the response using
>>> But I believe this is just the headers. I want to check the
>>> HTML response for a string if possible.
>>> Can this be done?
>>> Any thoughts greatly appreciated.
>>> Sean Tiley
>>> sean.tiley@... <mailto:sean.tiley@...>
>> Sean Tiley
> Sean Tiley
Free Next-Gen Firewall Hardware Offer
Buy your Sophos next-gen firewall before the end March 2013
and get the hardware for free! Learn more.