John Millikin | 6 Apr 03:42 2008
Picon

Re: Time a for JSON parser in the standard library?

I've written a rough draft of a PEP for standard library inclusion, attached to this email. Comments/improvements welcome - I tried to leave most of the differences between modules in the "Issues" section.

PEP: XXX
Title: A JSON handling library
Version: $Revision$
Last-Modified: $Date$
Author: John Millikin <jmillikin <at> gmail.com>
Discussions-To: web-sig <at> python.org
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 05-Apr-2008
Python-Version: 2.6
Post-History: XXX

Abstract
========

This PEP describes a proposed library for parsing and generating
data in the `JSON` [1]_ format. JSON stands for "JavaScript Object
Notation", and is described by RFC 4627 [2]_.

Rationale
=========

JSON is a widely-used data interchange format, often used for sending
data to and from a web browser using Javascript. Its simplicity and
ease of use has lead to various implementations with varying degrees
of compliance to the RFC. By bundling a capable implementation in
Python's standard library, I hope to reduce or eliminate the need
for choosing a JSON library.

Existing Public libraries
=========================

* Bob Ippolito's simplejson [3]_
* Deron Meranda's demjson [4]_
* John Millikin's jsonlib [5]_
* Alan Kennedy mentioned on web-sig [6]_ that he has written
  an implementation for Jython, but I couldn't find source code for
  it.

Each of these have different APIs, different degrees of strictness,
and different qualities of error handling.

Module Interface
================

Parsing
-------

Encoding Autodetection
''''''''''''''''''''''

The RFC requires that JSON is encoded in one of the Unicode encodings.
Because the first two bytes in a valid JSON expression are always from
the ASCII set, it is possible to reliably determine the encoding of
input data. Functions for autodetecting encoding exist in jsonlib and
demjson.

Parsing API
'''''''''''

A JSON expression may be parsed using the ``parse`` function::

  parse (bytes_or_string)

If the input is a ``bytes`` object, the encoding should be auto-detected
as above. If input has been recieved in a non-standard encoding, it can
be manually decoded and passed to ``parse`` as a string. The return
value is either a sequence or mapping, depending on the input.

Serialization
-------------

Python objects may be serialized using the ``generate`` function::

  generate (obj, indent = None, ascii_only = True, encoding = 'utf-8')

``indent`` is used to control pretty-printing. If ``None``, no pretty
printing will be performed and the output will be maximally compact.
If ``indent`` is a string, that string will be used for indenting
nested values. The only values allowed in ``indent`` are those that
are valid JSON whitespace; these are U+0009, U+000A, U+000D, and U+0020.

``ascii_only`` controls whether the output may contain characters above
the ASCII set. If ``True``, all non-ASCII characters must be escaped
using \\uXXXX syntax. Otherwise, non-ASCII characters will be included
without escaping. Depending on the output encoding and values of the
characters, this might be more size-efficient.

``encoding`` specifies how the output is to be encoded. If ``None``,
the output will be a Unicode string. By default, JSON is encoded in
UTF-8.

Note: this is the set of options generally supported by implementations.
For a full treatment of other options, see `Options for Serialization`_.

Other
-----

XXX Should the encoding autodetection function be a part of the
public API?

Issues
======

Representation of Fractional Numbers
------------------------------------

The author of jsonlib feels that fractional numbers should be parsed
into an instance of ``decimal.Decimal``, to avoid issues with values
that cannot be represented exactly by the ``float`` type
[7]_.

  The spec does not require a decimal, but I dislike losing information
  in the parsing stage. Any implementation in the standard library
  should, in my opinion, at least offer a parameter for lossless parsing
  of number values.

The author of simplejson disagrees [8]_, saying that:

  Practically speaking I've tried using decimal instead of float for
  JSON and it's generally The Wrong Thing To Do. The spec doesn't say
  what to do about numbers, but for proper JavaScript interaction you
  want to do things that approximate what JS is going to do: 64-bit
  floating point.

demjson appears to have some sort of float precision detection
mechanism, and returns instances of ``float`` only if they can
represent a value exactly.

Serializing User-defined Types
------------------------------

There should be some way for a user to specify how types not known
to the JSON library should be serialized. For example, django
needs to serialize types related to date and time.

* simplejson supports a ``default`` parameter to ``dump`` and
  ``dumps``, which should be a callable that accepts a value and
  returns a serializable object.
* demjson supports a ``json_equivalent`` method of objects to
  encode, or users may subclass the ``demjson.JSON`` class and
  override the ``encode_default`` method.
* jsonlib supports an ``on_unknown`` parameter to ``write``, which
  acts like simplejson's ``default``.
* Alan Kennedy's implementation checks for a __json__ method of
  objects to serialize [6]_.

Options for Serialization
-------------------------

There are options supported by only a few of the implementations:

``allow_nan``
  In ``simplejson``, allows Infinity and NaN to be serialized. These
  values are not supported by JSON, but are supported in JavaScript.

``check_circular``
  In ``simplejson``, allows the check for self-referential containers
  to be disabled.

``coerce_keys``
  In ``jsonlib``, forces non-string mapping keys to strings.

``default``
  In ``simplejson``, provides a hook for serializing user-defined
  types.

``indent``
  In ``simplejson``, an integer specifying the indentation level in
  spaces.

``on_unknown``
  In ``jsonlib``, serves the same purpose as simplejson's ``default``.

``separators``
  In ``simplejson``, allows the user to override the separators used
  for delimiting array and object values. There is no check performed
  as to whether this would produce invalid JSON. I think having this
  parameter is insane.

``skipkeys``
  In ``simplejson``, skips serializing mapping items with non-string
  keys.

``sort_keys``
  In ``jsonlib``, sorts mapping keys to provide consistent output for
  unit testing.

``strict``
  In ``demjson``, serves the same purpose as simplejson's
  ``allow_nan``.

Non-string Object Keys
----------------------

JSON allows only strings to be used as object keys. demjson in loose
mode allows non-string keys to be parsed, and simplejson will
automatically coerce some types to strings. simplejson has an option
for skipping non-string keys, and jsonlib has an option for coercing
them.

"Raw" atoms
-----------

JSON expressions must have an array or object as the outer-most
value -- that is, the expressions ``true``, ``42``, and ``"spam"``
are not valid JSON. Strict-mode demjson and jsonlib raise exceptions
when parsing or generating such an expression, simplejson does not.

This "feature" is widely supported, but it might just be a non-obvious
bug.

Trailing Commas
---------------

The text ``[1, 2, 3,]`` is valid in both JavaScript and Python, but
is invalid JSON. In JavaScript, this is an array of length four with
the items ``[1, 2, 3, undefined]``. In Python, it is a list of three
items.

Alan Kennedy mentioned that his parser has an option to support
reading these, so presumably he has a use case for it. He didn't
mention what it was parsed as.

Function Names
--------------

There is no real agreement on what the public functions should be
named. simplejson uses load[s] and dump[s], modeled after the
``pickle`` module. demjson uses ``decode`` and ``encode``. jsonlib
uses ``read`` and ``write``, modeled after the ``python-json``
module.

This PEP uses ``parse`` and ``generate`` because that is what the
``email`` module uses.

Module Name
-----------

Probably ``json``, but there's been no actual discussion or consensus
on it that I know of.

Lint for JSON
-------------

demjson comes with lint-like functionality. It would be nice to have
this available in the standard library as well, so that invalid JSON
could be detected without having to actually parse it.

Resources
=========

* `Comparing JSON modules for Python`__, by Deron Meranda.

  __ http://deron.meranda.us/python/comparing_json_modules/

References
==========

.. [1] Introducing JSON, contains general description of JSON and a list
   of implementations.
   (http://json.org/)

.. [2] RFC 4627
   (http://www.ietf.org/rfc/rfc4627.txt)

.. [3] http://pypi.python.org/pypi/simplejson/

.. [4] http://pypi.python.org/pypi/demjson/

.. [5] http://pypi.python.org/pypi/jsonlib/

.. [6] http://mail.python.org/pipermail/web-sig/2008-March/003332.html

.. [7] http://mail.python.org/pipermail/web-sig/2008-March/003343.html

.. [8] http://mail.python.org/pipermail/web-sig/2008-March/003336.html

Copyright
=========

This document has been placed in the public domain.


..
   Local Variables:
   mode: indented-text
   indent-tabs-mode: nil
   sentence-end-double-space: t
   fill-column: 70
   coding: utf-8
   End:

_______________________________________________
Web-SIG mailing list
Web-SIG@...
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/gcpw-web-sig%40m.gmane.org
Robert Brewer | 6 Apr 04:01 2008

Re: Time a for JSON parser in the standard library?

John Millikin wrote:
> I've written a rough draft of a PEP for standard library inclusion,
> attached to this email. Comments/improvements welcome - I tried to
> leave most of the differences between modules in the "Issues" section.

Re: Representation of Fractional Numbers, there are two solutions. If you return decimals, people using
JS on the other end are going to call float(d). If you return floats, people not using JS on the other end are
going to go use a different library. I suggest the former is more acceptable than the latter for a stdlib
offering. Allowing the caller of parse() to choose would be even better.

Robert Brewer
fumanchu@...

_______________________________________________
Web-SIG mailing list
Web-SIG@...
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/gcpw-web-sig%40m.gmane.org

John Millikin | 6 Apr 05:54 2008
Picon

Re: Time a for JSON parser in the standard library?

(messed up CC on last email, re-sending to list)

On Sat, Apr 5, 2008 at 7:01 PM, Robert Brewer <fumanchu-Q+9y+cpEbCIdnm+yROfE0A@public.gmane.org> wrote:

Re: Representation of Fractional Numbers, there are two solutions. If you return decimals, people using JS on the other end are going to call float(d). If you return floats, people not using JS on the other end are going to go use a different library. I suggest the former is more acceptable than the latter for a stdlib offering. Allowing the caller of parse() to choose would be even better.
 
I don't understand what you mean, here. generate ([decimal.Decimal ('1.1')]) -> '[1.1]', so a JavaScript user calling eval() on it would get a standard JavaScript float object without having to call float() explicitly.
_______________________________________________
Web-SIG mailing list
Web-SIG@...
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/gcpw-web-sig%40m.gmane.org
Robert Brewer | 6 Apr 08:45 2008

Re: Time a for JSON parser in the standard library?

John Millikin wrote:
> On Sat, Apr 5, 2008 at 7:01 PM, Robert Brewer <fumanchu <at> aminus.org> wrote:
> > Re: Representation of Fractional Numbers, there are two solutions. If you
> > return decimals, people using JS on the other end are going to call float(d).
> > If you return floats, people not using JS on the other end are going to go
> > use a different library. I suggest the former is more acceptable than the
> > latter for a stdlib offering. Allowing the caller of parse() to choose
> > would be even better.
> 
> I don't understand what you mean, here. generate ([decimal.Decimal ('1.1')])
> -> '[1.1]', so a JavaScript user calling eval() on it would get a standard
> JavaScript float object without having to call float() explicitly.

Sorry, I wasn't describing what anyone would do in Javascript. Pythonistas receiving JSON numbers from a
JS *sender*, who want Python floats, can call float(d) if they like if you hand them a Decimal object.
Annoying but easy. People receiving JSON numbers from, say, a Python sender, can't call Decimal(f) if you
hand them a float instance, at least not reliably. So they'll either go use some other jsonlib (bad) or
start passing numbers in strings (worse).


Robert Brewr
fumanchu <at> aminus.org

_______________________________________________
Web-SIG mailing list
Web-SIG <at> python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/gcpw-web-sig%40m.gmane.org
Ronny Pfannschmidt | 7 Apr 15:28 2008
Picon
Picon

[proposal] merging jsonrpc into xmlrpc

Hi,

since json-rpc and xml-rpc basically do the same 
and the only difference is the content-type (json is more concise),
i propose to create a single xml/json-rpc module.

I did the a semilar proposal to stdlib-sig,
they told me to ask in web-sig about the details cause of json.

---
Ronny Pfannschmidt
_______________________________________________
Web-SIG mailing list
Web-SIG@...
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/gcpw-web-sig%40m.gmane.org

Graham Dumpleton | 8 Apr 00:36 2008
Picon

Re: [proposal] merging jsonrpc into xmlrpc

2008/4/7 Ronny Pfannschmidt <Ronny.Pfannschmidt@...>:
> Hi,
>
>  since json-rpc and xml-rpc basically do the same
>  and the only difference is the content-type (json is more concise),
>  i propose to create a single xml/json-rpc module.
>
>  I did the a semilar proposal to stdlib-sig,
>  they told me to ask in web-sig about the details cause of json.

To repeat something I said on list a while back, this time phrased
properly to correctly refer to JSON-RPC. :-)

The problem with the JSON-RPC 1.0 specification was that it wasn't
always as clear as could have been. As a result different server side
implementations interpreted or implemented it differently, as did the
JavaScript clients. I'll admit that it has been a while since I looked
at it and maybe things have improved, but certainly it used to be the
case that finding a JavaScript library that talked to a specific
server side implementation wasn't always easy. End result was that the
JavaScript library would often only work with the specific web
framework it was originally designed for and nothing else.

The problem areas were, different interpretations of what could be
supplied in an error response. Whether an integer, string or arbitrary
object should be used as the id attribute in a request. Finally, some
JavaScript clients would only work with a server side implementation
which provided introspection methods as they would dynamically create
a JavaScript proxy object based on a call of the introspection
methods.

Unfortunately the JSON-RPC 1.1 draft specification didn't necessarily
make things better. Rather than creating a proper layered
specification which separated lower level transport and encoding
concerns from higher level application concepts such as introspection
they bundle it all together. Thus they try to enforce that a server
must support introspection even though doing so may be totally
impractical depending on what the JSON-RPC  server adapter is hooking
in to. They also introduced all this muck about having to support both
positional and named parameters at the same time. The JSON-RPC 1.1
specification was also never really completed and left out details
such as standard error codes etc that there were proposing be
specified.

Thus my question is, what version of the JSON-RPC specification are
you intending to support? Also what form would the error response take
so that it works with a suitable number of JSON-RPC clients? Are you
prepared to go and test it with a sufficient range of clients to make
sure Python implemented server side interops properly?

Graham
_______________________________________________
Web-SIG mailing list
Web-SIG@...
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/gcpw-web-sig%40m.gmane.org

Massimo Di Pierro | 8 Apr 19:09 2008
Picon

google appengine

You probably read that google has released appengine:

     http://www.youtube.com/watch?v=bfgO-LXGpTM

but they have disabled video responses. So here is mine anyway.

     http://www.vimeo.com/875433

Massimo
_______________________________________________
Web-SIG mailing list
Web-SIG@...
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/gcpw-web-sig%40m.gmane.org

Alan Kennedy | 8 Apr 23:01 2008

Re: [proposal] merging jsonrpc into xmlrpc

[Ronny]
>>  since json-rpc and xml-rpc basically do the same
>>  and the only difference is the content-type (json is more concise),
>>  i propose to create a single xml/json-rpc module.

[Graham]
>  The problem with the JSON-RPC 1.0 specification was that it wasn't
>  always as clear as could have been.
>
>  Unfortunately the JSON-RPC 1.1 draft specification didn't necessarily
>  make things better.

>  The JSON-RPC 1.1
>  specification was also never really completed and left out details
>  such as standard error codes etc that there were proposing be
>  specified.

All valid concerns.

I think that the JSON-RPC initiative lost its way a little. They tried
to model things such as encoding and decoding an object graph, using
object references, etc, which IMHO is a step too far for the usages
JSON-RPC would get, and is more CORBA than XML-RPC.

The maintainer of the JSON-RPC.org site was looking for someone to
take it over for a while; I think someone might have taken it over
last year.

[Graham]
>  Are you
>  prepared to go and test it with a sufficient range of clients to make
>  sure Python implemented server side interops properly?

Interestingly, the reference implementation for JSON-RPC is a server
written in python[1].

http://json-rpc.org/wiki/python-json-rpc

Perhaps python's best interests in this case are better served by
letting that reference implementation drive the JSON-RPC standards
process[2]?

If that is the case, then it is counter-productive to add a competing
module to the python standard library.

Regards,

Alan.

[1] But it's a shame they didn't write it on WSGI: then their services
could have run on the Google compute cloud ;-)

[2] Perhaps some pythonista from Web-SIG is most appropriate to advise
how JSON-RPC should move forward? After all, we're more accustomed to
server-side stuff than those javascript folks ;-)
_______________________________________________
Web-SIG mailing list
Web-SIG@...
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/gcpw-web-sig%40m.gmane.org

Ian Bicking | 9 Apr 00:06 2008

Re: [proposal] merging jsonrpc into xmlrpc

Alan Kennedy wrote:
> [1] But it's a shame they didn't write it on WSGI: then their services
> could have run on the Google compute cloud ;-)

Indeed.  After seeing a BaseHTTPServer JSON-RPC server go up on the 
Python Cookbook I wrote a WSGI server and made it into a tutorial: 
http://pythonpaste.org/webob/jsonrpc-example.html (but it's not a 
maintained library -- at least I won't be maintaining it).

> [2] Perhaps some pythonista from Web-SIG is most appropriate to advise
> how JSON-RPC should move forward? After all, we're more accustomed to
> server-side stuff than those javascript folks ;-)

Let it die?  It is more complicated than necessary, when instead you 
could just make each function a URL of its own, and POST the arguments 
and get back the response, with 500 Server Error for errors.  It's hard 
to spec that up because it's too simple.

OHM (http://pythonpaste.org/ohm/) follows this model of exposing a service.

   Ian
_______________________________________________
Web-SIG mailing list
Web-SIG@...
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/gcpw-web-sig%40m.gmane.org

Alan Kennedy | 9 Apr 03:02 2008

Re: [proposal] merging jsonrpc into xmlrpc

[Alan]
>> [2] Perhaps some pythonista from Web-SIG is most appropriate to advise
>> how JSON-RPC should move forward? After all, we're more accustomed to
>> server-side stuff than those javascript folks ;-)

[Ian]
>  Let it die?  It is more complicated than necessary, when instead you could
> just make each function a URL of its own, and POST the arguments and get
> back the response, with 500 Server Error for errors.  It's hard to spec that
> up because it's too simple.
>
>  OHM (http://pythonpaste.org/ohm/) follows this model of exposing a service.

Mmmm, very RESTful.

Access to the requested HTTP method is a fundamental for RESTful services.

I find it interesting that Java's HttpServletRequest has a
.getMethod(), but no .setMethod(). Which means that one has to
implement method overrides[1] by carrying the override value through
means other than the request object itself.

Whereas in WSGI, I can simply do: environ['REQUEST_METHOD'] =
environ['HTTP-X-HTTP-METHOD-OVERRIDE']

I've heard WSGI described as "python's servlet API". It's not that; it's better.

Regards,

Alan.

[1] http://code.google.com/apis/gdata/basics.html#Updating-an-entry
_______________________________________________
Web-SIG mailing list
Web-SIG@...
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: http://mail.python.org/mailman/options/web-sig/gcpw-web-sig%40m.gmane.org


Gmane