Re: WebSockets
Good morning,
On May 6, 2010, at 22:16 , John Carlson wrote:
> I was wondering how many people here have heard of WebSockets.
Interesting timing as I was just talking about this with some
coworkers. Here's a bit of hopefully helpful info but I'm not sure
that I understand how relevant this discussion is to capabilities.
> http://en.wikipedia.org/wiki/Web_Sockets
>
> "WebSockets is a technology providing for bi-directional, full-
> duplex communications channels, over a single Transmission Control
> Protocol (TCP) socket, designed to be implemented in web browsers
> and web servers. The WebSockets API is being standardized by the W3C
> and the WebSocket protocol is being standardized by the IETF."
On the commercial side, the Kaazing folks have been at the leading
edge of this for awhile so you might want to check out their stuff (in
terms of software and demos as well as the various presentations and
articles they've written). [ObDisclosure: I know them but have no
monetary relationship, etc.]
On the open source side, there's the Java-based jWebSocket.org and the
JavaScript-based Node.js websockets support. Surely there are more but
I've personally used all three.
> What's the scalability of this? How many sockets can the typical
> host support? I guess I should look at TCP. 2^16 it looks like.
> About the size of a typical football stadium? More if clients can
> relay data.
One has to plan for and deal with the long-live connections... that
can be a bit of a heretical shock to operators and developers used to
web 1.0 setups. These implementations all use some form of modern non-
blocking I/O to manage such large numbers of concurrent connections
efficiently. [Netty seems to be supplanting Mina in the Java space and
Node.js pretty much completely owns the server-side JavaScript space.]
At the sysadmin level, it's mostly the network related tuning like
ulimit for enough file descriptors.
The practical limit is roughly bounded by an envelope of: number of
concurrent connections, data load (messages per second, size of
messages), link capacity, OS-level network stack implementation
quality, etc. Of course, what your code is doing with the connections/
data matters, too, (i.e., basic routing vs. lots of processing but
don't forget those "hidden" taxes like the costs of xml transcoding).
That all said, in the hundreds of thousands per commodity host is
quite possible (and Kaazing has publicly stated that they've done 1M
connections per host).
> Is stateless communication crumbling?
Everything old is new again. It really depends on what you're doing
and what you need. The pendulum swung extremely far to the
"stateless" direction with the explosion of Web 1.0. But as those
systems have gotten more complex in the much more dynamic "Web 2.0"
world, you can see the lengths to which such systems must go to to
attempt to effectively and efficiently support such a dynamic,
personalized interaction model. [A loose analogy of this progression
is: broadcast tv to cable tv with on-demand programming to the internet
+flickr+youtube+torrents+email+chat+....]
A classic example (in the "don't start a land war in Asia" sense
that all of the websocket projects show is chat precisely because it's
so easy and efficient to do and because it's a service that companies
seem to always want to add to their web site but doing the naive web
1.0 RPC approach has all of the horrible polling latency and page
refresh effects for the user and nasty (inability to) scale problems
if your site is at all popular. The various pre-websocket Web 2.0
solutions like long-polling have basically all of the hard costs of
websockets without the fundamental benefits and so they've been stuck
in a sort of purgatory where they tease well but never really satisfy.
Have fun,
John