Stelian Ionescu | 1 Oct 2009 11:01
Gravatar

New patches: 30-Sep-2009


commit 75d8e243888f1b95e77f4ae43891f86a81a3edc8
Author: Stelian Ionescu <sionescu <at> cddr.org>
Date:   Wed Sep 30 19:08:15 2009 +0200

    WALK-DIRECTORY: provide restart ignore-file-system-error in case FN signals an error

 src/os/os-unix.lisp |   15 ++++++++++-----
 1 files changed, 10 insertions(+), 5 deletions(-)

commit 6c3be885b07a97e8938e33d3a654c59ae2a3a5c0
Author: Stelian Ionescu <sionescu <at> cddr.org>
Date:   Wed Sep 30 16:54:44 2009 +0200

    WALK-DIRECTORY: rename :ORDER to :DIRECTORIES

 src/os/os-unix.lisp |   14 ++++++++------
 1 files changed, 8 insertions(+), 6 deletions(-)

commit 08c0102e96331d778c3fe22f58eeaae515eee07c
Author: Stelian Ionescu <sionescu <at> cddr.org>
Date:   Wed Sep 30 16:01:08 2009 +0200

    More fixes to WALK-DIRECTORY.

 src/os/os-unix.lisp |   98 +++++++++++++++++++++++++--------------------------
 1 files changed, 48 insertions(+), 50 deletions(-)

An updated tarball of IOLib's source can be downloaded here:
http://common-lisp.net/project/iolib/files/snapshots/iolib-20090930.tar.gz
(Continue reading)

Stelian Ionescu | 3 Oct 2009 11:00
Gravatar

New patches: 2-Oct-2009


commit 18499d242d2dffe80fbeae14642cfc712afa3d51
Author: Stelian Ionescu <sionescu <at> cddr.org>
Date:   Fri Oct 2 13:52:02 2009 +0200

    WALK-DIRECTORY: fix :IF-DOES-NOT-EXIST

 src/os/os-unix.lisp |   32 +++++++++++++++-----------------
 1 files changed, 15 insertions(+), 17 deletions(-)

commit 964aed931fab2f82b4ca5ca2c354b8e02e070a8a
Author: Stelian Ionescu <sionescu <at> cddr.org>
Date:   Fri Oct 2 13:51:36 2009 +0200

    WALK-DIRECTORY: fix recursion.

 src/os/os-unix.lisp |   29 +++++++++++++++--------------
 1 files changed, 15 insertions(+), 14 deletions(-)

commit d47be62eafdda89325110da3c39c51ca8b60be4c
Author: Stelian Ionescu <sionescu <at> cddr.org>
Date:   Fri Oct 2 12:57:28 2009 +0200

    Add DELETE-FILES.

 src/os/os-unix.lisp |   19 +++++++++++++++++++
 src/os/pkgdcl.lisp  |    3 +--
 2 files changed, 20 insertions(+), 2 deletions(-)

commit 50a8a8b27ba02d78ac8b353b6964c8cf4edf83b6
(Continue reading)

Red Daly | 18 Oct 2009 05:05
Picon
Gravatar

c10k HTTP server with iolib

Hi,

I was wondering how best to use IOLib to write an efficient HTTP server that can handle perhaps 10,000+ simultaneous connections.  It seems like iolib has all the right ingredients: a sytem-level sockets interface and a io multiplexer that uses epoll/kqueue for efficiently querying sockets.  There is quite a bit of code already written so I was hoping for some advice about how this would be best implemented.

Here is a possible architecture for a server that can handle tons of connections at once:

Some lisp thread originally sets up a passive-socket (server socket) to listen for connections on some port.  There are a few worker threads (on the order of the number of processors in the machine).  When a connection is received, one of these worker threads will dequeue an active socket with ACCEPT.

However, after the initial connection all the HTTP headers and content must be read from the socket.  Presumably not all the data will be ready as soon as a connection is received, and read operations will block if allowed to block.  While it waits for the full HTTP request to come across the wire, the worker thread could be accepting new connections or processing older ones where the fully request is available.  To quickly send HTTP responses off, writes to sockets should also never block--so if we try to send more bytes than a socket can handle, we should handle that asynchronously so the worker can get on to the next thing to do.

So, a worker thread will either
1) be processing a request (arbitrary lisp code to respond appropriately to an HTTP request).  When a response is is ready, it should be written to some non-blocking gray stream.

2) be waiting for the next of the following events:

    a)  socket writable.  Some gray stream that was written to in (1) but blocked now has enough room in its buffers to allow more data to be sent immediately. (ie a "would block" message was received in a "send to" call).

    b) socket readable.  Some socket that we are listening to (gray stream) has more data available.  When this data is sufficient to respond to the request, this connection is now elgible for the processing described in (1).

    c)  socket accepted.  Some socket is available in the queue of the passive socket.  We can now begin listening to the socket's read events for processing as in (2.b) or process the socket as in (1).

My question about IOLib is how this sort of processing model could be implemented on top of iolib.  Do passive sockets generate epoll/kqueue events when a new connection is available to accept?  If so it seems like the multiplexer could be used to listen for events 2.a,2.b, and 2.c all simultaneously.

I see there are some gray stream implementations in the code right now, though I have not figured out how to use them.   How do I, for example, create a stream with an underlying socket?  Could these sockets work with the multiplexer implementation to accomplish the processing model described?

I think that sums it up.  Thanks for the great library!

-Red

<div><p>Hi,<br><br>I was wondering how best to use IOLib to write an efficient HTTP server that can handle perhaps 10,000+ simultaneous connections.&nbsp; It seems like iolib has all the right ingredients: a sytem-level sockets interface and a io multiplexer that uses epoll/kqueue for efficiently querying sockets.&nbsp; There is quite a bit of code already written so I was hoping for some advice about how this would be best implemented.<br><br>Here is a possible architecture for a server that can handle tons of connections at once:<br><br>Some lisp thread originally sets up a passive-socket (server socket) to listen for connections on some port.&nbsp; There are a few worker threads (on the order of the number of processors in the machine).&nbsp; When a connection is received, one of these worker threads will dequeue an active socket with ACCEPT.<br><br>However, after the initial connection all the HTTP headers and content must be read from the socket.&nbsp; Presumably not all the data will be ready as soon as a connection is received, and read operations will block if allowed to block.&nbsp; While it waits for the full HTTP request to come across the wire, the worker thread could be accepting new connections or processing older ones where the fully request is available.&nbsp; To quickly send HTTP responses off, writes to sockets should also never block--so if we try to send more bytes than a socket can handle, we should handle that asynchronously so the worker can get on to the next thing to do.<br><br>So, a worker thread will either<br>1) be processing a request (arbitrary lisp code to respond appropriately to an HTTP request).&nbsp; When a response is is ready, it should be written to some non-blocking gray stream.<br><br>2) be waiting for the next of the following events:<br><br>&nbsp;&nbsp;&nbsp; a)&nbsp; socket writable.&nbsp; Some gray stream that was written to in (1) but blocked now has enough room in its buffers to allow more data to be sent immediately. (ie a "would block" message was received in a "send to" call).<br><br>&nbsp;&nbsp;&nbsp; b) socket readable.&nbsp; Some socket that we are listening to (gray stream) has more data available.&nbsp; When this data is sufficient to respond to the request, this connection is now elgible for the processing described in (1).<br><br>&nbsp;&nbsp;&nbsp; c)&nbsp; socket accepted.&nbsp; Some socket is available in the queue of the passive socket.&nbsp; We can now begin listening to the socket's read events for processing as in (2.b) or process the socket as in (1).<br><br>My question about IOLib is how this sort of processing model could be implemented on top of iolib.&nbsp; Do passive sockets generate epoll/kqueue events when a new connection is available to accept?&nbsp; If so it seems like the multiplexer could be used to listen for events 2.a,2.b, and 2.c all simultaneously.<br><br>I see there are some gray stream implementations in the code right now, though I have not figured out how to use them.&nbsp;&nbsp; How do I, for example, create a stream with an underlying socket?&nbsp; Could these sockets work with the multiplexer implementation to accomplish the processing model described?<br><br>I think that sums it up.&nbsp; Thanks for the great library!<br><br>-Red<br></p></div>
Matthew Mondor | 18 Oct 2009 11:04

Re: c10k HTTP server with iolib

On Sat, 17 Oct 2009 20:05:52 -0700
Red Daly <reddaly <at> gmail.com> wrote:

> I was wondering how best to use IOLib to write an efficient HTTP server that
> can handle perhaps 10,000+ simultaneous connections.  It seems like iolib
> has all the right ingredients: a sytem-level sockets interface and a io
> multiplexer that uses epoll/kqueue for efficiently querying sockets.  There
> is quite a bit of code already written so I was hoping for some advice about
> how this would be best implemented.

Note that the following is about using the kqueue backend using SBCL
and iolib dating more than a year ago.  It mignt not apply when using
the /dev/poll, epoll, or even select backends.  Also, please forgive me
if I'm stating the obvious, as I have no knowledge of your background :)

I tried using iolib on NetBSD (which supports kqueue), along with the
multiplexer.  I wrote a very simplistic IO-bound server around it to
measure preformance (no worker threads, but non-blocking I/O in a
single threaded process, a model which I previously successfully used
for high performance C+kqueue(2) (and JavaScript+libevent(3)) on the
same OS).

The performance was unfortunatly pretty bad compared to using C+kqueue
(i.e. in the order of a few hundred served requests per second versus
thousands, and nearly a thousand with JS), so I made sure the kqueue
backend was being used (it was), and then looked at the code (after
being warned that the multiplexer was the less tested part of iolib).
What I noticed at the time was that timers were not dispatched to
kqueue but to a custom scheduler, and that a kevent(2) syscall was
issued per FD add/remove/state change event.

kqueue allows to use a single kevent(2) syscall in the main loop to
handle all additions/removals/state changes/notifications of
descriptors, signals and timers, which is part of what makes it so
performant, other than only needing the caller to iterate among new
state changes rather than a full descriptor table.

I admit that I didn't look at the iolib kevent backend code again
lately, which could have improved, and didn't try to fix it myself
(library portability being of limited value in my case, and using
complete C+PHP and C+JavaScript solutions for work, my adventure
into CL and iolib was experimental and a hobby, but I can confirm my
growing love for CL. :)

Another potential performance issue I've noticed is the interface
itself, i.e. all the sanity checking which to be (allegro?) compatible
as much as possible has to force distinction of various socket types
(bind/listen/accept vs read/write sockets for instance, adding
overhead).  Also, unlike BSD accept(2) which allows to immediately
access the client's address as it's stored into a supplied sockaddr
object, with iolib one has to perform a separate syscall to obtain the
client address/port as the interface did not cache that address.  I
honestly didn't look at if iolib made this possible, but the BSD
sockets API also allows asynchroneous non-blocking accept(2)/connect(2)
which is important for non-blocking I/O-bound proxies.

In the case of my test code, there also was some added overhead
as I wrote a general purpose TCP server library which the minimal test
HTTP server could use.  CLOS was used (which itself has some overhead
over struct/closures/lambda based code because of dynamic runtime
dispatching, although SBCL was pretty good compared to other
implementations to optimize CLOS code).  It also used a custom buffer to
be able to use file descriptors directly instead of streams (especially
since non-blocking I/O was used), although similar code using a
libevent(3) stub class in non-JIT/interpreted JavaScript using
SpiderMonkey was still faster (note that I've not tested iolib's own
buffering against mine however).  libevent(3) is also able to use a
single-syscall kevent(3) based loop which greatly helps performance.

At the time I didn't look into this as I had no idea, but CFFI itself
appears to incur some overhead compared to UFFI, but only looking at
the resulting assembly and microbenchmarks showed me this.  It probably
was a non-issue compared to the numerous kevent(2) syscalls.  Another
probably insignificant, since CPU-bound overhead could be iolib's use
of CLOS (I noticed CLOS to be from 1.5 to 10 times slower in some
struct+lambda vs class+method tests depending on task and CL
implementation).

Another factor was that it was among my first Common Lisp tests, so
the code was probably clumsy :)
In case it can be useful, the test code can be found at:
http://cvs.pulsar-zone.net/cgi-bin/cvsweb.cgi/mmondor/mmsoftware/cl/test/httpd.lisp?rev=1.10;content-type=text%2Fplain
Which uses:
http://cvs.pulsar-zone.net/cgi-bin/cvsweb.cgi/mmondor/mmsoftware/cl/lib/rw-queue/
http://cvs.pulsar-zone.net/cgi-bin/cvsweb.cgi/mmondor/mmsoftware/cl/lib/server/

In case iolib's multiplexer can't suit your needs with your favorite
backend, it however still doesn't make iolib useless, especially in the
case of application servers.  For instance:

As I was playing with ECL more recently, and that it supports POSIX
threads and SBCL-compatible simple BSD sockets API contrib library, I
wrote a simple multithreaded test server where a pool of ready threads
accept new connections themselves to serve the client to then go back
to accept mode when done.  This was actually to test ECL itself, and is
very minimal (isn't flexible and doesn't even implement input
timeouts!), but it can serve to demonstrate the idea which also could
be implemented using SBCL and its native sockets, or iolib, and the
performance was very decent for an application-type server (also note
that the bugs mentionned in the comments have since been fixed in ECL):
http://cvs.pulsar-zone.net/cgi-bin/cvsweb.cgi/mmondor/mmsoftware/cl/test/ecl-server2.lisp?rev=1.10;content-type=text%2Fplain

The above does not require an efficient multiplexer.  The method it
uses is similar to mmlib/mmserver|js-appserv and apache, and generally
a manager thread/process uses heuristics to grow and shrink the
processes/threads pool as necessary.  In the case of ECL, a
libevent(3)/kqueue(2) C-based main loop could even invoke CL functions
if optimal multiplexing was a must, as ECL compiles CL to C (SBCL's
compiler is more efficient however, especially where CLOS is involved).

In general, CPU-bound applications (HTTP application and database
servers often are) use a pool of processes if optimal reliability and
security is a must (permits privilege separation, avoids resource
leaks by occasionally recycling the process, a bug generally only
affects the instance, need for reentrant and thread-safe libraries is a
non-issue) or threads (generally with languages minimizing buffer
overflows and supporting a GC, with a master process managing the
threaded processes) when I/O-bound applications are the ones needing
optimal multiplexing with non-blocking asynchroneous I/O, often in a
single thread/process (i.e. frontend HTTP servers/proxies (lighttpd or
nginx), IRCD, etc).

For very busy dynamic sites, as load grows, a farm of CPU-bound
application servers can be setup and a few frontend I/O-bound HTTP
servers proxy dynamic requests to them (via fastcgi, or most commonly
today HTTP, especially with Keep-Alive support) and perform load
balancing (which is sometimes done at an upper layer).  In this sense
it is not necessary for a single all-purpose HTTP server to both handle
very efficient multiplexing and CPU-bound worker threads simultaneously
(the later usually better kept separate for the purpose of
redundancy and application-specific configuration)...

That said, if you want to implement an IO-bound server, I hope the
backends you'll need to use provide better performance than the kqueue
one did for me back then.  Working on improving it would be interesting
but I'm affraid I don't have the time or motivation to take up the task
at current time.  As for the interface-specific improvements I can (and
did) suggest a few changes but have no authority to change the API,
which seems to have been thought out with valid compatibility concerns.
--

-- 
Matt

Matthew Mondor | 18 Oct 2009 11:46

Re: c10k HTTP server with iolib

On Sun, 18 Oct 2009 05:04:36 -0400
Matthew Mondor <mm_lists <at> pulsar-zone.net> wrote:

> Another factor was that it was among my first Common Lisp tests, so
> the code was probably clumsy :)
> In case it can be useful, the test code can be found at:
[...]

Oh, obviously, adding the webeconomybs page to the httpd was done
afterwards for fun, the performance tests were done using a static
result page at the time. :)
--

-- 
Matt

Stelian Ionescu | 23 Oct 2009 11:00
Gravatar

New patches: 22-Oct-2009


commit e8b89bfbc6bf79704cc4af90facb2b4dc43033bb
Author: Stelian Ionescu <sionescu <at> cddr.org>
Date:   Thu Oct 22 14:49:11 2009 +0200

    Various improvements to WALK-DIRECTORY.

 src/os/os-unix.lisp |   61 +++++++++++++++++++++-----------------------------
 1 files changed, 26 insertions(+), 35 deletions(-)

commit 845705a181458892c8ee1b9c0b5269ef79a61da3
Author: Stelian Ionescu <sionescu <at> cddr.org>
Date:   Thu Oct 22 14:48:26 2009 +0200

    Some optimizations in CSTRING code.

 src/syscalls/unix-syscall-path-strings.lisp |    5 +++++
 1 files changed, 5 insertions(+), 0 deletions(-)

commit 403eae5d5937c259f9c1dba3389a95a5ce560564
Author: Stelian Ionescu <sionescu <at> cddr.org>
Date:   Thu Oct 22 14:46:53 2009 +0200

    Fix typos.

 src/syscalls/unix-syscall-path-strings.lisp |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

commit 85d0c5ea877db9bbcbc679c912ea551ad1031398
Author: Stelian Ionescu <sionescu <at> cddr.org>
Date:   Mon Oct 19 16:44:23 2009 +0200

    Whitespace.

 src/iolib.os.asd |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

commit e0f7110bfc785654882e609b32e8366c6a2f5039
Author: Stelian Ionescu <sionescu <at> cddr.org>
Date:   Sun Oct 18 04:36:53 2009 +0200

    Pass CONSTANTP the current environment when using it in a macro.

 src/sockets/common.lisp         |    4 ++--
 src/sockets/make-socket.lisp    |   10 ++++++----
 src/sockets/socket-methods.lisp |    8 ++++----
 3 files changed, 12 insertions(+), 10 deletions(-)

commit 08b41f407b167824b7453b52cabdf684072b4703
Author: Stelian Ionescu <sionescu <at> cddr.org>
Date:   Sun Oct 18 03:10:50 2009 +0200

    Update my email address in the LICENCE file.

 LICENCE |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

An updated tarball of IOLib's source can be downloaded here:
http://common-lisp.net/project/iolib/files/snapshots/iolib-20091022.tar.gz

Gitweb URL:
http://repo.or.cz/w/iolib.git


Gmane