Avi Bab | 1 Jul 12:44 2010

deadlock in libevent-2.0.5-beta

 

Running on Linux with pthreads.

 

One thread (CBTcpProxyListenerThread below) adds bufferevents (with option BEV_OPT_THREADSAFE) to an event_base.

A second thread (CBTcpProxySenderThread) dispatches on the event_base.

 

bufferevents are removed from the event_base either by a third thread or by the CBTcpProxySenderThread by calling bufferevent_free (without calling bufferevent_disable first – is this a misuse?).

 

The deadlock happens on pretty high load: ~6000 bufferevents are added and removed per second. Each one is triggered for write ~10 times per seconds (which gives ~60,000 triggeres per-second).

 

Here’s the deadlock stack:

Thread 1 (CBTcpProxyListenerThread)

#0  0x00000030c1c0c758 in __lll_mutex_lock_wait () from /opt/breach-proxy/lib64/libpthread.so.0

#1  0x00000030c1c087fa in _L_mutex_lock_908 () from /opt/breach-proxy/lib64/libpthread.so.0

#2  0x00000030c1c08682 in pthread_mutex_lock () from /opt/breach-proxy/lib64/libpthread.so.0

#3  0x00002aaab13039c8 in evthread_posix_lock (mode=0, _lock=0x15f27b20) at evthread_pthread.c:71

#4  0x00002aaab10e0f05 in event_add (ev=0x15f27d58, tv=0x0) at event.c:1815

#5  0x00002aaab10ed356 in _bufferevent_add_event (ev=0x15f27d58, tv=0x15f27e40) at bufferevent.c:824

#6  0x00002aaab10edf3e in be_socket_enable (bufev=0x15f27cc0, event=4) at bufferevent_sock.c:548

#7  0x00002aaab10ec296 in bufferevent_enable (bufev=0x15f27cc0, event=4) at bufferevent.c:418

#8  0x00002aaaaaac398d in CBTcpProxySenderThread::AddEvents (this=0x15f27490, pxcn= <at> 0x15e649d0) at sender_thread.cpp:56

#9  0x00002aaaaaac4a83 in CBTcpProxy::AddEvents (this=0x7fff061d3a70, pxcn= <at> 0x15e649d0) at tcpproxy.cpp:465

#10 0x00002aaaaaabc920 in complete_pxcn_after_server_connect (fd=236, what=4, arg=0x15e649d0) at listener_thread.cpp:234

#11 0x00002aaab10dedca in event_process_active_single_queue (base=0x15c04650, activeq=0x15c066f0) at event.c:1232

#12 0x00002aaab10df35d in event_process_active (base=0x15c04650) at event.c:1290

#13 0x00002aaab10df98c in event_base_loop (base=0x15c04650, flags=0) at event.c:1483

#14 0x00002aaab10df3c6 in event_base_dispatch (event_base=0x15c04650) at event.c:1317

#15 0x00002aaaaaabdae3 in CBTcpProxyListenerThread::run (this=0x15c04510) at listener_thread.cpp:41

 

Thread 2 (CBTcpProxySenderThread)

#0  0x00000030c1c0c758 in __lll_mutex_lock_wait () from /opt/breach-proxy/lib64/libpthread.so.0

#1  0x00000030c1c087fa in _L_mutex_lock_908 () from /opt/breach-proxy/lib64/libpthread.so.0

#2  0x00000030c1c08682 in pthread_mutex_lock () from /opt/breach-proxy/lib64/libpthread.so.0

#3  0x00002aaab13039c8 in evthread_posix_lock (mode=0, _lock=0x15f27b20) at evthread_pthread.c:71

#4  0x00002aaab10e188e in event_del (ev=0x2aaabc0678a0) at event.c:2015

#5  0x00002aaab10ee095 in be_socket_destruct (bufev=0x2aaabc067890) at bufferevent_sock.c:581

#6  0x00002aaab10ec956 in _bufferevent_decref_and_unlock (bufev=0x2aaabc067890) at bufferevent.c:600

#7  0x00002aaab10ed85f in bufferevent_writecb (fd=321, event=4, arg=0x2aaabc067890) at bufferevent_sock.c:306

#8  0x00002aaab10df1a7 in event_persist_closure (base=0x15f275e0, ev=0x2aaabc067928) at event.c:1184                                                                                

#9  0x00002aaab10ded6d in event_process_active_single_queue (base=0x15f275e0, activeq=0x15f27b00) at event.c:1227

#10 0x00002aaab10df35d in event_process_active (base=0x15f275e0) at event.c:1290

#11 0x00002aaab10df98c in event_base_loop (base=0x15f275e0, flags=0) at event.c:1483

#12 0x00002aaab10df3c6 in event_base_dispatch (event_base=0x15f275e0) at event.c:1317

#13 0x00002aaaaaac3ab2 in CBTcpProxySenderThread::run (this=0x15f27490) at sender_thread.cpp:35

 

Thanks,

Avi

 

xiaobing jiang | 1 Jul 17:31 2010
Picon

the usage of ev_pncalls in struct event

hi all:
    what's the usage of ev_pncalls in struct event? why not direct
use ev_ncalls?

two question:
1. in libevent 1.4.14, it seems only used in event_process_active(),
but in libeven2 it used in event_signal_closure. why?
2.         ev->ev_pncalls = &ncalls;  // from event_process_active()
here, ncalls is a stack variable.

           if (ev->ev_ncalls && ev->ev_pncalls) {
               /* Abort loop */
               *ev->ev_pncalls = 0;     //  ev.ev_pncalls point to a
stack variable, will this cover the stack?
           }

thanks!
***********************************************************************
To unsubscribe, send an e-mail to majordomo <at> freehaven.net with
unsubscribe libevent-users    in the body.

Joachim Bauch | 2 Jul 16:09 2010
Picon

Re: deadlock in libevent-2.0.5-beta

Hi,

On 01.07.2010 12:44, Avi Bab wrote:
> Running on Linux with pthreads.
>
> One thread (CBTcpProxyListenerThread below) adds bufferevents (with
> option BEV_OPT_THREADSAFE) to an event_base.
>
> A second thread (CBTcpProxySenderThread) dispatches on the event_base.

have you tried creating your bufferevent with BEV_OPT_DEFER_CALLBACKS
and BEV_OPT_UNLOCK_CALLBACKS?

This unlocks the bufferevents while executing the callbacks and should
prevent the deadlock from happening.

Best regards,
   Joachim
***********************************************************************
To unsubscribe, send an e-mail to majordomo <at> freehaven.net with
unsubscribe libevent-users    in the body.

Nick Mathewson | 2 Jul 19:48 2010
Picon

Re: deadlock in libevent-2.0.5-beta

On Thu, Jul 1, 2010 at 6:44 AM, Avi Bab <avib <at> breach.com> wrote:
>
>
> Running on Linux with pthreads.
>
>
>
> One thread (CBTcpProxyListenerThread below) adds bufferevents (with option
> BEV_OPT_THREADSAFE) to an event_base.
>
> A second thread (CBTcpProxySenderThread) dispatches on the event_base.
>
>
>
> bufferevents are removed from the event_base either by a third thread or by
> the CBTcpProxySenderThread by calling bufferevent_free (without calling
> bufferevent_disable first – is this a misuse?).
>
>
>
> The deadlock happens on pretty high load: ~6000 bufferevents are added and
> removed per second. Each one is triggered for write ~10 times per seconds
> (which gives ~60,000 triggeres per-second).

The stack traces look like they aren't the whole story.  It seems the
two threads you listed are both trying to acquire the lock for the
event base, and blocking on it..  But what's the stack of the thread
that's actually holding the lock?

--

-- 
Nick
***********************************************************************
To unsubscribe, send an e-mail to majordomo <at> freehaven.net with
unsubscribe libevent-users    in the body.

Zhou Li | 3 Jul 08:10 2010
Picon

Re: deadlock in libevent-2.0.5-beta

I met such deadlock too. It happened under very high load just as you said. I think the cause is that the call write(th_notify_fd[1]) got blocked ( sorry I didn't remember the exact position of this call to write th_notify_fd).


In event.c line 2597:

    /*
      This can't be right, can it?  We want writes to this socket to
      just succeed.
      evutil_make_socket_nonblocking(base->th_notify_fd[1]);
    */

When I uncommented this block of code, the deadlock disappeared.


On Sat, Jul 3, 2010 at 1:48 AM, Nick Mathewson <nickm <at> freehaven.net> wrote:
On Thu, Jul 1, 2010 at 6:44 AM, Avi Bab <avib <at> breach.com> wrote:
>
>
> Running on Linux with pthreads.
>
>
>
> One thread (CBTcpProxyListenerThread below) adds bufferevents (with option
> BEV_OPT_THREADSAFE) to an event_base.
>
> A second thread (CBTcpProxySenderThread) dispatches on the event_base.
>
>
>
> bufferevents are removed from the event_base either by a third thread or by
> the CBTcpProxySenderThread by calling bufferevent_free (without calling
> bufferevent_disable first – is this a misuse?).
>
>
>
> The deadlock happens on pretty high load: ~6000 bufferevents are added and
> removed per second. Each one is triggered for write ~10 times per seconds
> (which gives ~60,000 triggeres per-second).


The stack traces look like they aren't the whole story.  It seems the
two threads you listed are both trying to acquire the lock for the
event base, and blocking on it..  But what's the stack of the thread
that's actually holding the lock?

--
Nick
***********************************************************************
To unsubscribe, send an e-mail to majordomo <at> freehaven.net with
unsubscribe libevent-users    in the body.

Avi Bab | 4 Jul 10:29 2010

RE: deadlock in libevent-2.0.5-beta


Indeed it seems that someone, some when, failed to release the lock.
At the time of the deadlock the third thread (The ReceiverThread) is dispatching on a different eventbase.

This third thread does do some manipulation on bufferevents that are registered with the Sender's event_base:

void CBTcpProxy::ClosePXCN(BS_PXCN& pxcn, SIDE closing_peer) const
{
...
	struct bufferevent* passive_peer_be = (closing_peer == SIDE_CLIENT ? pxcn._write_server : pxcn._write_client);
	struct evbuffer* passive_out = bufferevent_get_output(passive_peer_be);
	bool delete_now = false;
	
	evbuffer_lock(passive_out);
	{
		if(evbuffer_get_length(passive_out) > 0)
		{
			bufferevent_setcb(passive_peer_be, NULL, &closing_cb, &event_cb, (void*)(&pxcn));		
		}
		else
		{
			bufferevent_disable(passive_peer_be, EV_READ|EV_WRITE);
			delete_now = true;
		}
	}
	evbuffer_unlock(passive_out);

	if(delete_now)
	{
		delete &pxcn; //destructor calls bufferevent_free.
	}
}

void closing_cb(struct bufferevent *bev, void *ctx)//will be called by the SenserThread when the output
buffer is empty
{
	PBS_PXCN p_pxcn = (PBS_PXCN)ctx;
	delete p_pxcn;
}

This is the only interaction with the third thread. 
I do not see a relation to the deadlock.

Thanks,
Avi

-----Original Message-----
From: owner-libevent-users <at> freehaven.net [mailto:owner-libevent-users <at> freehaven.net] On Behalf
Of Nick Mathewson
Sent: Friday, July 02, 2010 8:48 PM
To: libevent-users <at> freehaven.net
Subject: Re: [Libevent-users] deadlock in libevent-2.0.5-beta

On Thu, Jul 1, 2010 at 6:44 AM, Avi Bab <avib <at> breach.com> wrote:
>
>
> Running on Linux with pthreads.
>
>
>
> One thread (CBTcpProxyListenerThread below) adds bufferevents (with option
> BEV_OPT_THREADSAFE) to an event_base.
>
> A second thread (CBTcpProxySenderThread) dispatches on the event_base.
>
>
>
> bufferevents are removed from the event_base either by a third thread or by
> the CBTcpProxySenderThread by calling bufferevent_free (without calling
> bufferevent_disable first - is this a misuse?).
>
>
>
> The deadlock happens on pretty high load: ~6000 bufferevents are added and
> removed per second. Each one is triggered for write ~10 times per seconds
> (which gives ~60,000 triggeres per-second).

The stack traces look like they aren't the whole story.  It seems the
two threads you listed are both trying to acquire the lock for the
event base, and blocking on it..  But what's the stack of the thread
that's actually holding the lock?

--

-- 
Nick
***********************************************************************
To unsubscribe, send an e-mail to majordomo <at> freehaven.net with
unsubscribe libevent-users    in the body.
***********************************************************************
To unsubscribe, send an e-mail to majordomo <at> freehaven.net with
unsubscribe libevent-users    in the body.

Avi Bab | 4 Jul 12:40 2010

RE: deadlock in libevent-2.0.5-beta

 

This made a great improvement – the deadlock still appears but only on  even higher loads.

 

Thanks,

Avi

 

 

From: owner-libevent-users <at> freehaven.net [mailto:owner-libevent-users <at> freehaven.net] On Behalf Of Zhou Li
Sent: Saturday, July 03, 2010 9:11 AM
To: libevent-users <at> freehaven.net
Subject: Re: [Libevent-users] deadlock in libevent-2.0.5-beta

 

I met such deadlock too. It happened under very high load just as you said. I think the cause is that the call write(th_notify_fd[1]) got blocked ( sorry I didn't remember the exact position of this call to write th_notify_fd).

 

In event.c line 2597:

 

    /*

      This can't be right, can it?  We want writes to this socket to

      just succeed.

      evutil_make_socket_nonblocking(base->th_notify_fd[1]);

    */

 

When I uncommented this block of code, the deadlock disappeared.

 

 

On Sat, Jul 3, 2010 at 1:48 AM, Nick Mathewson <nickm <at> freehaven.net> wrote:

On Thu, Jul 1, 2010 at 6:44 AM, Avi Bab <avib <at> breach.com> wrote:
>
>
> Running on Linux with pthreads.
>
>
>
> One thread (CBTcpProxyListenerThread below) adds bufferevents (with option
> BEV_OPT_THREADSAFE) to an event_base.
>
> A second thread (CBTcpProxySenderThread) dispatches on the event_base.
>
>
>

> bufferevents are removed from the event_base either by a third thread or by
> the CBTcpProxySenderThread by calling bufferevent_free (without calling
> bufferevent_disable first – is this a misuse?).
>
>
>
> The deadlock happens on pretty high load: ~6000 bufferevents are added and
> removed per second. Each one is triggered for write ~10 times per seconds
> (which gives ~60,000 triggeres per-second).


The stack traces look like they aren't the whole story.  It seems the
two threads you listed are both trying to acquire the lock for the
event base, and blocking on it..  But what's the stack of the thread
that's actually holding the lock?

--
Nick

***********************************************************************
To unsubscribe, send an e-mail to majordomo <at> freehaven.net with
unsubscribe libevent-users    in the body.

 

Avi Bab | 4 Jul 13:04 2010

RE: deadlock in libevent-2.0.5-beta

 

Prior to this modification the deadlock occurred when the thread that deletes events from the event_base had consumed ~100% CPU (each thread has a dedicated CPU).

Now it happens when the thread that adds events to the event_base consumes ~100% CPU.

 

Avi

 

 

From: owner-libevent-users <at> freehaven.net [mailto:owner-libevent-users <at> freehaven.net] On Behalf Of Avi Bab
Sent: Sunday, July 04, 2010 1:41 PM
To: libevent-users <at> freehaven.net
Subject: RE: [Libevent-users] deadlock in libevent-2.0.5-beta

 

 

This made a great improvement – the deadlock still appears but only on  even higher loads.

 

Thanks,

Avi

 

 

From: owner-libevent-users <at> freehaven.net [mailto:owner-libevent-users <at> freehaven.net] On Behalf Of Zhou Li
Sent: Saturday, July 03, 2010 9:11 AM
To: libevent-users <at> freehaven.net
Subject: Re: [Libevent-users] deadlock in libevent-2.0.5-beta

 

I met such deadlock too. It happened under very high load just as you said. I think the cause is that the call write(th_notify_fd[1]) got blocked ( sorry I didn't remember the exact position of this call to write th_notify_fd).

 

In event.c line 2597:

 

    /*

      This can't be right, can it?  We want writes to this socket to

      just succeed.

      evutil_make_socket_nonblocking(base->th_notify_fd[1]);

    */

 

When I uncommented this block of code, the deadlock disappeared.

 

 

On Sat, Jul 3, 2010 at 1:48 AM, Nick Mathewson <nickm <at> freehaven.net> wrote:

On Thu, Jul 1, 2010 at 6:44 AM, Avi Bab <avib <at> breach.com> wrote:
>
>
> Running on Linux with pthreads.
>
>
>
> One thread (CBTcpProxyListenerThread below) adds bufferevents (with option
> BEV_OPT_THREADSAFE) to an event_base.
>
> A second thread (CBTcpProxySenderThread) dispatches on the event_base.
>
>
>

> bufferevents are removed from the event_base either by a third thread or by
> the CBTcpProxySenderThread by calling bufferevent_free (without calling
> bufferevent_disable first – is this a misuse?).
>
>
>
> The deadlock happens on pretty high load: ~6000 bufferevents are added and
> removed per second. Each one is triggered for write ~10 times per seconds
> (which gives ~60,000 triggeres per-second).


The stack traces look like they aren't the whole story.  It seems the
two threads you listed are both trying to acquire the lock for the
event base, and blocking on it..  But what's the stack of the thread
that's actually holding the lock?

--
Nick

***********************************************************************
To unsubscribe, send an e-mail to majordomo <at> freehaven.net with
unsubscribe libevent-users    in the body.

 

Avi Bab | 4 Jul 13:05 2010

RE: deadlock in libevent-2.0.5-beta

Apologies – this seems to be incorrect

 

From: owner-libevent-users <at> freehaven.net [mailto:owner-libevent-users <at> freehaven.net] On Behalf Of Avi Bab
Sent: Sunday, July 04, 2010 2:04 PM
To: libevent-users <at> freehaven.net
Subject: RE: [Libevent-users] deadlock in libevent-2.0.5-beta

 

 

Prior to this modification the deadlock occurred when the thread that deletes events from the event_base had consumed ~100% CPU (each thread has a dedicated CPU).

Now it happens when the thread that adds events to the event_base consumes ~100% CPU.

 

Avi

 

 

From: owner-libevent-users <at> freehaven.net [mailto:owner-libevent-users <at> freehaven.net] On Behalf Of Avi Bab
Sent: Sunday, July 04, 2010 1:41 PM
To: libevent-users <at> freehaven.net
Subject: RE: [Libevent-users] deadlock in libevent-2.0.5-beta

 

 

This made a great improvement – the deadlock still appears but only on  even higher loads.

 

Thanks,

Avi

 

 

From: owner-libevent-users <at> freehaven.net [mailto:owner-libevent-users <at> freehaven.net] On Behalf Of Zhou Li
Sent: Saturday, July 03, 2010 9:11 AM
To: libevent-users <at> freehaven.net
Subject: Re: [Libevent-users] deadlock in libevent-2.0.5-beta

 

I met such deadlock too. It happened under very high load just as you said. I think the cause is that the call write(th_notify_fd[1]) got blocked ( sorry I didn't remember the exact position of this call to write th_notify_fd).

 

In event.c line 2597:

 

    /*

      This can't be right, can it?  We want writes to this socket to

      just succeed.

      evutil_make_socket_nonblocking(base->th_notify_fd[1]);

    */

 

When I uncommented this block of code, the deadlock disappeared.

 

 

On Sat, Jul 3, 2010 at 1:48 AM, Nick Mathewson <nickm <at> freehaven.net> wrote:

On Thu, Jul 1, 2010 at 6:44 AM, Avi Bab <avib <at> breach.com> wrote:
>
>
> Running on Linux with pthreads.
>
>
>
> One thread (CBTcpProxyListenerThread below) adds bufferevents (with option
> BEV_OPT_THREADSAFE) to an event_base.
>
> A second thread (CBTcpProxySenderThread) dispatches on the event_base.
>
>
>

> bufferevents are removed from the event_base either by a third thread or by
> the CBTcpProxySenderThread by calling bufferevent_free (without calling
> bufferevent_disable first – is this a misuse?).
>
>
>
> The deadlock happens on pretty high load: ~6000 bufferevents are added and
> removed per second. Each one is triggered for write ~10 times per seconds
> (which gives ~60,000 triggeres per-second).


The stack traces look like they aren't the whole story.  It seems the
two threads you listed are both trying to acquire the lock for the
event base, and blocking on it..  But what's the stack of the thread
that's actually holding the lock?

--
Nick

***********************************************************************
To unsubscribe, send an e-mail to majordomo <at> freehaven.net with
unsubscribe libevent-users    in the body.

 

Nick Mathewson | 4 Jul 15:43 2010
Picon

Re: deadlock in libevent-2.0.5-beta

On Sat, Jul 3, 2010 at 2:10 AM, Zhou Li <echou327 <at> gmail.com> wrote:
> I met such deadlock too. It happened under very high load just as you said.
> I think the cause is that the call write(th_notify_fd[1]) got blocked (
> sorry I didn't remember the exact position of this call to write
> th_notify_fd).
> In event.c line 2597:
>     /*
>       This can't be right, can it?  We want writes to this socket to
>       just succeed.
>       evutil_make_socket_nonblocking(base->th_notify_fd[1]);
>     */
> When I uncommented this block of code, the deadlock disappeared.
>

This change isn't correct, though.  th_notify_fd[1] is used to tell
the main thread (the one running event_base_loop) to wake up.  The
code that writes to it doesn't check for EAGAIN, so  making that
socket nonblocking means that some attempts to wake up the main thread
will just get lost.
***********************************************************************
To unsubscribe, send an e-mail to majordomo <at> freehaven.net with
unsubscribe libevent-users    in the body.


Gmane