I'm using sqlobjects with webware for python on my website in production since 2007.
Since about one year ago (maybe more...), from time to time (about every week/month), the python process completely freezes under high load.
I upgraded to sqlobject 1.3.2 with latest mysqlDB, and still have the problem.
All signals are ignored. it can only be killed with a kill -9.
strace show the python process is stuck in a futex kernel call.
Recently, I tried to gdb the frozen python process: All threads are waiting for a semaphore, except one, that is waiting probably for a mysql response, in the _mysql.so object of python-mysql.
unfortunately, the gdb pystack macro doesn't work for me. So I can't get the python stack. It seems that gdb cpu gets all cpu dureing a very very long time. I haven't waited enough...
All threads waiting for a semaphore. I don't know if it is the python GIL or another mysql or sqlobject related lock. The GIL would explain that signals are ineffective.
#0 0x00110416 in __kernel_vsyscall ()
#1 0x005a8865 in sem_wait <at> <at> GLIBC_2.1 () from /lib/libpthread.so.0
#2 0x0067eafb in PyThread_acquire_lock (lock=0x8dc7028, waitflag=1) at Python/thread_pthread.h:349
#3 0x00682eb8 in lock_PyThread_acquire_lock (self=0x854c430, args=0xb7f3102c) at Modules/threadmodule.c:46
#4 0x0060d7ed in PyCFunction_Call (func=0x8eeb0ec, arg=0xb7f3102c, kw=0x0) at Objects/methodobject.c:108
#5 0x0065ac72 in PyEval_EvalFrameEx (f=0x93d7ff4, throwflag=0) at Python/ceval.c:3564
#6 0x00659fcd in PyEval_EvalFrameEx (f=0x8e2e53c, throwflag=0) at Python/ceval.c:3650
#7 0x0065b6bf in PyEval_EvalCodeEx (co=0x8922410, globals=0x891e3e4, locals=0x0, args=0x9cfa2dc, argcount=1, kws=0x9cfa2e0,
kwcount=0, defs=0x891af18, defcount=1, closure=0x0) at Python/ceval.c:2831
#8 0x00659844 in PyEval_EvalFrameEx (f=0x9cfa12c, throwflag=0) at Python/ceval.c:3660
#9 0x00659fcd in PyEval_EvalFrameEx (f=0x8dccfa4, throwflag=0) at Python/ceval.c:3650
#10 0x0065b6bf in PyEval_EvalCodeEx (co=0x87ad848, globals=0x87a924c, locals=0x0, args=0x8d43818, argcount=2, kws=0x0,
except one thread: ( I still have to check all 25 threads...)
#0 0x00110416 in __kernel_vsyscall ()
#1 0x005a952b in read () from /lib/libpthread.so.0
#2 0x00295338 in vio_read () from /usr/lib/mysql/libmysqlclient_r.so.15
#3 0x002953ae in vio_read_buff () from /usr/lib/mysql/libmysqlclient_r.so.15
#4 0x002967ab in ?? () from /usr/lib/mysql/libmysqlclient_r.so.15
#5 0x00296b9b in my_net_read () from /usr/lib/mysql/libmysqlclient_r.so.15
#6 0x0028fe39 in cli_safe_read () from /usr/lib/mysql/libmysqlclient_r.so.15
#7 0x00290c35 in ?? () from /usr/lib/mysql/libmysqlclient_r.so.15
#8 0x0028f1e4 in mysql_real_query () from /usr/lib/mysql/libmysqlclient_r.so.15
#9 0x00243f53 in _mysql_ConnectionObject_query (self=0x941160c, args=0x978138c) at _mysql.c:2008
#10 0x0060d7ed in PyCFunction_Call (func=0x9c300cc, arg=0x978138c, kw=0x0) at Objects/methodobject.c:108
#11 0x0065ac72 in PyEval_EvalFrameEx (f=0x940d924, throwflag=0) at Python/ceval.c:3564
#12 0x00659fcd in PyEval_EvalFrameEx (f=0x9b12c64, throwflag=0) at Python/ceval.c:3650
#13 0x00659fcd in PyEval_EvalFrameEx (f=0x9b7cdf4, throwflag=0) at Python/ceval.c:3650
#14 0x0065b6bf in PyEval_EvalCodeEx (co=0x8d2ead0, globals=0x8d304f4, locals=0x0, args=0x9240fec, argcount=2, kws=0x9240ff4,
kwcount=0, defs=0x8d37358, defcount=1, closure=0x0) at Python/ceval.c:2831
Note: I also get some "ProgrammingError: Commands out of sync; you can't run this command now" errors from time to time.
This is why I'm suspecting a wrong connection management between my threads.
Could it be related to the fact that I'm using the sqlHub.processConnection feature of SqlObject?
Reading the code, I don't understand the call path from dbConnection to the SqlHub.
How is the connection pool managed?
Is it thread safe?
Thanks in advance for your support