In the last while we noticed rabbitmq crashing due to lack of memory on the server. It looks like the memory usage is fine for a while and then starts to climb until all ram is consumed causing rabbit to fail. I thought it might be a rabbitmq issue but we've just had scenario such as this now and I've noticed one of the tasks using up 100% of CPU. I killed it and the memory usage went back to normal.
Looking at the logs of the task we see failures coming from librabbitmq trying to reconnect and failing. Is it possible that each such failure contributes to ram not being released until there's nothing left for poor ol' rabbit?
[2013-06-16 23:02:28,472: CRITICAL/MainProcess] Couldn't ack 209L, reason:ConnectionError('Operation on closed connection',)
Traceback (most recent call last):
File "/usr/local/lib/python2.6/dist-packages/kombu/transport/base.py", line 100, in ack_log_error
self.ack()
File "/usr/local/lib/python2.6/dist-packages/kombu/transport/base.py", line 95, in ack
self.channel.basic_ack(self.delivery_tag)
File "/usr/local/lib/python2.6/dist-packages/librabbitmq/__init__.py", line 86, in basic_ack
delivery_tag, multiple)
ConnectionError: Operation on closed connection
[2013-06-16 23:02:28,472: CRITICAL/MainProcess] Couldn't ack 210L, reason:ConnectionError('Operation on closed connection',)
Traceback (most recent call last):
File "/usr/local/lib/python2.6/dist-packages/kombu/transport/base.py", line 100, in ack_log_error
self.ack()
File "/usr/local/lib/python2.6/dist-packages/kombu/transport/base.py", line 95, in ack
self.channel.basic_ack(self.delivery_tag)
File "/usr/local/lib/python2.6/dist-packages/librabbitmq/__init__.py", line 86, in basic_ack
delivery_tag, multiple)
ConnectionError: Operation on closed connection
[2013-06-17 00:03:04,767: ERROR/MainProcess] consumer: Connection to broker lost. Trying to re-establish the connection...
Traceback (most recent call last):
File "/usr/local/lib/python2.6/dist-packages/celery/worker/consumer.py", line 395, in start
self.consume_messages()
File "/usr/local/lib/python2.6/dist-packages/celery/worker/consumer.py", line 486, in consume_messages
handlermap[fileno](fileno, event)
File "/usr/local/lib/python2.6/dist-packages/kombu/connection.py", line 291, in drain_nowait
self.drain_events(timeout=0)
File "/usr/local/lib/python2.6/dist-packages/librabbitmq/__init__.py", line 198, in drain_events
self._basic_recv(timeout)
ChannelError: Bad frame read