tinti | 2 Dec 2010 10:23
Picon
Gravatar

Celery crashed

Hi all,

what happened to my server yesterday?

Celeryd crashed and this is what has been logged:

2010-12-01 17:03:50,492 INFO Task
mgmtapp.tasks.TaskMod_account[2eafc1ed-91b1-4eaa-a302-cc04bce96edf]
succeeded in 1.19395208359s: None
2010-12-01 19:10:47,650 WARNING Traceback (most recent call last):
2010-12-01 19:10:47,933 WARNING File "/opt/app/manage.py", line 11, in
<module>
2010-12-01 19:10:47,948 WARNING execute_manager(settings)
2010-12-01 19:10:47,953 WARNING File "/usr/lib/python2.6/site-packages/
django/core/management/__init__.py", line 438, in execute_manager
2010-12-01 19:10:48,067 WARNING utility.execute()
2010-12-01 19:10:48,067 WARNING File "/usr/lib/python2.6/site-packages/
django/core/management/__init__.py", line 379, in execute
2010-12-01 19:10:48,068 WARNING
self.fetch_command(subcommand).run_from_argv(self.argv)
2010-12-01 19:10:48,068 WARNING File "/usr/lib/python2.6/site-packages/
django/core/management/base.py", line 191, in run_from_argv
2010-12-01 19:10:48,118 WARNING self.execute(*args,
**options.__dict__)
2010-12-01 19:10:48,127 WARNING File "/usr/lib/python2.6/site-packages/
django/core/management/base.py", line 218, in execute
2010-12-01 19:10:48,133 WARNING output = self.handle(*args, **options)
2010-12-01 19:10:48,156 WARNING File "/usr/lib/python2.6/site-packages/
djcelery/management/commands/celeryd.py", line 20, in handle
2010-12-01 19:10:48,215 WARNING worker.run(*args, **options)
(Continue reading)

Ask Solem | 2 Dec 2010 10:43
Picon
Favicon
Gravatar

Re: Celery crashed


On Dec 2, 2010, at 10:23 AM, tinti wrote:

> Hi all,
> 
> what happened to my server yesterday?
> 
> Celeryd crashed and this is what has been logged:
> 
> [...]
> 2010-12-01 19:10:48,728 WARNING (class_id, method_id))
> 2010-12-01 19:10:48,735 WARNING amqplib.client_0_8.exceptions
> 2010-12-01 19:10:48,735 WARNING .
> 2010-12-01 19:10:48,735 WARNING AMQPChannelException
> 2010-12-01 19:10:48,735 WARNING :
> 2010-12-01 19:10:48,736 WARNING (404, u"NOT_FOUND - no exchange
> 'celerycrq' in vhost '/'", (60, 40), 'Channel.basic_publish')
> 2010-12-01 19:10:51,220 INFO Celerybeat: Shutting down...
> 2010-12-01 22:06:51,596 WARNING celery <at> ldap1 v2.1.3 is starting.
> 
> [...]
> 

That is a mystery, as the exchange should have been synchronously declared just before.
Do you have a RabbitMQ cluster? What version of RabbitMQ?

Still, It shouldn't have crashed just because it couldn't send a broadcast command reply,
I will provide a patch for that in 2.1.4.

> 
(Continue reading)

Ask Solem | 2 Dec 2010 13:25
Picon
Favicon
Gravatar

Eventlet support in master and master branch frozen.

People,

The master branch is now frozen, and no new features will be added to 2.2.0.
This will be a great release, so be sure to try it out as soon as you can so we can
get it tested enough to release.

Eventlet pool support is in too, and not just running in another thread or process, it's
running directly in the consumer process.

A -P, --pool argument to celeryd has been added, and you can currently
choose between processes, eventlet and gevent.

See examples/eventlet:
    https://github.com/ask/celery/tree/master/examples/eventlet/

Small video of celeryd + eventlet making 100 async web requests:
    http://bit.ly/euPmiH

Note: gevent support may not be working correctly, it's not as complete as eventlet support
yet, but it should be small matter to complete.

Hunt bugs!

--

-- 
{Ask Solem,
 +47 98435213 | twitter.com/asksol }.

Joaquin Cuenca Abela | 2 Dec 2010 15:24

Celery tasks got stuck

Hi,

I'm running several celeryd processes like:

$ celeryd -Q parse-rss -c 6 --time-limit=60 --soft-time-limit=55
$ celeryd -Q fetch-page -c 6 --time-limit=60 --soft-time-limit=55
[...]

The 6 celeryd processes for fetch-page consume tasks for 1 or 2 hours,
but then they block. In my fetch-page logs, just before they get stuck
I get:

http://pastebin.com/eyQLmSNK

I'm using:

$ celeryd --version
2.1.1

Any ideas on what may be going wrong?

when I check with ps, all these celeryd tasks look normal (in the S state).

--

-- 
Joaquin Cuenca Abela

dmitry b | 2 Dec 2010 20:55
Picon

queue size and rabbitmq memory consumption

Hi,

We are using celeryd/rabbitmq combo to process massive amounts of
data.  At times we might be queuing tens of millions of tasks.  Does
anyone know if the system is robust enough to handle such load or
should we throttle the rate at which we schedule tasks (I'd rather not
as we have more than a single source of tasks and I'd hate to have to
put throttling everywhere).  We do see rabbitmq using massive amounts
of memory, which to the question: does it keep the entire contents of
its queues in memory?

Thanks
Dmitry

Harel Malka | 2 Dec 2010 21:09
Favicon

Re: queue size and rabbitmq memory consumption

Upgrade to the latest rabbit version which has a new persister. The
pre 2.x versions had a broken persister which caused me grief on more
than one occasion. Rabbit was limited by the amount of ram on the
machine it seemed like.
The new rabbits use disk to persist messages which is more sensible.
space is cheap and plentiful.
Harel

On Thu, Dec 2, 2010 at 7:55 PM, dmitry b <dmitry.maven@...> wrote:
> Hi,
>
> We are using celeryd/rabbitmq combo to process massive amounts of
> data.  At times we might be queuing tens of millions of tasks.  Does
> anyone know if the system is robust enough to handle such load or
> should we throttle the rate at which we schedule tasks (I'd rather not
> as we have more than a single source of tasks and I'd hate to have to
> put throttling everywhere).  We do see rabbitmq using massive amounts
> of memory, which to the question: does it keep the entire contents of
> its queues in memory?
>
>
> Thanks
> Dmitry
>
> --
> You received this message because you are subscribed to the Google Groups "celery-users" group.
> To post to this group, send email to celery-users@...
> To unsubscribe from this group, send email to celery-users+unsubscribe <at> googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/celery-users?hl=en.
>
(Continue reading)

dmitry b | 2 Dec 2010 22:28
Picon

Re: queue size and rabbitmq memory consumption

we are already running 2.1.1.  So tens of millions of tasks on a queue
shouldn't present a problem?  The reason I ask is that we are
experiencing system instability manifested by the kernel killing
processes when hitting an out-of-memory condition.  So before we start
digging for the cause, I'd like to rule out a huge queue as the most
likely culprit.

Thanks
D.

On Dec 2, 12:09 pm, Harel Malka <ha...@...> wrote:
> Upgrade to the latest rabbit version which has a new persister. The
> pre 2.x versions had a broken persister which caused me grief on more
> than one occasion. Rabbit was limited by the amount of ram on the
> machine it seemed like.
> The new rabbits use disk to persist messages which is more sensible.
> space is cheap and plentiful.
> Harel
>
>
>
>
>
>
>
> On Thu, Dec 2, 2010 at 7:55 PM, dmitry b <dmitry.ma...@...> wrote:
> > Hi,
>
> > We are using celeryd/rabbitmq combo to process massive amounts of
> > data.  At times we might be queuing tens of millions of tasks.  Does
(Continue reading)

Harel Malka | 3 Dec 2010 00:12
Favicon

Re: Re: queue size and rabbitmq memory consumption

I ran big queues, but not THAT big. I did run into serious problem in
pre 2.x like I said, but those were resolved with the introduction of
the new persister and rabbit not being bound by ram. From experience
the rabbitmq quys are happy to help in their forums.

Regardless of the problem, I think if you reach that amount of
messages that might be your queue to bring up a LOT more workers to
consume the queue...
The rabbit might get a tummy ache ;o)

On Thu, Dec 2, 2010 at 9:28 PM, dmitry b <dmitry.maven@...> wrote:
> we are already running 2.1.1.  So tens of millions of tasks on a queue
> shouldn't present a problem?  The reason I ask is that we are
> experiencing system instability manifested by the kernel killing
> processes when hitting an out-of-memory condition.  So before we start
> digging for the cause, I'd like to rule out a huge queue as the most
> likely culprit.
>
> Thanks
> D.
>
> On Dec 2, 12:09 pm, Harel Malka <ha...@...> wrote:
>> Upgrade to the latest rabbit version which has a new persister. The
>> pre 2.x versions had a broken persister which caused me grief on more
>> than one occasion. Rabbit was limited by the amount of ram on the
>> machine it seemed like.
>> The new rabbits use disk to persist messages which is more sensible.
>> space is cheap and plentiful.
>> Harel
>>
(Continue reading)

Ask Solem | 3 Dec 2010 00:26
Picon
Favicon
Gravatar

Re: Re: queue size and rabbitmq memory consumption


On Dec 3, 2010, at 12:12 AM, Harel Malka wrote:

> I ran big queues, but not THAT big. I did run into serious problem in
> pre 2.x like I said, but those were resolved with the introduction of
> the new persister and rabbit not being bound by ram. From experience
> the rabbitmq quys are happy to help in their forums.
> 
> Regardless of the problem, I think if you reach that amount of
> messages that might be your queue to bring up a LOT more workers to
> consume the queue...
> The rabbit might get a tummy ache ;o)

Also, are you sure you don't have lots of uncollected results and events?

See:

rabbitmqctl list_queues -p $your_vhost name messages consumers memory

Results can be disabled, and events in 2.2 will be transient and non-durable so you don't
have to collect them.

--

-- 
{Ask Solem,
 +47 98435213 | twitter.com/asksol }.

Ask Solem | 4 Dec 2010 08:51
Picon

Re: Celery tasks got stuck


On Dec 2, 3:24 pm, Joaquin Cuenca Abela <joaq...@...>
wrote:
> Hi,
>
> I'm running several celeryd processes like:
>
> $ celeryd -Q parse-rss -c 6 --time-limit=60 --soft-time-limit=55
> $ celeryd -Q fetch-page -c 6 --time-limit=60 --soft-time-limit=55
> [...]
>
> The 6 celeryd processes for fetch-page consume tasks for 1 or 2 hours,
> but then they block. In my fetch-page logs, just before they get stuck
> I get:
>
> http://pastebin.com/eyQLmSNK
>
> I'm using:
>
> $ celeryd --version
> 2.1.1
>
> Any ideas on what may be going wrong?
>
> when I check with ps, all these celeryd tasks look normal (in the S state).
>

Could you please upgrade to the latest version? (2.1.4)
There was a bug related to this fixed recently.

(Continue reading)


Gmane