assafgordon | 18 Oct 03:31 2014
Picon

Celery with Amazon/AWS Spot-Instances

Hello,

I'd like to bump a previously mentioned topic, of using Celery with AWS's spot-instances (which can terminate abruptly without any kind of notification or clean shutdown).

Is anyone using such a setup, and can share tips and advice?
Previous discussions did not seem to end in definitive conclusion:
  https://groups.google.com/d/msg/celery-users/VcghhZZM_zI/p9yf0x18hcYJ
  https://groups.google.com/d/msg/celery-users/CwOVXrRxE5s/pbXYbWFGZnUJ
  https://groups.google.com/d/msg/celery-users/QEeTLt9tl9Y/ewxFk04Xc9QJ

The last message is mine, but wasn't answered.
Seems like Using RabbitMQ with "late acks" is not sufficient (unless I missed something).

Alternatively,
If others are using Celery with amazon - how do you deploy it? and using which methods (only "on-demands" ? or perhaps "elastic beans" or EMR ?)

Any comments will be appreciated.
Thanks!
 - Gordon


--
You received this message because you are subscribed to the Google Groups "celery-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to celery-users+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To post to this group, send email to celery-users-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
Visit this group at http://groups.google.com/group/celery-users.
For more options, visit https://groups.google.com/d/optout.
Mathieu Longtin | 17 Oct 22:45 2014

Delivery guarantees of queues

I couldn't find this information in the doc.

What kind of delivery guarantees are there on message delivery with a Redis backend? 

If a worker gets killed -9 while working on a job, will another worker pick it up?

(yes I know Redis can lose data but my redis server has been up for three months, in our case, workers tend to get killed more frequently, especially if they use too much memory)

Thanks

--
You received this message because you are subscribed to the Google Groups "celery-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to celery-users+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To post to this group, send email to celery-users-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
Visit this group at http://groups.google.com/group/celery-users.
For more options, visit https://groups.google.com/d/optout.
Mathieu Longtin | 17 Oct 17:38 2014

Auto-reloading of task module

Hi, 

I'm trying to start workers in auto reload mode, but they never reload. I tried this:

env CELERYD_FSNOTIFY=stat celery -A tasks worker --logleve=info --autoreload

It didn't help. Am I missing something?

--
You received this message because you are subscribed to the Google Groups "celery-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to celery-users+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To post to this group, send email to celery-users-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
Visit this group at http://groups.google.com/group/celery-users.
For more options, visit https://groups.google.com/d/optout.
Mathieu Longtin | 17 Oct 17:37 2014

Tracking and retrying failed tasks

Hi,

I'm investigating Celery to replace Qless. I'm processing a few millions tasks a day, so scale is important. Celery peaked my interest because it seems more mature and maintained.

I don't care about a task result unless it fails. If it fails, I want to keep the tasks around, see it in a dashboard and re-submit it for retry once I fix the problem. I can do that easily with qless.

One thing that I can't seem to figure out is how to get a list of failed tasks. I set up the tasks like this:

<at> app.task(store_errors_even_if_ignored=True, ignore_result=True)

Once a task failed, I can see in it in the result redis, but I don't see any ways to force a retry, nor are there any part of the celery command to explore that.

I looked at flower, but flower only sees task that failed on its watch, nothing that failed before it started. I am guessing it wouldn't be able to keep 200K failed tasks in memory either.

Any pointers?

Thanks

--
You received this message because you are subscribed to the Google Groups "celery-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to celery-users+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To post to this group, send email to celery-users-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
Visit this group at http://groups.google.com/group/celery-users.
For more options, visit https://groups.google.com/d/optout.
phill | 16 Oct 14:59 2014
Picon

Confirming - It's okay to migrate from 3.0 to 3.1 on the existing backend with pending work in the queue?

I'm migrating from celery 3.0 to 3.1, running on the redis backend with pickle task serialization. I've done some testing where I just bring down the 3.0 queue (with jobs pending) and bring up 3.1 and things seem to work fine. It's really hard to get a large diversity of cases in my testing though so I also wanted to get confirmation that this is a reasonable way to migrate. The "What's New" guide is thorough and doesn't mention this, so I'm presuming so but I was looking for a positive confirmation that this is advisable.

If it's not, I can try to go through a process of draining the 3.0 queue entirely before upgrading or attempting to migrate over tasks but that definitely complicates things, so hoping this simple path is an acceptable one.

--
You received this message because you are subscribed to the Google Groups "celery-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to celery-users+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To post to this group, send email to celery-users-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
Visit this group at http://groups.google.com/group/celery-users.
For more options, visit https://groups.google.com/d/optout.
Gregory Taylor | 14 Oct 18:42 2014
Picon

Celery fails to reconnect to Redis after service disruptions

We've had service disruptions for our Redis broker the last two nights. One was network maintenance, the other was a software upgrade that caused Redis to refuse connections for about a minute. 

The thing that troubled us is that celery seems to be very inconsistent about recovering from network/service issues when using Redis as a broker. This is very likely something configuration related, we're just not sure where we've gone wrong. If we restart the celery processes, everything goes back to normal. That puts us in a tough situation in that celery (as configured) requires manual intervention to recover.

Here are some version numbers:

django==1.7.1
celery==3.1.15
redis==2.10.3
hiredis==0.1.4
billiard==3.3.0.18
kombu==3.0.23

Here are my Python-land settings: https://gist.github.com/gtaylor/c61e9b4802b094d3aeb4

Here is my supervisor unit (with ansible Jinja2 template variables included): https://gist.github.com/gtaylor/798c370894998377741d

Our gunicorn app servers seem to recover automatically when Redis comes back up, but it seems 50/50 with celery. Sometimes we recover from minor disruptions just fine, other times we have to restart the celery workers. 

Any ideas would be appreciated!

--
You received this message because you are subscribed to the Google Groups "celery-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to celery-users+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To post to this group, send email to celery-users-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
Visit this group at http://groups.google.com/group/celery-users.
For more options, visit https://groups.google.com/d/optout.
phill | 13 Oct 21:17 2014
Picon

Do database connections really get closed after each task, or before?

I'm running Celery 3.0.X (working on a 3.1 upgrade though!). I'm seeing a number of sleeping database connections which caused me to look into how connections are closed in djcelery. I found a bit of conflicting evidence. Ask says they're closed after task execution ( https://groups.google.com/forum/#!topic/celery-users/_oxdeICeU58 ), but it looks to me like it might be more accurate to say they're closed _before_ task execution : https://github.com/celery/django-celery/blob/master/djcelery/loaders.py#L113

I'm wondering if I'm reading that correctly? We recently added more worker capacity so is it plausible that the sleeping connections I'm seeing are from idle workers who won't close their connections until the next task runs?

Thanks in advance - Phill

--
You received this message because you are subscribed to the Google Groups "celery-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to celery-users+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To post to this group, send email to celery-users-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
Visit this group at http://groups.google.com/group/celery-users.
For more options, visit https://groups.google.com/d/optout.
Jacky Wang | 11 Oct 05:27 2014
Picon

Celery signal does not work well when specifying a sender

I tried to add a task_success signal to task scan. But the handler < monitor > did not called after scan executed succeed.  I do not know why?

The code is as following shown
<at> app.task()
def scan(text):
    pattern = r'.*\.mp3'
    logger.info('scan text %s', text)
    return re.search(pattern, text).group()
 
<at> signals.task_success.connect(sender=scan)
def monitor(sender, **kwargs):
    logger.info('task scan completed - %s', kwargs['result'])

I have inspected the internal of signal. And found that the signal module maintains a dict of <look up key, receiver>. The look up key is consisting of id(receiver), id(sender). 

Here are related log, the first block is the signal module records the receiver [ Here, it is monitor]. Notice that the id of sender is 62589472

Worker-1 3304   Append receivers - look up key: <(62608664L, 62589472L)>, 
sender: << <at> <at> task: proj.text.tasks.scan of WAETask:0x3d10828>task: proj.text.tasks.scan of WAETask:0x3bb0828>>,
 receiver: <<weakref at 0000000003BB7548; to 'function' at 0000000003BB5518 (monitor)>>

The second block record the sender id during the execution of celery.utils.dispatch.signal.Signal.send()
Worker-1 3304 - Sender is << <at> task: proj.text.tasks.scan of WAETask:0x3b8ad68>>
Worker-1 3304 -  Sender id is <62655008>
 
The ids of sender are different, 62589472 and 62655008 It looks that there are two instance of the sender, 'proj.text.tasks.scan'. 
Why does this happened?

Environment:
Version: v3.1.15
OS: Windows-7-6.1.7601-SP1
Python: 2.7

Configuration
transport:   amqp://guest:** <at> localhost:5679//
results:     mongodb://localhost:27017/
concurrency: 4 (prefork)

--
You received this message because you are subscribed to the Google Groups "celery-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to celery-users+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To post to this group, send email to celery-users-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
Visit this group at http://groups.google.com/group/celery-users.
For more options, visit https://groups.google.com/d/optout.
phill | 10 Oct 16:35 2014
Picon

Best Pattern for "One at a time" without running into big chain recursion error?

I've read a number of the threads on the large-chain-recursion issue ( https://github.com/celery/celery/issues/1078 )....

The pattern I run into sometimes is that I want to get a bunch of things done (hundreds or small-thousands) and I either don't want to saturate a worker pool, or don't want to clobber some external resource the tasks will content with. For instance I may want to send a few thousand updated records to some external API which will rate-limit me if I get chatty, and which might be slow.

We started creating chains, but ran into the pickle recursion issue (which I understand is a problem even with json because the worker pools will pickle to eachother). I _don't_ need the results of these tasks to be published forward (and if I do I can store their results somewhere to be picked up later), in case that helps.

Curious what pattern people use for this? I can create worker pools with a single worker and use groups, but then I end up with a proliferation of rarely-used pools which is no fun. I can chain the tasks manually (I used to do this before Canvas) and have one call the next as it's finishing but that's fragile (any task fails and the chain breaks).

Are there good workarounds for the recursion error that might get me there? Can I create a chain of smaller chains and avoid this somehow?


Curious how others solve these kinds of problems.

Thanks in advance!

Phill

--
You received this message because you are subscribed to the Google Groups "celery-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to celery-users+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To post to this group, send email to celery-users-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
Visit this group at http://groups.google.com/group/celery-users.
For more options, visit https://groups.google.com/d/optout.
GianMario Mereu | 10 Oct 15:22 2014
Picon

strange behaviour in chain

Hello everybody,

I facing with a strange behavior with chain. I have defined two tasks:

<at> app.task
def download_province(volte=[1,2], provincia='Lodi'): 
    
    ....
    do stuff ... 
   .....

    return True


<at> app.task
def create_snapshots(provincia='Lodi'):
    
    ....
    do stuff ... 
   .....

    return True


then I create the chain in this way:

task_result = chain( 
    tasks.download_province.subtask(([1], 'Lodi'), immutable=True),
    tasks.create_snapshots.subtask(('Lodi'), immutable=True)
    )()

the first subtask (download_province) is correctly executed while the second (create_snapshots) one is not. Celery is raising the following error:

TypeError: create_snapshots() takes at most 1 argument (4 given)

as the signature of subtask was not immutable. Do I make some mistake that I don't see at all? 

thanks al lot
gmario

--
You received this message because you are subscribed to the Google Groups "celery-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to celery-users+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To post to this group, send email to celery-users-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
Visit this group at http://groups.google.com/group/celery-users.
For more options, visit https://groups.google.com/d/optout.
Tore Olsen | 10 Oct 14:09 2014
Picon

Keep getting "missed heartbeat"

Hi all,

My celery setup is functioning correctly (AFAICT) but the workers keep logging:
2014-10-10 11:37:40,065: INFO/MainProcess] missed heartbeat from celery <at> apns.<my-ip>

This happens very sporadically; sometimes every few seconds, sometimes after a couple of minutes.

I'd like to understand what's actually causing this, and whether it's something I can ignore or if I should do something about it.

I have three workers and a beat scheduler configured via supervisord running like this:

celery -A cargame beat -n beat.%%h -l INFO

celery worker -A cargame -Q gcm -n gcm.%%h -l INFO

celery worker -A cargame -Q celery -n default.%%h -l INFO

celery worker -A cargame -Q apns -n apns.%%h -l INFO


The workers have very low load in the staging environment where I still get missed heartbeats. (Two of the workers are for push notifications.)


I'm using redis as a broker.

Any input appreciated.

Regards,
Tore

--
You received this message because you are subscribed to the Google Groups "celery-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to celery-users+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To post to this group, send email to celery-users-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
Visit this group at http://groups.google.com/group/celery-users.
For more options, visit https://groups.google.com/d/optout.

Gmane