Phill | 10 Jun 17:21 2014
Picon

Limiting concurrency of grouped tasks to a subset of the available worker pool

'Ello!

Here's the thing, say I have a worker pool with 30 workers in it. 
Sometimes I want to run a group of tasks but limit the concurrency 
of that group to some smaller number.. Say, no more than 5 of them 
running at the same time. I'm wondering if there's a handy pattern 
for doing that, which I've overlooked.

First off, I know this is poor form. Ideally my worker pools are 
tuned for concurrency and my tasks are all dissolved into nice little 
chunks that can be run as fast as the pool can handle them. However, 
in practice from time to time I run into challenges where I have tasks 
that strain some external resource like a database or remote API 
connection and I can't afford to run them at a concurrency that's as 
high as some of the other tasks in the pool. Additionally, I'm trying 
to resist spinning up lots of new specialized worker pools that will 
end up sitting idle most of the time.

Is this a problem anyone else has tackled? Curious if there's a 
convenient way to pull this off.

Thanks in advance,
Phill

Jonathan Lefman | 10 Jun 04:39 2014

Limit rate of tasks to queue

Hi. I am using a SaaS rabbitmq service. This service limits the number of messages that can be put into the queue per second. I have many tasks that I need to put into the queue; more than can be put into the queue per second. What is the easiest and most reliable way to put tasks into the queue using a rate limiter?

Thank you.

--
You received this message because you are subscribed to the Google Groups "celery-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to celery-users+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To post to this group, send email to celery-users-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
Visit this group at http://groups.google.com/group/celery-users.
For more options, visit https://groups.google.com/d/optout.
Devang | 10 Jun 04:01 2014
Picon

Using celery worker --detach option in production for Django

Hi,

In the documentation, it is mentioned that
"In a production environment you will want to run the worker in the background as a daemon - see Running the worker as a daemon -..."

However, it looks like there is a new and easier way to daemonize the workers using the celery worker --detach option.
So from 3.1 onwards, can we skip using celeryd, celery.service, supervisord, etc.. to launch celery workers and simply call celery worker command?

I am trying to deploy Celery in AWS Elastic Beanstalk, so it would seem that using a simple celery worker command would be much easier than juggling with Custom AMIs and supervisord scripts :)

Thanks,
Devang

--
You received this message because you are subscribed to the Google Groups "celery-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to celery-users+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To post to this group, send email to celery-users-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
Visit this group at http://groups.google.com/group/celery-users.
For more options, visit https://groups.google.com/d/optout.
Joseph Wenninger | 9 Jun 09:25 2014
Picon

rabbitmq queue cleanup?

Hi!

Is there a simple way to clean up rabbitmq result queues?

I know there is an expire timout, but I know that my task has finished and I got the results, after querying it, so I do not need it to stay in the queue till some arbitrary timeout. 

I thought AsyncResult.forget() would be good for me, but it is not implemented for the rabbit backend.

I'm using djcelery, celery 3.1.11 and rabbitmq-server 3.1.5-1 (debian)

What is the best way to clean up no longer needed results

Best regards
Joseph

--
You received this message because you are subscribed to the Google Groups "celery-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to celery-users+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To post to this group, send email to celery-users-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
Visit this group at http://groups.google.com/group/celery-users.
For more options, visit https://groups.google.com/d/optout.
Ofir Sudai | 8 Jun 11:35 2014
Picon

A chord of tasks with individual callbacks

Hi,

I know I can create a subtask with callback in the following matter:

s.apply_async(link=callback.s()

Also, I know I can create a chord in the following matter:

chord(tasks, callback.subtask((self.request.id,)))()

But is it possible to have a callback after each task is finished and also one callback after all tasks are finished?

Thank you, Ofir.

--
You received this message because you are subscribed to the Google Groups "celery-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to celery-users+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To post to this group, send email to celery-users-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
Visit this group at http://groups.google.com/group/celery-users.
For more options, visit https://groups.google.com/d/optout.
Lewis Franklin | 6 Jun 20:23 2014
Picon

Re: Re: Worker - Active vs. Reserved

I'm still experiencing this problem. I currently have a worker that has a concurrency of 12 but only one active task and 11 reserved. I have tried working with and without "-Ofair" as well as with and without acks_late, but I keep having an inordinate amount of tasks sitting in "reserved" when there is an ample amount of concurrency to handle the problem. Any help would be greatly appreciated.


On Fri, May 23, 2014 at 10:39 PM, Loic Duros <loic.duros-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
I had the same issue as you have, but I'm not using acks_late. In any case, I don't know how you run your worker, but you might want to start your command with the -Ofair option. It worked for me.
From the docs:
.. cmdoption:: -O Apply optimization profile. Supported: default, fair

http://celery.readthedocs.org/en/latest/_modules/celery/bin/worker.html


--
You received this message because you are subscribed to the Google Groups "celery-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to celery-users+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To post to this group, send email to celery-users-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
Visit this group at http://groups.google.com/group/celery-users.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "celery-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to celery-users+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To post to this group, send email to celery-users-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
Visit this group at http://groups.google.com/group/celery-users.
For more options, visit https://groups.google.com/d/optout.
Andrew Druchenko | 5 Jun 13:52 2014

Unable to run chained tasks on a queue specified at runtime

Hi there,
here is what I'm trying to achieve using celery canvas:

def _workflow(self, sync_request, forced=True):
    forced_queue = 'io-tasks-high-priority'
    update_last_sync_task = OpportunitiesSync._update_last_sync_timestamp.si(self, sync_request.sync_id, now)
    fetch_opps_tasks = [FetcherAsyncFacade.fetch_for.si(f, user, sync_request.last_sync) for f in self._fetchers]
    check_results_task = check_results.s(sync_request.sync_id, "REQUIRE_SUCCESS_ANY")

    if forced:
        update_last_sync_task.set(queue=forced_queue)
        fetch_opps_tasks = group([t.set(queue=forced_queue) for t in fetch_opps_tasks]).set(queue=forced_queue)
        check_results_task.set(queue=forced_queue)

    workflow = chain(fetch_opps_tasks, check_results_task, update_last_sync_task)

    workflow.apply_async()



My celery config looks like following:


CELERY_BROKER_URL = amqp://guest:guest-TgDyIxuTZKFGs5pSqAvsZA@public.gmane.orgst.com
CELERY_BACKEND_URL = mongodb://

HA_POLICY = 'all'


LOW_PRIORITY = {'queue': 'low-priority', 'routing_key': 'low-priority'}
IO_TASKS_HIGH_PRIORITY = {'queue': 'io-tasks-high-priority', 'routing_key': 'io-tasks-high-priority'}
IO_TASKS = {'queue': 'io-tasks', 'routing_key': 'io-tasks'}


CELERY_DEFAULT_QUEUE = LOW_PRIORITY['queue']

CELERY_QUEUES = (
    Queue(LOW_PRIORITY['queue'], routing_key=LOW_PRIORITY['routing_key']),
    Queue(IO_TASKS_HIGH_PRIORITY['queue'], routing_key=IO_TASKS_HIGH_PRIORITY['routing_key']),
    Queue(IO_TASKS['queue'], routing_key=IO_TASKS['routing_key']),

)

CELERY_ROUTES = (
    {"FetcherAsyncFacade.fetch_for": IO_TASKS},
)


What I'm expecting is that all tasks in a above mentioned canvas will be run on a io-tasks-high-priority queue when
needed.  
However, what happens is the following:

1) FetcherAsyncFacade.fetch_for  tasks go into IO_TASKS queue
2) The whole canvas task seems to be going into LOW_PRIORITY queue which is the default
3) IO_TASKS_HIGH_PRIORITY queue stays empty

Any help would be appreciated. Thanks in advance!

--
You received this message because you are subscribed to the Google Groups "celery-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to celery-users+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To post to this group, send email to celery-users-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
Visit this group at http://groups.google.com/group/celery-users.
For more options, visit https://groups.google.com/d/optout.
Daniel Ampuero | 4 Jun 23:41 2014
Picon

Inspecting queues


I need to monitor queues on a broker, specifically, how many workers are listening from a queue. Celery documentation says I can do this by:

rabbitmqctl list_queues name consumers
Is there any way to do this using API calls instead of using the command line?

Thanks!

--
You received this message because you are subscribed to the Google Groups "celery-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to celery-users+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To post to this group, send email to celery-users-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
Visit this group at http://groups.google.com/group/celery-users.
For more options, visit https://groups.google.com/d/optout.
Murthy A | 4 Jun 16:49 2014
Picon

tasks are not completed when using supervisor

Hi everyone,

I am using celery for sending emails in the background.
Every thing works fine till now.

I was trying to add 20+ tasks fin a loop and I am using supervisord to manage my worker.

THE PROBLEM is, only five tasks are completed and not sure where the other 15 tasks have gone.
And everything works good if am not using supervisord.

I am using redis as backend.


 

--
You received this message because you are subscribed to the Google Groups "celery-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to celery-users+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To post to this group, send email to celery-users-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
Visit this group at http://groups.google.com/group/celery-users.
For more options, visit https://groups.google.com/d/optout.
assafgordon | 4 Jun 00:40 2014
Picon

Detecting killed servers (AWS Spot Instances + Celery + RabbitMQ)

Hello,

I'm trying to use Celery+RabbitMQ to run analysis on Amazon AWS Spot Instances.
The issue with Spot Instances, is that the entire instance can be shutdown instantaneously and without notice (akin to someone pulling the power cord of a server).

For my appication, such shutdown events are fine, the each analysis chunk is idempotent, and can be restarted without a problem.

What I'm trying to achieve is to setup Celery + RabbitMQ in such a way that a killed/lost task is properly detected and restarted.
It's even acceptable if the task is marked as failed due to server error, and I'll add application-level logic to re-submit the task.

The problem is, I'm not able to setup Celery or RabbitMQ to properly detect such event.

I've found only one thread with similar topic:
https://groups.google.com/d/msg/celery-users/VcghhZZM_zI/p9yf0x18hcYJ
But wasn't able to even get a FAIL status.

I've tested it with the following "task" script:
====
from celery import Celery
from time import sleep

app = Celery('tasks', backend='amqp', broker='amqp://guest-iquZ65Jdg7V54TAoqtyWWQ@public.gmane.org//')

app.conf.update(
    BROKER_HEARTBEAT = 10,
    CELERY_ACKS_LATE = True,
    CELERYD_PREFETCH_MULTIPLIER = 1,
    CELERY_TRACK_STARTED = True,
    )

<at> app.task
def add(x, y):
    sleep(60);
    return x + y
===

And started it inside a virtual machine (emulating an AWS Spot instance) using:
===
celery worker -A tasks -l INFO -Q ec2
===

Then submitted a task and monitored it like so (from the host):
====
from tasks import add
from time import sleep,strftime

a = add.apply_async( (1,2), queue='ec2');

while True:
    print ("%s: %s = %s" % ( strftime("%H:%M:%S"), a.id, a.state ) )
    sleep(1)
====

Starting the "submit" script, I see:
====
18:35:57: eab3656b-2b48-4408-a8f6-1b4f45bd379f = STARTED
18:35:58: eab3656b-2b48-4408-a8f6-1b4f45bd379f = STARTED
18:35:59: eab3656b-2b48-4408-a8f6-1b4f45bd379f = STARTED
...
====
As expected.
The celery log in the virtual machine shows that the task is started.

I then kill the virtual machine running the task (simulating AWS Spot instance termination without any proper shutdown).

But the "monitor" script still shows that state as "STARTED" - the disappearance of the celery server was not detected at all (even after waiting several minutes).


Am I missing a configuration option?

Any suggestions appreciated,
 -gordon

--
You received this message because you are subscribed to the Google Groups "celery-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to celery-users+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To post to this group, send email to celery-users-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
Visit this group at http://groups.google.com/group/celery-users.
For more options, visit https://groups.google.com/d/optout.
Ofir Sudai | 2 Jun 17:31 2014
Picon

A question about chords...

Hello,

I have the following...

test.py:

<at> app.task(bind=True)
def parent(self, sizes):
    chord((child.s(size) for size in sizes), callback.s(sizes))()

<at> app.task(bind=True)
def child(self, size):
    print size
    return size

<at> app.task(bind=True)
def callback(self, sizes):
    print sizes, 'i am a callback'

config.py:

CELERY_ROUTES = {'test.parent': {'queue': 'test.parent'},
                 'test.child': {'queue': 'test.child'},
                 'test.callback': {'queue': 'test.callback'}}

and I'm running 3 workers, each configured to a different queue.

What I want it the the parent task will send a message for each size to the child queue and when all the messages in the task queue are successful (handled by the child worker) the callback will be called. instead im getting a 'unlock_chord' task in a queue called 'celery'

What am I doing wrong? is it even possible to run a callback from a charn when called from a task? I can do .get() but this will block my task which blocks the task and blocks the worker..

Thank you, Ofir.

--
You received this message because you are subscribed to the Google Groups "celery-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to celery-users+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To post to this group, send email to celery-users-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
Visit this group at http://groups.google.com/group/celery-users.
For more options, visit https://groups.google.com/d/optout.

Gmane