richie | 28 Feb 15:38 2015
Picon

Force .save() for all Group() tasks in celery

I know I can force the group to be saved in celery_tasksetmeta table by calling .save() on the GroupResult. For example,

r= group(...) r.save() # force save to celery_tasksetmeta.

However, if I have a complex workflow that might include some nested groups along the way I can't explicitly call .save() on them:

r= group([group(), chain(....)])
r.save() # only forces the outer most group to be saved to celery_tasksetmeta

Is there a way to force and/or configure celery to either always implicitly save (or call .save()) for each group so even the nested groups are written to the celery_tasksetmeta table?

Thanks.

--
You received this message because you are subscribed to the Google Groups "celery-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to celery-users+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To post to this group, send email to celery-users-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
Visit this group at http://groups.google.com/group/celery-users.
For more options, visit https://groups.google.com/d/optout.
Sanket Saurav | 26 Feb 22:10 2015

Celery tasks not getting registered sporadically

Hello!

I'm trying to achieve something fairly simple. I have a Django view, which on every POST request, triggers a celery task. Weirdly, the task invocations are failing silently, and sporadically. In my case, on two successive requests, the first task is registered and the second task is not.

Everything is working on dev (duh!). This happens only in prod. I'm running one celery worker, and using RabbitMQ as broker.

More details here: http://stackoverflow.com/questions/28744719/celery-tasks-failing-randomly

Any help would be greatly appreciated! Thanks!

--
You received this message because you are subscribed to the Google Groups "celery-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to celery-users+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To post to this group, send email to celery-users-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
Visit this group at http://groups.google.com/group/celery-users.
For more options, visit https://groups.google.com/d/optout.
Lennon Pulda-Grealy | 24 Feb 17:39 2015
Picon

IPython for rdb

There is a great little module I use all the time called ipdb which gives you a pdb shell which uses IPython. So you can do set_trace() but with tab completion, syntax highlighting, etc.

rdb is essential to using Celery, but after using IPython so much, the normal pdb shell feels crippled. Eventually I may hack together an IPython-enabled rdb based on ipdb's small source: https://github.com/gotcha/ipdb/blob/master/ipdb/__main__.py but I was wondering if anyone has done this before and has code for it or knows a reason why it doesn't work (does telnet limit anything? etc.)




--
You received this message because you are subscribed to the Google Groups "celery-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to celery-users+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To post to this group, send email to celery-users-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
Visit this group at http://groups.google.com/group/celery-users.
For more options, visit https://groups.google.com/d/optout.
Ami | 21 Feb 14:50 2015

Duplicate tasks

Hi,

I'm using latest celery w/ REDIS broker. One worker, with concurrency set to one, one beat server, no flower. A task is scheduled within another task:

[2015-02-20 23:30:37,238: INFO/MainProcess] Received task: send_text[223ee195-f2d1-42e0-8f35-097eb70a2555] eta:[2015-02-21 01:30:37.232401+00:00]

One hour later I see this:

[2015-02-21 00:32:07,740: INFO/MainProcess] Received task: send_text[223ee195-f2d1-42e0-8f35-097eb70a2555] eta:[2015-02-21 01:30:37.232401+00:00]

[2015-02-21 00:32:07,740: WARNING/MainProcess] QoS: Disabled: prefetch_count exceeds 65535

Another hour goes by, and when the task should be executed, there are TWO duplications:

[2015-02-21 01:30:37,888: INFO/MainProcess] Task send_text[223ee195-f2d1-42e0-8f35-097eb70a2555] retry: Retry in 26972s: AppCardDelayTask()

[2015-02-21 01:30:37,934: INFO/MainProcess] Task send_text[223ee195-f2d1-42e0-8f35-097eb70a2555] retry: Retry in 28353s: AppCardDelayTask()

It's actually getting worse afterwards: each copy of the same task, reschedules it's execution to a later time, and again there are two duplications of each. It seems like when celery starts everything works well, after a while duplications starts to appear..

Any suggestions on how to address? Thanks much!

A


--
You received this message because you are subscribed to the Google Groups "celery-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to celery-users+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To post to this group, send email to celery-users-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
Visit this group at http://groups.google.com/group/celery-users.
For more options, visit https://groups.google.com/d/optout.
William Viana Soares | 19 Feb 18:39 2015
Picon

Celery workers become idle

Hi,


I have an issue with celery workers becoming idle every few minutes even though there are still tasks ready in the queue. After approximately one minute the workers become active again without any intervention. I've tried to remove prefetching and still had the same issue. Rabbitmq stats seem fine and the tasks I'm running are mostly I/O bound. I'm using celery 3.1.17 with rabbitmq 3.2.4 as broker and gevent as process pool. I'm running everything on AWS and the workers are running on a docker container.

I have very little experience with celery so it might be a case of misconfiguration, but I have no idea of what it might be. Any ideas?
--

-William-

--
You received this message because you are subscribed to the Google Groups "celery-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to celery-users+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To post to this group, send email to celery-users-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
Visit this group at http://groups.google.com/group/celery-users.
For more options, visit https://groups.google.com/d/optout.
Doug Snyder | 18 Feb 00:13 2015
Picon

Celery Event Monitor reports no tasks even though I can confirm that there are

Im running celery with rabbitmq as the broker andf using logging to determine how things are going.
I've reached thepoint where I need celery monitoring so I tried the curses based celery event monitor.
I correctly shows the worker and shows the number  of events is increasing as I run my tasks but no information
about these events is displayed and it claims there are no tasks even though my logging confirms they are
running. From the event monitoring docs I've gathered you need the config:
CELERY_SEND_TASK_SENT_EVENT=True
so I've set that in my config file. Its supposed to be on the Python path which I believe it is. Its in my top
directory that has an __init__.py and allows other imports from this module. I'm assuming that is sufficient.
Can anybody else tell me anything else that might cause the behavior in the event monitor of ignoring tasks?
The one line documentation for event monitor are great until they don't work ( they also fail to tell you the
config thats needed for it to work )

--

-- 
You received this message because you are subscribed to the Google Groups "celery-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to celery-users+unsubscribe@...
To post to this group, send email to celery-users@...
Visit this group at http://groups.google.com/group/celery-users.
For more options, visit https://groups.google.com/d/optout.
Maxime Montinet | 16 Feb 17:20 2015
Picon

Ways of checking task health ?

Hi,

I'm wondering if there is any way in Celery itself to check, given an AsyncResult, whether the task is actually still running or if the worker has crashed ? I'm not sure whether this information if there at all (From a ping-pong between client and worker perhaps ?), or if I'm just stuck using local timeouts and re-queueing a task if it is lost due to a worker crash ?

I'm asking because my use case implies a large number of workers that may go down unexpectedly pretty often, and tasks are also relatively time-sensitive, so waiting a minute or two before actively resubmitting the task is feasible but not ideal. Before attempting to use Celery, I had a "handmade" task-dispatching system that had workers send status updates on their current task every 5 seconds, and the client knew to requeue the task when it hadn't received any update in 10 seconds or so. So, well, not sure how I'd do that with Celery.

Not running with Django or anything btw, regular commandline Python.

Any idea welcome !

--
You received this message because you are subscribed to the Google Groups "celery-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to celery-users+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To post to this group, send email to celery-users-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
Visit this group at http://groups.google.com/group/celery-users.
For more options, visit https://groups.google.com/d/optout.
Dimitris Theodorou | 15 Feb 22:13 2015
Picon

Long-running tasks and dependencies

Hi,

I've had difficulty addressing a particular situation with Celery, which involves long-running tasks and dependencies and which I would like to describe here. It would be great if this thread produces some discussion with something close to a solution in the end; all the threads I have found in this list that involve task dependencies have no clear resolution. A solution such as "celery is not the tool for this" would be acceptable as well.

Problem

I would like to model the following situation:

                               +----------+                                
                 
+----------> |  Task B1 |                                
                 
|            +----------+                                
                 
|                                                        
+----------+      |            +----------+              +-----------------+
|  Task A  | +---------------> |  Task B2 |       <------+ Result Requests |
+----------+      |            +----------+              +-----------------+
                 
|                                                        
                 
|            +----------+                                
                 
+----------> |  Task B3 |                                
                               
+----------+                                
                                                                           

Don't think of these tasks as celery tasks yet, but as a generic "workload" that needs to be executed, the result of which needs to become available.

A couple of B tasks depend on task A. The requests for the task results are only concerned with "B" tasks, not knowing anything about the existence of A.

All tasks are long-running and fairly "heavy", i.e. there is no luxury to run them a 2nd time if they are already running. If they are finished, their result is usable for an extended period of time without having to re-execute the tasks, and future callers needs only to fetch their (cached) result. If no such cached result exists, then the task will be executed and store its result, which will be used in subsequent requests. That means that the following are desired properties of the system:
  1. If a request for the A task is made, the system needs to respond as follows:
    1. task already running?
      Retrieve and return the existing task_id
    2. cached result found?
      Return the result
    3. no cached result and task not running?
      Create new task, return the new task_id

  2. If a request for a B task is made, the points 1 and 2 from above apply with these additions that are related to its task A dependency:
          3. Is task A running?
              Create a new B task and return the new task_id. The new B task will have to wait till A is completed before executing its body.
          4. Is task A's result available:
              Create a new B task, return the new task_id
          5. Is task A's result absent:
              Create a new task B and return its task_id. The B task will have to create a new task A and wait on it as well.
             

  3. It should be possible for task A to trigger execution of all its dependencies when it finishes, i.e. to trigger execution of all B tasks. This should work seamlessly with any B tasks already waiting on A.

  4. (bonus) A and B tasks should report their progress. B tasks in particular should report the progress of the task A they are waiting on before proceeding to report their own progress.



Solution directions

I wasn't able to model the above just with the tools Celery provides (tasks, canvas, and result backend), at least not in a straightforward way. I could definitely use some feedback and suggestions.

I have come up with some ideas for a solution, although it's not complete:

1. Create my own persistence layer to store the "cached" task results, ids, etc. It associates task results with task keys (task A, task B1, etc.), task ids (celery task ids), and other metadata.

   +-----------+-------------+-------------+-------------+---------------+  
   
| task_key  | task_status | task_id     | task_result | result_status |  
   
+---------------------------------------------------------------------+  
   
| task_A    | done        |             | *bytes*     | good          |  
   
| task_B1   | running     | b7282180-.. |             | waiting       |  
   
| task_B3   | done        |             | *bytes*     | stale         |  
   
|           |             |             |             |               |  
   
+-----------+-------------+-------------+-------------+---------------+


2. This table is populated by the celery tasks themselves. A celery Task subclass allows each task to know its "task_key". When invoked, it will find itself in the table, and will react accordingly.
Say, task A is invoked. It will perform the following (within a distributed lock)
  1. Find the "task_A" entry in the table
  2. Conditional logic:
    1. if task_status is running:
      do nothing, we are already running. (alternate implementation: raise a retry)
    2. if task_status is done and result_status is stale:
      execute body of task A
    3. if task_status is done, result_status is good:
      return the result

3. Small "result fetcher" celery tasks.

Long-running tasks (task B's) are not directly requested by the "end-user". There is a buffer of task C's in between that will check and wait on task B (really wait, with result.get() ), or immediately return the task B result if found. These need to run on a gevent worker (which I hope multiplexes the result.get(), otherwise scrap this ). These tasks play well with web requests and write their result on the celery backend, meant to be consumed once and removed. In contrast, the long-running tasks never write results on the celery backend, but on the custom task backend, where their result will stay around for a long time.

4. Dependencies

No solution here.


Do you think this is going to the right direction? Any feedback welcome.

--
You received this message because you are subscribed to the Google Groups "celery-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to celery-users+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To post to this group, send email to celery-users-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
Visit this group at http://groups.google.com/group/celery-users.
For more options, visit https://groups.google.com/d/optout.
Amit Kumar Srivastav | 13 Feb 08:08 2015

failing update: objects in a capped ns cannot grow -- using mongod for celery backend

u'Task package.tasks.location_mismatch[377773f4-c63c-4c98-9db9-184c6596429e] INTERNAL ERROR: OperationFailure(u'failing update: objects in a capped ns cannot grow',)'

getting this msg raised for raised by celery back ends and the collection is capped
therefore once the document is created we cant change any values


infact i m not able to understand what exactly are we updating in backends after creating the document.




--
You received this message because you are subscribed to the Google Groups "celery-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to celery-users+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To post to this group, send email to celery-users-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
Visit this group at http://groups.google.com/group/celery-users.
For more options, visit https://groups.google.com/d/optout.
Nelson Monterroso | 12 Feb 17:50 2015
Picon

Using Celery ETA keeps workers waiting

Hey all,
We've been using celery to run ~12 million jobs/day for ~1 year now, but have recently run into an issue where we've started to use ETA as described in the docs. What we're seeing is a worker is receiving the job, but looks like it's waiting for the ETA of the current task before executing any tasks that come after it, so the queue is essentially locked. Ex:

[2015-02-12 16:36:18,538: INFO/MainProcess] Received task: task_name[some-unique-id] eta:[2015-02-18 09:32:44+00:00]

The worker has its concurrency filled with these types of jobs, so any jobs that are expected to complete immediately look to be sitting in line, waiting for this task (that's 6 days away) to finish. Is this intended? We're running celery 3.1.10. Thanks for any insights.

--
You received this message because you are subscribed to the Google Groups "celery-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to celery-users+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To post to this group, send email to celery-users-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
Visit this group at http://groups.google.com/group/celery-users.
For more options, visit https://groups.google.com/d/optout.
Mayuresh Pise | 8 Feb 04:58 2015
Picon

Multi threaded celery task class

I am using celery version 3.1.17 (Cipater) and I am trying to accomplish following : 1. Create celery task class which would have init and run method. init method will create n threads out of other threading based class and queue would be shared between celery task class and these threads.

class threads(threading.Thread) : def __init__(self, workQ) : def run() : class celeryTask(Task) : def __init__(self) : <.. create threads class and share workq with it..> def run(self, somework) : <.. push work for threads ..>

I am facing multiple issues here : 1. After creating celery task class, I have to add following line at header of celery task worker without which is is giving error that celery attribute not found and further experiments are only worsening error messages.

celery = Celery('BaseWorker', backend='amqp', broker='amqp://')
  1. When I import this particular class in my main module at import statement init of celery task in getting executed and it created n threads in main task too. I do not understand why it is executing at import statement.

    from BaseWorker import BaseWorker .. basewrkObj = BaseWorker() .. basewrkObj.delay()

  2. Even after putting message into queue worker threads for celery instance does not seem to pick up these messages and hence not working on it.

I already checked this code and it is working in standalone threading environment. I am trying to add distributed computing functionality to it and using celery for that. I could not find any viable example on net which shows how celery task based class can be created and utilized, less so task class which has multiple threads.

It would be great if you can point me to right example or let me know if you need any more info. Thanks in advance


--
You received this message because you are subscribed to the Google Groups "celery-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to celery-users+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To post to this group, send email to celery-users-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
Visit this group at http://groups.google.com/group/celery-users.
For more options, visit https://groups.google.com/d/optout.

Gmane