Samaneh Yahyapour | 29 Jul 16:54 2014
Picon

how to limit producer in rabbitmq az a queue in celery

hi

I have a program by celery which multiple producer is adding message to a RabbitMQ (celery queue)
is there a way that i can limit number of messages which are adding by a producer to rabbitMQ ?

one way i use is to add one queue for each producer and use rate limit for worker but by this way i have to define a same worker multitime with different name as a task for a queue ( is there a way which a task can use multi queue in celery)

--
You received this message because you are subscribed to the Google Groups "celery-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to celery-users+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To post to this group, send email to celery-users-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
Visit this group at http://groups.google.com/group/celery-users.
For more options, visit https://groups.google.com/d/optout.
sandeep | 28 Jul 12:56 2014
Picon

Celery Multiple Workers with Mysql - design

I need to build a scheduler to handle distributed crawling. I have to crawl about 50,000 urls each day. Currently , it's handled using cron scripts(unix). I heard of celery and working on it to know whether it will fit into our system. Currently, i have all my url's stored in mysql table. How to configure the existing mysql system with celery as scheduler. How to create multiple workers to handle this crawling. your suggestions will be highly appreciated.

--
You received this message because you are subscribed to the Google Groups "celery-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to celery-users+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To post to this group, send email to celery-users-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
Visit this group at http://groups.google.com/group/celery-users.
For more options, visit https://groups.google.com/d/optout.
Mazzaroth M. | 26 Jul 23:41 2014
Picon

celery.conf.update .. instance variable race condition

In my celery.py I have

<at> worker_init.connect
def bootstrap_pyramid(signal, sender):

I set a variable from config.app.settings here, but it appears to be too late, celery.conf.update() was already called. How do I get a sender.app.settings value into my celery.conf.update()?

The variable in question is k_fetch_rss_posts_feeds

source:

http://pastebin.com/Sr6ie4NY

regards,
Michael


--
You received this message because you are subscribed to the Google Groups "celery-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to celery-users+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To post to this group, send email to celery-users-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
Visit this group at http://groups.google.com/group/celery-users.
For more options, visit https://groups.google.com/d/optout.
Danci Emanuel | 25 Jul 19:00 2014
Picon

Celery beat dies inexplicably.

Hi!

I am trying to send some reminders using a periodic task (from a Django app), but when I start the beat, it suddenly dies. 

My context:
  • I run celery==3.1.13
  • the settings part for the periodic task:
from celery.schedules import crontab
CELERYBEAT_SCHEDULE
= {
   
'name-reminders': {
       
'task': 'module.submodule.task',
       
'schedule': crontab(),
       
'args': ()
   
},
}
  • The task: all what it does is to print something to stdout, so I guess the task is not the problem.

I first start the worker:
celery worker -A path.to.app.module -l info

Then, the beat:  
celery beat -A path.to.app.module -l debug

And this is the stack traceback when the beat dies:

[2014-07-25 16:52:00,026: DEBUG/MainProcess] beat: Synchronizing schedule...
[2014-07-25 16:52:00,080: CRITICAL/MainProcess] beat raised exception <type 'exceptions.AttributeError'>: AttributeError("'NoneType' object has no attribute 'id'",)
Traceback (most recent call last):
 
File "/home/<>/.virtualenvs/<>/local/lib/python2.7/site-packages/celery/apps/beat.py", line 112, in start_scheduler
    beat
.start()
 
File "/home/<>/.virtualenvs/<>/local/lib/python2.7/site-packages/celery/beat.py", line 462, in start
    interval
= self.scheduler.tick()
 
File "/home/<>/.virtualenvs/<>/local/lib/python2.7/site-packages/celery/beat.py", line 220, in tick
    next_time_to_run
= self.maybe_due(entry, self.publisher)
 
File "/home/<>/.virtualenvs/<>/local/lib/python2.7/site-packages/celery/beat.py", line 208, in maybe_due
    debug
('%s sent. id->%s', entry.task, result.id)
AttributeError: 'NoneType' object has no attribute 'id'

Any help is highly appreciated. 

Thanks!

--
You received this message because you are subscribed to the Google Groups "celery-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to celery-users+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To post to this group, send email to celery-users-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
Visit this group at http://groups.google.com/group/celery-users.
For more options, visit https://groups.google.com/d/optout.
Shaumyadeep Chaudhuri | 25 Jul 15:18 2014
Picon

Multiple workers with single queue

Hi,
I want to use multiple workers across multiple systems consuming from a single queue for tasks.
I also would prefer to use django-orm as the broker.
Currently I am testing it with two workers on two systems.
What is happening is the whenever i send a task to the queue then the task is being always executed by both the workers.
The documentation lists this as a limitation but says it might happen sometimes and with many workers.
What is the main reason for this limitation of the django orm message broker?

Thanks

--
You received this message because you are subscribed to the Google Groups "celery-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to celery-users+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To post to this group, send email to celery-users-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
Visit this group at http://groups.google.com/group/celery-users.
For more options, visit https://groups.google.com/d/optout.
Darren Govoni | 25 Jul 14:52 2014
Picon

Calling result.get() after returning result.delay()? deadlock

Hi,
  I'm trying to understand a deadlock situation between 2 servers. One server A I create a workflow and call results.delay() and return the object. Then I call result.get() on the returned object from server A.
Server B runs the task and returns, but server A is deadlocked and warns about:

[code]
[2014-07-25 12:22:10,886: WARNING/MainProcess] /usr/local/lib/python2.7/dist-packages/celery/result.py:45: RuntimeWarning: Never call result.get() within a task!
See http://docs.celeryq.org/en/latest/userguide/tasks.html#task-synchronous-subtasks

In Celery 3.2 this will result in an exception being
raised instead of just being a warning.

  warnings.warn(RuntimeWarning(E_WOULDBLOCK))
[/code]

This all works on the same server. There appears to be messages stuck in a transient queue each time.

What could be happening here? I don't think I'm actually calling .get() from within a task!

thanks.


--
You received this message because you are subscribed to the Google Groups "celery-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to celery-users+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To post to this group, send email to celery-users-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
Visit this group at http://groups.google.com/group/celery-users.
For more options, visit https://groups.google.com/d/optout.
vignesh | 24 Jul 15:17 2014

Having error queues in celery

HI,

Is there any way in celery by which if a task execution fails I can automatically put it into another queue. 
For example it the task is running in a queue `x`, on exception enqueue it to queue named `error_x`

Thanks

--
You received this message because you are subscribed to the Google Groups "celery-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to celery-users+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To post to this group, send email to celery-users-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
Visit this group at http://groups.google.com/group/celery-users.
For more options, visit https://groups.google.com/d/optout.
Roman Kournjaev | 24 Jul 12:51 2014

Using a different formatter for logger


Hi 

I would like to create another log file for celery with a different formatter. So i would actually 2 different log file ( one human readable and another for sending events to logstash ) Currently the log file of celery is written to (/var/log/celery/worker.log).

I thought i can configure it like that : 

  1. <at> after_setup_task_logger.connect
  2. def augment_celery_log(**kwargs):
  3. logger = logging.getLogger()
  4. handler = logging.FileHandler('celery.log', 'w')
  5. formatter = logging.Formatter(LogstashFormatter)
  6.  
  7. if not logger.handlers:
  8.         handler.setFormatter(formatter)
  9.         logger.addHandler(handler)
  10.         logger.propagate = 0
  11.         logger.setLevel(logging.DEBUG)

But it seems that after that code is executed the /var/log/celery/worker.log is not being written any more , the celery.log that i have defined is no created anywhere and i dont see no more logs.

Can someone tell me what i am doing wrong ?

--
You received this message because you are subscribed to the Google Groups "celery-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to celery-users+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To post to this group, send email to celery-users-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
Visit this group at http://groups.google.com/group/celery-users.
For more options, visit https://groups.google.com/d/optout.
Zebulon | 24 Jul 10:22 2014

Adding new jobs to queue fairly

Hi,

I have developed a Django based website where users can perform scientific calculations on some
user-entered data. The jobs are processed with Celery, and everything works perfectly. However, I am
wondering if I am handling the queuing correctly.

If user 1 send 1000 jobs at a time, Celery processes them. However this can take a few hours. If meanwhile user
2 sends 10 jobs, then those 10 jobs are put at the end of the queue, and user 2 has to wait for all the jobs from
user 1 to finish before his 10 jobs are processed. I would like to see Celery interleave the jobs from user 2
with the beginning of the queue, so that the processing for both users is really concurrent, instead of
being treated as "first arrived first served".

I run celery with 2 workers and have a concurrency of 4, but this just accelerate the queue processing, this
does not increase the fairness for all users. How can I implement this? Currently I use AsyncResult to run
the jobs. Is it at this level that I should change the dispatch of the jobs, or is it with Celery configuration?

Many thanks in advance for your advice.
Zebulon

--

-- 
You received this message because you are subscribed to the Google Groups "celery-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to celery-users+unsubscribe@...
To post to this group, send email to celery-users@...
Visit this group at http://groups.google.com/group/celery-users.
For more options, visit https://groups.google.com/d/optout.

g | 23 Jul 22:29 2014
Picon

How should I structure this task workflow?

I need to upload large-ish files (on the order of 100mb) on behalf of users to endpoints defined by the user.  Depending on where the file is being uploaded to this could be quick-ish or very very slow.  Currently, the number of file uploads for one task set (or job, not sure of best term to describe) is on the order of 100 to 1000 at a time, but this is expected to grow by at least a factor of 10 once the initial kinks are worked out.


Depending on user settings for a particular endpoint, the max number of simultaneous uploads can vary from 1 to unlimited.


What is the best way to implement a max simultaneous upload limiter?


The concept behind chunks (doc link) seems to fit but I think the duration of the tasks will cause problems.  I am planning to use Redis as the backend so I think the VISIBILITY_TIMEOUT will cause problems with the chunks method.


The first thing that comes to mind is to use a cache lock and have the task try to obtain a lock.  The keys would be something along the lines of "key-1", "key-2", ... "key-N" depending on the simultaneous connection limit.  If the lock isn't obtained, retry the task shortly.  This doesn't seem very efficient because each task will be checking N cache keys and then retrying after a short pause until all tasks are finished.


The second thought is a "coordinator" task that runs one upload subtask and then submits a new "coordinator" task until there are no more uploads to make.  E.g. if 10 uploads are allowed simultaneously then there would be 10 "coordinator" tasks running repeatedly until no more uploads remain.  The problem with this one is that there will be variable downtime in the middle of uploading while the "coordinator" task requeues.  Constant uploading is ideal due to the nature of post processing by the endpoint.


Any guidance would be greatly appreciated.  How would you structure the task flow?  Would I be better off using a broker that doesn't rely on the visibility timeout and make use of the chunks method?

--
You received this message because you are subscribed to the Google Groups "celery-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to celery-users+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To post to this group, send email to celery-users-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
Visit this group at http://groups.google.com/group/celery-users.
For more options, visit https://groups.google.com/d/optout.
Aniello Barletta | 22 Jul 11:22 2014
Picon

Distribuited workers issue

Hi all,

I've this problem on my celery infrastructure: I'm using celery 3.1.12, and I've a django project with same task.py on both servers, and i want to route the task on a specific server using the routing key approach.

django settings.py (same configuration on both server A and B, with different <ip_address>):
...

TUNNEL_HOST = <ip_address>

CELERYBEAT_SCHEDULER = "djcelery.schedulers.DatabaseScheduler"
CELERY_RESULT_BACKEND = "amqp"
CELERY_ACCEPT_CONTENT = ['pickle']
CELERY_QUEUE_HA_POLICY = 'all'
CELERY_RESULT_DB_SHORT_LIVED_SESSIONS = True
CELERY_WORKER_DIRECT = True

BROKER_URL = 'amqp://guest:guest-mQ7lE4MOPXsTMXefkiEQoZC4tTrXolVc@public.gmane.org:5673//'

CELERY_DEFAULT_QUEUE = TUNNEL_HOST
CELERY_DEFAULT_EXCHANGE_TYPE = 'direct'
CELERY_DEFAULT_ROUTING_KEY = TUNNEL_HOST

CELERY_QUEUES = (
    Queue(TUNNEL_HOST, Exchange(TUNNEL_HOST), routing_key=TUNNEL_HOST),
)

..

celery start daemon:
..
        python /jhub/_prod/server_ptiqaaeryn_core_daemon/conf/manage.py celeryd_detach \
                --logfile=/jhub/_prod/server_ptiqaaeryn_core_daemon/logs/aerynceleryd.log \
                --pidfile=/jhub/_prod/server_ptiqaaeryn_core_daemon/aerynceleryd.pid \
                --workdir=/jhub/_prod/server_ptiqaaeryn_core_daemon/lib/python2.7/site-packages/aeryn_project/ -B -l INFO\
                --autoscale=16,4 --settings=settings "'
..

task execution (tunnel_host is the ip address, i use it as a string as routing key label):

result = task.apply_async((key,user.username, suites, runParams['connectionString'], runParams['device'],
                                   runParams['capabilities'],service,extra_params,), queue=tunnel_host, routing_key=tunnel_host)

If i'm connected to the A server, and the task is executed on A no problem, but if the task needs to be executed on B (because tunnel host refers to the B server), i got only the celery task_id.

Where i'm wrong? Thank you in advance.

--
You received this message because you are subscribed to the Google Groups "celery-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to celery-users+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To post to this group, send email to celery-users-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
Visit this group at http://groups.google.com/group/celery-users.
For more options, visit https://groups.google.com/d/optout.

Gmane