M Trumpis | 2 Apr 21:52
Picon
Favicon

TaskClient problems

Hello.. I'm having some trouble queuing up tasks in a TaskClient
object. I don't have a good handle on what's going on, but I've tried
to include all the relevant log/debug info. I'm working with an
ipcluster with two engines, on Mac OS 10.4. Python etc are built up
from macports, and IPython from trunk (the IPython build details are
listed below). When I set up a queue of tasks, most of the time the
engines encounter an error and shut down.

When I start up ipcluster, the first warning sign is this line:

2009-04-02 12:18:54-0700 [-] spawnProcess called, but the SIGCHLD
handler is not installed. This probably means you have not yet called
reactor.run, or called reactor.run(installSignalHandler=0). You will
probably never see this process finish, and it may become a zombie
process.

The errors after that point seem to depend on the task queuing.. The
ipengine log with a twisted exception is included below

Thanks.. sorry my description is lousy, but I hope the logs help!

Mike

-------------------------- IPCLUSTER STDOUT
---------------------------------------
puma:~/workywork/recon/ghosting miket$ ipcluster local
2009-04-02 12:18:54-0700 [-] Log opened.
2009-04-02 12:18:54-0700 [-] spawnProcess called, but the SIGCHLD
handler is not installed. This probably means you have not yet called
reactor.run, or called reactor.run(installSignalHandler=0). You will
(Continue reading)

Fernando Perez | 2 Apr 22:54
Picon
Gravatar

Re: TaskClient problems

Howdy,

On Thu, Apr 2, 2009 at 12:52 PM, M Trumpis <mtrumpis <at> berkeley.edu> wrote:
> Hello.. I'm having some trouble queuing up tasks in a TaskClient
> object. I don't have a good handle on what's going on, but I've tried
> to include all the relevant log/debug info. I'm working with an
> ipcluster with two engines, on Mac OS 10.4. Python etc are built up
> from macports, and IPython from trunk (the IPython build details are
> listed below). When I set up a queue of tasks, most of the time the
> engines encounter an error and shut down.

[...]

Just as an FYI, Mike is in our lab and this is for some of our local
work; I've already tried to help him as much as I can but I'm totally
stumped here.  I don't see the initial errors he gets upon startup
(I'm on Ubuntu 64bit, he's on OSX), and I'm not really sure what else
to try.  But if anyone has any suggestions, we'd be very grateful...

Cheers,

f
Yichun Wei | 2 Apr 23:23
Picon

Re: TaskClient problems

I had the impression that I needed Twised 8.1 to have my code work.
The warning about reactor.run reminds me that it might be related.
Just an idea, not sure if it helps. -yichun

On Thu, Apr 2, 2009 at 12:52 PM, M Trumpis <mtrumpis <at> berkeley.edu> wrote:
> Hello.. I'm having some trouble queuing up tasks in a TaskClient
> object. I don't have a good handle on what's going on, but I've tried
> to include all the relevant log/debug info. I'm working with an
> ipcluster with two engines, on Mac OS 10.4. Python etc are built up
> from macports, and IPython from trunk (the IPython build details are
> listed below). When I set up a queue of tasks, most of the time the
> engines encounter an error and shut down.
>
> When I start up ipcluster, the first warning sign is this line:
>
> 2009-04-02 12:18:54-0700 [-] spawnProcess called, but the SIGCHLD
> handler is not installed. This probably means you have not yet called
> reactor.run, or called reactor.run(installSignalHandler=0). You will
> probably never see this process finish, and it may become a zombie
> process.
>
> The errors after that point seem to depend on the task queuing.. The
> ipengine log with a twisted exception is included below
>
> Thanks.. sorry my description is lousy, but I hope the logs help!
>
> Mike
>
> -------------------------- IPCLUSTER STDOUT
> ---------------------------------------
(Continue reading)

Fernando Perez | 2 Apr 23:30
Picon
Gravatar

Re: TaskClient problems

On Thu, Apr 2, 2009 at 2:23 PM, Yichun Wei <yichun.wei <at> gmail.com> wrote:
> I had the impression that I needed Twised 8.1 to have my code work.
> The warning about reactor.run reminds me that it might be related.
> Just an idea, not sure if it helps. -yichun

Good point!  On my ubuntu box, I have this too:

In [2]: twisted.__version__
Out[2]: '8.1.0'

Mike, could you try updating your twisted and seeing if that helps?

f
M Trumpis | 2 Apr 23:54
Picon
Favicon

Fwd: TaskClient problems

I'll try and see on Mac OS, but I do see the same thing in Ubuntu with
twisted 8.1.0

There is no weird stdout when starting up ipcluster, but the engines
do shut down in the same way, and the twisted exception is pretty much
the same:

2009-04-02 13:58:18-0700 [-] Log opened.
2009-04-02 13:58:18-0700 [-] Using furl file:
/home/mike/.ipython/security/ipcontroller-engine.furl
2009-04-02 13:58:18-0700 [Negotiation,client] engine registration
succeeded, got id: 0
2009-04-02 13:58:36-0700 [-]
/usr/lib/python2.5/site-packages/twisted/internet/base.py:1047:
exceptions.DeprecationWarning: Reactor already running! This behavior
is deprecated since Twisted 8.0

I actually think the engines shut down when doing a mec.push in this
case, instead of anything with the TaskClient.

In Ubuntu, I rolled back to the 0.9.1 release and was able to fire off
the jobs successfully.

Mike

On Thu, Apr 2, 2009 at 2:30 PM, Fernando Perez <fperez.net <at> gmail.com> wrote:
> On Thu, Apr 2, 2009 at 2:23 PM, Yichun Wei <yichun.wei <at> gmail.com> wrote:
>> I had the impression that I needed Twised 8.1 to have my code work.
>> The warning about reactor.run reminds me that it might be related.
>> Just an idea, not sure if it helps. -yichun
(Continue reading)

Fernando Perez | 3 Apr 00:22
Picon
Gravatar

Re: Fwd: TaskClient problems

Hey Mike,

On Thu, Apr 2, 2009 at 2:54 PM, M Trumpis <mtrumpis <at> berkeley.edu> wrote:
> I'll try and see on Mac OS, but I do see the same thing in Ubuntu with
> twisted 8.1.0

Weird, here's a transcript of my ipcluster session:

uqbar[~]> ipcluster local
2009-04-02 15:19:17-0700 [-] Log opened.
2009-04-02 15:19:17-0700 [-] Process ['ipcontroller',
'--logfile=/home/fperez/.ipython/log/ipcontroller'] has started with
pid=30471
2009-04-02 15:19:18-0700 [-] Process ['ipengine',
'--logfile=/home/fperez/.ipython/log/ipengine30471-'] has started with
pid=30477
2009-04-02 15:19:18-0700 [-] Process ['ipengine',
'--logfile=/home/fperez/.ipython/log/ipengine30471-'] has started with
pid=30479
2009-04-02 15:19:18-0700 [-] Engines started with pids: [30477, 30479]
^C2009-04-02 15:20:34-0700 [-] Stopping local cluster
2009-04-02 15:20:34-0700 [-] Process ['ipengine',
'--logfile=/home/fperez/.ipython/log/ipengine30471-'] has stopped with
0
2009-04-02 15:20:34-0700 [-] Process ['ipcontroller',
'--logfile=/home/fperez/.ipython/log/ipcontroller'] has stopped with 0
2009-04-02 15:20:34-0700 [-] Process ['ipengine',
'--logfile=/home/fperez/.ipython/log/ipengine30471-'] has stopped with
0
2009-04-02 15:20:34-0700 [-] Engines received signal: [0, 0]
(Continue reading)

Brian Granger | 3 Apr 00:27
Picon

Re: Fwd: TaskClient problems

Can you try to run this giving the number of engines:

ipcluster local -n 2

What you are doing should work, but maybe there is a bug.

So both of you can reproduce this?  Also, can you both check to make
sure you don't have a zombie ipython  process around?

Brian

On Thu, Apr 2, 2009 at 3:22 PM, Fernando Perez <fperez.net <at> gmail.com> wrote:
> Hey Mike,
>
> On Thu, Apr 2, 2009 at 2:54 PM, M Trumpis <mtrumpis <at> berkeley.edu> wrote:
>> I'll try and see on Mac OS, but I do see the same thing in Ubuntu with
>> twisted 8.1.0
>
> Weird, here's a transcript of my ipcluster session:
>
> uqbar[~]> ipcluster local
> 2009-04-02 15:19:17-0700 [-] Log opened.
> 2009-04-02 15:19:17-0700 [-] Process ['ipcontroller',
> '--logfile=/home/fperez/.ipython/log/ipcontroller'] has started with
> pid=30471
> 2009-04-02 15:19:18-0700 [-] Process ['ipengine',
> '--logfile=/home/fperez/.ipython/log/ipengine30471-'] has started with
> pid=30477
> 2009-04-02 15:19:18-0700 [-] Process ['ipengine',
> '--logfile=/home/fperez/.ipython/log/ipengine30471-'] has started with
(Continue reading)

Fernando Perez | 3 Apr 00:38
Picon
Gravatar

Re: Fwd: TaskClient problems

On Thu, Apr 2, 2009 at 3:27 PM, Brian Granger <ellisonbg.net <at> gmail.com> wrote:
> Can you try to run this giving the number of engines:
>
> ipcluster local -n 2
>
> What you are doing should work, but maybe there is a bug.
>
> So both of you can reproduce this?  Also, can you both check to make
> sure you don't have a zombie ipython  process around?

Mmh, I'm confused: the transcript I showed is OK, everything seems to
run fine in my case.  It starts two engines, the log doesn't show any
errors, and the engines behave as I expect them to by controlling them
interactively.  Or did you see something in my log that seemed out of
place?

f
Brian Granger | 3 Apr 00:47
Picon

Re: Fwd: TaskClient problems

Mmh, now I am confused too....strike that....

When I tried "ipcluster local" I was accidentally looking at an older
log file.  It works fine for me as well.

I would check for zombies and also remove any config files in .ipython
related to the kernel (the *.ini files)

Brian

On Thu, Apr 2, 2009 at 3:38 PM, Fernando Perez <fperez.net <at> gmail.com> wrote:
> On Thu, Apr 2, 2009 at 3:27 PM, Brian Granger <ellisonbg.net <at> gmail.com> wrote:
>> Can you try to run this giving the number of engines:
>>
>> ipcluster local -n 2
>>
>> What you are doing should work, but maybe there is a bug.
>>
>> So both of you can reproduce this?  Also, can you both check to make
>> sure you don't have a zombie ipython  process around?
>
> Mmh, I'm confused: the transcript I showed is OK, everything seems to
> run fine in my case.  It starts two engines, the log doesn't show any
> errors, and the engines behave as I expect them to by controlling them
> interactively.  Or did you see something in my log that seemed out of
> place?
>
> f
>
(Continue reading)

Brian Granger | 3 Apr 01:46
Picon

Re: Fwd: TaskClient problems

What resolved the issue?

Brian

On Thu, Apr 2, 2009 at 4:25 PM, Mike Trumpis <mtrumpis <at> gmail.com> wrote:
> seems to work.. here's the output:
>
> puma:~ miket$ ipcontroller
> 2009-04-02 16:23:28-0700 [-] Log opened.
> 2009-04-02 16:23:28-0700 [-] foolscap.pb.Listener starting on 63775
> 2009-04-02 16:23:28-0700 [-] foolscap.pb.Listener starting on 63776
> 2009-04-02 16:23:28-0700 [-] Adapting Controller to interface: multiengine
> 2009-04-02 16:23:28-0700 [-] Saving furl for interface [multiengine]
> to file: /Users/miket/.ipython/security/ipcontroller-mec.furl
> 2009-04-02 16:23:28-0700 [-] Adapting Controller to interface: task
> 2009-04-02 16:23:28-0700 [-] Saving furl for interface [task] to file:
> /Users/miket/.ipython/security/ipcontroller-tc.furl
> 2009-04-02 16:23:28-0700 [-] Saving furl for the engine to file:
> /Users/miket/.ipython/security/ipcontroller-engine.furl
> 2009-04-02 16:23:28-0700 [-]
> twisted.internet.protocol.DatagramProtocol starting on 50966
> 2009-04-02 16:23:28-0700 [-] Starting protocol
> <twisted.internet.protocol.DatagramProtocol instance at 0x2ad1be8>
> 2009-04-02 16:23:28-0700 [-]
> twisted.internet.protocol.DatagramProtocol starting on 50967
> 2009-04-02 16:23:28-0700 [-] Starting protocol
> <twisted.internet.protocol.DatagramProtocol instance at 0x2b80fd0>
> 2009-04-02 16:23:28-0700 [-] (Port 50966 Closed)
> 2009-04-02 16:23:28-0700 [-] Stopping protocol
> <twisted.internet.protocol.DatagramProtocol instance at 0x2ad1be8>
(Continue reading)


Gmane