Daniel Gruber | 1 Mar 2010 10:50
Picon

misssing exceptions and thread safety

Hi,

while scanning the wiki I found following:

The Job object methods should throw following exceptions:
- "JobAlreadySuspendedException" from suspend method when
  job is already suspended. The DRMAA implementation have
  to make sure that suspend job is just called once. It is not enough
  for the DRMAA implementation to rely on own state, it should
  check the state automatically in order to avoid problems when
  the state is set outside of DRMAA. Should DRMAA deal with
  such cases?
- "JobNotSuspendedException" from the resume method (like above).
- "JobTerminatedException" when calling a method on a job
   when the job is already terminated.
   This is for "suspend" "resume" "hold" "release" "terminate" "waitStarted"

Obvious synchronization problems:
- accessing an already "deleted" JobTemplate: here that same as for a destroyed Session should apply (InvalidJobTemplateException)
- accessing a job template while "deleting" (running a job or accessing otherwise): here that same as for a destroyed Session should apply (InvalidJobTemplateException)
- write access for the job templates must be synchronized by the DRMAA implementation
- Is there a need to make the invalid state of a JobTemplate (that is when a JobSession
  has been closed) as an accessible field or should every problem covered by the
 "InvalidJobTemplateException"?

Regards

Daniel
<div>
Hi, <br><br>
while scanning the wiki I found following:<br><br>
The Job object methods should throw following exceptions:<br>
	-
"JobAlreadySuspendedException" from suspend method when <br>
&nbsp; job is already suspended. The DRMAA implementation have <br>
&nbsp; to make sure that suspend job is just called once. It is not enough <br>
&nbsp; for the DRMAA implementation to rely on own state, it should <br>
&nbsp; check the state automatically in order to avoid problems when <br>
&nbsp; the state is set outside of DRMAA. Should DRMAA deal with <br>
&nbsp; such cases?<br>
	-
"JobNotSuspendedException" from the resume method (like above). <br>
	-
"JobTerminatedException" when calling a method on a job <br>
&nbsp;&nbsp; when the job is already terminated. <br>
&nbsp;&nbsp; This is for "suspend" "resume" "hold" "release" "terminate"
"waitStarted"<br><br>
Obvious synchronization problems:<br>
- accessing an already "deleted" JobTemplate: here that same as for a
destroyed Session should apply (InvalidJobTemplateException)<br>
- accessing a job template while "deleting" (running a job or accessing
otherwise): here that same as for a destroyed Session should apply (InvalidJobTemplateException)<br>
- write access for the job templates must be synchronized by the DRMAA
implementation <br>
- Is there a need to make the invalid state of a JobTemplate (that is
when a JobSession <br>
&nbsp; has been closed) as an accessible field or should every problem
covered by the <br>
&nbsp;"InvalidJobTemplateException"?<br><br>
Regards <br><br>
Daniel <br>
</div>
Andre Merzky | 1 Mar 2010 11:17
Gravatar

Re: misssing exceptions and thread safety

Quoting [Daniel Gruber] (Mar 01 2010):
> 
>    Hi,
>    while scanning the wiki I found following:
>    The Job object methods should throw following exceptions:
>    - "JobAlreadySuspendedException" from suspend method when
>      job is already suspended. The DRMAA implementation have
>      to make sure that suspend job is just called once. It is not enough
>      for the DRMAA implementation to rely on own state, it should
>      check the state automatically in order to avoid problems when
>      the state is set outside of DRMAA. Should DRMAA deal with
>      such cases?

*Can* DRMAA deal with such cases?  These are two operations which
are usually not atomic (1: check for state, 2: suspend) - so how can
a DRMAA client side library ensure that the remote state does not
change between these two calls, e.g. due to a 3rd part API call?

I guess it's ok to throw when the backend replies with that error
(job already suspended) - but requiring the DRMAA implementation to
ensure atomicity is most likely futile.

my $0.02, Andre.

>    - "JobNotSuspendedException" from the resume method (like above).
>    - "JobTerminatedException" when calling a method on a job
>       when the job is already terminated.
>       This is for "suspend" "resume" "hold" "release" "terminate"
>    "waitStarted"
>    Obvious synchronization problems:
>    - accessing an already "deleted" JobTemplate: here that same as for a
>    destroyed Session should apply (InvalidJobTemplateException)
>    - accessing a job template while "deleting" (running a job or accessing
>    otherwise): here that same as for a destroyed Session should apply
>    (InvalidJobTemplateException)
>    - write access for the job templates must be synchronized by the DRMAA
>    implementation
>    - Is there a need to make the invalid state of a JobTemplate (that is
>    when a JobSession
>      has been closed) as an accessible field or should every problem
>    covered by the
>     "InvalidJobTemplateException"?
>    Regards
>    Daniel
--

-- 
Nothing is ever easy.
Daniel Gruber | 1 Mar 2010 11:38
Picon

Re: misssing exceptions and thread safety

On 03/01/10 11:17, Andre Merzky wrote:
Quoting [Daniel Gruber] (Mar 01 2010):
Hi, while scanning the wiki I found following: The Job object methods should throw following exceptions: - "JobAlreadySuspendedException" from suspend method when job is already suspended. The DRMAA implementation have to make sure that suspend job is just called once. It is not enough for the DRMAA implementation to rely on own state, it should check the state automatically in order to avoid problems when the state is set outside of DRMAA. Should DRMAA deal with such cases?
*Can* DRMAA deal with such cases? These are two operations which are usually not atomic (1: check for state, 2: suspend) - so how can a DRMAA client side library ensure that the remote state does not change between these two calls, e.g. due to a 3rd part API call? I guess it's ok to throw when the backend replies with that error (job already suspended) - but requiring the DRMAA implementation to ensure atomicity is most likely futile. my $0.02, Andre.
You're right - atomicity seems not to be possible.

Another important thing to know would be if all DRMs are throwing such an exception
or are there any which are silently ignore second request and telling again that it
is suspended (suspend is idempotent). Do we have than a problem with the spec saying
that there is an Exception but on some implementations there will be never throw one?
We should make the exception optional if so.

Cheers

Daniel


- "JobNotSuspendedException" from the resume method (like above). - "JobTerminatedException" when calling a method on a job when the job is already terminated. This is for "suspend" "resume" "hold" "release" "terminate" "waitStarted" Obvious synchronization problems: - accessing an already "deleted" JobTemplate: here that same as for a destroyed Session should apply (InvalidJobTemplateException) - accessing a job template while "deleting" (running a job or accessing otherwise): here that same as for a destroyed Session should apply (InvalidJobTemplateException) - write access for the job templates must be synchronized by the DRMAA implementation - Is there a need to make the invalid state of a JobTemplate (that is when a JobSession has been closed) as an accessible field or should every problem covered by the "InvalidJobTemplateException"? Regards Daniel

<div>
On 03/01/10 11:17, Andre Merzky wrote:
<blockquote cite="mid:20100301101709.GH23663 <at> jonas" type="cite">
  Quoting [Daniel Gruber] (Mar 01 2010):

  <blockquote type="cite">
       Hi,
   while scanning the wiki I found following:
   The Job object methods should throw following exceptions:
   - "JobAlreadySuspendedException" from suspend method when
     job is already suspended. The DRMAA implementation have
     to make sure that suspend job is just called once. It is not enough
     for the DRMAA implementation to rely on own state, it should
     check the state automatically in order to avoid problems when
     the state is set outside of DRMAA. Should DRMAA deal with
     such cases?

  </blockquote>

*Can* DRMAA deal with such cases?  These are two operations which
are usually not atomic (1: check for state, 2: suspend) - so how can
a DRMAA client side library ensure that the remote state does not
change between these two calls, e.g. due to a 3rd part API call?

I guess it's ok to throw when the backend replies with that error
(job already suspended) - but requiring the DRMAA implementation to
ensure atomicity is most likely futile.

my $0.02, Andre.

  
</blockquote>
You're right - atomicity seems not to be possible. <br><br>
Another important thing to know would be if all DRMs are throwing such
an exception <br>
or are there any which are silently ignore second request and telling
again that it <br>
is suspended (suspend is idempotent). Do we have than a problem with
the spec saying <br>
that there is an Exception but on some implementations there will be
never throw one?<br>
We should make the exception optional if so.<br><br>
Cheers <br><br>
Daniel <br><br><br><blockquote cite="mid:20100301101709.GH23663 <at> jonas" type="cite">

  
  <blockquote type="cite">
       - "JobNotSuspendedException" from the resume method (like above).
   - "JobTerminatedException" when calling a method on a job
      when the job is already terminated.
      This is for "suspend" "resume" "hold" "release" "terminate"
   "waitStarted"
   Obvious synchronization problems:
   - accessing an already "deleted" JobTemplate: here that same as for a
   destroyed Session should apply (InvalidJobTemplateException)
   - accessing a job template while "deleting" (running a job or accessing
   otherwise): here that same as for a destroyed Session should apply
   (InvalidJobTemplateException)
   - write access for the job templates must be synchronized by the DRMAA
   implementation
   - Is there a need to make the invalid state of a JobTemplate (that is
   when a JobSession
     has been closed) as an accessible field or should every problem
   covered by the
    "InvalidJobTemplateException"?
   Regards
   Daniel

  </blockquote>
</blockquote>
<br>
</div>
Daniel Templeton | 1 Mar 2010 14:48
Picon

Re: misssing exceptions and thread safety

The atomicity has to be managed at the DRM layer.  The suspend call has 
to operate like a test and set operation.  If the suspend operation 
doesn't return notification whether the job was already suspended, then 
the DRMAA implementation can't report it via an exception.  There is, 
however, no need to make the exception explicitly optional.  Exceptions 
are by definition optional.

Daniel

On 03/01/10 02:38, Daniel Gruber wrote:
> On 03/01/10 11:17, Andre Merzky wrote:
>> Quoting [Daniel Gruber] (Mar 01 2010):
>>    
>>>     Hi,
>>>     while scanning the wiki I found following:
>>>     The Job object methods should throw following exceptions:
>>>     - "JobAlreadySuspendedException" from suspend method when
>>>       job is already suspended. The DRMAA implementation have
>>>       to make sure that suspend job is just called once. It is not enough
>>>       for the DRMAA implementation to rely on own state, it should
>>>       check the state automatically in order to avoid problems when
>>>       the state is set outside of DRMAA. Should DRMAA deal with
>>>       such cases?
>>>      
>>
>> *Can* DRMAA deal with such cases?  These are two operations which
>> are usually not atomic (1: check for state, 2: suspend) - so how can
>> a DRMAA client side library ensure that the remote state does not
>> change between these two calls, e.g. due to a 3rd part API call?
>>
>> I guess it's ok to throw when the backend replies with that error
>> (job already suspended) - but requiring the DRMAA implementation to
>> ensure atomicity is most likely futile.
>>
>> my $0.02, Andre.
>>
>>    
> You're right - atomicity seems not to be possible.
>
> Another important thing to know would be if all DRMs are throwing such 
> an exception
> or are there any which are silently ignore second request and telling 
> again that it
> is suspended (suspend is idempotent). Do we have than a problem with 
> the spec saying
> that there is an Exception but on some implementations there will be 
> never throw one?
> We should make the exception optional if so.
>
> Cheers
>
> Daniel
>
>
>>    
>>>     - "JobNotSuspendedException" from the resume method (like above).
>>>     - "JobTerminatedException" when calling a method on a job
>>>        when the job is already terminated.
>>>        This is for "suspend" "resume" "hold" "release" "terminate"
>>>     "waitStarted"
>>>     Obvious synchronization problems:
>>>     - accessing an already "deleted" JobTemplate: here that same as for a
>>>     destroyed Session should apply (InvalidJobTemplateException)
>>>     - accessing a job template while "deleting" (running a job or accessing
>>>     otherwise): here that same as for a destroyed Session should apply
>>>     (InvalidJobTemplateException)
>>>     - write access for the job templates must be synchronized by the DRMAA
>>>     implementation
>>>     - Is there a need to make the invalid state of a JobTemplate (that is
>>>     when a JobSession
>>>       has been closed) as an accessible field or should every problem
>>>     covered by the
>>>      "InvalidJobTemplateException"?
>>>     Regards
>>>     Daniel
>>>      
>
>
> --
>    drmaa-wg mailing list
>    drmaa-wg@...
>    http://www.ogf.org/mailman/listinfo/drmaa-wg
>    

Andre Merzky | 1 Mar 2010 15:27
Gravatar

Re: misssing exceptions and thread safety

Hi Dan, 

Quoting [Daniel Templeton] (Mar 01 2010):
> 
> There is, however, no need to make the exception explicitly
> optional.  Exceptions are by definition optional.

This may be true in this case, but in general I expect exceptions to
be guaranteed on specific circumstances. For example, I would expect
that a suspend() on an invalid jobid will *always* cause an
exception (mandatory), not only sometimes (optional).

Best, Andre.

--

-- 
Nothing is ever easy.
Mariusz Mamoński | 2 Mar 2010 07:24
Picon

Re: misssing exceptions and thread safety

Hi all,

1. Can we merge JobAlreadySuspendedException,
JobNotSuspendedException, JobTerminatedException into one something
like CantApplyToCurrentStateExecption (OGSA-BES approach) and state
that the error message should bears current job state ?
2. Guaranteeing atomicity (concerning operation that comes from
outside) in DRMAA is almost impossible for the DRMS i know, as usually
there is no "lock on job" operation available in public API.

Cheers,

On 1 March 2010 15:27, Andre Merzky <andre <at> merzky.net> wrote:
> Hi Dan,
>
> Quoting [Daniel Templeton] (Mar 01 2010):
>>
>> There is, however, no need to make the exception explicitly
>> optional.  Exceptions are by definition optional.
>
> This may be true in this case, but in general I expect exceptions to
> be guaranteed on specific circumstances. For example, I would expect
> that a suspend() on an invalid jobid will *always* cause an
> exception (mandatory), not only sometimes (optional).
>
> Best, Andre.
>
>
>
> --
> Nothing is ever easy.
> --
>  drmaa-wg mailing list
>  drmaa-wg <at> ogf.org
>  http://www.ogf.org/mailman/listinfo/drmaa-wg
>

--

-- 
Mariusz
--
  drmaa-wg mailing list
  drmaa-wg <at> ogf.org
  http://www.ogf.org/mailman/listinfo/drmaa-wg
Daniel Templeton | 2 Mar 2010 15:19
Picon

Re: misssing exceptions and thread safety

Sounds reasonable to me.  At a minimum, all of these very specific state 
exception should inherit from a more general exception in languages that 
support it.

Daniel

On 03/01/10 22:24, Mariusz Mamoński wrote:
> Hi all,
>
> 1. Can we merge JobAlreadySuspendedException,
> JobNotSuspendedException, JobTerminatedException into one something
> like CantApplyToCurrentStateExecption (OGSA-BES approach) and state
> that the error message should bears current job state ?
> 2. Guaranteeing atomicity (concerning operation that comes from
> outside) in DRMAA is almost impossible for the DRMS i know, as usually
> there is no "lock on job" operation available in public API.
>
> Cheers,
>
> On 1 March 2010 15:27, Andre Merzky<andre <at> merzky.net>  wrote:
>    
>> Hi Dan,
>>
>> Quoting [Daniel Templeton] (Mar 01 2010):
>>      
>>> There is, however, no need to make the exception explicitly
>>> optional.  Exceptions are by definition optional.
>>>        
>> This may be true in this case, but in general I expect exceptions to
>> be guaranteed on specific circumstances. For example, I would expect
>> that a suspend() on an invalid jobid will *always* cause an
>> exception (mandatory), not only sometimes (optional).
>>
>> Best, Andre.
>>
>>
>>
>> --
>> Nothing is ever easy.
>> --
>>   drmaa-wg mailing list
>>   drmaa-wg <at> ogf.org
>>   http://www.ogf.org/mailman/listinfo/drmaa-wg
>>
>>      
>
>
>    

--
  drmaa-wg mailing list
  drmaa-wg <at> ogf.org
  http://www.ogf.org/mailman/listinfo/drmaa-wg
Daniel Gruber | 2 Mar 2010 15:28
Picon

Re: misssing exceptions and thread safety

The only thing is that we loose information on languages that
do not support exception inheritance. For example when
throwing CantApplyToCurrentStateException while resuming
it is unclear weather it is because the job was terminated or
it was not suspended. Maybe we should make the more
general exceptions part of the language bindings?

Daniel

On 03/02/10 15:19, Daniel Templeton wrote:
Sounds reasonable to me. At a minimum, all of these very specific state exception should inherit from a more general exception in languages that support it. Daniel On 03/01/10 22:24, Mariusz Mamoński wrote:
Hi all, 1. Can we merge JobAlreadySuspendedException, JobNotSuspendedException, JobTerminatedException into one something like CantApplyToCurrentStateExecption (OGSA-BES approach) and state that the error message should bears current job state ? 2. Guaranteeing atomicity (concerning operation that comes from outside) in DRMAA is almost impossible for the DRMS i know, as usually there is no "lock on job" operation available in public API. Cheers, On 1 March 2010 15:27, Andre Merzky<andre-vSXDtpAwmDysTnJN9+BGXg@public.gmane.org> wrote:
Hi Dan, Quoting [Daniel Templeton] (Mar 01 2010):
There is, however, no need to make the exception explicitly optional. Exceptions are by definition optional.
This may be true in this case, but in general I expect exceptions to be guaranteed on specific circumstances. For example, I would expect that a suspend() on an invalid jobid will *always* cause an exception (mandatory), not only sometimes (optional). Best, Andre. -- Nothing is ever easy. -- drmaa-wg mailing list drmaa-wg-F/d3TggoGCE@public.gmane.org http://www.ogf.org/mailman/listinfo/drmaa-wg
-- drmaa-wg mailing list drmaa-wg-F/d3TggoGCE@public.gmane.org http://www.ogf.org/mailman/listinfo/drmaa-wg

<div>
The only thing is that we loose information on languages that <br>
do not support exception inheritance. For example when <br>
throwing CantApplyToCurrentStateException while resuming <br>
it is unclear weather it is because the job was terminated or <br>
it was not suspended. Maybe we should make the more <br>
general exceptions part of the language bindings?<br><br>
Daniel<br><br>
On 03/02/10 15:19, Daniel Templeton wrote:
<blockquote cite="mid:4B8D1E7A.205@..." type="cite">
  Sounds reasonable to me.  At a minimum, all of these very specific state 
exception should inherit from a more general exception in languages that 
support it.

Daniel

On 03/01/10 22:24, Mariusz Mamo&#324;ski wrote:

  <blockquote type="cite">
    Hi all,

1. Can we merge JobAlreadySuspendedException,
JobNotSuspendedException, JobTerminatedException into one something
like CantApplyToCurrentStateExecption (OGSA-BES approach) and state
that the error message should bears current job state ?
2. Guaranteeing atomicity (concerning operation that comes from
outside) in DRMAA is almost impossible for the DRMS i know, as usually
there is no "lock on job" operation available in public API.

Cheers,

On 1 March 2010 15:27, Andre Merzky<a class="moz-txt-link-rfc2396E" href="mailto:andre@...">&lt;andre@...&gt;</a>  wrote:

    
    <blockquote type="cite">
      Hi Dan,

Quoting [Daniel Templeton] (Mar 01 2010):

      
      <blockquote type="cite">
        There is, however, no need to make the exception explicitly
optional.  Exceptions are by definition optional.

        
      </blockquote>
      This may be true in this case, but in general I expect exceptions to
be guaranteed on specific circumstances. For example, I would expect
that a suspend() on an invalid jobid will *always* cause an
exception (mandatory), not only sometimes (optional).

Best, Andre.

--
Nothing is ever easy.
--
  drmaa-wg mailing list
  <a class="moz-txt-link-abbreviated" href="mailto:drmaa-wg@...">drmaa-wg@...</a>
  <a class="moz-txt-link-freetext" href="http://www.ogf.org/mailman/listinfo/drmaa-wg">http://www.ogf.org/mailman/listinfo/drmaa-wg</a>

     

    </blockquote>

   

  </blockquote>

--
  drmaa-wg mailing list
  <a class="moz-txt-link-abbreviated" href="mailto:drmaa-wg@...">drmaa-wg@...</a>
  <a class="moz-txt-link-freetext" href="http://www.ogf.org/mailman/listinfo/drmaa-wg">http://www.ogf.org/mailman/listinfo/drmaa-wg</a>

</blockquote>
<br>
</div>
Peter Tröger | 2 Mar 2010 21:12
Picon

Re: misssing exceptions and thread safety


> Sounds reasonable to me.  At a minimum, all of these very specific  
> state
> exception should inherit from a more general exception in languages  
> that
> support it.

A mandatory exception hierarchy was declined a long time ago, the  
argumentation became even part of the DRMAA IDL 1.0 spec:

"Language bindings MAY decide to introduce a hierarchical ordering of  
the DRMAA exceptions through class derivation. In this case it MAY  
also happen that new exceptions are introduced for behavior  
aggregation. In this case, those exceptions SHALL be marked as  
abstract, to prevent them from being thrown."

>> 1. Can we merge JobAlreadySuspendedException,
>> JobNotSuspendedException, JobTerminatedException into one something
>> like CantApplyToCurrentStateExecption (OGSA-BES approach) and state
>> that the error message should bears current job state ?

I might have missed something, but where did you got these exceptions  
from ? Another point is that your proposal is already reality. The  
decision was made at the F2F meeting in July 2009:

"The former HoldInconsistentStateException,  
ReleaseInconsistentStateException, ResumeInconsistentStateException,  
andSuspendInconsistentStateException from DRMAA v1.0 are now expressed  
as single InconsistentStateException with different meaning per  
function"

I would like to ask everybody to reason ONLY about the DRMAAv2 spec in  
the wiki. Every other document is (very likely) outdated.

Thanks and best regards,
Peter.

>> 2. Guaranteeing atomicity (concerning operation that comes from
>> outside) in DRMAA is almost impossible for the DRMS i know, as  
>> usually
>> there is no "lock on job" operation available in public API.
>>
>> Cheers,
>>
>> On 1 March 2010 15:27, Andre Merzky<andre@...>  wrote:
>>
>>> Hi Dan,
>>>
>>> Quoting [Daniel Templeton] (Mar 01 2010):
>>>
>>>> There is, however, no need to make the exception explicitly
>>>> optional.  Exceptions are by definition optional.
>>>>
>>> This may be true in this case, but in general I expect exceptions to
>>> be guaranteed on specific circumstances. For example, I would expect
>>> that a suspend() on an invalid jobid will *always* cause an
>>> exception (mandatory), not only sometimes (optional).
>>>
>>> Best, Andre.
>>>
>>>
>>>
>>> --
>>> Nothing is ever easy.
>>> --
>>>  drmaa-wg mailing list
>>>  drmaa-wg@...
>>>  http://www.ogf.org/mailman/listinfo/drmaa-wg
>>>
>>>
>>
>>
>>
>
> --
>  drmaa-wg mailing list
>  drmaa-wg@...
>  http://www.ogf.org/mailman/listinfo/drmaa-wg

Peter Tröger | 2 Mar 2010 22:35
Picon

Re: misssing exceptions and thread safety

GFD.133 has a good statement in the description of the control()  
function:

"This routine SHALL return once the action has been acknowledged by  
the DRM system, but does not necessarily wait until the action has  
been completed."

This underlines Dan's argumentation, the point of synchronization  
resp. atomicity is the DRMS itself. It was no problem in DRMAAv1,  
since we carefully avoided to demand any kind of state saving in the  
library. This changed with the new persistency features. We discussed  
possible new race conditions in Hamburg, but couldn't find anything  
unsolvable. The new concept demands only the storage of identifiers so  
far - for sessions (if supported by the DRM) and jobs. The state still  
must be retrieved from the DRM on every usage.

>>>>    The Job object methods should throw following exceptions:
>>>>    - "JobAlreadySuspendedException" from suspend method when
>>>>      job is already suspended. The DRMAA implementation have
>>>>      to make sure that suspend job is just called once. It is not  
>>>> enough
>>>>      for the DRMAA implementation to rely on own state, it should
>>>>      check the state automatically in order to avoid problems when
>>>>      the state is set outside of DRMAA. Should DRMAA deal with
>>>>      such cases?

Can you provide a link for this text ? I cannot find it. It also makes  
no real sense - job state NEVER EVER should be persisted in the DRMAA  
library itself.

>>> *Can* DRMAA deal with such cases?  These are two operations which
>>> are usually not atomic (1: check for state, 2: suspend) - so how can
>>> a DRMAA client side library ensure that the remote state does not
>>> change between these two calls, e.g. due to a 3rd part API call?

It cannot, and it is no problem. A "test-and-set" semantic of the  
library is not expected here. The DRMS should tell the library that  
suspend() is not allowed with the current state. Or in other words -  
we expect the job control functions of the DRM system to act (more or  
less) like the DRMAA equivalents. So far, this worked out.

Best,
Peter.


Gmane