We are pleased to release a formal 2.4.0 release! Also a first
development release of 2.5.
Both are available now for download at
If you are developing new code please code against the master git
as it is constantly updated so
as to avoid as many conflicts as possible.
Note to BGQ earlier adopters: Recently there have been a few
changes that require the runjob_mux to run as your SLURM user. Also
the plugin_flags must be updated as well to avoid a possible
runjob_mux crash if you are starting a job and decide to turn off
the slurmctld at the same time. Please read the updated bluegene
web page http://schedmd.com/slurmdocs/bluegene.html
look for "System
Administration for BlueGene/Q only" for full instructions.
Thanks for all your help and support. Among other things 2.4 brings
substantial performance enhancements and many other improvements
many of which can be found in the RELEASE_NOTES file in the code.
As always if you find any bugs let us know through
or the slurm-dev list.
Below are changes for 2.4.0 and 2.5.0-pre1 since the last tag.
* Changes in SLURM 2.4.0
-- Cray - Improve support for zero compute note resource
Partition used can now be configured with no nodes nodes.
-- BGQ - make it so srun -i<taskid> works correctly.
-- Fix parse_uint32/16 to complain if a non-digit is given.
-- Add SUBMITHOST to job state passed to Moab vial sched/wiki2.
Patch by Jon
-- BGQ - Fix issue when running with AllowSubBlockAllocations=Yes
compiling with --enable-debug
-- Modify scontrol to require "-dd" option to report batch job's
from Don Albert, Bull.
-- Modify SchedulerParamters option to match documentation:
changed to "bf_resolution=". Patch from Rod Schultz, Bull.
-- Fix bug that clears job pending reason field. Patch fron Don
-- In etc/init.d/slurm move check for scontrol after sourcing
/etc/sysconfig/slurm. Patch from Andy Wettstein, University of
-- Fix in scheduling logic that can delay jobs with min/max node
-- BGQ - fix issue where if a step uses the entire allocation and
the next step in the allocation only uses part of the allocation
the correct cnodes.
-- BGQ - Fix checking for IO on a block with new IBM driver V1R1M1
function didn't always work correctly.
-- BGQ - Fix issue when a nodeboard goes down and you want to
to make a larger small block and are running with sub-blocks.
-- BLUEGENE - Better logic for making small blocks around bad
-- BGQ - When using an old IBM driver cnodes that go into error
a job kill timeout aren't always reported to the system. This
handled by the runjob_mux plugin.
-- BGQ - Added information on how to setup the runjob_mux to run as
-- Improve memory consumption on step layouts with high task count.
-- BGQ - quiter debug when the real time server comes back but
still messages we find when we poll but haven't given it back to
-- BGQ - fix for if a request comes in smaller than the smallest
we must use a small block instead of a shared midplane block.
-- Fix issues on large jobs (>64k tasks) to have the correct
counter type when
packing the step layout structure.
-- BGQ - fix issue where if a user was asking for tasks and
but not node count the node count is correctly figured out.
-- Move logic to always use the 1st alphanumeric node as the batch
-- BLUEGENE - fix race condition where if a nodeboard/card goes
down at the
same time a block is destroyed and that block just happens to be
smallest overlapping block over the bad hardware.
-- Fix bug when querying accounting looking for a job node size.
-- BLUEGENE - fix possible race condition if cleaning up a block
removal of the job on the block failed.
-- BLUEGENE - fix issue if a cable was in an error state make it so
check if a block is still makable if the cable wasn't in error.
-- Put nodes names in alphabetic order in node table.
-- If preempted job should have a grace time and preempt mode is
but job is going to be canceled because it is interactive or
it now receives the grace time.
-- BGQ - Modified documents to explain new plugin_flags needed in
in order for the runjob_mux to run correctly.
-- BGQ - change linking from libslurm.o to libslurmhelper.la to
* Changes in SLURM 2.5.0.pre1
-- Add new output to "scontrol show configuration" of LicensesUsed.
-- Changed jobacct_gather plugin infrastructure to be cleaner and
-- Change license option count separator from "*" to ":" for
the gres option (e.g. "--licenses=foo:2 --gres=gpu:2"). The "*"
be accepted, but is no longer documented.
-- Permit more than 100 jobs to be scheduled per node (new limit is
-- Restructure of srun code to allow outside programs to utilize