Moe Jette | 19 Mar 22:29 2015

Slurm versions 14.11.5 and 15.08.0-pre3 are now available


Version 14.11.5 contains quite a few bug fixes generated over the past  
five weeks including two high impact bugs. There is a fix for the  
slurmdbd daemon aborting if a node is set to a DOWN state and it's  
"reason" field is NULL. The other important bug fix will prevent  
someone from being able to kill a job array belonging to another user.  
Details about all of the changes are appended.

Version 15.08.0-pre3 represents the current state of Slurm development  
for the release planned in August 2015 and is intended for development  
and test purposes only. Notable enhancements include power capping  
support for Cray systems and add the ability for a compute node to be  
allocated to multiple jobs, but restricted to one user at a time.

Both versions can be downloaded from
http://www.schedmd.com/#repos

* Changes in Slurm 14.11.5
==========================
  -- Correct the squeue command taking into account that a node can
     have NULL name if it is not in DNS but still in slurm.conf.
  -- Fix slurmdbd regression which would cause a segfault when a node is set
     down with no reason.
  -- BGQ - Fix issue with job arrays not being handled correctly
     in the runjob_mux plugin.
  -- Print FAIR_TREE, if configured, in "scontrol show config" output for
     PriorityFlags.
  -- Add SLURM_JOB_GPUS environment variable to those available in the Prolog.
  -- Load lua-5.2 library if using lua5.2 for lua job submit plugin.
  -- GRES logic: Prevent bad node_offset due to not preserving no_consume flag.
(Continue reading)

jette | 12 Feb 22:49 2015

Slurm versions 14.11.4 and 15.08.0-pre2 are now available


Slurm versions 14.11.4 and 15.08.0-pre2 are now available from

http://www.schedmd.com/#repos

Version 14.11.4 contains quite a few bug fixes generated over the past  
five weeks. Several of these are related to job arrays, including one  
that can cause the slurmctld daemon to abort. Version 15.08.0-pre2  
represents the current state of Slurm development for the released  
planned in August 2015 and is intended for development and test  
purposes only. It includes some development work for burst buffers,  
power management, and inter-cluster job dependencies. More details  
about the changes are shown below.

* Changes in Slurm 14.11.4
==========================
  -- Make sure assoc_mgr locks are initialized correctly.
  -- Correct check of enforcement when filling in an association.
  -- Make sacctmgr print out classification correctly for clusters.
  -- Add array_task_str to the perlapi job info.
  -- Fix for slurmctld abort with GRES types configured and no CPU binding.
  -- Fix for GRES scheduling where count > 1 per topology type (or GRES types).
  -- Make CR_ONE_TASK_PER_CORE work correctly with task/affinity.
  -- job_submit/pbs - Fix possible deadlock.
  -- job_submit/lua - Add "alloc_node" to job information available.
  -- Fix memory leak in mysql accounting when usage rollup happens.
  -- If users specify ALL together with other variables using the
     --export sbatch/srun command line option, propagate the users'
     environ to the execution side.
  -- Fix job array scheduling anomaly that can stop scheduling of valid tasks.
(Continue reading)

Danny Auble | 12 Dec 20:56 2014

Slurm version 14.11.2 and 15.08.0-pre1 are now available


Slurm versions 14.11.2 and 15.08.0-pre1 are now available. Version 
14.11.2 includes quite a few relatively minor bug fixes.

Version 15.08.0 is under active development and its release is planned 
in August 2015.  While this is the first pre-release there is already 
quite a bit of new functionality.

Both versions can be downloaded from http://schedmd.com/#repos

Highlights of the 2 versions are these

* Changes in Slurm 14.11.2
==========================
  -- Fix Centos5 compile errors.
  -- Fix issue with association hash not getting the correct index which
     could result in seg fault.
  -- Fix salloc/sbatch -B segfault.
  -- Avoid huge malloc if GRES configured with "Type" and huge "Count".
  -- Fix jobs from starting in overlapping reservations that won't 
finish before
     a "maint" reservation begins.
  -- When node gets drained while in state mixed display its status as 
draining
     in sinfo output.
  -- Allow priority/multifactor to work with sched/wiki(2) if all priorities
     have no weight.  This allows for association and QOS decay limits 
to work.
  -- Fix "squeue --start" to override SQUEUE_FORMAT env variable.
  -- Fix scancel to be able to cancel multiple jobs that are space 
(Continue reading)

jette | 26 Nov 19:39 2014

Slurm version 14.11.1 is now available


We have just released Slurm version 14.11.1. This includes a fix for a  
race condition that can deadlock the slurmctld daemon when job_submit  
plugins are used, plus a few minor changes as identified below. You  
can download it from:
http://www.schedmd.com/#repos

* Changes in Slurm 14.11.1
==========================
  -- Get libs correct when doing the xtree/xhash make check.
  -- Update xhash/tree make check to work correctly with current code.
  -- Remove the reference 'experimental' for the jobacct_gather/cgroup
     plugin.
  -- Add QOS manipulation examples to the qos.html documentation page.
  -- If 'squeue -w node_name' specifies an unknown host name print
     an error message and return 1.
  -- Fix race condition in job_submit plugin logic that could cause  
slurmctld to
     deadlock.
  -- Job wait reason of "ReqNodeNotAvail" expanded to identify  
unavailable nodes
     (e.g. "ReqNodeNotAvail(Unavailable:tux[3-6])").
--

-- 
Morris "Moe" Jette
CTO, SchedMD LLC

Danny Auble | 14 Nov 00:18 2014

Slurm 14.11.0 is now available


Slurm version 14.11.0 is now available. This is a major Slurm release 
with many new features. See the RELEASE_NOTES and NEWS files in the 
distribution for detailed descriptions of the changes, a few of which 
are noted below.

Upgrading from Slurm versions 2.6 or 14.03 should proceed without loss 
of jobs or other state.  Just be sure to upgrade the slurmdbd first. 
(Upgrades from pre-releases of version 14.11 may result job loss.)

Slurm downloads are available from http://www.schedmd.com/#repos.

Thanks to all those who helped make this release!

Highlights of changes in Slurm version 14.11.0 include:
  -- Added job array data structure and removed 64k array size restriction.
  -- Added support for reserving CPUs and/or memory on a compute node 
for system
     use.
  -- Added support for allocation of generic resources by model type for
     heterogeneous systems (e.g. request a Kepler GPU, a Tesla GPU, or a 
GPU of
     any type).
  -- Added support for non-consumable generic resources that are 
limited, but
     can be shared between jobs.
  -- Added support for automatic job requeue policy based on exit value.
  -- Refactor job_submit/lua interface. LUA FUNCTIONS NEED TO CHANGE! The
     lua script no longer needs to explicitly load meta-tables, but 
information
(Continue reading)

Danny Auble | 4 Nov 02:35 2014

Slurm versions 14.03.10 and 14.11.0-rc3 are now available


Slurm version 14.03.10 includes quite a few relatively minor bug fixes, 
and will most likely be the last 14.03 release.  Thanks to all those who 
helped make this a very stable release.

We hope to officially tag 14.11.0 before SC14.  Version 14.11.0-rc3 
includes a few bug fixes discovered in recent testing but is looking 
very stable. Thanks to everyone participating in the testing!  If you 
can, please test this release so we can attempt to fix as many issues as 
we can before we tag 14.11.0.

Just a heads up, version 15.08 is already starting development we will 
most likely tag a pre1 of this later this month.

Slurm downloads are available from http://www.schedmd.com/#repos.

Here are some snips from the NEWS file on what has changed since the 
last releases.

* Changes in Slurm 14.03.10
===========================
  -- Fix a few sacctmgr error messages.
  -- Treat non-zero SlurmSchedLogLevel without SlurmSchedLogFile as a fatal
     error.
  -- Correct sched_config.html documentation SchedulingParameters
     should be SchedulerParameters.
  -- When using gres and cgroup ConstrainDevices set correct access
     permission for the batch step.
  -- Fix minor memory leak in jobcomp/mysql on slurmctld reconfig.
  -- Fix bug that prevented preservation of a job's GRES bitmap on slurmctld
(Continue reading)

jette | 17 Oct 23:08 2014

Slurm versions 14.03.9 and 14.11.0-rc2 are now available


Slurm versions 14.03.9 and 14.11.0-rc2 are now available.
Version 14.03.9 includes quite a few relatively minor bug fixes.
Version 14.11.0-rc2 includes a few bug fixes discovered in recent testing.
Thanks to everyone participating in the testing!
Version 14.11.0 is no longer under active development, but is undergoing
testing for a planned release in early November.

Slurm downloads are available from
http://www.schedmd.com/#repos

* Changes in Slurm 14.03.9
==========================
  -- If slurmd fails to stat(2) the configuration print the string describing
     the error code.
  -- Fix for mixing core base reservations with whole node based reservations
     to avoid overlapping erroneously.
  -- BLUEGENE - Remove references to Base Partition.
  -- sview - If compiled on a non-bluegene system then used to view a BGQ fix
     to allow sview to display blocks correctly.
  -- Fix bug in update reservation. When modifying the reservation the end time
     was set incorrectly.
  -- The start time of a reservation that is in ACTIVE state cannot be  
modified.
  -- Update the cgroup documentation about release agent for devices.
  -- MYSQL - fix for setting up preempt list on a QOS for multiple QOS.
  -- Correct a minor error in the scancel.1 man page related to the
     --signal option.
  -- Enhance the scancel.1 man page to document the sequence of signals sent
  -- Fix slurmstepd core dump if the cgroup hierarchy is not completed
(Continue reading)

jette | 29 Sep 18:38 2014

Slurm User Group Meeting 2014: Presentations now online


About 70 people attended the Slurm User Group Meeting last week in  
Lugano Switzerland. There were a lot of good presentations and  
discussions. Copies of the presentations are now available online at
http://slurm.schedmd.com/publications.html

NOTE: A few of the presentations are missing, but will be posted when  
available.
--

-- 
Morris "Moe" Jette
CTO, SchedMD LLC

Danny Auble | 17 Sep 22:58 2014

Slurm versions 14.03.8 and 14.11.0-pre5 are now available


Slurm versions 14.03.8 and 14.11.0-pre5 are now available. Version 
14.03.8 includes quite a few relatively minor bug fixes.

Version 14.11.0 is under active development and its release is planned 
in November 2014.  Much of its features and performance enhancements 
will be discussed next week at SLUG 2014 in Lugano Switzerland.

Note to all developers, code freeze for new features in 14.11 will be at 
the end of this month (September).

Slurm downloads are available from http://www.schedmd.com/#repos.

Highlights of the 2 versions are these

* Changes in Slurm 14.03.8
==========================
  -- Fix minor memory leak when Job doesn't have nodes on it (Meaning 
the job
     has finished)
  -- Fix sinfo/sview to be able to query against nodes in reserved and other
     states.
  -- Make sbatch/salloc read in (SLURM|(SBATCH|SALLOC))_HINT in order to
     handle sruns in the script that will use it.
  -- srun properly interprets a leading "." in the executable name based 
upon
     the working directory of the compute node rather than the submit host.
  -- Fix Lustre misspellings in hdf5 guide
  -- Fix wrong reference in slurm.conf man page to what --profile option 
should
(Continue reading)

jette | 15 Sep 21:01 2014

Slurm User Group Meeting 2014 - Last call to register


Registration for the Slurm User Group Meeting in Lugano, Switzerland  
closes in two hours. If you plan to attend and have not yet  
registered, please do so now.

http://slurm.schedmd.com/slurm_ug_agenda.html#registration
--

-- 
Morris "Moe" Jette
CTO, SchedMD LLC

Slurm User Group Meeting
September 23-24, Lugano, Switzerland
Find out more http://slurm.schedmd.com/slurm_ug_agenda.html

jette | 20 Aug 00:11 2014

New Slurm releases and Slurm User Group Meeting


The 2014 Slurm User Group Meeting will be held on September 23 and 24  
in Lugano,
Switzerland. The meeting will include an assortment of tutorials, technical
presentations, and site reports. Prof. Felix Schürmann with the European Human
Brain Project will be our keynote speaker. Early registration for ends this
week. For more information, see
http://slurm.schedmd.com/slurm_ug_agenda.html

Slurm versions 14.03.7 and 14.11.0-pre4 are now available.
Version 14.03.7 includes quite a few relatively minor bug fixes.
Version 14.11.0-pre4 includes a new job array data structure and APIs for
managing job arrays. These changes provide vastly improved scalability with
respect to job arrays. Version 14.11.0 is under active development and its
release is planned in November 2014.

Slurm downloads are available from
http://www.schedmd.com/#repos

Highlights of changes in Slurm version 14.03.7 include:
  -- Correct typos in man pages.
  -- Add note to MaxNodesPerUser and multiple jobs running on the same node
     counting as multiple nodes.
  -- PerlAPI - fix renamed call from slurm_api_set_conf_file to
     slurm_conf_reinit.
  -- Fix gres race condition that could result in job deallocation  
error message.
  -- Correct NumCPUs count for jobs with --exclusive option.
  -- When creating reservation with CoreCnt, check that Slurm uses
     SelectType=select/cons_res, otherwise don't send the request to slurmctld
(Continue reading)


Gmane