Skip to content

21.09

Compare
Choose a tag to compare
@dholt dholt released this 01 Oct 00:02

DeepOps 21.09 Release Notes

What's New

Release 21.09 is mostly a bug fix release

General

  • Support for DGX OS 5 in nvidia-dgx role

Slurm

  • Slurm version 21.08.1
  • HPC SDK 21.9
  • Open OnDemand v2.0.9
  • CUDA toolkit 11.4
  • Slurm Pyxis plugin 0.11.1
  • Enroot container runtime v3.2.0
  • Hwloc 2.5.0, pmix 3.2.3
  • Spack v0.16.2

K8s

  • Kubernetes version v1.20.7 (kubespray v2.16.0)
  • Helm version v3.5.4
  • GPU Operator v1.8.2 (GPU driver 470.57.02)
  • GPU Device Plugin v0.9.0
  • GPU Feature Discovery v0.4.1
  • NFS Client Provisioner v4.0.13

Changes

  • Docker version 20.10

Bugs/Enhancements

  • Improved cleanup in Slurm epilog (#965)
  • Fix disabling NVIDIA driver install on Slurm cluster install (#948)
  • Permit SFTP in default SSHD config (#980)
  • Address different possible DCGM service names depending on version (#983)
  • Fix PAM Slurm adopt/login (#989)
  • Enroot: adjust cache directory to be per-user (#997)
  • Adding proxy support for downloading of hwloc, pmix, nhc and slurm (#1002)
  • Remove broken offline deployment support and clarify documentation (#1012)
  • Grafana: add var for custom config template (#994)
  • EasyBuild: Enable both shells on all distros (#993)
  • Default to building Slurm with dynamic libs (#1021)
  • ood-wrapper: Don't install python3-passlib on CentOS 7 (#995)
  • Update ansible-role-enroot to 0.5.0 (#1030)

Upgrade steps

If you are upgrading to this version of DeepOps from a previous release you will need to follow the upgrade section of the Slurm or Kubernetes Deployment Guides. In addition to this, the ./scripts/setup.sh script must be re-run and any new variables in the config.example files should be added to the existing config. For a full diff from release 21.06 run git diff 21.06 21.09 -- config.example/. If you encounter problem please open a GitHub issue. See the update guide for additional guidance.

Notes