v1.17.0: Initial Support for GKE, Slurm v5.6.3
Key New Features
- Initial Support for Kubernetes with GKE (example).
- Enable specification of all fields of module outputs
- Instructions to run the toolkit from Cloud Workstations
New Modules
gke-cluster
: module to create a Google Kubernetes Engine (GKE) clustergke-node-pool
: module to create a Google Kubernetes Engine (GKE) node pool
Module Improvements
startup-script
: replace example scripts with bool inputscustom-image
: addedimage_storage_locations
inputcustom-image
: use a unique Packer SSH username to avoid clashes with previous Packer buildshtcondor-configure
: address need for SystemD overridehtcondor-configure
: ensure that a central manager optimization is configured even when high availability is not enabledchrome-remote-desktop
: updated for Slurm image support
Improvements
- Added support for OFE deployment from a configuration file
Version updates
schedmd-slurm-gcp-v5-controller
: update SchedMD modules to 5.6.3
What's Changed
- Replace startup-srcipt examples with bool inputs by @mr0re1 in #1100
- Copy all embedded modules into deployment, use unique source for locals by @mr0re1 in #1086
- Close copy file descriptor in EmbeddedSourceReader by @mr0re1 in #1114
- Improve error match in embedded_test by @mr0re1 in #1115
- Adds a gke-cluster module to community by @nick-stroud in #1113
- DAOS docs update by @cboneti in #1116
- Simplify and relax type constraints for variables.tf by @mr0re1 in #1111
- Make every integration test into individual build config by @mr0re1 in #1112
- Fix validator test_deployment_variable_not_used by @mr0re1 in #1120
- Add basic documentation for gke-cluster module and example by @nick-stroud in #1117
- Updating packer documentation to make usage easier to find by @cboneti in #1118
- Add
image_storage_locations
input tomodules/packer/custom-image
by @mr0re1 in #1123 - Add TF definition for DAILY-test-X,PR-test-X, and PR-validation by @mr0re1 in #1119
- Add "babysit_tests" tool to automatically approve PR tests by @mr0re1 in #1106
- Solve state/world discrepancies in TF dev infra. by @mr0re1 in #1126
- Move SlurmV5 tests affected by stockouts to us-west4-c by @mr0re1 in #1124
- Improve variable references by @tpdownes in #1127
- Remove test groups, update documentation by @mr0re1 in #1128
- Fix bug in check for mixing module kinds within a group by @mr0re1 in #1130
- Update GitHub bug report template by @mr0re1 in #1131
- Remove deprecated pod_security_policy by @nick-stroud in #1133
- Add test selectors to babysit tool by @mr0re1 in #1136
- Add TF for legacy PR tests. To be removed after release by @mr0re1 in #1135
- Add SPACK_CACHE secret to spack-gromacs test by @mr0re1 in #1132
- Add instructions for connecting to the gke-cluster by @nick-stroud in #1138
- Address need for SystemD override in HTCondor module by @tpdownes in #1139
- Update TFLint and rules plugin for Google Cloud Platform by @tpdownes in #1146
- Add double quotes on variables: SC2086 – ShellCheck by @nick-stroud in #1148
- Add support for sensitive output values by @tpdownes in #1129
- Represent TerraformBackend.Config with cty.Value by @mr0re1 in #1141
- Bump github.com/otiai10/copy from 1.9.0 to 1.10.0 by @dependabot in #1143
- Bump github.com/spf13/cobra from 1.6.1 to 1.7.0 by @dependabot in #1145
- Truncate short sha length to 7 chars when filtering from cloud build by @nick-stroud in #1151
- Bump google.golang.org/api from 0.114.0 to 0.117.0 by @dependabot in #1150
- Bring develop up to date with release of v1.16.0 by @nick-stroud in #1153
- Pin google terraform provider to latest version by @nick-stroud in #1154
- Add selectors for batch and spack tests to babysit_tests tool by @nick-stroud in #1155
- Reduce the number of execution hosts in pbs test to reduce the change… by @nick-stroud in #1149
- Ensure that PBS test config explicitly uses network module by @tpdownes in #1159
- Align internal use of Toolkit GitHub refs by @tpdownes in #1160
- Move Ubuntu test and example to reduce chance of stockout by @nick-stroud in #1163
- Fix HTCondor central manager configuration by @tpdownes in #1162
- Add specialized tokenizer to handle
((HCL literals))
by @mr0re1 in #1167 - Move Slurm v5 high io test to reduce stockouts by @nick-stroud in #1168
- Gke node pool by @nick-stroud in #1140
- Make babysit_tests compatible with Python3.7 (VertexAI) by @mr0re1 in #1173
- Instructions to run the toolkit from Cloud Workstations by @cboneti in #1170
- Write group metadata to deployment folder by @tpdownes in #1169
- Update quantum example with new build instructions by @tpdownes in #1176
- Add TransformSimpleToHcl for cty.Value by @mr0re1 in #1165
- Developer setup on login is causing workstation to crash on startup by @nick-stroud in #1177
- Add conditions on Slurm partition enable_placement, exclusive, Oversu… by @mr0re1 in #1174
- Move tests to avoid stockouts by @nick-stroud in #1179
- Use a unique Packer SSH username to avoid clashes with previous Packer builds by @nick-stroud in #1184
- Bump google.golang.org/api from 0.117.0 to 0.118.0 by @dependabot in #1183
- Bump cloud.google.com/go/compute from 1.19.0 to 1.19.1 by @dependabot in #1182
- Update SchedMD modules to 5.6.3 (from 5.6.2) by @SkylerMalinowski in #1171
- Updated chrome remote desktop module for slurm image support by @saltysoup in #1181
- Add HclExpression struct by @mr0re1 in #1180
- Make "2-component" sole valid format. by @mr0re1 in #1188
- Rename
HclExpression
->Expression
by @mr0re1 in #1189 - Ensure that outputs are written as HCL references by @tpdownes in #1190
- Make Blueprint.Vars a Dict instead of map[string]interface{} by @mr0re1 in #1187
- OFE deployment using config file. by @ek-nag in #1157
- Updating the builder image to use newer shellcheck by @cboneti in #1191
- Minor changes to Expression by @mr0re1 in #1192
- Enable intergroup references in "advanced usage" mode by @tpdownes in #1193
- Update GKE documentation to include gke-node-pool by @nick-stroud in #1196
- Refactor resolution of RequiredApis into separate function, so rest c… by @mr0re1 in #1194
- Allow user to specify authorized networks in settings by @nick-stroud in #1197
- Limiting line length to 2K to avoid bufio.Scanner: token too long issues by @cboneti in #1198
- Fix missing edge in graph for explicit connections by @tpdownes in #1201
- Make
validatorConfig.Inputs
Dict
by @mr0re1 in #1195 - Add "Golden copy" test runner with single test by @mr0re1 in #1202
- Ensure IGC outputs are marked sensitive to reduce output verbosity by @tpdownes in #1199
- Use struct for packer/terraform kind values by @tpdownes in #1200
- Add missing files for golden copy test by @mr0re1 in #1203
- Add golden copy tests for terraform IGC by @mr0re1 in #1204
- Add license to generated YAML files by @mr0re1 in #1206
- Update DAILY integration tests schedule to run on weekdays only. by @mr0re1 in #1208
- Remove unused
DeploymentConfig.expanded
by @mr0re1 in #1207 - Update PR template to remove checklist by @tpdownes in #1212
- Address security alert CVE-2023-30608 by @tpdownes in #1241
- Update version to 1.17.0 by @mr0re1 in #1266
- Cherry-pick NVIDIA compilation fix by @mr0re1 in #1271
- Release v1.17.0 by @mr0re1 in #1276
Full Changelog: v1.16.0...v1.17.0