Custom Images in the HPC Toolkit

Introduction

This module uses Packer to create an image within an HPC Toolkit deployment. Packer operates by provisioning a short-lived VM in Google Cloud on which it executes scripts to customize the boot disk for repeated use. The VM's boot disk is specified from a source image that defaults to the HPC VM Image. This Packer "template" supports customization by the following approaches following a recommended use:

startup-script metadata from raw string or file
Shell scripts uploaded from the Packer execution environment to the VM
Ansible playbooks uploaded from the Packer execution environment to the VM

They can be specified independently of one another, so that anywhere from 1 to 3 solutions can be used simultaneously. In the case that 0 scripts are supplied, the source boot disk is effectively copied to your project without customization. This can be useful in scenarios where increased control over the image maintenance lifecycle is desired or when policies restrict the use of images to internal projects.

Order of execution

The startup script specified in metadata executes in parallel with the other supported methods. However, the remaining methods execute in a well-defined order relative to one another.

All shell scripts will execute in the configured order
After shell scripts complete, all Ansible playbooks will execute in the configured order

NOTE: if both startup_script and startup_script_file are specified, then startup_script_file takes precedence.

Recommended use

Because the metadata startup script executes in parallel with the other solutions, conflicts can arise, especially when package managers (yum or apt) lock their databases during package installation. Therefore, it is recommended to choose one of the following approaches:

Specify either startup_script or startup_script_file and do not specify shell_scripts or ansible_playbooks.
- This can be especially useful in environments that restrict SSH access
Specify any combination of shell_scripts and ansible_playbooks and do not specify startup_script or startup_script_file.

If any of the shell_scripts or ansible_playbooks fail by returning a code other than 0, Packer will determine that the build has failed and refuse to save the resulting disk.

NOTE: there is an existing issue that can cause failures of the startup_script or startup_script_file not to be detected as failures by Packer.

External access with SSH

The shell scripts and Ansible playbooks customization solutions both require SSH access to the VM from the Packer execution environment. SSH access can be enabled one of 2 ways:

The VM is created without a public IP address and SSH tunnels are created using Identity-Aware Proxy (IAP).
- Allow use_iap to take on its default value of true
The VM is created with an IP address on the public internet and firewall rules allow SSH access from the Packer execution environment.
- Set omit_external_ip = false (or omit_external_ip: false in a blueprint)
- Add firewall rules that open SSH to the VM

The Packer template defaults to using to the 1st IAP-based solution because it is more secure (no exposure to public internet) and because the Toolkit VPC module automatically sets up all necessary firewall rules for SSH tunneling and outbound-only access to the internet through Cloud NAT.

In either SSH solution, customization scripts should be supplied as files in the shell_scripts and ansible_playbooks settings.

Environments without SSH access

Many network environments disallow SSH access to VMs. In these environments, the metadata-based startup scripts are appropriate because they execute entirely independently of the Packer execution environment.

In this scenario, a single scripts should be supplied in the form of a string to the startup_script input variable. This solution integrates well with Toolkit runners. Runners operate by using a single startup script whose behavior is extended by downloading and executing a customizable set of runners from Cloud Storage at startup.

NOTE: Packer will attempt to use SSH if either shell_scripts or ansible_playbooks are set to non-empty values. Leave them at their default, empty values to ensure access by SSH is disabled.

Supplying startup script as a string

The startup_script parameter accepts scripts formatted as strings. In Packer and Terraform, multi-line strings can be specified using heredoc syntax in an input Packer variables file (*.pkrvars.hcl) For example, the following snippet defines a multi-line bash script followed by an integer representing the size, in GiB, of the resulting image:

startup_script = <<-EOT
  #!/bin/bash
  yum install -y epel-release
  yum install -y jq
  EOT

disk_size = 100

In a blueprint, the equivalent syntax is:

...
    settings:
      startup_script: |
        #!/bin/bash
        yum install -y epel-release
        yum install -y jq
      disk_size: 100
...

Monitoring startup script execution

When using startup script customization, Packer will print very limited output to the console. For example:

==> example.googlecompute.toolkit_image: Waiting for any running startup script to finish...
==> example.googlecompute.toolkit_image: Startup script not finished yet. Waiting...
==> example.googlecompute.toolkit_image: Startup script not finished yet. Waiting...
==> example.googlecompute.toolkit_image: Startup script, if any, has finished running.

Using the default value for [var.scopes][#input_scopes], the output of startup script execution will be stored in Cloud Logging. It can be examined using the Cloud Logging Console or with a gcloud logging read command (substituting <<PROJECT_ID>> with your project ID):

$ gcloud logging --project <<PROJECT_ID>> read \
    'logName="projects/<<PROJECT_ID>>/logs/GCEMetadataScripts" AND jsonPayload.message=~"^startup-script: "' \
    --format="table[box](timestamp, resource.labels.instance_id, jsonPayload.message)" --freshness 2h

Note that this command will print all startup script entries within the project within the "freshness" window in reverse order. You may need to identify the instance ID of the Packer VM and filter further by that value using gcloud or grep. To print the entries in the order they would have appeared on your console, we recommend piping the output of this command to the standard Linux utility tac.

Example

The included blueprint demonstrates a solution that builds an image using:

The HPC VM Image as a base upon which to customize
A VPC network with firewall rules that allow IAP-based SSH tunnels
A Toolkit runner that installs a custom script

Please review the examples README for usage instructions.

License

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

 http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Requirements

No requirements.

Providers

No providers.

Modules

No modules.

Resources

No resources.

Inputs

Name	Description	Type	Default	Required
accelerator_count	Number of accelerator cards to attach to the VM; not necessary for familes that always include GPUs (A2).	`number`	`null`	no
accelerator_type	Type of accelerator cards to attach to the VM; not necessary for familes that always include GPUs (A2).	`string`	`null`	no
ansible_playbooks	A list of Ansible playbook configurations that will be uploaded to customize the VM image	list(object({ playbook_file = string galaxy_file = string extra_arguments = list(string) }))	`[]`	no
deployment_name	HPC Toolkit deployment name	`string`	n/a	yes
disk_size	Size of disk image in GB	`number`	`null`	no
image_family	The family name of the image to be built. Image name will also be derived from this value. Defaults to `deployment_name`	`string`	`null`	no
labels	Labels to apply to the short-lived VM	`map(string)`	`null`	no
machine_type	VM machine type on which to build new image	`string`	`"n2-standard-4"`	no
network_project_id	Project ID of Shared VPC network	`string`	`null`	no
omit_external_ip	Provision the image building VM without a public IP address	`bool`	`true`	no
on_host_maintenance	Describes maintenance behavior for the instance. If left blank this will default to `MIGRATE` except the use of GPUs requires it to be `TERMINATE`	`string`	`null`	no
project_id	Project in which to create VM and image	`string`	n/a	yes
scopes	Service account scopes to attach to the instance. See https://cloud.google.com/compute/docs/access/service-accounts.	`list(string)`	[ "https://www.googleapis.com/auth/userinfo.email", "https://www.googleapis.com/auth/compute", "https://www.googleapis.com/auth/devstorage.full_control", "https://www.googleapis.com/auth/logging.write" ]	no
service_account_email	The service account email to use. If null or 'default', then the default Compute Engine service account will be used.	`string`	`null`	no
shell_scripts	A list of paths to local shell scripts which will be uploaded to customize the VM image	`list(string)`	`[]`	no
source_image	Source OS image to build from	`string`	`null`	no
source_image_family	Alternative to source_image. Specify image family to build from latest image in family	`string`	`"hpc-centos-7"`	no
source_image_project_id	A list of project IDs to search for the source image. Packer will search the first project ID in the list first, and fall back to the next in the list, until it finds the source image.	`list(string)`	`null`	no
ssh_username	Username to use for SSH access to VM	`string`	`"packer"`	no
startup_script	Startup script (as raw string) used to build the custom VM image (overridden by var.startup_script_file if both are supplied)	`string`	`null`	no
startup_script_file	Path to local shell script that will be uploaded as a startup script to customize the VM image	`string`	`null`	no
subnetwork_name	Name of subnetwork in which to provision image building VM	`string`	n/a	yes
tags	Assign network tags to apply firewall rules to VM instance	`list(string)`	`null`	no
use_iap	Use IAP proxy when connecting by SSH	`bool`	`true`	no
use_os_login	Use OS Login when connecting by SSH	`bool`	`false`	no
wrap_startup_script	Wrap startup script with Packer-generated wrapper	`bool`	`true`	no
zone	Cloud zone in which to provision image building VM	`string`	n/a	yes

Outputs

No outputs.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Custom Images in the HPC Toolkit

Introduction

Order of execution

Recommended use

External access with SSH

Environments without SSH access

Supplying startup script as a string

Monitoring startup script execution

Example

License

Requirements

Providers

Modules

Resources

Inputs

Outputs

Files

README.md

Latest commit

History

README.md

File metadata and controls

Custom Images in the HPC Toolkit

Introduction

Order of execution

Recommended use

External access with SSH

Environments without SSH access

Supplying startup script as a string

Monitoring startup script execution

Example

License

Requirements

Providers

Modules

Resources

Inputs

Outputs