This repository defines a Terraform module, which you can use in your code by adding a module
configuration and setting its source
parameter to URL of this folder. This module builds a Kubernetes-based JupyterHub in Google Cloud as used by Brown University.
In general this module of JupyterHub is configured as follows:
- Two pools: one for the core components, one for user pods
- Authentication (Google OAuth has been tested, other are possible), dummy authenticator is the default.
- We currently use Infoblox to configure our DNS, we will be making that optional in the future.
- We provide scale-up and scale-down cronjobs that can change the number of replicas to have nodes be warm for users during class-time.
- Optional shared nfs volume (for shared data, for instance).
For general terraform examples see the examples folder. In practice we deploy one hub per class at Brown. Since most of the deployments are very similar, we use Terragrunt to keep configurations DRY. While our deployment repository is not public at this moment, we hope to provide an example soon.
This module depends on you having GCP credentials of some kind. The module looks for a credential file in JSON format. You should export the following:
GOOGLE_APPLICATION_CREDENTIALS=/path/to/file.json
If the credentials are set correctly, the basic gcloud infrastructure is successfully created
Additionally make sure that gcloud init
is using the appropriate service account. This is necessary because this module performs a local exec
to get the cluster credentials. You also need to make sure that KUBECONFIG
or KUBE_CONFIG_PATH
path is set. A typical error seen when the context is not set correctly is
Error: error installing: Post "http://localhost/apis/apps/v1/namespaces/kube-system/deployments": dial tcp [::1]:80: connect: connection refused
Finally, this module also configures records in infoblox and therefore you'll need credentials to the server. For Brown users we recommend using 1password-cli
to source your secrets into environment variables (ask for access to creds)., ie
export INFOBLOX_USERNAME=$(op item get infoblox --field username)
export INFOBLOX_PASSWORD=$(op item get infoblox --field password --reveal)
export INFOBLOX_SERVER=$(op item get infoblox --format json | jq -r '.urls[].href' | awk -F/ '{print $3}')
The following envs are required
INFOBLOX_USERNAME
INFORBOX_PASSWORD
INFOBLOX_SERVER
This repository defines a Terraform module, which you can use in your
code by adding a module
configuration and setting its source
parameter to URL of this repository. See the examples folder for guidance
Name | Version |
---|---|
terraform | >= 1.9.2 |
5.38.0 | |
google-beta | 5.38.0 |
helm | 2.14.0 |
kubernetes | 2.31.0 |
Name | Version |
---|---|
5.38.0 |
Name | Source | Version |
---|---|---|
external_infoblox_record | git::https://github.com/BrownUniversity/terraform-infoblox-record-a.git | v0.1.6 |
gke_auth | terraform-google-modules/kubernetes-engine/google//modules/auth | 31.0.0 |
jhub_cluster | git::https://github.com/BrownUniversity/terraform-gcp-cluster.git | v0.1.9 |
jhub_helm | ./modules/helm-jhub | n/a |
jhub_project | git::https://github.com/BrownUniversity/terraform-gcp-project.git | v0.1.6 |
jhub_vpc | git::https://github.com/BrownUniversity/terraform-gcp-vpc.git | v0.1.4 |
production_infoblox_record | git::https://github.com/BrownUniversity/terraform-infoblox-record-a.git | v0.1.6 |
Name | Type |
---|---|
google_compute_address.static | resource |
Name | Description | Type | Default | Required |
---|---|---|---|---|
activate_apis | The list of apis to activate within the project | list(string) |
[] |
no |
auth_secretkeyvaluemap | Key Value Map for secret variables used by the authenticator | map(string) |
{ |
no |
auth_type | Type OAuth e.g google | string |
"dummy" |
no |
auto_create_network | Auto create default network. | bool |
false |
no |
automount_service_account_token | Enable automatin mounting of the service account token | bool |
true |
no |
billing_account | Billing account id. | string |
n/a | yes |
cluster_name | Cluster name | string |
"default" |
no |
core_pool_auto_repair | Enable auto-repair of core-component pool | bool |
true |
no |
core_pool_auto_upgrade | Enable auto-upgrade of core-component pool | bool |
true |
no |
core_pool_disk_size_gb | Size of disk for core-component pool | number |
100 |
no |
core_pool_disk_type | Type of disk core-component pool | string |
"pd-standard" |
no |
core_pool_image_type | Type of image core-component pool | string |
"COS_CONTAINERD" |
no |
core_pool_initial_node_count | Number of initial nodes in core-component pool | number |
1 |
no |
core_pool_local_ssd_count | Number of SSDs core-component pool | number |
0 |
no |
core_pool_machine_type | Machine type for the core-component pool | string |
"n1-highmem-4" |
no |
core_pool_max_count | Maximum number of nodes in the core-component pool | number |
3 |
no |
core_pool_min_count | Minimum number of nodes in the core-component pool | number |
1 |
no |
core_pool_name | Name for the core-component pool | string |
"core-pool" |
no |
core_pool_preemptible | Make core-component pool preemptible | bool |
false |
no |
create_service_account | Defines if service account specified to run nodes should be created. | bool |
false |
no |
create_tls_secret | If set to true, user will be passing tls key and certificate to create a kubernetes secret, and use it in their helm chart | bool |
true |
no |
default_service_account | Project default service account setting: can be one of delete, depriviledge, or keep. | string |
"delete" |
no |
disable_dependent_services | Whether services that are enabled and which depend on this service should also be disabled when this service is destroyed. | string |
"true" |
no |
enable_private_nodes | (Beta) Whether nodes have internal IP addresses only | bool |
false |
no |
folder_id | The ID of a folder to host this project | string |
n/a | yes |
gcp_zone | The GCP zone to deploy the runner into. | string |
"us-east1-b" |
no |
helm_deploy_timeout | Time for helm to wait for deployment of chart and downloading of docker image | number |
1000 |
no |
helm_values_file | Relative path and file name. Example: values.yaml | string |
n/a | yes |
horizontal_pod_autoscaling | Enable horizontal pod autoscaling addon | bool |
true |
no |
http_load_balancing | Enable httpload balancer addon | bool |
false |
no |
jhub_helm_version | Version of the JupyterHub Helm Chart Release | string |
n/a | yes |
kubernetes_version | The Kubernetes version of the masters. If set to 'latest' it will pull latest available version in the selected region. | string |
n/a | yes |
labels | Map of labels for project. | map(string) |
{ |
no |
logging_service | The logging service that the cluster should write logs to. Available options include logging.googleapis.com, logging.googleapis.com/kubernetes (beta), and none | string |
"logging.googleapis.com/kubernetes" |
no |
maintenance_start_time | Time window specified for daily maintenance operations in RFC3339 format | string |
"03:00" |
no |
master_ipv4_cidr_block | (Beta) The IP range in CIDR notation to use for the hosted master network | string |
"172.16.0.0/28" |
no |
monitoring_service | The monitoring service that the cluster should write metrics to. Automatically send metrics from pods in the cluster to the Google Cloud Monitoring API. VM metrics will be collected by Google Compute Engine regardless of this setting Available options include monitoring.googleapis.com, monitoring.googleapis.com/kubernetes (beta) and none | string |
"monitoring.googleapis.com/kubernetes" |
no |
network_name | Name of the VPC. | string |
"kubernetes-vpc" |
no |
network_policy | Enable network policy addon | bool |
true |
no |
org_id | Organization id. | number |
n/a | yes |
project_name | Name of the project. | string |
n/a | yes |
range_name_pods | The range name for pods | string |
"kubernetes-pods" |
no |
range_name_services | The range name for services | string |
"kubernetes-services" |
no |
record_domain | The domain on the record. hostaname.domain = FQDN | string |
n/a | yes |
record_hostname | The domain on the record. hostaname.domain = FQDN | string |
n/a | yes |
region | The region to host the cluster in | string |
"us-east1" |
no |
regional | Whether the master node should be regional or zonal | bool |
true |
no |
remove_default_node_pool | Remove default node pool while setting up the cluster | bool |
false |
no |
scale_down_command | Command for scale-down cron job | list(string) |
[ |
no |
scale_down_name | Name of scale-down cron job | string |
"scale-down" |
no |
scale_down_schedule | Schedule for scale-down cron job | string |
"1 18 * * 1-5" |
no |
scale_up_command | Command for scale-up cron job | list(string) |
[ |
no |
scale_up_name | Name of scale-up cron job | string |
"scale-up" |
no |
scale_up_schedule | Schedule for scale-up cron job | string |
"1 6 * * 1-5" |
no |
shared_storage_capacity | Size of the shared volume | number |
5 |
no |
site_certificate | File containing the TLS certificate | string |
n/a | yes |
site_certificate_key | File containing the TLS certificate key | string |
n/a | yes |
subnet_name | Name of the subnet. | string |
"kubernetes-subnet" |
no |
tls_secret_name | TLS secret name used in secret creation, it must match with what is used by user in helm chart | string |
"jupyterhub-tls" |
no |
use_shared_volume | Whether to use a shared NFS volume | bool |
false |
no |
user_pool_auto_repair | Enable auto-repair of user pool | bool |
true |
no |
user_pool_auto_upgrade | Enable auto-upgrade of user pool | bool |
true |
no |
user_pool_disk_size_gb | Size of disk for user pool | number |
100 |
no |
user_pool_disk_type | Type of disk user pool | string |
"pd-standard" |
no |
user_pool_image_type | Type of image user pool | string |
"COS_CONTAINERD" |
no |
user_pool_initial_node_count | Number of initial nodes in user pool | number |
1 |
no |
user_pool_local_ssd_count | Number of SSDs user pool | number |
0 |
no |
user_pool_machine_type | Machine type for the user pool | string |
"n1-highmem-4" |
no |
user_pool_max_count | Maximum number of nodes in the user pool | number |
20 |
no |
user_pool_min_count | Minimum number of nodes in the user pool | number |
1 |
no |
user_pool_name | Name for the user pool | string |
"user-pool" |
no |
user_pool_preemptible | Make user pool preemptible | bool |
false |
no |
Name | Description |
---|---|
cluster_name | Cluster name |
hub_ip | Static IP assigned to the Jupyter Hub |
location | n/a |
project_id | Project ID |
project_name | Project Name |
region | n/a |
zones | List of zones in which the cluster resides |
Use GitLab Flow.
- Create feature branches for features and fixes from default branch
- Merge only from PR with review
- After merging to default branch a release is drafted using a github action. Check the draft and publish if you and tests are happy
We recommend installing the latest version of terraform whenever you are updating this module. The current terraform version for this module is 1.9.2. You can install terraform with homebrew.
You should make sure that pre-commit hooks are installed to run the formater, linter, etc. Install and configure terraform pre-commit hooks as follows:
Install dependencies
brew bundle install
Install the pre-commit hook globally
DIR=~/.git-template
git config --global init.templateDir ${DIR}
pre-commit init-templatedir -t pre-commit ${DIR}
To run the hooks specified in .pre-commit-config.yaml
:
pre-commit run -a
| Hook name | Description |
| ------------------------------------------------ | -------------------------------------------------------------------------------------------------------------------------- |
| `terraform_fmt` | Rewrites all Terraform configuration files to a canonical format. |
| `terraform_docs` | Inserts input and output documentation into `README.md`. |
| `terraform_tflint` | Validates all Terraform configuration files with [TFLint](https://github.com/terraform-linters/tflint). |
| `terraform_tfsec` | [TFSec](https://github.com/liamg/tfsec) static analysis of terraform templates to spot potential security issues. |
This is only needed if running tests locally. The google-cloud-sdk and last-pass cli are included in the Brewfile so it should now be installed
This repo includes a env.sh
file that where you set the path to the google credentials file and infoblox secrets. First you'll need to make sure you are logged in to last pass,
eval $(op signin)
Then use
source env.sh
to set the related environment variables. If you need to unset them, you can use
deactivate
As of 2022-08 Gcloud authentication needs an additional plugin to be installed. Run
gcloud components install gke-gcloud-auth-plugin
See here for more information.
This repository uses the native terraform tests to test the modules. In the tests directory you can find examples of how each module can be used and the test scripts.
In addition to the GCLOUD and INFOBLOX variables configured by the env.sh
file, we also need to add some additional secret variables.
In the example folders, rename the following files:
local-example.tfvars
tosecrets.auto.tfvars
local-example.yaml
tosecrets.yaml
Set the corresponding values inside of the files. They should automatically be ignored via our .gitignore
file
Use the terraform test
command to test the modules in this repo. You can also specify the name of the files to run each test individually:
terraform test -filter=tests/test-sample-jhub.tftest.hcl # runs the test without nfs
terraform test -filter=tests/test-sample-jhub-nfs.tftest.hcl # runs the test with nfs
If you need finer control when trouble shooting, you can directly run terraform within the container specified by the Dockerfile.
First, build the Dockerfile with:
docker build -t <image_name> --platform linux/amd64 .
Note that --platform linux/amd64
is necessary for ARM-based systems (e.g. Apple Silicon Macs).
Then run the docker container with
docker run -t -d -v $(pwd):/usr/app --platform linux/amd64 <image_name>
Finally, you can get a shell inside the running container with:
docker exec -it <container_name> /bin/bash
Follow the next section to authenticate to Google Cloud and 1Password.
Further troubleshooting will require interacting with the kubernetes cluster directly, and you'll need to authenticate to the cluster. You can do so for instance as follows,
PROJECT=jhub-sample-xxxxx
ZONE=us-east1-b
gcloud container clusters get-credentials default --zone ${ZONE} --project ${PROJECT}
If gcloud is not authenticated, then do so as follows
gcloud auth activate-service-account <service-account> --key-file=<path-tojson-credentials>
--project=$PROJECT
This project has three workflows enabled:
-
PR labeler: When opening a PR to the main branch, a label is given assigned automatically according to the name of your feature branch. The labeler follows the follows rules in pr-labeler.yml
-
Release Drafter: When merging to master, a release is drafted using the Release-Drafter Action
-
terraform test
is run on every commit unless[skip ci]
is added to commit message.
We aim to upgrade this package at least once a year.
Use tfenv
to manage your versions of terraform. You can update the version in the .terraform-version
file and run tfenv install
and tf use
to install and use the version specified in the file.
You should also update the version of terraform specified in the versions.tf
file