Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation and example clound function for shared vpcs. #2674

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
46 changes: 46 additions & 0 deletions community/other/cloud-function-for-shared-vpcs/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
<!-- BEGINNING OF PRE-COMMIT-TERRAFORM DOCS HOOK -->
## Requirements

| Name | Version |
|------|---------|
| <a name="requirement_terraform"></a> [terraform](#requirement\_terraform) | >= 0.14.0 |
| <a name="requirement_archive"></a> [archive](#requirement\_archive) | >= 2.4.2 |
| <a name="requirement_google"></a> [google](#requirement\_google) | >= 3.83 |

## Providers

| Name | Version |
|------|---------|
| <a name="provider_archive"></a> [archive](#provider\_archive) | >= 2.4.2 |
| <a name="provider_google"></a> [google](#provider\_google) | >= 3.83 |

## Modules

No modules.

## Resources

| Name | Type |
|------|------|
| [google_cloudfunctions2_function.serviceaccount_audit_logs_watcher](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/cloudfunctions2_function) | resource |
| [google_compute_network.vpc](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_network) | resource |
| [google_compute_shared_vpc_service_project.shared_vpc](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_shared_vpc_service_project) | resource |
| [google_compute_subnetwork.hpc](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_subnetwork) | resource |
| [google_logging_project_sink.logs_sink](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/logging_project_sink) | resource |
| [google_project_iam_binding.subnet_iam_policy_binding](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/project_iam_binding) | resource |
| [google_project_iam_custom_role.subnet_iampolicy_role](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/project_iam_custom_role) | resource |
| [google_pubsub_topic.log_sink_topic](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/pubsub_topic) | resource |
| [google_pubsub_topic_iam_binding.log_sink_topic_binding](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/pubsub_topic_iam_binding) | resource |
| [google_service_account.cloud_function_service_account](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/service_account) | resource |
| [google_storage_bucket.cf_source_bucket](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/storage_bucket) | resource |
| [google_storage_bucket_object.object](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/storage_bucket_object) | resource |
| [archive_file.cf_source](https://registry.terraform.io/providers/hashicorp/archive/latest/docs/data-sources/file) | data source |

## Inputs

No inputs.

## Outputs

No outputs.
<!-- END OF PRE-COMMIT-TERRAFORM DOCS HOOK -->
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
# Copyright 2024 "Google LLC"
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

import base64
import json
import os
import functions_framework
from google.cloud import compute_v1



@functions_framework.cloud_event
def process_log_entry(event):
data_buffer = base64.b64decode(event.data["message"]["data"])
log_entry = json.loads(data_buffer)["protoPayload"]

host_project = os.getenv("HOST_PROJECT")
subnet_region = os.getenv("SUBNET_REGION")
subnet_name = os.getenv("SUBNET_NAME")

# Dont handle service accounts created by google.
if not "principalEmail" in log_entry['authenticationInfo']:
return

client = compute_v1.SubnetworksClient()
request = compute_v1.GetIamPolicySubnetworkRequest(
project=host_project,
region=subnet_region,
resource=subnet_name,
)

iam_policy = client.get_iam_policy(request=request)

members = []
for o in iam_policy.bindings:
members = [x for x in o.members if not x.startswith("deleted:")]
if log_entry['methodName'] == 'google.iam.admin.v1.CreateServiceAccount':
print("Adding " + log_entry['response']['email'] + " to list of authorized service accounts." )
members.append("serviceAccount:" + log_entry['response']['email'])


iam_policy.bindings[0].members = list(set(members))
print("Current list of members", iam_policy.bindings[0].members)
# Initialize request argument(s)
request = compute_v1.SetIamPolicySubnetworkRequest(
project=host_project,
region=subnet_region,
resource=subnet_name,
region_set_policy_request_resource={"policy":iam_policy}
)

# Make the request
response = client.set_iam_policy(request=request)
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
functions-framework==3.*
google-cloud-compute
174 changes: 174 additions & 0 deletions community/other/cloud-function-for-shared-vpcs/main.tf
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly this terraform would essentially be a pre-step that would set up the shared vpc and cloud function to give appropriate permissions before deploying a cluster in a service project using a blueprint.

This submission would be the first of its kind. The HPC Toolkit repo does not host terraform that is not Toolkit compatible. It seems to me the way this terraform is written is meant to be a self contained deployment (aka no inputs/outputs and not modular). While the example is informative, I believe it strays from the purpose of the Toolkit to build composable modules.

Several ideas on a path forward.

  1. Modularize the code and create a blueprint that could be used with the Toolkit. This would be a non trival amount of work but would reuse existing modules and produce new modules that could be reused in future.
  • network/vpc module - exists
  • project/iam-custom-role module - needs development
  • project/service-account module - exists
  • pubsub/topic module - exists but expand to include log sink functionality
  • file-system/cloud-storage-bucket module - exists
  • cloud-function/serviceaccount-audit-logs-watcher module - needs development

Possible alternative breakdown could be:

  • network/vpc module - exists
  • project/service-account module - exists
  • file-system/cloud-storage-bucket module - exists
  • cloud-function/serviceaccount-audit-logs-watcher - new module that takes in bucket, vpc, and service account and contains rest of functionality.
  1. Host this example in some other repo and refer to it from the shared vpc documentation in this PR.
  2. Maybe there is a deeper level we could place this code that is more appropriate such as /community/examples/shared-vpc-setup-helper/

I will confer with the team and decide what options are reasonable.

Original file line number Diff line number Diff line change
@@ -0,0 +1,174 @@
# Copyright 2024 "Google LLC"
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

locals {
host_project = "host-project-id"
service_project = "service-project-id"
}

terraform {
required_providers {
google = {
source = "hashicorp/google"
version = ">= 3.83"
}
archive = {
source = "hashicorp/archive"
version = ">= 2.4.2"
}
}
required_version = ">= 0.14.0"
}


resource "google_compute_network" "vpc" {
name = "vpc2"
project = local.host_project
auto_create_subnetworks = false
}

resource "google_compute_shared_vpc_service_project" "shared_vpc" {
host_project = local.host_project
service_project = local.service_project
}

resource "google_compute_subnetwork" "hpc" {
name = "hpc"
project = local.host_project
ip_cidr_range = "10.1.3.0/24"
region = "europe-west4"
network = google_compute_network.vpc.id
}


resource "google_project_iam_custom_role" "subnet_iampolicy_role" {
project = local.host_project
role_id = "subnetIamPolicyRole"
title = "Subnet IAM Policy Role"
description = "This role is used for giving access to control iam policy for specific subnet"
permissions = ["compute.subnetworks.getIamPolicy", "compute.subnetworks.setIamPolicy"]
}


resource "google_service_account" "cloud_function_service_account" {
account_id = "subnet-iam-assigner"
project = local.service_project
display_name = "For runninng Cloud Function, that controls iam permissions in host project for subnet."
}


resource "google_project_iam_binding" "subnet_iam_policy_binding" {
project = local.host_project
role = google_project_iam_custom_role.subnet_iampolicy_role.id
condition {
expression = "resource.name == \"${google_compute_subnetwork.hpc.id}\""
title = "Only access to ${google_compute_subnetwork.hpc.id}"
description = "Restrict permissions to single subnet"
}
members = [
"serviceAccount:${google_service_account.cloud_function_service_account.email}"
]
}

resource "google_pubsub_topic" "log_sink_topic" {
name = "service-account-auditlogs"
project = local.service_project
message_retention_duration = "86600s"
}

resource "google_logging_project_sink" "logs_sink" {
name = "service-account-audit-logs"
project = local.service_project
# Can export to pubsub, cloud storage, bigquery, log bucket, or another project
destination = "pubsub.googleapis.com/projects/${google_pubsub_topic.log_sink_topic.project}/topics/${google_pubsub_topic.log_sink_topic.name}"

# Log all WARN or higher severity messages relating to instances
filter = "protoPayload.methodName=\"google.iam.admin.v1.DeleteServiceAccount\" OR protoPayload.methodName=\"google.iam.admin.v1.CreateServiceAccount\""

# Use a unique writer (creates a unique service account used for writing)
unique_writer_identity = true
}

resource "google_pubsub_topic_iam_binding" "log_sink_topic_binding" {
project = google_pubsub_topic.log_sink_topic.project
topic = google_pubsub_topic.log_sink_topic.name
role = "roles/pubsub.publisher"
members = [
google_logging_project_sink.logs_sink.writer_identity
]
}



resource "google_storage_bucket" "cf_source_bucket" {
name = "${local.service_project}-service-account-auditlog-gcf-source" # Every bucket name must be globally unique
project = local.service_project
location = "europe-west1"
uniform_bucket_level_access = true
}

data "archive_file" "cf_source" {
type = "zip"
source_dir = "./cloudfunction/"
output_path = "function-source.zip"
excludes = ["venv"]
}

resource "google_storage_bucket_object" "object" {
name = "function-source-${data.archive_file.cf_source.output_sha256}.zip"
bucket = google_storage_bucket.cf_source_bucket.name
source = "function-source.zip"
}



resource "google_cloudfunctions2_function" "serviceaccount_audit_logs_watcher" {
name = "serviceaccount-audit-log-watcher"
location = "europe-west1"
project = local.service_project
description = "Parse service account audit logs"

build_config {
runtime = "python312"
entry_point = "process_log_entry" # Set the entry point
source {
storage_source {
bucket = google_storage_bucket.cf_source_bucket.name
object = google_storage_bucket_object.object.name
}
}
}

service_config {
max_instance_count = 1
min_instance_count = 0
available_memory = "256Mi"
timeout_seconds = 60
max_instance_request_concurrency = 1
environment_variables = {
HOST_PROJECT = google_compute_subnetwork.hpc.project
SUBNET_REGION = google_compute_subnetwork.hpc.region
SUBNET_NAME = google_compute_subnetwork.hpc.name
}
service_account_email = google_service_account.cloud_function_service_account.email
}

event_trigger {
trigger_region = "europe-west4"
event_type = "google.cloud.pubsub.topic.v1.messagePublished"
pubsub_topic = google_pubsub_topic.log_sink_topic.id
retry_policy = "RETRY_POLICY_RETRY"
}

}
33 changes: 33 additions & 0 deletions docs/shared-vpcs.md
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the title of this readme, I think there should be some reference to the other shared vpc example.

Maybe start by saying

"Here is an example of using shared vpc with the pre-existing-vpc module. If service projects only have permissions to access subnets then you can use the pre-existing-subnetwork module instead."

If instead you think the pre-existing-subnetwork module should always be used for shared VPC then maybe we should consider updating that example.

Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# Shared VPCs with HPC

The HPC toolkit supports the use of shared-vpcs.

The module is located in `modules/network/pre-existing-subnetwork`.

The extension is build to support subnet level permissions.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend instead:

The pre-existing-subnetwork module was created to support subnet level permissions.


The subnet is referenced directly using self_link:

```yaml
- group: primary
modules:
- source: modules/network/pre-existing-subnetwork
kind: terraform
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Delete kind: terraform. Default kind is terraform.

settings:
subnetwork_self_link: https://compute.googleapis.com/compute/v1/projects/{project}/regions/{region}/subnetworks/{subnetwork}
name = name-of-subnet (optional - not used when subnet_self_link is defined)
region = name-of-region (optional - not used when subnet_self_link is defined)
project = name-of-project (optional - not used when subnet_self_link is defined)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest the following alternative format for L18-20 as it represents the valid yaml syntax and also being commented out will prevent issues when people copy paste this block.

      # name: name-of-subnet (optional - not used when subnet_self_link is defined)
      # region: name-of-region (optional - not used when subnet_self_link is defined)
      # project: name-of-project (optional - not used when subnet_self_link is defined)

id: hpc_network
```

As described in documentation:
[https://registry.terraform.io/providers/hashicorp/google/latest/docs/data-sources/compute_subnetwork]

If subnetwork_self_link is provided then name,region,project is ignored.

Since using the HPC toolkit creates a new service account for the cluster, the cluster service accounts needs roles/compute.networkUser on the subnet on shared VPC.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only true when using the service account module in a blueprint.

I would instead say:

When using the HPC toolkit creates a new service account


To accomplish this on an automated basis, it's possible to use a cloud-function that listens on new service account creations/deletions, and uses a dedicated service account, to manage the access the subnet.

An example function is provided in `community/other/cloud-function-for-shared-vpcs/`.