Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remote sources for components #598

Open
Gowiem opened this issue May 8, 2024 · 8 comments
Open

Remote sources for components #598

Gowiem opened this issue May 8, 2024 · 8 comments

Comments

@Gowiem
Copy link
Member

Gowiem commented May 8, 2024

Describe the Feature

This is a similar idea to what Terragrunt does with their "Remote Terraform Configurations" feature: https://terragrunt.gruntwork.io/docs/features/keep-your-terraform-code-dry/#remote-terraform-configurations

The idea would be that you could provide a URL to a given root module and use that to create a component instance instead of having that component available locally in the atmos project repo.

The benefit here is that you don't need to vendor in the code for that root module. Vendoring is great when you're going to make changes to a configuration, BUT if you're not making any changes then it just creates large PRs that are hard to review and doesn't provide much value.

Another benefit: I have team members that strongly dislike creating root modules that are simply slim wrappers of a single child module because then we're in the game of maintaining a very slim wrapper. @kevcube can speak to that if there is interest to understand more there.

Expected Behavior

Today all non-custom root module usage is done through vendoring in Atmos, so no similar expected behavior AFAIK.

Use Case

Help avoid vendoring in code that you're not changing and therefore not polluting the atmos project with additional code that is unchanged.

Describe Ideal Solution

I'm envisioning this would work like the following with the $COMPONENT_NAME.metadata.url being the only change to the schema. Maybe we also need a version attribute as well, but TBD.

components:
  terraform:
    s3-bucket:
      metadata:
        url: https://github.com/cloudposse/terraform-aws-components/tree/1.431.0/modules/s3-bucket
      vars:
        ...

Running atmos against this configuration would result in atmos cloning that root module down to the local in a temporary cache and then using that cloned root module as the source to run terraform or tofu against.

Alternatives Considered

None.

Additional Context

None.

@Gowiem
Copy link
Member Author

Gowiem commented May 8, 2024

@osterman @aknysh @nitrocode -- Would be interested in your folks thoughts on this one!

@osterman
Copy link
Member

osterman commented May 8, 2024

This has come up from a handful of people lately in the community. Clearly it's a popular feature for users migrating from a Terragrunt ecosystem.

This is supported today by combining 2 concepts.

How it's different from Terragrunt

  1. You have to run atmos vendor pull before running commands. In other words, it's not Just in Time (JIT). We like not relying on temporary cache folders like Terragrunt.
  2. You'll need to ensure you vendor to unique folders, if you want to support concurrent versions of remote components.
  3. We don't support HCL-style "glue" code to dynamically construct a component from multiple remote "root" modules, but you can use terraform overrides to monkey patch behavior or add functionality. https://developer.hashicorp.com/terraform/language/files/override

How we could improve it?

  • We could add a --vendor flag to atmos terraform commands, so that it would know to vendor automatically. Then it would feel more "just in time"
  • We could add a setting.vendor.pull = true which would look for matching vendor.yaml components and automatically pull them at run time. If we did this, we would also have to update the atmos describe affected command. atmos describe affected is not currently aware of vendor.yaml, so you'll want to use component.yaml instead to determine affected.
  • We could make vendoring support temporary folders, so we wouldn't need to worry about overwriting versions accidentally. Then update components to refer symbolically to vendor'd components. This would be a great way to restrict also versions allowed to only versions defined in the vendoring configuration.
  • Vendored configurations could be parameterized, so components could overwrite specific parameters in the vendor configuration, similar to how imports support context. E.g. a component could have setting.vendor.context.version: 1.2.3 which would overwrite the version, while preserving the rest of the vendor configuration.

Why Not Define Vendoring in Stack Configurations?

We deliberately did not add the vendor.yaml configurations to the Stack configurations for a few reasons

  • A vendoring configuration has a lot going on. It would unnecessarily complicate the stack configurations with a lot of configuration, that would be overwhelming. So instead, we added the concept of imports to vendoring as well. One vendor.yaml can import other configs.
  • With the advanced usage of imports, inheritance, multiple-inheritance, knowing what to vendor for a given component instance is non-intuitive. Also, we need to be extra cautious about two component instances running in parallel, cloning to the same directory a component at different versions. It puts a lot of onus on the user, and requires a lot of error handling on the CLI.

@osterman
Copy link
Member

osterman commented May 8, 2024

Another benefit: I have team members that strongly dislike creating root modules that are simply slim wrappers of a single child module because then we're in the game of maintaining a very slim wrapper. @kevcube can speak to that if there is interest to understand more there.

See: https://atmos.tools/core-concepts/components/vendoring#vendoring-modules-as-components

@kevcube
Copy link
Contributor

kevcube commented May 8, 2024

See: https://atmos.tools/core-concepts/components/vendoring#vendoring-modules-as-components

@osterman My problem with this is adding potentially hundreds of lines of code to our repo that we have no immediate need to modify. Makes for a large, difficult to actually verify PR.

@osterman
Copy link
Member

osterman commented May 8, 2024

@kevcube committing the files is not required if using component.yaml, and mitigated from a code review perspective using .gitattributes.

See https://sweetops.slack.com/archives/C031919U8A0/p1714589350835659

Option 1: Vendor and Commit

By default, Cloud Posse (in our engagements and refarch), vendor the everything in to the repositories.

Pros

  • an immutable record of components / less reliance on the remote repositories
  • super easy to test changes without a "fork bomb" and ensuring "PR storm" as you update multiple repos
  • super easy to diverge when you want to
  • super easy to detect changes and what's affected
  • much faster than cloning all dependencies
  • super easy to "grep" (search) the repo to find where something is defined
  • No need to dereference a bunch of URLs just to find where something is defined
  • Easier for newcomers to understand what is going on

Cons

  • Reviewing PRs containing tons of vendored files sucks 👎
  • ...? I struggle to see them

Option 2: Vendoring Just In Time

Vendoring components (or anything for that fact, which is supported by atmos) can be done "Just in time", more or less like terraform init for provider and modules.

Pros:

  • only things that change or are different are in the local repo
  • PRs don't contain a bunch of duplicated files
  • It's more "DRY" (but I'd argue this is not really any more DRY than committing them. Not in practice because vendoring is completely automated)

Cons

  • It's slower to run, because everything must first be downloaded
  • It's not immutable. Remote refs can change, including tags
  • Remote sources can go away, or suffer transient errors
  • It's harder to understand what something is doing, when you have to dereference dozens of URLs to look at the code
  • Cannot just do a "code search" (grep) through the repo to see where something is defined
  • In order to determine what is affected, you have to clone everything which is slower
  • If you want to test out some change, you have to fork it and create branch with your changes, then update your pinning
  • If you want to diverge, you also have to fork it, or vendor it in locally

Option 3: Hybrid

  • might make sense in some cirucmstances. Vendor & commit 3rd-party dependencies you do not control, and for everything else permit remote dependencies and vendor JIT.

@osterman
Copy link
Member

Should components be able to have their own vendor.yaml that can be imported?

apiVersion: atmos/v1
kind: AtmosVendorConfig
metadata:
  name: example-vendor-config
  description: Atmos vendoring manifest
spec:
  # `imports` or `sources` (or both) must be defined in a vendoring manifest
  imports:
    - "vendor/vpc"
    - "components/terraform/**/vendor.yaml"

@osterman
Copy link
Member

Imagine a vendor.yaml with the following.

apiVersion: atmos/v1
kind: AtmosVendorConfig
metadata:
  name: example-vendor-config
  description: Atmos vendoring manifest
spec:
  # `imports` or `sources` (or both) must be defined in a vendoring manifest
  imports:
    - "vendor/vendor2"
    - "vendor/vendor3.yaml"

  sources:
    - component: "vpc"
      source: "oci://public.ecr.aws/cloudposse/components/terraform/stable/aws/vpc:{{.Version}}"
      targets:
        - "components/terraform/vpc/{{ .Version }}"
      included_paths:
        - "**/*.tf"
        - "**/*.tfvars"
        - "**/*.md"

    - component: "vpc-flow-logs-bucket"
      source: "github.com/cloudposse/terraform-aws-components.git//modules/vpc-flow-logs-bucket?ref={{.Version}}"
      targets:
        - "components/terraform/infra/vpc-flow-logs-bucket/{{.Version}}"
      excluded_paths:
        - "**/*.yaml"
        - "**/*.yml"

Then in a stack configuration for dev

components:
  terraform:
    vpc:
      settings:
        vendor: 
           version: 1.2.3

    vpc-flow-logs-bucket:
      settings:
        vendor:
          version: 1.2.4
          auto: true

Then in a stack configuration for prod

components:
  terraform:
    vpc:
      settings:
        vendor: 
           version: 1.2.1

    vpc-flow-logs-bucket:
      settings:
        vendor:
          version: 1.2.1
          auto: true

And then running atmos terraform plan vpc --stack use1-prod-vpc --vendor

Or using release channels,

components:
  terraform:
    vpc:
      settings:
        vendor: 
           version: alpha

    vpc-flow-logs-bucket:
      settings:
        vendor:
          version: beta
          auto: true

@osterman
Copy link
Member

How would overrides pattern work, if you want to use vendoring as proposed?

Note

Monkey patching is an anti-pattern, but is supported

Option 1

Without any signficant changes to atmos, in your vendoring config, ensure you have

    - component: "vpc-flow-logs-bucket"
      # ...
      excluded_paths:
        - "**/*_override.tf"

Place your {xxx}_override.tf in the vendored folder and commit only the the _override.tf files.

Option 2

Without any signficant changes to atmos, in your vendoring config, ensure you have:

    - component: "vpc-flow-logs-bucket"
      source: "github.com/cloudposse/terraform-aws-components.git//modules/vpc-flow-logs-bucket?ref={{.Version}}"
      targets:
        - "components/terraform/infra/vpc-flow-logs-bucket/{{.Version}}"
      mixins:
        - "components/teraform/overrides/vpc-flow-logs-bucket/*_override.tf"
        - "github.com:cloudposse/my-private-repo/providers.tf?ref=v1.2.3"

In this example, local files stored in components/teraform/overrides/vpc-flow-logs-bucket are copied into the target. As well as a remote providers file is copied in at version 1.2.3.

You don't have to use *_override.tf, it could just be *, but it's "safer" to focus on overrides.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants