Skip to content

DACRepair/terraform_state_processor

Repository files navigation

Terraform State Processor

Overview

The Terraform State Processor is a tool used to extract and manipulate the Terraform state. This is typically for use with automation, reporting, and data analysis tools. Note: This tool is a one way tool designed to read the tf state only. It does not allow for direct modification of the state.

The application is split into two areas: input and output

Input: This is done through the use of abstraction classes using a JSON/XMLPath-like markup and is written in Python.

Output: This is done through the use of the Jinja2 template engine.

Quick Start

Installation

The supported path for this application is using the provided Docker image

> docker pull ghcr.io/dacrepair/terraform_state_processor:latest

You can also install it directly to an existing Python 3.10 environment using

> pip install git+https://github.com/DACRepair/terraform_state_processor

Once this has been completed, usage is simple by executing a docker container via:

> docker run --rm -it -b ${PWD}:/data ghcr.io/dacrepair/terraform_state_processor:latest

Configuration and usage

Help Text

Usage: main.py [OPTIONS]

  Generates text data from the terraform state.

Options:
  --tfstate TEXT     The terraform state json file path (default: ./tfstate.json).
  --processors TEXT  Add custom processors
  --template TEXT    The template you would like to generate.
  --outfile TEXT     The file to write the output to (default is to stdout).
  --help             Show this message and exit.

Arguements

Argument Environment Default Description
--tfstate TFSTATE_TFSTATE ./tfstate The path to the tfstate file.
--processors TFSTATE_PROCESSORS A folder path with custom resource processors.
--template TFSTATE_TEMPLATE The path to the template file (may be relative), see info in template creation.
--outfile TFSTATE_OUTFILE Specify a path to output template contents to. Prints to STDOUT if not used.

State File

The terraform_state_processor command by default looks for a tfstate.json file. On the docker image this would be /data/tfstate.json. This can be easily generated by running something like terraform show --json > tfstate.json. You can also specify an arbitrary state file with --tfstate <state file path>

Processing

This is done through a python class that extends the BaseResource class. The majority of the field handling is done dynamically so extending should be extremely straightforward. Naming should follow this standard:

# resource_types/vsphere.py

# All classes should be title-case, and should only have alphanumeric characters 
# (-_ should be removed, numerics in their spelled format, 1 -> One)
# All classes should refrain from having the provider name (IE: vsphere_virtual_machine -> VirtualMachine)

Examples:
terraform {
  required_providers {
    vsphere = {} # vsphere.py
  }
}

data "vsphere_folder" "some_folder" {} # FolderResource(BaseResource)

resource "vsphere_virtual_machine" "virtual_machine" {} # VirtualMachineResource(BaseResource)

Writing an extension is fairly straightforward as well:

import json
from terraform_state_processor.terraform.resource_types import BaseResource, StateField


class VirtualMachineResource(BaseResource):
    __datatype__ = 'vsphere_virtual_machine'

    name = StateField("values.name")
    uuid = StateField("values.uuid")
    ip_address = StateField("values.default_ip_address")
    config_servergroups = StateField("values.extra_config.guestinfo..config_servergroups", json.loads)
    config_serverroles = StateField("values.extra_config.guestinfo..config_serverroles", json.loads)

To get a better understanding, simply run the command without a template specified, and it will dump the json out to console including the base processed data.

Templating

By default, the application only comes with a debugging template that shows the individual entries

You can specify a custom template by using the --template option

By default the template can be in one of several places:

  • ./templates
  • ~/.tfstate_processor/templates
  • an absolute path (specify via --template with an absolute path).
  • one of the project defaults.

If you specify a relative name or path for the templates, it will try to find it within the order listed above, otherwise and absolute path will always be that full path.

For example:

/home/user/.tfstate_processor/templates
  | - test.j2
/$CWD/templates
  | - testing
  |  | - test.j2
  | - test.j2

If you were to specify: --template test.j2, it would resolve to /$CWD/templates/test.j2

Building templates

The template file is simply a flat representation of the output, using jinja2 to replace the areas you would like extracted from the state:

DEBUG DUMP
Terraform Format Version: {{ format_version }}
Terraform Application Version: {{ terraform_version }}

[Data Output: raw_entries]
{%- for entry in raw_entries %}
{{ entry }}
{%- endfor %}

[Processed Resources]
{%- for entry in resources %}
{{ entry }}
{%- endfor %}

Within the template, you have several available data objects:

Name Description
format_version The state's format version.
terraform_version The Terraform Version that generated the state.
resources The state resources processed by the tfstate_processor.
raw_entries The state's raw resources in dict format (list).
env The available environment variables.

For more information on how jinja2 works, see here: Jinja Designer Documentation