Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ADR: Key Management Improvements #1675

Open
biscoe916 opened this issue Oct 22, 2024 · 1 comment
Open

ADR: Key Management Improvements #1675

biscoe916 opened this issue Oct 22, 2024 · 1 comment
Labels
adr Architecture Decision Records pertaining to OpenTDF

Comments

@biscoe916
Copy link
Member

biscoe916 commented Oct 22, 2024

Background

OpenTDF's current key management implementation requires administrators to follow different procedures for different types of keys. KAS keys must be defined in two separate locations within the opentdf config, and other keys have their own unique management requirements. While some of this fragmentation stems from the separation between OpenTDF and vendor specific capabilities, it creates unnecessary complexity and increases the risk of configuration errors.

The platform also lacks a standardized configuration structure for accessing key material from modern key management solutions such as:

  • Cloud provider KMS services (AWS, GCP)
  • Vault/OpenBAO
  • Hardware Security Modules (HSM)

Goals

  • Establish a unified interface for key management across all OpenTDF components
  • Simplify key configuration and reduce potential for misconfiguration
  • Enable seamless integration with external key management systems
  • Support future scalability through standardized key handling
  • Maintain backward compatibility during transition

Proposal Overview

This document proposes creating a consistent key management and retrieval mechanism that can be used across both OpenTDF and any vendor specific offerings. The solution will eliminate the need for multiple configuration points while providing flexible integration with various key storage solutions.

Current State

Types of keys

KAS Keys

KAS (Key Access Server) keys can be:

  • RSA or EC format
  • Raw key material used locally via Go's standard crypto libraries
  • References to keys backed by KMS or HSM

Encrypted Search Keys

  • Symmetric keys used for secure search functionality
  • Currently managed through environment variables

Bundle Signing Keys

  • RSA 2048 keys
  • Used to cryptographically sign deployment bundles
  • Currently managed through Cosign

Policy Signing Keys

  • RSA 2048 keys
  • Used to sign policy artifacts
  • Enables trusted import/export of platform configuration

Future Key Requirements

The platform needs to accommodate additional key types:

  • EPOP (Entity Proof of Possession) keys
  • Platform policy root keys
  • Identity Provider (IDP) and PKI root certificates

Current Management Approaches

Each key type currently requires its own management approach:

  1. KAS Keys:

    • Stored as PEM-encoded files in the filesystem
    • Referenced in opentdf config.yaml
    • Requires configuration in multiple locations
  2. Vendor specific keys:

    • Imported and managed via environment variables
    • No standardized key rotation mechanism
  3. Signing Keys (Bundle and Policy):

    • Managed through Cosign tooling
    • Requires separate key generation and management workflows

These disparate approaches create several challenges:

  • Increased operational complexity
  • No unified key rotation strategy
  • Limited integration with modern key management systems
  • Difficult to maintain consistent security policies

Proposed Solution

Overview

We propose eliminating key pre-provisioning in config.yaml in favor of a unified key management system that provides:

  1. CLI-based key management for automated processes
  2. Web-based administration through the upcoming admin UI
  3. Standardized interface for all key operations
  4. Flexible integration with external key management systems

Key Storage and Identification

The solution consists of two main components:

1. Key Storage

  • Primary storage in Platform's data store (Postgres by default)
  • Alternative storage in signed policy artifacts
  • Support for external key management systems:
    • Hardware Security Modules (HSM)
    • Cloud KMS services
    • Vault/OpenBAO

2. Key Identification

  • Internal key IDs generated and managed by the platform
    • Controlled format and size
    • Optimized for nanoTDF compatibility
  • External key references maintained separately
    • Maps to provider-specific identifiers (e.g., Vault-generated UUIDs)
    • Preserves isolation between internal and external systems

Key Retrieval Process

Example: Key retrieval for rewrap operation

  1. KAS receives rewrap request
  2. Validates request and permissions
  3. Leverages AccessPDP for access decision
  4. Extracts internal key ID (k1) from key access object
  5. Retrieves key configuration via keyProvider.getKeyConfiguration(k1)
  6. Uses configuration to fetch key material from appropriate backend
  7. Performs rewrap operation

Performance Considerations

To ensure optimal performance:

  • KeyProvider implements aggressive caching of frequently used keys
  • Preloading mechanism for most recent n keys
  • Configurable cache size and retention policies

Implementation Plan

Phase 1: Core Infrastructure

  1. Create new keys table schema Reference
  2. Implement CLI for key management operations
  3. Remove key definitions from config.yaml

Phase 2: Provider Integration

  1. Update platform code to use key ID references exclusively
  2. Develop crypto provider framework:
    • Build on @dmihalcik-virtru's existing provider work
    • Implement provider interface, An example:
      standardCrypto.encrypt(clearText, keyReference)
    • Create providers for:
      • Standard crypto operations
      • HSM integration (e.g., Thales)
      • Cloud KMS (e.g., GCP KMS)

Risks and Mitigations

Operational Risks

1. Migration Complexity

Risk: Existing deployments may face disruption during migration to the new key management system.
Mitigation:

  • Provide backward compatibility during transition period
  • Create automated migration tools for existing key configurations
  • Document step-by-step migration procedures for different deployment types
  • Enable gradual migration by supporting both old and new systems simultaneously

2. Performance Impact

Risk: Additional abstraction layers and external key fetching could impact system performance.
Mitigation:

  • Implement aggressive caching strategy
  • Allow configuration of cache sizes and retention policies

3. Configuration Errors

Risk: While simpler, the new system still requires proper configuration of external key management systems.
Mitigation:

  • Implement validation checks for key provider configurations
  • Provide clear error messages for misconfiguration
  • Include configuration examples for common scenarios

Technical Risks

1. Integration Complexity

Risk: Different key management systems have varying APIs and capabilities.
Mitigation:

  • Design flexible provider interface
  • Implement comprehensive provider testing
  • Document provider-specific limitations
  • Maintain test suite for provider implementations
@strantalis strantalis added the adr Architecture Decision Records pertaining to OpenTDF label Oct 23, 2024
@willackerly
Copy link

I'm a fan. Key ID's can be any length right?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
adr Architecture Decision Records pertaining to OpenTDF
Projects
None yet
Development

No branches or pull requests

3 participants