The Certman Operator is used to automate the provisioning and management of TLS certificates from Let's Encrypt for OpenShift Dedicated clusters provisioned via https://cloud.redhat.com/.
At a high level, Certman Operator is responsible for:
- Provisioning Certificates after a clusters successful installation.
- Reissuing Certificates prior to their expiry.
- Revoking Certificates upon cluster decomissioning.
GO: 1.11
Operator-SDK: 0.5.0
Hive v1 Certman Operator is currently dependent on Hive. Hive is an API driven OpenShift cluster providing OpenShift Dedicated provisioning and management.
Specifically, Hive provides a namespace scoped CustomResourceDefinition called ClusterDeployment. Certman watches the Installed
spec of CRD and will attempt to provision certificates for the cluster once this field returns true
. Hive is also responsible for the deployment of the certificates to the cluster via syncsets.
Only Hive v1 will work with this release.
- A new OpenShift Dedicated cluster is requested by from https://cloud.redhat.com.
- Certman's
Reconcile
function watches theInstalled
field of the ClusterDeployment CRD (as explained above). Once theInstalled
field becomestrue
, a CertificateRequest resource is created for that cluster. - Certman operator will then request new certificates from Let’s Encrypt based on the populated spec fields of the CertificateRequest CRD.
- To prove ownership of the domain, Certman will attempt to answer the Let’s Encrypt DNS-01 challenge by publishing the
_acme-challenge
subdomain in the cluster’s DNS zone with a TTL of 1 min. - Wait for propagation of the record and then verify the existance of the challenge subdomain by using DNS over HTTPS service from Cloudflare. Certman will retry verification up to 5 times before erroring.
- Once the challenge subdomain record has been verified, Let’s Encrypt can verify that you are in control of the domain’s DNS.
- Let’s Encrypt will issue certificates once challenge has been successfuly completed. Certman will then delete the challenge subdomain as it is no longer required.
- Certificates are then stored in a secret on the management cluster. Hive watchs for this secret.
- Once the secret contains a valid certificates for the cluster, Hive will sync the secrets over to the OpenShift Dedicated cluster using a SyncSet.
- Certman operator will reconcile all CertificateRequest every 10 minutes by default. During this reconciliation loop, certman will check for the validity of the existing certificates. As the certificates expiry nears 45 days, they will be reissued and the secret will be updated. Reissuing certificates this early avoids getting email notifications about certificate expiry from Let’s Encrypt.
- Updates to secrets on certificate reissuance will trigger Hive controller’s reconciliation loop which will force a syncset of the new secret to the OpenShift Dedicated cluster. OpenShift will detect that secret has changed and will apply the new certificates to the cluster.
- When a OpenShift Dedicated cluster is decommissioned, all valid certificates are first revoked and then the secret is deleted on the management cluster. Hive will then continue deleting other cluster resources.
- As described above in dependencies, Certman Operator requires Hive for custom resources and actual deploymen of certificates. It is therefore not a suitable "out-of-the-box" solution for Let's Encrypt certificate management. For this, we recommend using either openshift-acme or cert-manager. Certman Operator is ideal for use cases when large number of OpenShift clusters have to be managed centrally.
- Certman Operator currently only supports DNS Challenges through AWS Route53. There are plans for GCP support. HTTP Challenges is not supported.
- Certman Operator does not support creation of Let's Encrypt account at this time. You must already have Let's Encrypt account and keys that you can provide to the Certman Operator.
- Certman Operator does NOT configure the TLS certificates in an OpenShift cluster. This is managed by Hive using SyncSet.
The Certman Operator relies on the following custom resource definitions (CRDs):
-
CertificateRequest
, which provides the details needed to request a certificate from Let's Encrypt. -
ClusterDeployment
, which defines a targeted OpenShift managed cluster. The Operator ensures at all times that the OpenShift managed cluster has valid certificates for control plane and pre-defined external routes.
For local development, you can use either minishift or minikube to develop and run the operator. You will also need to install the operator-sdk.
The script hack/test/local_test.sh
can be used to automate local testing by creating a minikube cluster and deploying certman-operator and its dependencies.
A ConfigMap is used to store certman operator configuration. At the moment, there are 2 items that can be configured using ConfigMap.
default_notification_email_address
- Email address to which Let's Encrypt certificate expiry notifications should be sent.
oc create configmap certman-operator \
--from-literal=default_notification_email_address=foo@bar.com
Secret is used to store Let's Encrypt account url and keys. We will use Let's Encrypt staging environment if it's an staging account, and use production environment if it's an production account.
oc create secret generic lets-encrypt-account \
--from-file=private-key=private-key.pem \
--from-file=account-url=account.txt
git clone git@github.com:openshift/hive.git
oc create -f hive/config/crds
oc create -f https://raw.githubusercontent.com/openshift/certman-operator/master/deploy/crds/certman_v1alpha1_certificaterequest_crd.yaml
operator-sdk up local
docker login quay.io
operator-sdk build quay.io/tparikh/certman-operator
docker push quay.io/tparikh/certman-operator
oc new-project certman-operator
oc project certman-operator
oc create -f deploy/service_account.yaml
oc create -f deploy/role.yaml
oc create -f deploy/role_binding.yaml
oc create -f deploy/operator.yaml
certman_operator_certs_in_last_day_openshift_com
reports how many certs have been issued for Openshift.com in the last 24 hours.
certman_operator_certs_in_last_day_openshift_apps_com
reports how many certs have been issued for Openshiftapps.com in the last 24 hours.
certman_operator_certs_in_last_week_openshift_com
reports how many certs have been issued for Openshift.com in the last 7 days.
certman_operator_certs_in_last_week_openshift_apps_com
reports how many certs have been issued for Openshiftapps.com in the last 7 days.
certman_operator_duplicate_certs_in_last_week
reports how many certs have had duplication issues.
Certman Operator always creates a certificate for the control plane for the clusters Hive builds. By passing a string into the pod as an environment variable named EXTRA_RECORD
Certman Operator can add an additional record to the SAN of the certificate for the API servers. This string should be the short hostname without the domain. The record will use the same domain as the rest of the cluster for this new record.
Example
apiVersion: apps/v1
kind: Deployment
metadata:
name: certman-operator
spec:
template:
spec:
...
env:
- name: EXTRA_RECORD
value: "myapi"
The example will add myapi.<clustername>.<clusterdomain>
to the certificate of the control plane.
Certman Operator is licensed under Apache 2.0 license. See the LICENSE file for details.