Understanding the Role of Cert-manager in Cluster API

The Cluster API team has started cutting release candidates of the latest version of the project: Cluster API v0.3. I have been looking forward to this new version, as it adds a set of powerful capabilities, such as declarative control plane management and cluster upgrades. I plan to explore those features in future posts.

As I started testing the latest release candidate, I noticed that I had to install cert-manager into my management cluster. I wasn't quite sure why this was the case, so I decided to learn more about how Cluster API leverages cert-manager.

What is cert-manager?

Cert-manager is a controller and set of Custom Resource Definitions (CRDs) that enable automated certificate management on Kubernetes. Some of the CRDs that cert-manager installs are Issuers, CertificateRequests, and Certificates. Using these CRDs, you can request a certificate via the Kubernetes API. Once you create the request, the cert-manager controller creates and manages the certificate for you.

The most common scenario I have run into with cert-manager is issuing certificates for Ingress resources. Cluster API, however, does not use cert-manager for this.

Cert-manager and Cluster API

After installing the latest release candidate of cluster API in a fresh management cluster, I looked for certificate requests:

$ kubectl get certificaterequests --all-namespaces
NAMESPACE             NAME                                                 READY   AGE
capi-webhook-system   capi-kubeadm-bootstrap-serving-cert-1303918272       True    109m
capi-webhook-system   capi-kubeadm-control-plane-serving-cert-1399157480   True    109m
capi-webhook-system   capi-serving-cert-3321106290                         True    109m

I noticed that all certificate requests live in the capi-webhook-system namespace. The next thing I wanted to know was the issuer used to create these certificates:

$ kubectl get issuer -n capi-webhook-system
NAME                                           AGE
capi-kubeadm-bootstrap-selfsigned-issuer       109m
capi-kubeadm-control-plane-selfsigned-issuer   109m
capi-selfsigned-issuer                         109m

There are three issuers, one for each of the certificates that Cluster API requests during installation. Judging by the name, the issuers create self-signed certificates.

Finally, I listed the Certificates in the capi-webhook-system to verify that cert-manager had fulfilled the Certificate Requests:

$ kubectl get certificates -n capi-webhook-system
NAME                                      READY   SECRET                                            AGE
capi-kubeadm-bootstrap-serving-cert       True    capi-kubeadm-bootstrap-webhook-service-cert       108m
capi-kubeadm-control-plane-serving-cert   True    capi-kubeadm-control-plane-webhook-service-cert   108m
capi-serving-cert                         True    capi-webhook-service-cert                         108m

After exploring the certificaterequests, issuers and certificates created during the Cluster API installation, it is evident that Cluster API leverages cert-manager to create self-signed certificates. But how are these certificates used?

Cluster API Webhooks

Starting with version 0.3, Cluster API configures validating and mutating webhooks in the management cluster. It also uses webhooks for converting custom resources between API versions. These webhooks are admission controllers or plugins that the API server calls when Cluster API resources are created or modified.

The validating webhooks validate v1alpha3 Cluster API resources. For example, if I try to create a v1alpha3 Cluster that has an InfrastructureRef in another namespace, I get a validation error. This validation is implemented in the validating webhook.

$ kubectl apply -f invalid-cluster.yaml
Error from server (Cluster.cluster.x-k8s.io "capi-quickstart" is invalid:
spec.infrastructureRef.namespace: Invalid value: "foo": must match metadata.namespace):
error when creating "sample.yaml": admission webhook "validation.cluster.cluster.x-k8s.io" denied the request:
Cluster.cluster.x-k8s.io "capi-quickstart" is invalid: spec.infrastructureRef.namespace:
Invalid value: "foo": must match metadata.namespace

Cluster API uses the mutating webhooks to set default values in v1alpha3 custom resources. For example, the Cluster resource defaults the namespace of the InfrastructureRef and ControlPlaneRef to the same namespace as the Cluster. The defaulting of these values improve the UX of these custom resources.

Finally, Cluster API specifies conversion webhooks in the CRD specifications. The conversion webhooks implement the conversion of custom resources from v1alpha2 to v1alpha3 and vice-versa. The conversion allows Cluster API users to upgrade their existing management clusters from Cluster API 0.2 to 0.3.

This is great, but what does it have to do with certificates?

When configuring admission webhooks, Kubernetes requires the webhooks to expose an HTTPS service. And this is precisely where the certificates come into play—three certificates for three webhooks. The webhook services use the certificates to expose the webhooks over HTTPS.

# Three certificates
$ kubectl get certificate
NAME                                      READY   SECRET                                            AGE
capi-kubeadm-bootstrap-serving-cert       True    capi-kubeadm-bootstrap-webhook-service-cert       47h
capi-kubeadm-control-plane-serving-cert   True    capi-kubeadm-control-plane-webhook-service-cert   47h
capi-serving-cert                         True    capi-webhook-service-cert                         47h

# Three webhook services
$ kubectl get service
NAME                                         TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)   AGE
capi-kubeadm-bootstrap-webhook-service       ClusterIP   10.96.81.253   <none>        443/TCP   47h
capi-kubeadm-control-plane-webhook-service   ClusterIP   10.96.18.220   <none>        443/TCP   47h
capi-webhook-service                         ClusterIP   10.96.11.28    <none>        443/TCP   47h

Summary

I hope this post sheds light on why Cluster API v0.3 needs cert-manager in the management cluster.

In brief, Cluster API v0.3 installs admission webhooks in the management cluster to validate, mutate, and convert Cluster API resources. Kubernetes requires these webhooks to be HTTPS services, thus requiring serving certificates. Instead of creating the certificates in a one-shot manner, Cluster API leverages the cert-manager project to create and manage the webhook certificates throughout the lifetime of the management cluster.

If you are looking to test the latest Cluster API release candidates, check out the master version of the Cluster API Quick Start (docs are still under development, so you might run into sharp edges).

Additional reading:

Did you find this post useful? Did I get something wrong? I would love to hear from you! Please reach out via @alexbrand.