kind local Kubernetes cluster with freenom domain and Let’s Encrypt TLS wildcard certificate
This is some progress for part of a side/pet project over months that I failed to achieved before with Docker Desktop on WSL
The objective was to create a local Kubernetes cluster for testing out some CI/CD as close as possible to a production system (not on infrastructure and networking level, but more on component-wise angle), consider it a poor man approach.
Premature Conclusion: “freebies are always the most expensive” as time used and brain cells killed are tremendous
What I actually want to achieve (short term)
I want to host a docker registry for my cluster, with TLS support, instead of having insecure mode or using self sign certificate (I am guessing those might lay some troubles in future)
With that in mind, what I need is a legit TLS certificate (preferably wildcard) and thus a domain name (that allow managing DNS record for dns01 challenge), and a docker registry container, and I am planning to put the registry inside the Kubernetes cluster.
This sounds silly in terms of storage as the image would sit in the cluster after pulling, when the image is also in the registry inside the cluster (and kind [or k3s] have a nice feature to load the image into cluster directly), but on the other hand, I could leverage the cert-manager, DNS inside the cluster, so I am not sure it would be good or bad at the moment.
What this article cover
I am covering the portion of configuration of using kind as local Kubernetes cluster, and using cert-manager to manage TLS cert issuance request and renewal from Let’s Encrypt with a free domain (for 1 year) in Freenom.
Why I chosen kind as the tool for local cluster?
- Docker Desktop: my previous experience in WSL told me this is not for my case.
- Minikube: popular choice, underneath is a VM (and thus need more resources), only support single node cluster.
- Microk8s: efficient, built for Ubuntu, support multi-node cluster
- k3s: sounds good as it’s by Rancher Lab (which I also tried Rancher before), but seems setting up multi-node cluster might be more configuration compare to others
- kind: kubernetes in docker, I actually struggled to choose between it and microk8s, but my take is generic (docker) is more preferred to restricted (Ubuntu)
Apply a freenom domain — you can get a domain (with suffix .ml, .tk, .ga, .cf, .gq) for free for max 12 months. Just go on their website and look for a free domain if needed.
Note: even Freenom allow federated sign in with Facebook or Google account, but the later section, the integration require creating username/password as Kubernete secrete, and the Facebook and Google account is NOT supported.
Why this freebie costed extra time? Because cert-manager’s ACME DNS01 challenge only implemented a major domain registrar, and Freenom is not included.
Given previous experience, I decided I better have a more controllable environment and not rely on the deeply windows-integrated WSL environment.
This could be yet another article to explain, but basically WSL is a VM (which even you installed multiple distro, they all sit on the same VM) using a Hyper-V virtual switch called “WSL” with “Internal” type of network with a special localhost mapping back to host…Sounds troublesome enough for managing the network.
So my setup is a Ubuntu server Hyper-V VM, with docker, helm and kubectl installed.
kind cluster configuration
Following the kind documentation with some additional settings, below is a single node cluster setup, and instead of using the default basic kindnet CNI, I would use Calico instead.
Those “extraPortMappings” is nice things that help allow the network to reach the cluster through the host machine (e.g. port 80 from host would route to cluster’s node 32080 port, which in Traefik configuration, I would use node port to make settings work)
Note that the extraPortMappings cannot be dynamically configured after cluster is created, the whole cluster need to be re-create in order to add additional mapping.
Below is the bash script to create the cluster and setup Calico as well as open the VM firewall port.
I setup the cert-manager with Helm chart, as well I added the reflector for managing the copy of secret (TLS certificate) to different namespaces in the cluster.
This is the bash script I used to install the cert manager:
This is the bash script I used to install the reflector:
Webhook for Freenom
The additional configuration is the setting up of ACME dns01 challenge (for wildcard certificate). The official supported ACME are as follow (detail please refer to link):
For any other non supported provider, one could find / implement a webhook to support that (freenom webhook is not in the reference list of cert-manager document)
Here is the github repo for freenom webhook: https://github.com/andreee94/cert-manager-webhook-freenom
I downloaded the yaml file from their readme.
Then continue to follow the readme and create yaml file(s) to include the sections RBAC, Secrets, Cluster Issuer and Wildcard Certificate.
The wild card certificates in the following yaml
Then run with commands to apply the yaml files
kubectl apply -f <the yaml file>
in the host VM terminal, I use the command:
kubectl describe certificate/<xxxxxx>-wildcard-certificate-prod
The expected result would see the certificate successfully issued event at the end.
Issue I faced
There have been an issue when wait for a long time and not able to see the secret (the name defined in secreteName in the wildcard certificate spec) being created. And the describe certificate command report the certificate request is being generated and stuck there forever.
I then check the pod of freenom webhook with following command:
kubectl get pods -n=cert-manager
An then check the log of the pod with following command:
kubectl logs -n=cert-manager pod/freenom-webhook-5f69fc9c8f-b9tr5
It show something like following:
Then I go to freenom website and login, check the domain’s DNS record and manually delete all TXT record, and then the cert request process seems running OK.
Explain (a bit) what the webhook does according to source code
The core action the webhook performed is to help create and delete record in DNS during the ACME dns01 challenge.
The dependent go package “github.com/tzwsoho/go-freenom/freenom”
I see the login code as follow (which looks like parsing the HTML structure, that really worry me…Finger cross and hope freenom would not change their website HTML structure shortly)
This is very similar to my other article on setting up Traefik with 1 different. The bash script I used to setup (with Helm chart) as follow:
The key different is the helm config file:
As said above, the kind cluster have the extraPortMappings to help redirect traffic from host to target port, but instead of using “expose” type of configuration, we need to use “nodePort” configuration.
I anticipate HTTP traffic routing would be like:
port 80 (host machine) => port 32080 (extraPortMapping) => port 8080 (Traefik) => actual service (through Traefik routing rule)
Deploy an echo server to verify the certificate and network run OK
The deployment file:
Note that the tls secret file is defined in whe wildcard certificate wildcard yaml file before which is in default namespace, it is NOT by default being able to refer in the traefik route in “echoserver” namespace, but because we use reflector annotation, the cert secret is being sync over to this target namespace. One can check the secret with command:
kubectl -n=echoserver get secret # namespace = echoserver only
Verify with browser in host machine (I updated host table to point the URL to the VM’s ip address) , and it’s working.
Again…freebie is very expensive, but hopefully this partial success would help someone else.
[Edit — 2022-07-24]
I lately experienced the my.freenom.com DNS record point to an IP that’s not working, note that my.freenom.com is hosting the login page as well as DNS record update page (which the freenom webhook used browser access approach to login, add/remove ACME challenge record, so if this page failed, the DNS01 challenge would fail as well)
The solution for me is find out the DNS which point to a working IP and update hosts table, I used site24x7.com to help figure that out (I used 220.127.116.11)