You are not alone — the frustrating (failure) journey setting up private docker registry with self sign cert with WSL2 Docker desktop’s Kubernetes

Background

This is an incomplete journey (hopefully I would land at destination) on setting up a private docker registry inside Kubernetes of Docker Desktop in WSL2…

The reason I want to summarize it is partially hoping someone have experience could help, plus share my experience such that people trying to do this know what might or might not work out and the expected behavior.

My setup

Environment

WSL 2 running in Windows 10 (Build 19044)

WSL 2 OS version: Ubuntu 20.04.3 LTS

Docker Desktop Version 4.5.1

Associated Docker Engine Client and Server version

What worked with the docker installed

From above setup, the docker and build and run container locally as well as login, push to and pull from:

  1. Microsoft Azure docker container registry
  2. A private docker registry on Aliyun [Alibaba Cloud in mainland China] with tls termination on nginx in front of docker registry

According to above, the docker desktop and WSL setup should be assumed “healthy”.

Kubernetes setup

The Kubernetes cluster is the docker desktop associated cluster.

Result running “kubectl cluster-info”

Worth to mention about Docker Desktop network

Normally when we install docker (in Linux natively), it would create a new network interface (bridge) “docker0", but the docker desktop user manual explicitly mentioned there would be no such interface:

https://docs.docker.com/desktop/windows/networking/#there-is-no-docker0-bridge-on-windows

The result is one could directly connect localhost (127.0.0.1) to connect to the container network, for example, we expose a container to port 5000, we can access localhost:5000 from host computer to access the container.

Expectation of target setup

With the previous experience deploying private docker registry, I am expecting the docker registry would be accessed with FQDN with TLS, and I can login, push and pull container image through host machine and the local Kubernetes cluster.

Setup Idea 1

Given my Kubernetes cluster is using Traefik as ingress controller, I would expecting to not terminate the tls and pass through to the docker registry

Docker registry

The docker registry deployment as follow (skipped some selector and template metadata lines for screen capture purpose):

One thing worth mention is the environment variable “REGISTRY_HTTP_TLS_CLIENTCAS”, there is no example online mentioned how to specify the client CA cert (one can use config file deployment as alternative) through environment variables.

The official docker documentation mentioned how to and the list of all values available:

https://docs.docker.com/registry/configuration/#override-specific-configuration-options

And note that the “clientcas” is an array of values, so I put “- /certs/ca.crt” as value in the configuration (credit goes to a stackoverflow post I lost track of).

Traefik

The Traefik configuration is more tricky, most of my previous experience on using Traefik to serve TLS content is with TLS terminated at Traefik ingress, this setup idea 1 need to configure a pass through, the very first struggle is IngressRoute does not have a tls passthrough, while only IngressRouteTCP support that.

https://doc.traefik.io/traefik/routing/providers/kubernetes-crd/#kind-ingressroute
https://doc.traefik.io/traefik/routing/providers/kubernetes-crd/#kind-ingressroutetcp

The struggle I have had was IngressRouteTCP supported middlewares are very limited compare to IngressRoute.

There is also a very important nature for having IngressRouteTCP (which is CRD for Traefik TCP Router) is it supersede HTTP Router (IngressRoute).

https://doc.traefik.io/traefik/routing/routers/#general_1

The configuration of this trial is as follow:

Verify result

Note that I already add hosts table in Windows (as I use Docker Desktop in Windows with WSL2 integration) pointing docker-registry.my.localvirtual.domain to 127.0.0.1

  1. Test with curl with following command:
curl -v --cacert ca.crt --cert client.cert --key client.key -u <username>:<password> https://docker-registry.my.localvirtual.domain:5443/v2/_catalog
Success result

2. Test with docker login

Run the following command (the --debug switch help alot)

docker --debug --tlsverify --tls --tlscacert=ca.crt --tlscert=client.cert --tlskey=client.key login docker-registry.my.localvirtual.domain:5443

We would see failure like follow:

Note the error is stating “tls: bad certificate”, also we see there are 2 attempt of connection, the https and then http (this is by docker CLI design, imply it’s expected behavior).

But this start the question, where does the traffic goes wrong?

Checking logs

Given the assumption of network should goes from client (curl or docker CLI) to Traefik ingress and finally to the docker container registry, so it worth checking the log:

# Traefik (ignoring the ping and /api/overview that overwhlem the log)kubectl -n=traefik logs svc/traefik -f | grep -v -E "(ping|\/api\/overview)"
# Docker Registry log (ignoring the debug mode "filesystem.Stat" that overwhlem the log)
kubectl -n=docker-registry logs svc/docker-registry -f | grep -v "filesystem.Stat"

The curl would result in log like this:

Traefik log upon curl
docker registry log upon curl

The docker login CLI generate result as follow:

Traefik log upon docker login CLI
Docker registry log upon docker login CLI

Conclusion from the result I drawn is the tls cert is not passed from Traefik to docker registry.

Further control experiment — make sure docker container registry setup correct (removing Traefik impact)

I would like to make sure the docker registry configuration is correct, so I setup a Linux pod inside the cluster to verify the registry can be login through docker CLI.

Note, it’s NOT a docker in docker setup, which mean it only install the docker cli for login, but NOT installing the docker engine, and so docker pull and push would NOT be able to verify by this mean.

(Side reference for docker in docker 1, 2, 3, 4)

# create a new docker in the same namespace as of the docker container registry and start as shell
kubectl -n=docker-registry run -it --rm --restart=Never alpine --image=alpine:latest -- sh
# inside the new Alpine Linux shell, run follow (to install package)
apk update && apk add curl && apk add busybox-extras && apk add nano && apk add docker
# add record to /etc/hosts (the record would be like:)
# <dokcer container service IP> <hostname we have the SSL cert>
# 192.168.2.3 docker-registry.my.localvirtual.domain
nano /etc/hosts

Copy over the cert, then run curl command as before (note the service is exposed 9665 port, different from Traefik entrypoint point 5443)

curl -v --cacert ca.crt --cert client.cert --key client.key -u <username>:<password> https://docker-registry.my.localvirtual.domain:9665/v2/_catalog

Curl result runs OK, then run docker login:

docker --debug --tlsverify --tls --tlscacert=ca.crt --tlscert=client.cert --tlskey=client.key login docker-registry.my.localvirtual.domain:9665

Verifying the log from docker registry, we see:

Setup docker CLI to use cert?

After research, to allow the client to supply the certificate, we would create a folder in following structure

https://docs.docker.com/engine/security/certificates/
my folder setup (with the URL replacing documentation’s localhost:5000 with my registry url)

And after this, run without specify the certs and key:

docker --debug login docker-registry.my.localvirtual.domain:9665

The login success:

So up to this point, I have confidence that the docker registry works, and “the client didn’t provide a certificate” error message I see previously related to docker client configuration.

So I go ahead to create the same folder structure as I tested inside the Linux pod. Then run the following:

docker --debug login docker-registry.my.localvirtual.domain:5443

The result is as follow:

The interesting point is it try to access HTTP instead of HTTPS, so I try to force TLS with following command:

docker --debug --tls login docker-registry.my.localvirtual.domain:5443

Result as follow:

This looks to me it try to get a default path to look for certificates, so finally use the longer command specifying the tls certs.

docker --debug --tlsverify --tls --tlscacert=ca.crt --tlscert=client.cert --tlskey=client.key login docker-registry.my.localvirtual.domain:5443

And ended up with result:

Checking the log in docker registry, we see the same “tls: client didn’t provide a certificate” again.

So here I suspect the docker client isn’t using the certificate, even specified in command.

After digging in the document more, the Docker Desktop user manual have a section on adding TLS certificate

https://docs.docker.com/desktop/windows/#adding-tls-certificates

So it imply the docker desktop does not monitor the /etc/docker/certs.d/<url:port> for certificate but manage it on ~/.docker/certs.d/<url:port>

After copying the cert, the result still not success and the TLS cert is not being picked up.

Checking result with tcpdump

Afterwards, I think I should capture the TCP traffic and see if a certificate being associated, without lengthen the article for too long, the learning and result as follow:

  1. (Wireshark on Windows cannot distinguish traffic inside WSL2, that’s why using tcpdump and export to pcap file to view in Wireshark) [I see some stackoverflow post mentioned that, my test also no luck, but this is not seriously studied]
  2. We can decipher tls traffic with (export SSLKEYLOGFILE=path/to/log), but docker login traffic cannot decipher (according to this post, because the docker CLI is written in golang and it read a config defined in code, on the other hand, I also tried mitmproxy but no luck setting it up correctly for proxying with the self sign cert).

So, again, no luck, setup 1 end.

Setup Idea 2.0

The setup idea 2.0 is based upon the article “Authenticate proxy with nginx” in official documentation.

This time the setup is a bit different, the TLS is terminated at the Traefik, while for the authentication, I tried to put it:

  1. at Traefik alone
  2. at docker registry only
  3. at both Traefik and docker registry
  4. drop authentication altogether, they does not affect the result

It does NOT affect the final result, again, cUrl come through (for case 4 above, do not pass the login/password), and docker CLI have the same error.\

The config for TLS termination, having this IngressRoute + Middleware replacing the previous IngressRouteTCP with tls passthrough:

note that the passHostHeader is added, respecting the example artical mentioned proxy_set_header is required

Up to this point, I do not know enough which part in the route having the problem.

Bypass Traefik and access the registry directly

I would like to allow the docker client to connect to docker registry directly.

By changing the service type from ClusterIP to NodePort as follow:

We can then figure out the service port we can use to access the service from within the node (which is the localhost in my case)

The result is the same as before, the curl can access OK and the docker CLI does not (with the same error)

In summary, seems like the problem is the docker CLI does not pass the certificate correctly.

A bit more deep dive on docker CLI (code)

Spent some time to study Docker cli source code, not sure if I am still trying to go down the path… long story short, the code indicated docker cli would respect docker context, I did tried to create context, and result keep the same…

Conclusion up to now

Been keep spending time on this item and I feel the idea of having Docker Desktop + WSL2 + Local Kubernetes + Self Sign Certificate is a tough path that I am not convinced to further pursue…

But the experience here did help me explore a lot more tools and the way how the docker CLI works as well as troubleshooting Kubernetes.

As usual, hopes this article would help someone.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store