Troubleshooting Traefik routing and Jaeger for observability

Background

As a continuation of last article for configuring Traefik, there have been quite some moment that the routing does not work as expected, so it would worth some discussion and sharing on how I tried to troubleshoot.

Items I tried

Access log to me is important to know what kind of access it does that go through Traefik.

By default, the Helm chart is setting log level to ERROR and access log is disabled.

I did configure the helm chart overriding values yaml to set log level to DEBUG and access log to true.

By reviewing the documentation of log and access log, log level and access log configuration are configured through file based configuration (through defined yaml file in config map and bind to a volume). But the enable/disable of access log is not in such configuration.

By further inspecting the helm chart pod template, the log level and access log enabling are both configured using command line interface (CLI)

We can then prepare to monitor the log, by running kubectl commnad

# the -f switch mean "follow", which we would see the screen update when new logs comes in
kubectl logs svc/<traefik service name> -f

Then we hit our site and read the log, there would be overwhelming number of logs, and we would look for the pattern like following:

Several piece of information can be noticed, first, the request path is /api/device/systemTime, and we see the X-Replaced-Path is “/device/systemTime, this align with the middleware to rewrite path from /device/* to /api/device/*.

Also the forward URL is going to 10.109.0.85:8031, which is one of the pod of the target service (one can check the with kubectl get pod -o=wide)

Traefik comes with a dashboard, one could follow the Traefik documentation to setup (either insecure mode or secure mode with authentication using Kubernetes secret)

By default, an IngressRoute would be added with rule

(PathPrefix(`/api`) || PathPrefix(`/dashboard`))

This is not enough as it does not specify the hostname and there might be some other services that conflict with the same pattern (likely /api prefix), the documentation recommend adding rule like following:

And sure, if we do not want to config DNS, we could update the host table to point the domain name to the Traefik load balancer IP address.

By accessing the URL, one would need to use the pattern:

http://<traefik dashboard hostname>/dashboard/

In my scenario, without the ending “/” after dashboard would not work (it somehow being matched to another route)

The dashboard helped me to see the list of entrypoints and confirm the configuration is taking effect as expected.

Also the routers route show some error and drilling down would see a list of rules as well as the detail (middleware and error cause)

list of routes
Showing error reason
Showing middlewares applied to the route

I know Jaeger through Traefik documentation section observability > tracing > Jaeger, I only used it to inspect particular request info as well as the middleware being applied.

I install Jaeger with the jaeger operator helm chart (their github here with instruction of installation).

A minimal deployment would be a Kubernetes deployment config yaml like following:

And also at Traefik, add the “additionalArguments” or other ways to apply the CLI as following:

Finally adding a IngressRoute rule:

Adding DNS record to host table pointing “jaeger-query” to the Traefik load balance IP address and open browser to hit the url http://jaeger-query.

We would see the UI as following, and clicking one of the traffic we would see more.

The UI, showing request listing ordered by time
The request undergo 3 “frames”, each frame would show different detail depends on the frame
Looking at the frame name, we already see it first goes to the requested host, then run through the redirct http => https middleware, and finally route to the target service, so this confirm the traffic goes through the expected route (note that this example didn’t go through the url rewrite)

Jaeger is fall more powerful for traceability, just that I haven’t learn to use more of it yet.

Conclusion

In this article, I shared the tools I used to troubleshoot / trace the Traefik routes, hopes that help someone~

--

--

--

Love podcasts or audiobooks? Learn on the go with our new app.

Amazon Kinesis: The core of real-time streaming!

How to Use Learning Opportunities Effectively

Tracking ships and visualize them in QGIS

The Only Git Commands You Need To Know

Micro Frontends Architecture for SPA

The ‘Simon Workshop’ can help scrum teams see their own true colours

Personalising VSCode for Dyslexia

Useful Jira Automation

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Stephen Cow Chau

Stephen Cow Chau

More from Medium

Hands On Network Policies In Kubernetes

Install Open LDAP Server / Console via Helm Chart

Flogging Moby

Setting up Kibana for Elasticsearch in Local