Performing Web/API Service upgrades without Downtime

Published on 06 March 2021 in Version 4.10 / Development - 3 minutes read - Read in jp

By leveraging Kubernetes rollouts.

This is a series of post describing how the Hue Query Service is being built.

Automation well done frees-up from repetitive manual tasks while also documenting the process: team members get more productive at working at adding value instead and keep the momentum.

Now, how to update automatically the refresh of the project websites without any downtime and manual steps. as well as (and not to forget all run in small containers in a main Kubernetes cluster. Containers might be a bit overweight for this type of static websites, but they allow the helpful pattern of being driven automatically via source code changes and harmonizing all the services to follow the exact same flow.

i.e. also reuses the same deployment logic, as well as other database engines offered in the demo website. Those websites are also driven via code changes in GitHub, not via any UI.

For example, here are the running websites:

kubectl get pods -ngethue
docs-55bf874485-vjnlf 1/1 Running 1 8h
website-5c579d4dd-kqlvt 1/1 Running 0 60m
website-jp-964f9cc57-h97gz 1/1 Running 0 6h38m

Until recently we were performing daily restarts the “hard way”:

kubectl delete pods -ngethue `kubectl get pods -ngethue | egrep ^website | cut -d" "-f1`

This “works” but induces some non required downtimes and “noise”:

Hammered by “website is down” notificationsHammered by “website is down” notifications

Now, the standard kubernetesrollout command is being used, and the transition is transparent for the admins and public users!

kubectl rollout restart -ngethue deployment/website

First diagram from the Kubernetes documention demoing a rolloutFirst diagram from the Kubernetes documention demoing a rollout

Start of the new websiteinstance/pod and swapping with the old one when ready:

kubectl get pods -ngethue
NAME                         READY   STATUS    RESTARTS   AGE
docs-55bf874485-vjnlf        1/1     Running   1          13h
website-75c7446d4c-z5p6g     0/1     Running   0          6s
website-bb6fc6b6-nkzqh       1/1     Running   0          18m
website-jp-964f9cc57-h97gz   1/1     Running   0          11h

Note that latest tag is being used here, and a new image gets built daily when the repository mirror get synced. The image building of the static websites is very simple and has very low chance of failing or shipping an incorrect image. By leveraging proper tagging, all the states would be versioned and filling upgrades would automatically roll back to a previously valid state.

Current requirements are “100% automated as simple as possible with daily frequency”. But what if we would like a more “real time” rollout? (e.g. after each commit or pull request or hourly). This in the plan and will be detailed in a follow-up blog post.

Any feedback or question? Feel free to comment here or on the Discussions and quick start SQL querying!

Any feedback or advice? Feel free to comment!

Romain from the Hue Team

comments powered by Disqus

More recent stories

26 June 2024
Integrating Trino Editor in Hue: Supporting Data Mesh and SQL Federation
Read More
03 May 2023
Discover the power of Apache Ozone using the Hue File Browser
Read More