Introducing Hue requests tracing with Opentracing and Jaeger in Kubernetes

Published on 24 September 2019 in Version 4 / Development - 3 minutes read - Last modified on 19 April 2021

Hue is getting easy to run with its Docker container and Kubernetes Helm package. Recent blog posts describes how to get access to logs and metrics. Even in a non distributed world it can get noisy to know how much time is being spent where in each user request.

Consequently, in the context of a Data Analyst, knowing why a certain query is slow can become problematic. On top of that, adding multiple tenants and users, and more than 20 external APIs and the fog about fine grain performances appears and its becomes extremely manual and time consuming to troubleshoot.

In order to help get clarity on where exactly each request time is being spent, Hue started to implement the Opentracing API. Jaeger was selected as the implementation for its ease of use and close support with Kubernetes. Here we will also leverage the Microk8s distribution that bundles it.


Hue now ships with the open tracing integration, and details about the current state of this feature are in the Tracing design document. To turn it on, in the hue.ini:

## If tracing is enabled.

## Trace all the requests instead of a few specific ones like the SQL Editor. Much noisier but currently required.

On the Jaerger side, as explained in the quick start, it is simple to run it on the same host as Hue with this container:

docker run -d --name jaeger \
  -p 5775:5775/udp \
  -p 6831:6831/udp \
  -p 6832:6832/udp \
  -p 5778:5778 \
  -p 16686:16686 \
  -p 14268:14268 \
  -p 9411:9411 \

And that’s it! Jaeger should show up at this page http://localhost:16686.

Tracing queries

In the SQL Editor of Hue, execute a series of queries. In the Jaeger UI, if you then select the hue-api service, each external call to the queried datawarehouse (e.g. execute_statement, fetch_status, fetch_result… to MySql, Apache Impala…) are being traced. Below we can see 5 query executions that went pretty fast.

Fine grain filtering at the user or query level operation is possible. For example, to lookup all the submit query calls of the user ‘romain’, select ‘notebook-execute’ as the Operation, and tag filter via user-id=”romain”:

In the next iteration, more calls and tags (e.g. filter all traces by SQL session XXX) will be supported and a closer integration with the database engine would even propagate the trace id across all the system.


Any feedback or question? Feel free to comment here or on the Forum or @gethue and quick start SQL querying!


Romain from the Hue Team

comments powered by Disqus

More recent stories

21 September 2021
Access your data in ABFS without any credential keys!
Read More
14 September 2021
SSO for REST APIs with your custom JWT authentication
Read More
17 August 2021
Create Phoenix tables in Just 2 steps
Read More