Introducing Hue requests tracing with Opentracing and Jaeger in Kubernetes

Published on 24 September 2019 in Administration / Querying / Version 4.6 - 3 minutes read - Last modified on 04 February 2020

Hue is getting easy to run with its Docker container and Kubernetes Helm package. Recent blog posts describes how to get access to logs and metrics. Even in a non distributed world it can get noisy to know how much time is being spent where in each user request.

Consequently, in the context of a Data Analyst, knowing why a certain query is slow can become problematic. On top of that, adding multiple tenants and users, and more than 20 external APIs and the fog about fine grain performances appears and its becomes extremely manual and time consuming to troubleshoot.

In order to help get clarity on where exactly each request time is being spent, Hue started to implement the Opentracing API. Jaeger was selected as the implementation for its ease of use and close support with Kubernetes. Here we will also leverage the Microk8s distribution that bundles it.


Hue now ships with the open tracing integration, and details about the current state of this feature are in the Tracing design document. To turn it on, in the hue.ini:

## If tracing is enabled.

## Trace all the requests instead of a few specific ones like the SQL Editor. Much noisier but currently required.

On the Jaerger side, as explained in the quick start, it is simple to run it on the same host as Hue with this container:

docker run -d --name jaeger \
  -p 5775:5775/udp \
  -p 6831:6831/udp \
  -p 6832:6832/udp \
  -p 5778:5778 \
  -p 16686:16686 \
  -p 14268:14268 \
  -p 9411:9411 \

And that’s it! Jaeger should show up at this page http://localhost:16686.

Tracing queries

In the SQL Editor of Hue, execute a series of queries. In the Jaeger UI, if you then select the hue-api service, each external call to the queried datawarehouse (e.g. execute_statement, fetch_status, fetch_result… to MySql, Apache Impala…) are being traced. Below we can see 5 query executions that went pretty fast.

Fine grain filtering at the user or query level operation is possible. For example, to lookup all the submit query calls of the user ‘romain’, select ‘notebook-execute’ as the Operation, and tag filter via user-id=”romain”:

In the next iteration, more calls and tags (e.g. filter all traces by SQL session XXX) will be supported and a closer integration with the database engine would even propagate the trace id across all the system.


Any feedback or question? Feel free to comment here or on the Forum or @gethue and quick start SQL querying!


Romain from the Hue Team

comments powered by Disqus

More recent stories

11 March 2020
Automatically checking documentation and website dead links with Continuous Integration
Read More
04 March 2020
A better collaborative Data Warehouse Experience with SQL query sharing via links or gists
Read More
27 February 2020
Re-using the JavaScript SQL Parser
Read More