Quick Task: How to query Apache Druid analytic database

Published on 22 March 2019 in Version 4 - 2 minutes read - Last modified on 05 June 2023

Self-service exploratory analytics is one of the most common use cases of the Hue users. While deeply integrated with Apache Impala and Apache Hive, Hue also lets you take advantage of its smart editor and assistants with any databases. In this tutorial, let's see how to query Apache Druid.

Apache Druid is an “OLAP style” database.

If not already running, it is easy to get Druid downloaded and started. In our case we will just query the provided Wikipedia data sample.

Administrator

First, let's make sure that Hue can talk to Druid via the pydruid SqlAlchemy connector. Either make sure it is in the global Python environment via a usual ‘pip install’ or install it in the Hue virtual environment.

./build/env/bin/pip install pydruid

Note: Make sure the version is equal or more to 0.4.1 if not you will get a “Can't load plugin: sqlalchemy.dialects:druid”.

In the hue.ini configuration file, now let's add the interpreter. Here ‘druid-host.com’ would be the machine where Druid is running.

[notebook]
[[interpreters]]
[[[druid]]]
name = Druid
interface=sqlalchemy
options='{"url": "druid://druid-host.com:8082/druid/v2/sql/"}'

And now restart Hue.

User

And that's it, now open-up http://127.0.0.1:8000/hue/editor/?type=pydruid (replace host or port of your actual Hue) and you can start querying!

SELECT countryName, count(*) t
FROM druid.wikipedia
GROUP BY countryName
ORDER BY t DESC
LIMIT 100

As usual feel free to comment here or to send feedback to the hue-user list or @gethue!

 


comments powered by Disqus

More recent stories

03 May 2023
Discover the power of Apache Ozone using the Hue File Browser
Read More
23 January 2023
Hue 4.11 and its new dialects and features are out!
Read More