SQL Editor for Apache Impala

Published on 30 April 2020 in Querying / Version 4.8 - 2 minutes read - Read in jp

Impala SQL

Apache Impala is a fast SQL engine for your data warehouse. Want to give it a quick try in 3 minutes? Here is how!

Starting Impala

First make sure your have docker installed in your system. Then, based on the great tutorial of Apache Kudu (which we will cover next, but in the meantime the Kudu Quickstart is worth a look), just execute:

docker run -d --name kudu-impala -p 21000:21000 -p 21050:21050 -p 25000:25000 -p 25010:25010 -p 25020:25020 --memory=4096m apache/kudu:impala-latest impala

Afterwards, docker ps should show:

> docker ps
CONTAINER ID        IMAGE                       COMMAND                  CREATED             STATUS              PORTS                                                                                                                              NAMES
fe7b68d167b3        apache/kudu:impala-latest   "/impala-entrypoint.…"   4 seconds ago       Up 3 seconds>21000/tcp,>21050/tcp,>25000/tcp,>25010/tcp,>25020/tcp   kudu-impala

Then just enter the running container and start the SQL shell:

> docker exec -it kudu-impala impala-shell

Starting Impala Shell without Kerberos authentication
Opened TCP connection to fe7b68d167b3:21000
Connected to fe7b68d167b3:21000
Server version: impalad version 3.3.0-RELEASE RELEASE (build 0f840c5a0f5e673c67cbd482e62065fd47b98e1a)
Welcome to the Impala shell.
(Impala Shell v3.4.0-SNAPSHOT (b0c6740) built on Thu Oct 17 10:56:02 PDT 2019)

When you set a query option it lasts for the duration of the Impala shell session.

And run some SQL instructions:

[fe7b68d167b3:21000] default> show tables;
Query: show tables
Fetched 0 row(s) in 0.36s
[fe7b68d167b3:21000] default> create table a (a int);
Query: create table a (a int)
| summary                 |
| Table has been created. |
Fetched 1 row(s) in 1.31s

[fe7b68d167b3:21000] default> insert into a values (1);
Query: insert into a values (1)
Query submitted at: 2020-04-30 17:42:59 (Coordinator: http://fe7b68d167b3:25000)
Query progress can be monitored at: http://fe7b68d167b3:25000/query_plan?query_id=cb410a4f8b0b0d6a:1a8a909e00000000
Modified 1 row(s) in 1.60s

[fe7b68d167b3:21000] default> select * from a;
Query: select * from a
Query submitted at: 2020-04-30 17:43:08 (Coordinator: http://fe7b68d167b3:25000)
Query progress can be monitored at: http://fe7b68d167b3:25000/query_plan?query_id=7242c5151534b8db:bef9c91000000000
| a |
| 1 |
Fetched 1 row(s) in 0.33s

[fe7b68d167b3:21000] default> exit

SQL Editor

Typing SQL with a Query Assistant is even more productive.

cf. above docker ps, get the container ID and retrieve its IP via:

> docker inspect -f '{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' 638574b31cd6

As Impala is deeply integrated with Hue, in the hue.ini simply configure the hostname of the container:


And restart Hue and that's it, the editor will appear:

Hue Impala SQL Editor

To read more in depth about the SQL Experience follow this blog post.

Any feedback or question? Feel free to comment here or on the Forum and quick start querying!

Romain from the Hue Team

comments powered by Disqus

More recent stories

17 November 2020
Easy Querying of live Kafka data in a Big Table like HBase with SQL
Read More
20 October 2020
Tutorial on querying live streams of data with Flink SQL
Read More
20 October 2020
Tutorial on querying live streams of data with ksql (Kafka SQL)
Read More