SQL querying Apache HBase with Apache Phoenix

Published on 07 August 2019 in Querying / Browsing / Version 4 / Version 4.8 - 2 minutes read - Last modified on 03 June 2020

Apache Phoenix is a great addon to extent SQL on top of Apache HBase, the non relational distributed data store. On top of the HBase Browser, now the Editor provides a more common syntax for querying the data. Note that being a key/value store, the SQL can have different idioms, and the Editor interface still requires some polishing to fully support all the SQL UX capabilities of Hue.

In this post about Phoenix, let's follow Phoenix's 15-minute tutorial then query the US_POPULATION table via the Editor:

Hue supports Phoenix the best via the SqlAlchemy interfaces as described in the [SQL Connector documentation][https://docs.gethue.com/administrator/configuration/connectors/#apache-phoenix]. Since HUE-9367 the connector is shipped with Hue so it all works out of the box.

In the desktop/conf/hue.ini config file section just add the Phoenix interpreter:

[notebook]
[[interpreters]]
[[[phoenix]]]
name=phoenix
interface=sqlalchemy
options='{"url": "phoenix://sql-phoenix.gethue.com:8765/"}'

Then start the Phoenix queryserver:

phoenix-queryserver
...
19/07/24 20:55:13 INFO util.log: Logging initialized @1563ms
19/07/24 20:55:13 INFO server.Server: jetty-9.2.z-SNAPSHOT
19/07/24 20:55:14 INFO server.ServerConnector: Started [email protected]{HTTP/1.1}{0.0.0.0:8765}
19/07/24 20:55:14 INFO server.Server: Started @1793ms
19/07/24 20:55:14 INFO server.HttpServer: Service listening on port 8765.

And we are ready to query HBase!

SELECT * FROM us_population LIMIT 10

Notes

1 Existing HBase tables need to be mapped to views

0: jdbc:phoenix:> CREATE VIEW if not exists "analytics_demo_view" ( pk VARCHAR PRIMARY KEY, "hours"."01-Total" VARCHAR );
Error: ERROR 505 (42000): Table is read only. (state=42000,code=505)
-->
0: jdbc:phoenix:> CREATE Table if not exists "analytics_demo" ( pk VARCHAR PRIMARY KEY, "hours"."01-Total" VARCHAR );

2 Tables are seeing as uppercase by Phoenix. When getting started, it is simpler to just create the table via Phoenix.

Error: ERROR 1012 (42M03): Table undefined. tableName=ANALYTICS_DEMO (state=42M03,code=1012)
-->
0: jdbc:phoenix:> select * from "analytics_demo" where pk = "domain.0" limit 5;

3 Phoenix follows Apache Calcite. Feel free to help improve the SQL autocomplete support for it.

4 The UI (and the underlying SQLAlchemy API) cannot distinguish between ‘ANY namespace’ and ‘empty/Default’ namespace

 

Feel free to read more about Apache Phoenix capabilities in this Cloudera blog post announcing [Phoenix in CDH][11].

Any feedback or question? Feel free to comment here or on @gethue!

comments powered by Disqus

More recent stories

23 June 2020
Monitoring Hue activity with Grafana Dashboards
Read More
22 June 2020
Automated checks for JavaScript modules compatible licenses and non absolute paths with Continuous Integration
Read More
19 May 2020
How to grant Ranger permissions for a new user on a Secure Cluster
Read More