10 years of Data Querying Experience Evolution with Hue

Published on 28 January 2020 in Version 4 - 3 minutes read - Last modified on 06 March 2021 - Read in jp

Hue has just blown its 10th candle. Hue was created when Apache Hadoop was still in its infancy before becoming mainstream (read more about the Hadoop story in Hadoop is Dead. Long live Hadoop).

Hue originally was a part of Cloudera Manager, which was proprietary and focused more on the administrators but was then moved out to its own open source project in version 0.3. Hue then gradually evolved from being a desktop like application to a modern single page SQL Editor (and is at version 4.6 as of today).

Through continuous iterations, Hue kept improving on its main goal: facilitating the ease of use to the data platform. The user base primarily consists of anybody looking at querying data: e.g.:

  • Data Analyst answering some ad-hoc questions
  • Program Managers looking at usage stats
  • IT/SQL Developers building some Data apps
  • Data Architects poking at the whole usability of the system
  • Data Engineers nurturing the Data Warehouse table creations

The second category consists of more technical users wanting to see job logs, upload data to the distributed file systems like HDFS or AWS S3, build workflows, create search dashboards, optimize queries…

Hue 1 screenshot

Hue 1 (2009) - A desktop-feel application with an Apache Hive Editor, Hadoop File and Job browsers.

Hue 2 screenshot Hue 2 (2012) - Flat design, advanced SQL Editor and adding more than 15 new apps/connectors to the data platform with proper security (e.g. for browsing tables, building workflows and search dashboards)

Hue 3 screenshot Hue 3 (2013) - Aggregating and inter-linking the apps together into a single experience and providing a single page Editor and a much more powerful SQL intellisense

Hue 4 screenshot Hue 4 (2017) - Major revamp of the interface turning Hue into a modern and simpler single page app. Next steps of SQL intellisense with smart recommendations, risk alerts and data catalog integration

More users and More SQL

With the merging of Cloudera (CDH) and Hortonworks (HDP) distributions into CDP (Cloudera Data Platform, then available in Data Center or Cloud), Hue is becoming ubiquitous and available to even more users via:

  • 1000+ combined customers (including an important part of the Fortune 500)
  • 100 000s of SQL queries are being executed manually via Hue daily

Upstream Hue is also shipped in several other distributions like AWS EMR, IBM Open Data Hub and has an active community.

Hue 4.6 screenshot Hue 4.6 (2019) - Componentization continues and stronger Data Warehouse integration for SQL querying and browsing files in the Cloud

In 2020, the upcoming Hue 5 is specializing even more into Data Warehousing and has for focus to provide the best SQL Cloud Editor:

  • First with deeper and deeper support of the Apache Hive and Apache Impala SQL engines. The SQL interfaces are also being revamped into stable components allowing an easy welcome of other engines of the Apache Calcite family and more like Apache Phoenix, Apache Druid, Apache Flink SQL. More collaboration with richer query sharing as well as an even smarter intellisense and assistant to optimize queries.
  • Secondly by being “Cloud Ready” and fitting well in the world of scaling up and down containers and automated infrastructure. The first version of Hue on Kubernetes has already been shipped and more scale and simpler operation management are coming.

We will deep dive in greater details on the querying capabilities of the SQL Cloud Editor in part two of this series of 10 years of evolution of Hue. Until then, feel free to comment here or on the Forum and quick start SQL querying!

Romain, from the Hue Team


comments powered by Disqus

More recent stories

03 May 2023
Discover the power of Apache Ozone using the Hue File Browser
Read More
23 January 2023
Hue 4.11 and its new dialects and features are out!
Read More