Articles & News

10 January 2018

Self Service BI: doing a Customer 360 by querying and joining Salesforce, Marketing and log datasets

In this demo we use the Editor to query credit card transaction data that is saved in an object store in the cloud (here S3) and in a Kudu table. The demos leverages the Data Catalog search and tagging as well as the Query Assistant. Note: Do it Yourself! The queries and data are freely available on Scenario: Digital Services International You recently launched a new streaming service: VP wants to understand support impact of this launch Marketing wants to use this to better target campaigns Goal: Build a 360-degree view of your customers to understand the support costs, product usage, time-to-resolution, and current activity in marketing channels…

1 minute read - Querying / Version 4 / Tutorial

24 August 2017

Importing data from traditional databases into HDFS/Hive in just a few clicks

There are exciting new features coming in Hue 4.1 and later in CDH 6 next year. One of which is Hue’s brand new tool to import data from relational databases to HDFS file or Hive table using Apache Sqoop 1. It enables us to bring large amount of data into the cluster in just few clicks via interactive UI. This Sqoop connector was added to the existing import data wizard of Hue.…

5 minutes read - Version 4 / Querying / Tutorial

22 August 2016

Easy indexing of data into Solr with ETL operations

Creating Solr Collections from Data files in a few clicks There are exciting new features coming in Hue 3.11 week and later in CDH 5.9 this Fall. One of which is Hue’s brand new tool to create Apache Solr Collections from file data. Hue’s Solr dashboards are great for visualizing and learning more about your data so being able to easily load data into Solr collections can be really useful.…

7 minutes read - Querying / Tutorial

07 July 2015

Bay Area BikeShare Data Analysis with Search and Spark Notebook

In this tutorial, we use public data from Bay Area BikeShare and visualize bike trips patterns and their users to understand more the usage of the platform. Hue provides a Dynamic Search dashboard as well as the new Spark Notebook for enriching the data. We recommend to start with the Trip dataset from and index it into Solr. For impatient people, we provide a subset of trips ready to be indexed as well as the weather data to be processed later with Spark.…

1 minute read - Querying / Tutorial

21 May 2015

Build a Real Time Analytic dashboard with Solr Search and Spark Streaming

Search is a great way to interactively explore your data. The Search App is continuously improving and now comes with a better support for real time! In this video, we are collecting tweets with Spark Streaming and directly indexing them into Solr with the Spark Solr app. Note that we are using a slightly modified version that adds more tweet information.  You can see the tweets rolling in!…

2 minutes read - Querying / Tutorial

09 October 2014

Bay Area bike share analysis with the Hadoop Notebook and Spark & SQL

This post was initially published on the Hue project blog Apache Spark is getting popular and Hue contributors are working on making it accessible to even more users. Specifically, by creating a Web interface that allows anyone with a browser to type some Spark code and execute it. A Spark submission REST API was built for this purpose and can also be leveraged by the developers. In a previous post, we demonstrated how to use Hue's Search app to seamlessly index and visualize trip data from Bay Area Bike Share and leverage Spark to supplement that analysis by adding weather data to our dashboard.…

6 minutes read - Browsing / Querying / Tutorial

08 November 2013

Season II: 8. How to transfer data from Hadoop with Sqoop 2

Note: Sqoop2 is now replaced by  Apache Sqoop is a great tool for moving data (in files or databases) in or out of Hadoop. In Hue 3, a new app was added for making Sqoop2 easier to use. In this final episode (previous one was about Search) of the season 2 of the Hadoop Tutorial series let’s see how simple it becomes to export our Yelp results into a MySql table!…

2 minutes read - Querying / Tutorial

04 November 2013

Season II: 7. How to index and search Yelp data with Solr

In the previous episode we saw how to use Pig and Hive with HBase. This time, let’s see how to make our Yelp data searchable by indexing it and building a customizable UI with the Hue Search app.    Indexing data into Solr  This tutorial is based on SolrCloud. Here is a step by step guide about its installation and a list of required packages: solr-server solr-mapreduce search  Next step is about deploying and configuring Solr Cloud.…

3 minutes read - Querying / Tutorial

21 October 2013

Season II: 6. Use Pig and Hive with HBase

The HBase app is an elegant way to visualize and search a lot of data. Apache HBase tables can be tricky to update as they require lower level API. Some good alternative for simplifying the data management or access is to use Apache Pig or Hive.  In this post we are going to show how to load our yelp data from the Oozie Bundles episode into HBase with Hive. Then we will use the HBase Browser to visualize it and Pig to compute some statistics.…

3 minutes read - Browsing / Querying / Tutorial

14 October 2013

Season II: 5. Bundle Oozie coordinators with Hue

Hue provides a great Oozie UI in order to use Oozie without typing any XML. In Tutorial 3, we demonstrate how to use an Oozie coordinator for scheduling a daily top 10 of restaurants. Now lets imagine that we also want to compute a top 10 and 100. How can we do this? One solution is to use Oozie bundles.    Workflow and Coordinator updates Bundles are are way to group coordinators together into a set.…

3 minutes read - Scheduling / Tutorial

More recent stories

23 September 2020
Hue 4.8 and its improvements are out!
Read More
15 September 2020
SQL Querying Improvements: Phoenix, Flink, SparkSql, ERD Table...
Read More
14 September 2020
REST API for sending SQL queries and Browsing files
Read More