Articles & News

03 February 2014

Making Hadoop Accessible to your Employees with LDAP

Last updated on July 9th 2015  Hue easily integrates with your corporation’s existing identity management systems and provides authentication mechanisms for SSO providers. By changing a few configuration parameters, your employees can start doing big data analysis in their browser by leveraging an existing security policy.  This blog post details the various features and capabilities available in Hue for LDAP: Authentication Search bind Direct bind Importing users…

9 minutes read -

03 February 2014

How to manage the Hue database with the shell

Last update on March 9 2016  First, backup the database. By default this is this SqlLite file: cp /var/lib/hue/desktop.db ~/ Then if using CM, export this variable in order to point to the correct database: HUE_CONF_DIR=/var/run/cloudera-scm-agent/process/-hue-HUE_SERVER-id echo $HUE_CONF_DIR export HUE_CONF_DIR Where is the most recent ID in that process directory for hue-HUE_SERVER. A quick way to get the correct directory is to use this script: export HUE_CONF_DIR="/var/run/cloudera-scm-agent/process/\`ls -alrt /var/run/cloudera-scm-agent/process | grep HUE | tail -1 | awk '{print $9}'\`"…

3 minutes read - Development

03 February 2014

Solving the Hue 2.X hanging problem

In the Hue versions before 3, Hue is sometimes getting slow and “stuck”. To fix this problem, it is recommended to switch Hue to use the CherryPy server instead of Spawning. In the hue.ini or the Hue Safety Valve in CM, enter: [desktop] use_cherrypy_server = true Cause: Most of the time some timeout/Thrift errors can be seen in the Hue logs (/logs page). These errors are due to Beeswax crashing or being very slow and blocking all the requests as the Spawing Server is not perfectly greenified in Hue 2 (the unique Thread is blocked in the RPC IO call).…

1 minute read -

13 January 2014

Using Hadoop MR2 and YARN with an alternative Job Browser interface

Hue now defaults to using Yarn since version 3.   First, it is a bit simpler to configure Hue with MR2 than in MR1 as Hue does not need to use the Job Tracker plugin since Yarn provides a REST API. Yarn is also going to provide an equivalent of Job Tracker HA with YARN-149. Here is how to configure the clusters in hue.ini. Mainly, if you are using a pseudo distributed cluster it will work by default.…

1 minute read -

02 January 2014

A better PyGreSql support for Django

With the release of django-pygresql, the Hue team has taken a first stab at PyGreSQL support in Django! The ‘Why’ The open source world has many different kinds of licenses and it can be confusing to know which one makes sense for you. PyGreSQL is a PostgreSQL client with a permissible enough license that it can be packaged and shipped. The ‘How’ PyGreSQL has some minor differences from the provided postgresql backend.…

1 minute read - Development

02 January 2014

A new Spark Web UI: Spark App

Note: This post is deprecated as of Hue 3.8 / April 24th 2015. Hue now have a new Spark Notebook application. Hi Spark Makers! A Hue Spark application was recently created. It lets users execute and monitor Spark jobs directly from their browser from any machine, with interactivity. The new application is using the Spark Job Server contributed by Ooyala at the last Spark Summit. We hope to work with the community and have support for Python, Java, direct script submission without compiling/uploading and other improvements in the future!…

2 minutes read -

30 December 2013

JobTracker High Availability (HA) in MR1

When the Job Tracker goes down, Hue cannot display the Jobs in File Browser or submit to the correct cluster.  In MR1, Hadoop can support two Job Trackers, a master Job Tracker that can fail over to a standby Job Tracker and hence provide Job Tracker HA. Let’s see how Hue 3.5 and CDH5beta1 (and probably CDH4.6) can take advantage of this. Note: in MR1 Hue is using a plugin to communicate with the Job Tracker.…

1 minute read -

16 December 2013

Use the Impala App with Sentry for real security

Apache Sentry is the new way to provide security (e.g. privileges on SQL statements SELECT, CREATE…) when querying data in Hadoop. Impala offers fast SQL for Apache Hadoop and can leverage Sentry. Here is how to use configure it: First enable impersonation in the hue.ini that way permissions will be checked against the current user and not ‘hue’ which acts as a proxy: [impala] impersonation_enabled=True Then you might hit this error:…

2 minutes read -

13 December 2013

Hue goes to Los Angeles: HBase Meetup

HBase + Hue - LA HBase User Group from gethue LA HBase Meetup

1 minute read -

12 December 2013

Recent Security Enhancements

Hue has seen a slew of security improvements recently (from Hue 3.5). The most important ones have been enabling encryption when communicating with other services: Secure database connection (HUE-1638) HiveServer2 over SSL (HUE-1749)  In addition, several other security options have been added: Session timeout is now configurable (HUE-1528) Cookies can be secure (HUE-1529) HTTP only in session cookie if supported (HUE-1639) Allowed HTTP methods can be defined in the hue.…

2 minutes read -

More recent stories

26 June 2024
Integrating Trino Editor in Hue: Supporting Data Mesh and SQL Federation
Read More
03 May 2023
Discover the power of Apache Ozone using the Hue File Browser
Read More