How to deploy Hue on HDP

How to deploy Hue on HDP

Guest post from Andrew that we regularly update (Dec 19th 2014)

 

I decided to deploy Hue 3.7, from tarballs (note, other sources like packages from the ‘Install’ menu above would work too), on HDP 2.2 recently and wanted to document some notes for anyone else looking to do the same.

Deployment Background:

  • Node Operating System:  CentOS 6.6 – 64bit
  • Cluster Manager:  Ambari 1.7
  • Distribution:  HDP 2.2
  • Install Path (default):  /usr/local/hue
  • HUE User:  hue

After compiling (some hints there), you may run into out of the box/post-compile startup issues.

  • Be sure to set the appropriate Hue proxy user/groups properties in your Hadoop service configurations (e.g. WebHDFS/WebHCat/Oozie/etc)
  • Don’t forget to configure your Hue configuration file (‘/usr/local/hue/desktop/conf/hue.ini’) to use FQDN hostnames in the appropriate places

 

beeswax-editor

 

Startup

Hue uses an SQLite database by default and you may find the following error when attempting to connect to HUE at its default port (e.g. fqdn:8888)

  • File “/usr/local/hue/build/env/lib/python2.6/site-packages/Django-1.4.5-py2.6.egg/django/db/backends/sqlite3/base.py”, line 344, in execute return Database.Cursor.execute(self, query, params) DatabaseError: unable to open database file

 

Removing apps

For Impala (or any other app), the easiest way to remove them is to just black list them in the hue.ini. The second best alternative way is to remove the Hue permissions to the groups of some users.
[desktop]
app_blacklist=impala
For Sentry, you will need to use ‘security’, but it will also hide the HDFS ACLs editor for now.

HDFS

Check your HDFS configuration settings and ensure that the service is active and healthy.
Did you remember to configure proxy user hosts and groups for your HDFS service configuration?

With Ambari, you can review your cluster’s HDFS configuration, specifically under the “Custom core-site.xml” subsection:

There should be two (2) new/custom properties added to support the HUE File Browser:

<property>
  <name>hadoop.proxyuser.hue.hosts</name>
  <value>*</value>
</property>
<property>
  <name>hadoop.proxyuser.hue.groups</name>
  <value>*</value>
</property>

With Ambari, you can go to the HDFS service settings and find this under “General”

– The property name is dfs.webhdfs.enabled (“WebHDFS enabled), and should be set to “true” by default.

– If a change is required, save the change and start/restart the service with the updated configuration.

Ensure the HDFS service is started and operating normally.

– You could quickly check some things, such as HDFS and WebHDFS by checking the WebHDFS page:

– http://<NAMENODE-FQDN>:50070/ in a web browser or ‘curl <NAMENODE-FQDN>:50070

Check if the processes are running using a shell command on your NameNode:

– ‘ps -ef | grep “NameNode”

By default your HDFS service(s) may not be configured to start automatically (e.g. upon boot/reboot).
Check the HDFS logs to see if the namenode service had trouble starting or started successfully:
– These are typically found at ‘/var/log/hadoop/hdfs/’

 

Hive Editor

By default, HUE appears to connect to the Hiveserver2 service using NOSASL authentication; Hive 0.14 ships with HDP 2.2 but is not configured by default to use authentication.
  • We’ll need to change the properties of our Hive configuration to work with the HUE Hive Editor (‘hive.server2.authentication=‘NOSASL’).
HDP 2.1 (Hive 0.13) continues to carry forward the GetLog() issue with Hue’s Hive Editor.e.g.
"Server does not support GetLog()"
In HDP 2.2, that includes Hive 0.14 and HIVE-4629, you will need this commit from Hue 3.8 (coming-up at the end of Q1 2015) or use master, and enable it in the hue.ini:
[beeswax]
# Choose whether Hue uses the GetLog() thrift call to retrieve Hive logs.
# If false, Hue will use the FetchResults() thrift call instead.
use_get_log_api=false

Security – HDFS ACLs Editor

By default, Hadoop 2.4.0 does not enable HDFS file access control lists (FACLs)

  • We’ll need to change the properties of our HDFS namenode service to enable FACLs (‘dfs.namenode.acls.enabled’=’true’)

Spark

 We are improving the Spark Editor and might change the Job Server and stuff is still pretty manual/not recommend for now.

HBase

Currently not tested (should work with Thrift Server 1)

Job Browser

Progress has never been entirely accurate for Map/Reduce completions — always shows the percentage for Mappers vs Reducers as a job progresses. “Kill” feature works correctly.

Oozie Editor/Dashboard

Note: when Oozie is deployed via Ambari 1.7, for HDP 2.2, the sharelib files typically found at /usr/lib/oozie/ are missing, and in turn are not staged at hdfs:/user/oozie/share/lib/ …

I’ll check this against an HDP 2.1 deployment and write the guys at Hortonworks an email to see if this is something they’ve seen as well.

Pig Editor

 Make sure you have at least 2 nodes or tweak YARN to be able to launch two apps at the same time (gotcha #5) and Oozie is configured correctly.
The Pig/Oozie log looks like this:
2014-12-15 23:32:17,626  INFO ActionStartXCommand:543 - SERVER[hdptest.construct.dev] USER[amo] GROUP[-] TOKEN[] APP[pig-app-hue-script] JOB[0000001-141215230246520-<wbr />oozie-oozi-W] ACTION[0000001-<wbr />[email protected]:<wbr />start:] Start action [0000001-141215230246520-<wbr />[email protected]:start:] with user-retry state : userRetryCount [0], userRetryMax [0], userRetryInterval [10]

2014-12-15 23:32:17,627  INFO ActionStartXCommand:543 - SERVER[hdptest.construct.dev] USER[amo] GROUP[-] TOKEN[] APP[pig-app-hue-script] JOB[0000001-141215230246520-<wbr />oozie-oozi-W] ACTION[0000001-<wbr />[email protected]:<wbr />start:] [***0000001-141215230246520-<wbr />[email protected]:start:***]Action status=DONE

2014-12-15 23:32:17,627  INFO ActionStartXCommand:543 - SERVER[hdptest.construct.dev] USER[amo] GROUP[-] TOKEN[] APP[pig-app-hue-script] JOB[0000001-141215230246520-<wbr />oozie-oozi-W] ACTION[0000001-<wbr />[email protected]:<wbr />start:] [***0000001-141215230246520-<wbr />[email protected]:start:***]Action updated in DB!

2014-12-15 23:32:17,873  INFO ActionStartXCommand:543 - SERVER[hdptest.construct.dev] USER[amo] GROUP[-] TOKEN[] APP[pig-app-hue-script] JOB[0000001-141215230246520-<wbr />oozie-oozi-W] ACTION[0000001-<wbr />[email protected]<wbr />pig] Start action [0000001-141215230246520-<wbr />[email protected]] with user-retry state : userRetryCount [0], userRetryMax [0], userRetryInterval [10]