How to deploy Hue on HDP

How to deploy Hue on HDP

Guest post from Andrew that we regularly update (Dec 19th 2014)

 

I decided to deploy Hue 3.7, from tarballs (note, other sources like packages from the ‘Install’ menu above would work too), on HDP 2.2 recently and wanted to document some notes for anyone else looking to do the same.

Deployment Background:

  • Node Operating System:  CentOS 6.6 – 64bit
  • Cluster Manager:  Ambari 1.7
  • Distribution:  HDP 2.2
  • Install Path (default):  /usr/local/hue
  • HUE User:  hue

After compiling (some hints there), you may run into out of the box/post-compile startup issues.

  • Be sure to set the appropriate Hue proxy user/groups properties in your Hadoop service configurations (e.g. WebHDFS/WebHCat/Oozie/etc)
  • Don’t forget to configure your Hue configuration file (‘/usr/local/hue/desktop/conf/hue.ini’) to use FQDN hostnames in the appropriate places

 

beeswax-editor

 

Startup

Hue uses an SQLite database by default and you may find the following error when attempting to connect to HUE at its default port (e.g. fqdn:8888)

  • File “/usr/local/hue/build/env/lib/python2.6/site-packages/Django-1.4.5-py2.6.egg/django/db/backends/sqlite3/base.py”, line 344, in execute return Database.Cursor.execute(self, query, params) DatabaseError: unable to open database file

 

Removing apps

For Impala (or any other app), the easiest way to remove them is to just black list them in the hue.ini. The second best alternative way is to remove the Hue permissions to the groups of some users.
[desktop]
app_blacklist=impala
For Sentry, you will need to use ‘security’, but it will also hide the HDFS ACLs editor for now.

HDFS

Check your HDFS configuration settings and ensure that the service is active and healthy.
Did you remember to configure proxy user hosts and groups for your HDFS service configuration?

With Ambari, you can review your cluster’s HDFS configuration, specifically under the “Custom core-site.xml” subsection:

There should be two (2) new/custom properties added to support the HUE File Browser:

<property>
  <name>hadoop.proxyuser.hue.hosts</name>
  <value>*</value>
</property>
<property>
  <name>hadoop.proxyuser.hue.groups</name>
  <value>*</value>
</property>

With Ambari, you can go to the HDFS service settings and find this under “General”

– The property name is dfs.webhdfs.enabled (“WebHDFS enabled), and should be set to “true” by default.

– If a change is required, save the change and start/restart the service with the updated configuration.

Ensure the HDFS service is started and operating normally.

– You could quickly check some things, such as HDFS and WebHDFS by checking the WebHDFS page:

– http://<NAMENODE-FQDN>:50070/ in a web browser or ‘curl <NAMENODE-FQDN>:50070

Check if the processes are running using a shell command on your NameNode:

– ‘ps -ef | grep “NameNode”

By default your HDFS service(s) may not be configured to start automatically (e.g. upon boot/reboot).
Check the HDFS logs to see if the namenode service had trouble starting or started successfully:
– These are typically found at ‘/var/log/hadoop/hdfs/’

 

Hive Editor

By default, HUE appears to connect to the Hiveserver2 service using NOSASL authentication; Hive 0.14 ships with HDP 2.2 but is not configured by default to use authentication.
  • We’ll need to change the properties of our Hive configuration to work with the HUE Hive Editor (‘hive.server2.authentication=‘NOSASL’).
HDP 2.1 (Hive 0.13) continues to carry forward the GetLog() issue with Hue’s Hive Editor.e.g.
"Server does not support GetLog()"
In HDP 2.2, that includes Hive 0.14 and HIVE-4629, you will need this commit from Hue 3.8 (coming-up at the end of Q1 2015) or use master, and enable it in the hue.ini:
[beeswax]
# Choose whether Hue uses the GetLog() thrift call to retrieve Hive logs.
# If false, Hue will use the FetchResults() thrift call instead.
use_get_log_api=false

Security – HDFS ACLs Editor

By default, Hadoop 2.4.0 does not enable HDFS file access control lists (FACLs)

  • We’ll need to change the properties of our HDFS namenode service to enable FACLs (‘dfs.namenode.acls.enabled’=’true’)

Spark

 We are improving the Spark Editor and might change the Job Server and stuff is still pretty manual/not recommend for now.

HBase

Currently not tested (should work with Thrift Server 1)

Job Browser

Progress has never been entirely accurate for Map/Reduce completions — always shows the percentage for Mappers vs Reducers as a job progresses. “Kill” feature works correctly.

Oozie Editor/Dashboard

Note: when Oozie is deployed via Ambari 1.7, for HDP 2.2, the sharelib files typically found at /usr/lib/oozie/ are missing, and in turn are not staged at hdfs:/user/oozie/share/lib/ …

I’ll check this against an HDP 2.1 deployment and write the guys at Hortonworks an email to see if this is something they’ve seen as well.

Pig Editor

 Make sure you have at least 2 nodes or tweak YARN to be able to launch two apps at the same time (gotcha #5) and Oozie is configured correctly.
The Pig/Oozie log looks like this:
2014-12-15 23:32:17,626  INFO ActionStartXCommand:543 - SERVER[hdptest.construct.dev] USER[amo] GROUP[-] TOKEN[] APP[pig-app-hue-script] JOB[0000001-141215230246520-<wbr />oozie-oozi-W] ACTION[0000001-<wbr />141215230246520-oozie-oozi-W@:<wbr />start:] Start action [0000001-141215230246520-<wbr />oozie-oozi-W@:start:] with user-retry state : userRetryCount [0], userRetryMax [0], userRetryInterval [10]

2014-12-15 23:32:17,627  INFO ActionStartXCommand:543 - SERVER[hdptest.construct.dev] USER[amo] GROUP[-] TOKEN[] APP[pig-app-hue-script] JOB[0000001-141215230246520-<wbr />oozie-oozi-W] ACTION[0000001-<wbr />141215230246520-oozie-oozi-W@:<wbr />start:] [***0000001-141215230246520-<wbr />oozie-oozi-W@:start:***]Action status=DONE

2014-12-15 23:32:17,627  INFO ActionStartXCommand:543 - SERVER[hdptest.construct.dev] USER[amo] GROUP[-] TOKEN[] APP[pig-app-hue-script] JOB[0000001-141215230246520-<wbr />oozie-oozi-W] ACTION[0000001-<wbr />141215230246520-oozie-oozi-W@:<wbr />start:] [***0000001-141215230246520-<wbr />oozie-oozi-W@:start:***]Action updated in DB!

2014-12-15 23:32:17,873  INFO ActionStartXCommand:543 - SERVER[hdptest.construct.dev] USER[amo] GROUP[-] TOKEN[] APP[pig-app-hue-script] JOB[0000001-141215230246520-<wbr />oozie-oozi-W] ACTION[0000001-<wbr />141215230246520-oozie-oozi-W@<wbr />pig] Start action [0000001-141215230246520-<wbr />[email protected]] with user-retry state : userRetryCount [0], userRetryMax [0], userRetryInterval [10]

21 Comments

  1. max 3 years ago

    great article, i was able to follow and get this going but i am stuck at a point that when i get to the hostname:8000 port i get website not avaiable…i did set all the appropriate *.xml file settings….running HDP 2.2 with ambari 1.0.7 on centos…any help would be appreciated…

    [[email protected] conf.dist]# ../../build/env/bin/hue runserver
    /root/hue/build/env/lib/python2.6/site-packages/Django-1.4.5-py2.6.egg/django/conf/__init__.py:110: DeprecationWarning: The SECRET_KEY setting must not be empty.
    warnings.warn(“The SECRET_KEY setting must not be empty.”, DeprecationWarning)
    /root/hue/build/env/lib/python2.6/site-packages/Django-1.4.5-py2.6.egg/django/conf/__init__.py:110: DeprecationWarning: The SECRET_KEY setting must not be empty.
    warnings.warn(“The SECRET_KEY setting must not be empty.”, DeprecationWarning)
    Validating models…

    0 errors found
    Django version 1.4.5, using settings ‘desktop.settings’
    Development server is running at http://127.0.0.1:8000/
    Quit the server with CONTROL-C.

    ^C[[email protected] conf.dist]#

    • Hue Team 3 years ago

      runserver is the development server, to make it accessible from anywhere run it like this:
      ../../build/env/bin/hue runserver 0.0.0.0:8000

      But this is not recommended at all for production, you should start it with
      ../../build/env/bin/hue runcpserver
      instead

      or even
      build/env/bin/hue supervisor
      https://github.com/cloudera/hue#getting-started

      note: the production server runs on port 8888 and not 8000

  2. Bolke 2 years ago

    I have built hue 3.7.1 plus the GetLog patches. When running HUE keeps complaining about Oozie not running, while it definitely is. Is this due to the libraries as indeed /usr/lib/oozie does’t exist? What is the workaround?

  3. Surendra Mummini 2 years ago

    We have deployed HDP 2.1 on CentOS, having Hue 2.5.1-695. Is is possible to move to Hue 3.7.1?

    • Hue Team 2 years ago

      Yes, we have some users who are running 3.7 on HDP (like the author of this blog post)

  4. Jason Waterfall 2 years ago

    I keep getting a “NotImplementedError at /accounts/login/” whenever I access Hue. I tried restarting but the error still remains.

    Exception Location: /usr/lib/hue/build/env/lib/python2.6/site-packages/Django-1.2.3-py2.6.egg/django/contrib/auth/models.py in save, line 427

    Hue Version: 2.6.1-2041
    HDP Version: 2.2.0

    • Hue Team 2 years ago

      Please use a more recent Hue version, 3+

  5. tim fei 2 years ago

    This post is very helpful. Thank you!

    I have successfully installed hue 3.7.1 against my HDP2.2. It’s way better than the 2.5 version bundled with HDP itself.

    I have two questions though
    1. How do I get the GetLog patch applied ? Is there a guide on how to get the patch from git and apply it to my hue installation ?

    2. How do I change the default “ctrl”+”space” hotkey for hue auto-completion ? On my system, this key combination has been taken for the input method switch. I don’t find a relevant setting in hue.ini for this.

    Thanks again.

    • tim fei 2 years ago

      I just found a way to solve my #2 question.

      I found the key map was hard-coded in

      beeswax/templates/execute.mako, line : 1585 as below

      extraKeys: {
      “Ctrl-Space”: function () {
      CodeMirror.fromDot = false;
      codeMirror.execCommand(“autocomplete”);
      console.log(“in Alt-/ event”);
      },
      Tab: function (cm) {
      $(“#executeQuery”).focus();
      }
      },

      So I just change the key combination to “Alt-/” which works for me perfectly.

      • Hue Team 2 years ago

        Awesome! Glad it works for you! Regarding the GetLog issue, you could try to download the patch from Github just adding a .patch at the end of the commit (so https://github.com/cloudera/hue/commit/6a0246710f7deeb0fd2e1f2b3b209ad119c30b72.patch) and apply it to your Hue with ‘patch -a’ or ‘git apply’

        • tim fei 2 years ago

          Thank you for the prompt reply.

          I tried to download the patch file, and apply the patch. However, it failed at one file which can not be found on my installation folder .
          The version I installed is hue. 3.7.1 tar And there is no beeswax/thrift folder in my installation folder. Can you further help ? Thanks a lot!
          Following is my command line output

          cd /usr/local/hue (where is my hue installation )
          [[email protected] hue]# patch -p1<../6a0246710f7deeb0fd2e1f2b3b209ad119c30b72.patch
          patching file apps/beeswax/gen-py/TCLIService/ttypes.py
          patching file apps/beeswax/src/beeswax/api.py
          patching file apps/beeswax/src/beeswax/conf.py
          patching file apps/beeswax/src/beeswax/server/dbms.py
          patching file apps/beeswax/src/beeswax/server/hive_server2_lib.py
          patching file apps/beeswax/src/beeswax/templates/execute.mako
          Hunk #1 succeeded at 2628 (offset -9 lines).
          patching file apps/beeswax/static/js/beeswax.vm.js
          can't find file to patch at input line 238
          Perhaps you used the wrong -p or –strip option?
          The text leading up to this was:
          ————————–
          |diff –git a/apps/beeswax/thrift/TCLIService.thrift b/apps/beeswax/thrift/TCLIService.thrift
          |index 53ea3cf..63dad7e 100644
          |— a/apps/beeswax/thrift/TCLIService.thrift
          |+++ b/apps/beeswax/thrift/TCLIService.thrift

          • Hue Team 2 years ago

            Since you are not compiling Hue from scratch, you can remove the part related to the TCLIService.thrift from the patch file and try again! 🙂

          • tim fei 2 years ago

            It worked !
            Basically, all the steps I used are :

            1. Download the patch from github
            wget https://github.com/cloudera/hue/commit/6a0246710f7deeb0fd2e1f2b3b209ad119c30b72.patch
            2. Apply that patch using:
            cd /usr/local/hue
            patch -p1<../6a0246710f7deeb0fd2e1f2b3b209ad119c30b72.patch

            3. Skip the error about can not found the "apps/beeswax/thrift/TCLIService.thrift"

            4. Manually add the section to desktop/conf/hue.ini to tell hue not to use the GetLog ()
            # Choose whether Hue uses the GetLog() thrift call to retrieve Hive logs.
            # If false, Hue will use the FetchResults() thrift call instead.
            use_get_log_api=false

            5. Restart the hue process by kill it . Supervisor will restart it automatically
            ps ax|grep hue (use this command to find the process /build/env/bin/hue runcherrypyserver)

            Thank you so much.

          • Hue Team 2 years ago

            And thank YOU for writing down this summary!

  6. jhlee1979 2 years ago

    need to add oozie-site config for oozie

    oozie.service.ProxyUserService.proxyuser.hue.groups : *
    oozie.service.ProxyUserService.proxyuser.hue.hosts : *

    • Aleks 2 years ago

      Thank you! It solves my problem! Uhuuu!:)

  7. Manish 2 years ago

    Hi,

    I have installed cloudera quickstart 5.4 and it comes with HUE 3.7 . In this version of HUE , I am unable to find a way to upload any file (apart from .csv or tab separated) to create index directly without writing any morphline code. Please help!

  8. kishore 1 year ago

    how to uninstall hue packages ?

Leave a reply

Your email address will not be published. Required fields are marked *

*