How to install Hue 3 on IBM BigInsights 4.0 to explore Big Data

How to install Hue 3 on IBM BigInsights 4.0 to explore Big Data

This post was originally published on the IBM blog HUE on IBM BigInsights 4.0 to explore Big Data by  / @vinayak_agr.

For Hue 3.9 and BigInsights 4.1 have a look to https://developer.ibm.com/hadoop/blog/2015/10/27/how-to-install-hue-3-9-on-top-of-biginsights-4-1/

 

Task

This article will walk you through the steps required to deploy/setup HUE on IBM BigInsights version 4.0 and above.

Introduction

HUE or Hadoop User Experience is a Web interface for analyzing data with Apache Hadoop. With Big Data, you need a tool to navigate through your data, query your data and even search it. This is all tied up together in one place with HUE.

Pre-requisites

To deploy HUE on BigInsights, you need an up and running BigInsights Version 4.x cluster. For the purpose of this article, we can use BigInsights V4 Quickstart edition that is available on IBM website for free. You can download the Quick Start Edition here. It is assumed that your OS is redhat 6.x. If not, you will need to change the package installation commands as per your linux distro.

Install Dependencies

A couple of dependencies are required by HUE to run. So lets start with downloading the required packages. Launch terminal and download the required packages.

[[email protected] /]# yum install ant
[[email protected] /]# yum install python-devel.x86_64
[[email protected] /]# yum install krb5-devel.x86_64
[[email protected] /]# yum install krb5-libs.x86_64
[[email protected] /]# yum install libxml2.x86_64
[[email protected] /]# yum install python-lxml.x86_64
[[email protected] /]# yum install libxslt-devel.x86_64
[[email protected] /]# yum install mysql-devel.x86_64
[[email protected] /]# yum install openssl-devel.x86_64
[[email protected] /]# yum install libgsasl-devel.x86_64
[[email protected] /]# yum install sqlite-devel.x86_64
[[email protected] /]# yum install openldap-devel.x86_64

Download HUE

We will download the latest version of HUE as of today which is version 3.7.1 and extract it. In the terminal, run the following commands:

[[email protected] /]# wget https://dl.dropboxusercontent.com/u/730827/hue/releases/3.7.1/hue-3.7.1.tgz
[[email protected] /]# sudo echo “JAVA_HOME=\”/usr/lib/jvm/java-7-openjdk-1.7.0.75.x86_64/jre\”” >> /etc/environment
[[email protected] /]# tar zxvf hue-3.7.1.tgz

Add User And Group for HUE

[[email protected] /]# groupadd hue
[[email protected] /]# useradd hue -g hue
[[email protected] /]# passwd hue

Now give ownership of extracted hue folder to user hue by executing the following command.

[[email protected] /]# chown hue:hue hue-3.7.1

You will also need to add user hue to sudoers file as a sudoer.

Install HUE

1. As user hue, start the installation as shown below.
[[email protected] /]#sudo make install
make_install
2. By default, HUE installs to ‘/usr/local/hue’ in your Management node’s local filesystem as shown below. Make user hue, the owner of /usr/local/hue folder by executing
sudo chown –R hue:hue /usr/local/hue

Setting up hadoop properties for HUE

1. Configure properties in core-site.xml

i. Enable Webhdfs
Go to Ambari, select HDFS on the left side and then select config as shown.
hdfs_config
Then scroll down and make sure webdfs is check marked as shown below:
web_hdfs
ii. Add the following 2 properties under custom core-site.xml with value “*” as shown below:
core_site

2. Configure properties in oozie-site.xml

Just like above, now select oozie on the left side in Ambari and then select config.
i. Add two properties in oozie for HUE as shown below.
zookeeper_config

3. Configure properties in webcat-site.xml

Now navigate to Hive on left side in Ambari and then select config.
i. Keep scrolling down until you see webcat-site and add two properties in webhcat configuration for HUE as shown below:
webhcat_site

Configure HUE.ini file to point to your Hadoop cluster

– Go to /usr/local/hue/desktop/conf
– Start editing hue.ini using any editor(like vim) after making a backup file.
hue_ini

Note: In this article, the cluster is small-one node, therefore services like Hive Server, Hive Metastore, HBase Master, Zookeepers etc are deployed on one node itself. In case of bigger cluster, put the correct node information for the respective services that we are editing next. The screenshots below are just example to help you configure.

i. Edit Hdfs and webhdfs parameters to point to your cluster. Make the changes as shown. Don’t forget to uncomment these parameters after adding values.

hue_config_ini

ii. Configure YARN parameters and don’t forget to uncomment these parameters as shown:

hue_config_ini_2

iii. Configure Oozie, hive and hbase as show below. Don’t forget to uncomment the parameters.
hue_config_ini_3
hue_config_ini_4hue_config_ini_5

– Save all the changes.

Start HUE

– As hue user, go to /usr/local/hue/build/env folder and start HUE by executing ./supervisor as shown below
start_hue

Testing HUE

In your browser, go to
yourserver:8888/filebrowser
When prompted for userid/password, use user hue and its password that you created earlier to login.
You should see the following screen making sure that HUE is working properly.
demo_start

Conclusion

In this article we have successfully deployed HUE 3.7.1 on top of BigInsights V4.0 using Quick Start edition.This setup would allow an end user to browse/copy/delete HDFS files, fire queries to hive/hbase and even create a dashboard for data analysis. This interface can also be used as a front end for your enterprise search application powered by Solr.

20 Comments

  1. Vikas 2 years ago

    Thanks for sharing it.

    I tried it these steps and could successfully install and configure Hue 3.8.1 on local machine but when I am trying to configure it with my cloud instance, its not working. I realized that my cloud instance is protected by Knox LDAP server. Here’s detail –

    authentication
    ShiroProvider
    true

    sessionTimeout
    30

    main.ldapRealm
    org.apache.shiro.realm.ldap.JndiLdapRealm

    main.ldapRealm.userDnTemplate
    uid={0},ou=people,dc=a4h,dc=com

    main.ldapRealm.contextFactory.url
    ldap://{{knox_host_name}}:389

    main.ldapRealm.contextFactory.authenticationMechanism
    simple

    urls./**
    authcBasic

    I am struggling big time in configuring these settings with hue.ini. How to generate certificate and use them?

  2. azahari 2 years ago

    does hue 3.9.0 works with biginsight 4.0 qse_docker?

  3. Ashish 2 years ago

    I am unable to install the python,ANT through YUM it is giving me
    Loaded plugins: product-id, refresh-packagekit, rhnplugin, security, subscription-manager

    • Hue Team 2 years ago

      sounds like a Linux problem more than a Hue problem 😉

  4. Ashish 2 years ago

    Hi Hue team can we go for the hue setup in Quick start vm edition for biginsight 4.0?
    if yes then know i am getting the error like
    Loaded plugins: product-id, refresh-packagekit, rhnplugin, security, subscription-manager
    This system is not registered to Red Hat Subscription Management. You can use subscription-manager to register.
    This system is not registered with RHN Classic or RHN Satellite.
    You can use rhn_register to register.
    RHN Satellite or RHN Classic support will be disabled.

    • Hue Team 2 years ago

      We don’t know these errors, probably Big insight specific so would recommend to ask there!

  5. Krishna 2 years ago

    Will these commands work on MAC?

  6. Krishna 2 years ago

    I mean installation commands

  7. Riya 2 years ago

    If we close the terminal where we started Hue , it kills the process and we are no longer able to access Hue Web UI. How can we run this as background so that we can have it running all time ?

    • Hue Team 2 years ago

      You need to start Hue with the nohup option, or supervisor, or use the packaging script of a distribution.

  8. HonzaS 1 year ago

    Hi, here are my experience from installing Hue 3.9 on BigInsights 4.1 QSE VMware image:
    – Prerequisites: apart from mentioned, you have to install gmp-devel.x86_64 and maven. Don’t forger to add maven to PATH and then push it to installer (sudo env “PATH=$PATH” make install), otherwise Hue install will fail.
    – Provide valid Java path, in my case it was sudo echo “JAVA_HOME=\”/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.75.x86_64/jre\”” >> /etc/environment
    – Supervisor can be found in /usr/local/hue/build/env/bin

  9. ramesh 1 year ago

    I have deployed Hue on Biginsights 4.1. I am getting the following error. Initially created hueadmin as the ID on hue. Then added hdfs id on hue and granted it superuser within hue. I get this error when I login in as hdfs. hueadmin id did not work either.

    Any help is much appreciated. Thanks in advance

    Cannot access: /user/hdfs.

    SecurityException: Failed to obtain user group information: org.apache.hadoop.security.authorize.AuthorizationException: Unauthorized connection for super-user: hue from IP 10.xxx.xxx.xxx (error 403)

    • Hue Team 1 year ago

      It really seems it’s a configuration error outside Hue. What are the permissions on HDFS?

      • ramesh 1 year ago

        Hello Hue Team,

        Appreciate your quick response. I hope you can help me through this…

        What is the required permissions for hue to work ? Please see the hdfs permissions below. I used the default configuration for setting up the hadoop cluster.
        drwxr-xr-x – hdfs hdfs 0 2016-03-08 14:50 /user/hdfs

        Will the hueadmin (admin id which was the initial one setup on hue) be able to browse the hdfs filesystem ? Do I need to set up the ID on hdfs, create home directory or add it to the HDFS group for it to work ? The error seems to say that but I do not see any instructions in Hue docs to do that. The proxyuser properties as stated in this blog has been setup.

        [[email protected] log]# hadoop fs -ls /user
        Found 12 items
        drwxrwx— – ambari-qa hdfs 0 2016-03-08 10:52 /user/ambari-qa
        drwxrwxrwx – bigsql hdfs 0 2016-03-10 10:28 /user/bigsql
        drwxr-xr-x – hbase hdfs 0 2016-03-08 11:14 /user/hbase
        drwxr-xr-x – hcat hdfs 0 2016-03-08 11:18 /user/hcat
        drwxr-xr-x – hdfs hdfs 0 2016-03-08 14:50 /user/hdfs
        drwx—— – hive hdfs 0 2016-03-08 11:18 /user/hive
        drwxr-xr-x – hueadmin hdfs 0 2016-03-16 12:02 /user/hueadmin
        drwxrwxr-x – oozie hdfs 0 2016-03-08 11:15 /user/oozie
        drwxr-xr-x – spark hadoop 0 2016-03-08 11:17 /user/spark
        drwxrwxr-x – tauser hadoop 0 2016-03-09 12:22 /user/tauser

        Just to be sure.. I also created hueadmin ID, home dir as seen below.
        sudo adduser –ingroup hdfs hueadmin
        sudo -u hdfs hadoop fs -mkdir /user/hueadmin
        sudo -u hdfs hadoop fs -chown -R hueadmin:hdfs /user/hueadmin

        • ramesh 1 year ago

          Hi,

          Wanted to provide a update. I was able to identify the issue.. The default_user property is what I changed from hue to hdfs and databrowser is now working. It does help to look at the logs and backtrack from the rest api queries :). Thanks for your attention.

          # This should be the Hue admin and proxy user
          default_user=hdfs

          # This should be the hadoop cluster admin
          default_hdfs_superuser=hdfs

  10. ramesh 1 year ago

    More information on my previous question. The Quick start wizard gave a few errors. I have followed all instructions on this site to deploy Hue on BI 4.x. This is a cluster with 15 nodes, 4 management and 11 data nodes. The Hue is deployed on the name node.

    Quick Start Wizard – Hue™ 3.9.0 – The Hadoop UI

    Checking current configuration
    Configuration files located in /usr/local/hue/desktop/conf

    Potential misconfiguration detected. Fix and restart Hue.

    hadoop.hdfs_clusters.default.webhdfs_url Current value: http://10.xxx.xxx.xxx:50070/webhdfs/v1
    Failed to access filesystem root

    desktop.secret_key Current value:
    Secret key should be configured as a random string. All sessions will be lost on restart
    Hive Editor Failed to access Hive warehouse: /apps/hive/warehouse
    Impala Editor No available Impalad to send queries to.
    Spark The app won’t work without a running Livy Spark Server

    • Hue Team 1 year ago

      See previous reply, it looks like HDFS is not configured properly and everything else fails subsequently.

Leave a reply

Your email address will not be published. Required fields are marked *

*