Install Hue 3 on Pivotal HD 3.0

Published on 15 June 2015 in Administration - 7 minutes read - Last modified on 04 February 2020

This post was originally published on Install Hue 3 with Pivotal HD 3.0 by Christian Tzolov of @Pivotal.

Latest Hadoop distributions from Pivotal (PHD3.0) and Hortonworks (HDP2.2) come with support for Hue 2.6. Unfortunately Hue 2.6 is quite old and does not provide any RDBMS UI. The RDBMS view is useful for Pivotal as it allows friendly web interface for running adhoc HAWQ SQL queries. This feature is demoed here:

 

 

Below I will show how to install latest Hue 3.7.1 on PHD3.0 and how to use it with HAWQ.

<p>
  <strong>Disclaimer</strong>: This is an experimental work. It is not thoughtfully tested and will not be supported in the future. The article expresses author’s own opinion.
</p>

<p>
  &nbsp;
</p>

<p>
  <strong>Installing Hue 3.7.1 on Pivotal HD 3.0</strong>
</p>
The following Hue 3.7.1 rpms are built with Apache BigTop. You can download compressed bundle with all Hue rpms: hue-all-3.7.1-1.el6.x86_64.zip. It contains the following packages:
    <td>
      <div dir="ltr">
        hue-spark - requires additional services
      </div>
    </td>
  </tr>

  <tr>
    <td>
      <div dir="ltr">
        hue-server
      </div>
    </td>

    <td>
      <div dir="ltr">
        hue-sqoop
      </div>
    </td>
  </tr>

  <tr>
    <td>
      <div dir="ltr">
        hue-rdbms
      </div>
    </td>

    <td>
      <div dir="ltr">
        hue-doc
      </div>
    </td>
  </tr>

  <tr>
    <td>
      <div dir="ltr">
        hue-pig
      </div>
    </td>

    <td>
      <div dir="ltr">
        hue-search
      </div>
    </td>
  </tr>

  <tr>
    <td>
      <div dir="ltr">
        hue-zookeeper
      </div>
    </td>

    <td>
    </td>
  </tr>

  <tr>
    <td>
      <div dir="ltr">
        hue-beeswax
      </div>
    </td>

    <td>
    </td>
  </tr>

  <tr>
    <td>
      <div dir="ltr">
        hue-hbase
      </div>
    </td>

    <td>
    </td>
  </tr>

  <tr>
    <td>
      <div dir="ltr">
        hue-impala - required due to issue <a href="https://issues.cloudera.org/browse/HUE-2492">HUE-2492 </a>
      </div>
    </td>

    <td>
    </td>
  </tr>
</table>


Uncompress hue-all.zip bundle on the Ambari node and install the packages as explained below. For some packages the dependency check needs to be disabled (e.g.use rpm -i with  -nodeps option).


# External dependencies
sudo yum -y install cyrus-sasl-gssapi cyrus-sasl-plain libxml2 libxslt zlib python sqlite python-psycopg2
# Hue packages
sudo yum -y install ./hue-common-3.7.1-1.el6.x86_64.rpm
sudo yum -y install ./hue-server-3.7.1-1.el6.x86_64.rpm
sudo yum -y install ./hue-rdbms-3.7.1-1.el6.x86_64.rpm
sudo yum -y install ./hue-zookeeper-3.7.1-1.el6.x86_64.rpm
sudo yum -y install ./hue-pig-3.7.1-1.el6.x86_64.rpm
sudo yum -y install ./hue-hbase-3.7.1-1.el6.x86_64.rpm
sudo yum -y install ./hue-beeswax-3.7.1-1.el6.x86_64.rpm
(sudo rpm -i -nodeps hue-beeswax-3.7.0+cdh5.3.3+180-1.cdh5.3.3.p0.8.el6.x86_64.rpm)
sudo yum -y install ./hue-sqoop-3.7.1-1.el6.x86_64.rpm
sudo yum -y install ./hue-impala-3.7.1-1.el6.x86_64.rpm


Start Hue:

sudo /etc/init.d/hue start

Open the Ambari UI on port https://:8888 (https://ambari.localadmin:8888):


Screen Shot 2015-04-13 at 3.02.30 PM.png

The first login will ask for an username and a password.  Make sure to set the username to hue! Pick a password of your choice.


Next configure Hue. Edit the /etc/hue/conf/hue.ini as explained in Appendix A  and apply the RDBMS table view workaround as explained in Appendix B.

Restart Hue:

sudo /etc/init.d/hue restart


Enable HAWQ remote access


Enable the remote access from the Ambari to HAWQ. On the HAWQ master (phd3) open the master’s pg_hba.conf file:

sudo vi /data/hawq/master/gpseg-1/pg_hba.conf


and add the following line. Replace the IP with the address of you the Ambari node.

host    all     gpadmin /32        trust


Restart the HAWQ Service (using Ambari):
Ambari-Restart-HAWQ-Service.png


Start HBase Thrift Server


Hue communicates with the HBase service via Thrift. To start the server on the HBase master node (phd2.localdomain) run:

sudo nohup /usr/bin/hbase thrift start &


Hadoop Proxy configuration


To allow HUE to impersonate various Hadoop services you have to enable the following hadoop proxies. In Ambari from the Dashboard view select the HDFS service and then the ‘Config’ tab. Type ‘proxy’ in the search field and press enter. Change or add the properties to match those values:


Screen Shot 2015-04-14 at 12.27.36 AM.png

hue-common
    <td>
      <div dir="ltr">
        value
      </div>
    </td>
  </tr>

  <tr>
    <td>
      <div dir="ltr">
        hadoop.proxyuser.hcat.groups
      </div>
    </td>

    <td>
      <div dir="ltr">
        *
      </div>
    </td>
  </tr>

  <tr>
    <td>
      <div dir="ltr">
        hadoop.proxyuser.hcat.hosts
      </div>
    </td>

    <td>
      <div dir="ltr">
        *
      </div>
    </td>
  </tr>

  <tr>
    <td>
      <div dir="ltr">
        hadoop.proxyuser.hive.groups
      </div>
    </td>

    <td>
      <div dir="ltr">
        *
      </div>
    </td>
  </tr>

  <tr>
    <td>
      <div dir="ltr">
        hadoop.proxyuser.hive.hosts
      </div>
    </td>

    <td>
      <div dir="ltr">
        *
      </div>
    </td>
  </tr>

  <tr>
    <td>
      <div dir="ltr">
        hadoop.proxyuser.hue.groups
      </div>
    </td>

    <td>
      <div dir="ltr">
        *
      </div>
    </td>
  </tr>

  <tr>
    <td>
      <div dir="ltr">
        hadoop.proxyuser.hue.hosts
      </div>
    </td>

    <td>
      <div dir="ltr">
        *
      </div>
    </td>
  </tr>

  <tr>
    <td>
      <div dir="ltr">
        hadoop.proxyuser.oozie.groups
      </div>
    </td>

    <td>
      <div dir="ltr">
        *
      </div>
    </td>
  </tr>

  <tr>
    <td>
      <div dir="ltr">
        hadoop.proxyuser.oozie.hosts
      </div>
    </td>

    <td>
      <div dir="ltr">
        *
      </div>
    </td>
  </tr>
</table>


Save the modified configuration and restart the affected services (HDFS, YARN and MapReduce). Tip: follow the restart suggestions.

 

Appendix A: HUE Configuration
Hue configuration is in /etc/hue/conf/hue.ini file. Run ‘/etc/init.d/hue restart’ after configuration modification.

The following service deployment topology is being used for this particular hue.ini configuration:


Property name
    <td>
      <div dir="ltr">
        Ambari, Hue, Nagios, Ganglia
      </div>
    </td>
  </tr>

  <tr>
    <td>
      <div dir="ltr">
        phd1.localadmin
      </div>
    </td>

    <td>
      <div dir="ltr">
        HAWQ SMaster, NameNode, HiveServer2, Hive Metastore, ResourceManager, WebHCat Server, DataNode, HAWQ Segment, RegionServer, NodeManager, PXF
      </div>
    </td>
  </tr>

  <tr>
    <td>
      <div dir="ltr">
        phd2.localadmin
      </div>
    </td>

    <td>
      <div dir="ltr">
        App Timeline Server, History Server, HBase Master, Oozie Server, SNameNode, Zookeeper Server, DataNode, HAWQ Segment, RegionServer, NodeManager, PXF
      </div>
    </td>
  </tr>

  <tr>
    <td>
      <div dir="ltr">
        phd3.localadmin
      </div>
    </td>

    <td>
      <div dir="ltr">
        HAWQ Master, DataNode, HAWQ Segment, RegionServer, NodeManager, PXF
      </div>
    </td>
  </tr>
</table>


Only the modification from the the default hue configuration properties are listed below. Configuration is aligned with PHD3.0 cluster with the following topology:

 


###########################################################################
# General configuration for core Desktop features (authentication, etc)
###########################################################################
[desktop]
 # Set this to a random string, the longer the better.
 # This is used for secure hashing in the session store.
 secret_key=bozanovakozagoza
 # Time zone name
 time_zone=Europe/Amsterdam
 # Comma separated list of apps to not load at server startup.
 # e.g.: pig,zookeeper
 app_blacklist=impala,indexer
###########################################################################
# Settings for the RDBMS application
###########################################################################
[librdbms]
 # The RDBMS app can have any number of databases configured in the databases
 # section. A database is known by its section name
 # (IE sqlite, mysql, psql, and oracle in the list below).
 [[databases]]
   # mysql, oracle, or postgresql configuration.
   [[[postgresql]]]
     # Name to show in the UI.
     nice_name="HAWQ"
     name=postgres
     engine=postgresql
     host=phd3.localdomain
     port=5432
     user=gpadmin
     password=
###########################################################################
# Settings to configure your Hadoop cluster.
###########################################################################
[hadoop]
 # Configuration for HDFS NameNode
 # ------------------------
 [[hdfs_clusters]]
   [[[default]]]
     # Enter the filesystem uri
     fs_defaultfs=hdfs://phd1.localdomain:8020
     # Use WebHdfs/HttpFs as the communication mechanism.
     # Domain should be the NameNode or HttpFs host.
     # Default port is 14000 for HttpFs.
     webhdfs_url=http://phd1.localdomain:50070/webhdfs/v1
 # Configuration for YARN (MR2)
 # ------------------------
 [[yarn_clusters]]
   [[[default]]]
     # Enter the host on which you are running the ResourceManager
     resourcemanager_host=phd1.localdomain
     # The port where the ResourceManager IPC listens on
     resourcemanager_port=8030
     # URL of the ResourceManager API
     resourcemanager_api_url=http://phd1.localdomain:8088
     # URL of the HistoryServer API
     history_server_api_url=http://phd2.localdomain:19888
###########################################################################
# Settings to configure liboozie
###########################################################################
[liboozie]
 # The URL where the Oozie service runs on. This is required in order for
 # users to submit jobs. Empty value disables the config check.
 oozie_url=http://phd2.localdomain:11000/oozie
###########################################################################
# Settings to configure Beeswax with Hive
###########################################################################
[beeswax]
 # Host where HiveServer2 is running.
 # If Kerberos security is enabled, use fully-qualified domain name (FQDN).
 hive_server_host=phd1.localdomain
 # Port where HiveServer2 Thrift server runs on.
 hive_server_port=10000
 # Choose whether Hue uses the GetLog() thrift call to retrieve Hive logs.
 # If false, Hue will use the FetchResults() thrift call instead.
 use_get_log_api=false
 # Set a LIMIT clause when browsing a partitioned table.
 # A positive value will be set as the LIMIT. If 0 or negative, do not set any limit.
 browse_partitioned_table_limit=250
 # A limit to the number of rows that can be downloaded from a query.
 # A value of -1 means there will be no limit.
 # A maximum of 65,000 is applied to XLS downloads.
 download_row_limit=10000
 # Thrift version to use when communicating with HiveServer2
 thrift_version=5
###########################################################################
# Settings to configure the Zookeeper application.
###########################################################################
[zookeeper]
 [[clusters]]
   [[[default]]]
     # Zookeeper ensemble. Comma separated list of Host/Port.
     # e.g. localhost:2181,localhost:2182,localhost:2183
     host_ports=phd2.localdomain:2181
     # The URL of the REST contrib service (required for znode browsing)
     rest_url=http://phd2.localdomain:9998
###########################################################################
# Settings to configure HBase Browser
###########################################################################
[hbase]
 # Comma-separated list of HBase Thrift servers for clusters in the format of '(name|host:port)'.
 # Use full hostname with security.
 hbase_clusters=(Cluster|phd2.localdomain:9090)

Appendix B: RDBMS view doesn’t show tables workaround

<div dir="ltr">
  Credits to Scott Kahler for this workaround!
</div>

<div dir="ltr">
  <br class="kix-line-break" />Edit /usr/lib/hue/desktop/libs/librdbms/src/librdbms/server/postgresql_lib.py. Replace the cursor.execute() statements in the get_tables() and get_columns() methods as shown below
</div>

def get_tables(self, database, table_names=[]):
# Doesn’t use database and only retrieves tables for database currently in use.
cursor =self.connection.cursor()
#cursor.execute(“SELECT table_name FROM information_schema.tables WHERE table_schema=’%s'” % database)
cursor.execute(“SELECT table_name FROM information_schema.tables WHERE table_schema NOT IN (‘hawq_toolkit’,’information_schema’,’madlib’,’pg_aoseg’,’pg_bitmapindex’,’pg_catalog’,’pg_toast’)”)
self.connection.commit()
return[row[0]for row in cursor.fetchall()]

def get_columns(self, database, table):
cursor =self.connection.cursor()
#cursor.execute(“SELECT column_name FROM information_schema.columns WHERE table_schema=’%s’ and table_name=’%s'” % (database, table))
cursor.execute(“SELECT column_name FROM information_schema.columns WHERE table_name=’%s’ AND table_schema NOT IN (‘hawq_toolkit’,’information_schema’,’madlib’,’pg_aoseg’,’pg_bitmapindex’,’pg_catalog’,’pg_toast’)”% table)
self.connection.commit()
return[row[0]for row in cursor.fetchall()]

You can automate the update like this:


sudo sed -i “s/=’%s’\” % database/NOT IN (‘hawq_toolkit’,’information_schema’,’madlib’,’pg_aoseg’,’pg_bitmapindex’,’pg_catalog’,’pg_toast’)\”/g”/usr/lib/hue/desktop/libs/librdbms/src/librdbms/server/postgresql_lib.py
sudo sed -i “s/table_schema=’%s’ and table_name=’%s’\” % (database, table)/table_name=’%s’ AND table_schema NOT IN (‘hawq_toolkit’,’information_schema’,’madlib’,’pg_aoseg’,’pg_bitmapindex’,’pg_catalog’,’pg_toast’)\” % table/g”/usr/lib/hue/desktop/libs/librdbms/src/librdbms/server/postgresql_lib.py

Restart Hue
<div dir="ltr">
  <p>
    <pre><code class="bash">sudo /etc/init.d/hue restart</code></pre>
  </p>
</div>

<h1 dir="ltr">
  Related links:
</h1>

<div dir="ltr">
  <a href="https://gethue.com/hadoop-hue-3-on-hdp-installation-tutorial/">https://gethue.com/hadoop-hue-3-on-hdp-installation-tutorial/</a>
</div>

comments powered by Disqus

More recent stories

19 May 2020
How to grant Ranger permissions for a new user on a Secure Cluster
Read More
06 May 2020
SQL Editor for Apache Flink SQL
Read More
05 May 2020
How to Configure Hue to authenticate with Apache Knox SSO on a Secure Cluster
Read More
ambari.localadmin