Oozie workflow credentials with a Hive action with Kerberos

Oozie workflow credentials with a Hive action with Kerberos

When using Hadoop security and scheduling jobs using Hive (or Pig, HBase) you might have received this error:

Caused by: MetaException(message:Could not connect to meta store using any of the URIs provided. Most recent failure: org.apache.thrift.transport.TTransportException: GSS initiate failed

Indeed, in order to use an Oozie Hive action with the Hive metastore server when Kerberos is enabled, you need to use HCatalog credentials in your workflow.

Here is a demo, with a kerberized cluster and a MySql Hive metastore showing how it works. We create a Hive script that will list the tables and performs an operation requiring the HCat credential. Please find all the used and generated configurations here.

Hue fills up automatically the parameters for you, just check the credentials required on your workflow action and Hue will:

  • Pull dynamically the available credentials details from the cluster
  • Configure the credentials in workflows for you

Then don’t forget to check the HCat credential in the Hive action advanced properties. You can check multiple credentials if you ever need to.

And that’s it! Submit the workflow and check its output, you will see the list of tables and the result of the computation of the second query!

As usual feel free to comment on the hue-user list or @gethue!

Note:
Hive should not access directly the metastore database via JDBC, or it will bypass the protection.

Include a hive-config.xml in the Job XML property of the Hive action with this type of configuration:

<property>
<name>javax.jdo.option.ConnectionURL</name>
    <value>jdbc:mysql://hue.com:3306/hive1?useUnicode=true&amp;characterEncoding=UTF-8</value>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hive1</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>hive1</value>
</property>

Use this one:

<property>
<name>hive.metastore.local</name>
<value>false</value>
</property>
<property>
<name>hive.metastore.uris</name>
    <value>thrift://hue.com:9083</value>
</property>
<property>
<name>hive.metastore.sasl.enabled</name>
<value>true</value>
</property>

Note:
When the job will try to connect to MySql, you might hit this missing jar problem:


Caused by: org.datanucleus.store.rdbms.datasource.DatastoreDriverNotFoundException: The specified datastore driver ("com.mysql.jdbc.Driver") was not found in the CLASSPATH. Please check your CLASSPATH specification, and the name of the driver.
<pre>

To solve it, simply download the MySql jar connector from http://dev.mysql.com/downloads/connector/j/, and have HiveServer2 points to it with:

<property>
<name>hive.aux.jars.path</name>
  <value>file:///usr/share/java//mysql-connector-java.jar</value>
</property>

Note:
To activate the credentials in Oozie itself, update this property in oozie-site.xml

<property>
  <name>oozie.credentials.credentialclasses</name>
  <value>
    hcat=org.apache.oozie.action.hadoop.HCatCredentials,
    hbase=org.apache.oozie.action.hadoop.HbaseCredentials
  </value>
</property>

3 Comments

  1. oozie-user 3 years ago

    What version of Hue will support this feature?

  2. Alex 2 years ago

    I am using Hue 2.6.1-2041 and it does not have this feature. Is there a work around?

    • Hue Team 2 years ago

      Unfortunately now, you would need to upgrade.

      BTW Hue 2 is very old, we are at 3.8 now: demo.gethue.com

Leave a reply

Your email address will not be published. Required fields are marked *

*