Hive UDF in 1 minute!

Published on 19 August 2013 in Querying - 2 minutes read - Last modified on 04 February 2020

Apache Hive comes with a lot of built-in UDFs, but what happens when you need a “special one”? This post is about how to get started with a custom Hive UDF from compilation to execution in no time.

 

 

Let’s go!

Our goal is to create a UDF that transforms its input to upper case. All the code is available in our public repository of Hadoop examples and tutorials.

If you want to go even faster, the UDF is already precompiled here.

If not, checkout the code:

git clone https://github.com/romainr/hadoop-tutorials-examples.git
cd hive-udf

And compile the UDF (Java and Hive need to be installed):

javac -cp $(ls /usr/lib/hive/lib/hive-exec*.jar):/usr/lib/hadoop/hadoop-common.jar org/hue/udf/MyUpper.java

jar -cf myudfs.jar  -C . .

 

Or use Maven with our pom.xml that will automatically pull the dependent jars

mvn install

 

Register the UDF in the Hive Editor

Then open up Beeswax in the Hadoop UI Hue, click on the ‘Settings’ tab.

In File Resources, upload myudfs.jar, pick the jar file and point to it, e.g.:

/user/hue/myudf.jar

Make the UDF available by registering a UDF (User Defined Function ):

Name

myUpper

Class

org.hue.udf.MyUpper

 

That’s it! Just test it on one of the Hue example tables:

select myUpper(description) FROM sample_07 limit 10

Summary

We are using the most common type of UDF. If you want to learn more in depth about the other ones, some great resources like the Hadoop Definitive guide are available. Notice that adding a jar loads it for the entirety of the session so you don’t need to load it again. Next time we will demo how to create a Python UDF for Hive!

 

Have any questions? Feel free to contact us on hue-user or @gethue!

 

Note:

If you did not register the UDF as explained above, you will get this error:

error while compiling statement: failed: parseexception line 1:0 cannot recognize input near 'myupper' " "

comments powered by Disqus

More recent stories

23 June 2020
Monitoring Hue activity with Grafana Dashboards
Read More
22 June 2020
Automated checks for JavaScript modules compatible licenses and non absolute paths with Continuous Integration
Read More
19 May 2020
How to grant Ranger permissions for a new user on a Secure Cluster
Read More