Note: This post is deprecated as of Hue 3.8 / April 24th 2015. Hue now have a new Spark Notebook application.
Hi Spark Makers!
A Hue Spark application was recently created. It lets users execute and monitor Spark jobs directly from their browser from any machine, with interactivity.
The new application is using the Spark Job Server contributed by Ooyala at the last Spark Summit.
We hope to work with the community and have support for Python, Java, direct script submission without compiling/uploading and other improvements in the future!
As usual feel free to comment on the hue-user list or @gethue!
Get Started!
Currently only Scala jobs are supported and programs need to implement this trait and be packaged into a jar. Here is a WordCount example. To learn more about Spark Job Server, check its README.
If you are using Cloudera Manager, enable the Spark App by removing it from the blacklist by adding this in the Hue Safety Valve:
[desktop]
app_blacklist=
Requirements
We assume you have Spark 0.9.0, Scala 2.10. installed on your system. Make sure you have the good scala and sbt versions, e.g. for Ubuntu: https://gist.github.com/visenger/5496675
Get Spark Job Server
Currently on github on this branch:
git clone https://github.com/ooyala/spark-jobserver.git
cd spark-jobserver
Then type:
sbt
re-start
Get Hue
If Hue and Spark Job Server are not on the same machine update the hue.ini property in desktop/conf/pseudo-distributed.ini:
[spark]
\# URL of the Spark Job Server.
server_url=http://localhost:8090/
To point to your Spark Cluster
vim ./job-server/src/main/resources/application.conf
Replace:
master = "local[4]"
With the Spark Master URL (you can get it from the Spark Master UI: http://SPARK-HOST:18080/):
master = "spark://localhost:7077"
Get a Spark example to run
Then follow this walk-through and create the example jar that is used in the video demo.