A new Spark Web UI: Spark App

A new Spark Web UI: Spark App

Note: This post is deprecated as of Hue 3.8 / April 24th 2015. Hue now have a new Spark Notebook application.

Hi Spark Makers!

A Hue Spark application was recently created. It lets users execute and monitor Spark jobs directly from their browser from any machine, with interactivity.

The new application is using the Spark Job Server contributed by Ooyala at the last Spark Summit.

We hope to work with the community and have support for Python, Java, direct script submission without compiling/uploading and other improvements in the future!

As usual feel free to comment on the hue-user list or @gethue!

Get Started!

Currently only Scala jobs are supported and programs need to implement this trait and be packaged into a jar. Here is a WordCount example. To learn more about Spark Job Server, check its README.

If you are using Cloudera Manager, enable the Spark App by removing it from the blacklist by adding this in the Hue Safety Valve:

[desktop]
app_blacklist=

Requirements

We assume you have Spark 0.9.0, Scala 2.10. installed on your system. Make sure you have the good scala and sbt versions, e.g. for Ubuntu: https://gist.github.com/visenger/5496675

Get Spark Job Server

Currently on github on this branch:

git clone https://github.com/ooyala/spark-jobserver.git
cd spark-jobserver

Then type:

sbt
re-start

Get Hue

If Hue and Spark Job Server are not on the same machine update the hue.ini property in desktop/conf/pseudo-distributed.ini:

[spark]
  # URL of the Spark Job Server.
  server_url=http://localhost:8090/

To point to your Spark Cluster

vim ./job-server/src/main/resources/application.conf

Replace:

master = "local[4]"

With the Spark Master URL (you can get it from the Spark Master UI: http://SPARK-HOST:18080/):

master = "spark://localhost:7077"

Get a Spark example to run

Then follow this walk-through and create the example jar that is used in the video demo.

31 Comments

  1. sohi 4 years ago

    Hi,

    I have installed Cloudera Quick Start VM om my laptop. I started spark server But I am not getting Spark option under query editor.

    Please help

    Thanks and regards
    Sohi

  2. Hue Team 4 years ago

    Look for ‘app_blacklist’ on this page!

  3. Hassan 4 years ago

    I have a cluster setup with cloudera express, I have followed all the instructions and I still don’t see any option under query editor for spark. What did I miss? i added
    [dektop]
    app_blacklist=

    to the safety valve as well.

  4. Hue Team 4 years ago

    Sorry, there was a typo, it is [desktop] and not [dektop]!

  5. Hassan 4 years ago

    Thanks, its amazing how one misses a simple typo 🙁

  6. umanga 3 years ago

    I have a cluster setup with cloudera express. I have followed the
    , and can see spark in the Query Editor menu.

    however, when i go there, i’m getting this :

    An error happened with the Spark Server

    HTTPConnectionPool(host=’localhost’, port=8090): Max retries exceeded with url: /jobs (Caused by : [Errno 111] Connection refused)

    On other note, i was not able to access Spark master URL @ http://SPARK-HOST:18080, i could only access history server @ http://SPARK_HOST:18088/.
    So, i just edited application.conf to master = “spark://localhost:7077”, i doubt if this port is correct.

  7. umanga 3 years ago

    *correct in my case. How can i verify this?

  8. David Magaha 3 years ago

    Is there a particular version of ClouderaManager that you need for this to work? I am using 5.1 and I have all of the Spark stuff installed and I can hit the Spark Master on 18080 but under Query Editors in Hue I don’t see the Spark stuff. I have nothing in the blacklist and have restarted everything.

    • Hue Team 3 years ago

      If you have 5.1 or later you need to provide an empty blacklist list in the Hue configuration. But this app is not pretty usable right now and we have a new version coming up in the next release!

  9. sudheer 3 years ago

    spark editor not working when sbt process is killed or terminated. is there any way to run it in backup group. sorry I’m new to Hue and Spark.

    • Hue Team 3 years ago

      No, the app is actively talking to the server so it needs to be running. We have something simpler coming out soon

  10. Kyle 3 years ago

    I’m trying to get the Hue Spark Application running on an Amazon EMR cluster running Hue 3.6. I’ve got the Job Server running on the master but don’t see the Spark Ignitor app in the menu bar. I’ve been doing some trial and error (mostly error :-)) trying to figure out the correct configuration setting for master given EMR clusters have internal and external url addresses for the master. I currently have the Spark job server running on the master node of my cluster. I would love some specific instructions for setting up Hue and the jobs server on an Amazon EMR cluster. Any suggestions would be greatly appreciated.

    • Hue Team 3 years ago

      If you are using CDH 5.2 the app is blacklisted by default and you can enable the app by adding the following to the “Hue Service Advanced Configuration Snippet” (safety valve in CM configuration for Hue):

      [desktop]
      app_blacklist=

  11. Kyle 3 years ago

    I’ve uncommented “app_blacklist=” line from the hue.ini file and restarted the hue service. I still don’t see the Spark Ignitor app. This isn’t a Cloudera version of Hadoop. I’m running on the 2.4 version of hadoop and 1.2.1 version of Spark on an Amazon EMR cluster. It comes with Hue 3.6.

  12. Jonas 3 years ago

    When I tried the word count example I got the error classPath spark.jobserver.WordCountExample not found error 404

  13. chliao 3 years ago

    I have already implemented Spark example SimpleApp.scala(https://spark.apache.org/docs/latest/quick-start.html)
    and read README.md on HDFS
    the result is …Your application has the following error(s):

    { “status”: “ERROR”, “result”: { “message”: “Ask timed out on [Actor[akka://JobServer/user/context-supervisor/097f185e-SimpleApp#-1094988048]] after [15000 ms]”, “errorClass”: “akka.pattern.AskTimeoutException”, “stack”: [“akka.pattern.PromiseActorRef$$anonfun$1.apply$mcV$sp(AskSupport.scala:333)”, “akka.actor.Scheduler$$anon$7.run(Scheduler.scala:117)”, “scala.concurrent.Future$InternalCallbackExecutor$.scala$concurrent$Future$InternalCallbackExecutor$$unbatchedExecute(Future.scala:694)”, “scala.concurrent.Future$InternalCallbackExecutor$.execute(Future.scala:691)”, “akka.actor.LightArrayRevolverScheduler$TaskHolder.executeTask(Scheduler.scala:467)”, “akka.actor.LightArrayRevolverScheduler$$anon$8.executeBucket$1(Scheduler.scala:419)”, “akka.actor.LightArrayRevolverScheduler$$anon$8.nextTick(Scheduler.scala:423)”, “akka.actor.LightArrayRevolverScheduler$$anon$8.run(Scheduler.scala:375)”, “java.lang.Thread.run(Thread.java:701)”] } } (error 500)

    Env. CDH5.3.3 +Spark1.2 + spark-jobserver-master(version in ThisBuild := “0.5.2-SNAPSHOT”)

  14. chliao 3 years ago

    Hi Sir
    Thank you for your response ,whether Spark job Server can read the data on HDFS?
    because always “”message”: “Ask timed out on…akka.pattern.AskTimeoutException…(error 500)”
    Are there any other set for this message ?
    my env. CDH 5.3.3 + Spark1.2 + spark-jobserver-master(version in ThisBuild := “0.5.2-SNAPSHOT”)
    Hue Web UI , spark-jobserver and History Server Web UI are on the same server
    thanks.

  15. Chliao 3 years ago

    Hi Sir
    Thank you for your response ,I’m write this simple code to read data from HDFS and sbt package , upload this jar to Spark Ighiter(on Hue)
    import org.apache.spark.SparkContext
    import org.apache.spark.SparkContext._
    import org.apache.spark.SparkConf

    object readHDFS {
    def main(args: Array[String]) {
    val conf = new SparkConf().setAppName(“readHDFS”)
    val sc = new SparkContext(conf)
    val logFile = sc.textFile(“README.md”)
    logFile.foreach(println)
    }
    }

    this code can run on spark-shell (Spark Standalone) without error and can reading README.md by File Browser on HueUI before.
    But, when I enter the Execute Button on Spark Ighiter, always show this message
    Your application has the following error(s):
    { “status”: “ERROR”, “result”: { “message”: “Ask timed out on [Actor[akka://JobServer/user/context-supervisor/81db940c-readHDFS#1269440544]] after [10000 ms]”, “errorClass”: “akka.pattern.AskTimeoutException”, “stack”: [“akka.pattern.PromiseActorRef$$anonfun$1.apply$mcV$sp(AskSupport.scala:333)”, “akka.actor.Scheduler$$anon$7.run(Scheduler.scala:117)”, “scala.concurrent.Future$InternalCallbackExecutor$.scala$concurrent$Future$InternalCallbackExecutor$$unbatchedExecute(Future.scala:694)”, “scala.concurrent.Future$InternalCallbackExecutor$.execute(Future.scala:691)”, “akka.actor.LightArrayRevolverScheduler$TaskHolder.executeTask(Scheduler.scala:467)”, “akka.actor.LightArrayRevolverScheduler$$anon$8.executeBucket$1(Scheduler.scala:419)”, “akka.actor.LightArrayRevolverScheduler$$anon$8.nextTick(Scheduler.scala:423)”, “akka.actor.LightArrayRevolverScheduler$$anon$8.run(Scheduler.scala:375)”, “java.lang.Thread.run(Thread.java:701)”] } } (error 500)

    I have no idea what’s wrong with this condition .

    • Hue Team 3 years ago

      It won’t work, just run your jar in the new app or do this in the Notebook

      val logFile = sc.textFile(“README.md”)
      logFile.foreach(println)

  16. Andy 2 years ago

    in Hue Service Advanced Configuration Snippet” (safety valve in CM configuration for Hue) ,I add the [desktop]
    app_blacklist=
    but it
    “Detects potentially misconfigured. Repair and restart Hue.”
    when i remove the [desktop]
    app_blacklist=
    it becomes normal what is wrong ,who can heip me ?the cdh5.7.2 version

    • Author
      Hue Team 2 years ago

      This is normal, the apps that you unblacklisted are not talking to any live HBase, Livy… server

Leave a reply

Your email address will not be published. Required fields are marked *

*