Build a Real Time Analytic dashboard with Solr Search and Spark Streaming

Build a Real Time Analytic dashboard with Solr Search and Spark Streaming

Search is a great way to interactively explore your data. The Search App is continuously improving and now comes with a better support for real time!

In this video, we are collecting tweets with Spark Streaming and directly indexing them into Solr with the Spark Solr app. Note that we are using a slightly modified version that adds more tweet information.

 

You can see the tweets rolling in! Compared to the previous version:

  • the dashboard updates its widgets only when the data changes without any page jumping
  • the dashboard can refresh itself automatically every N seconds
  • a main date filter lets you quickly select a rolling date range for all the dashboard

 

live-search

Tweets coming in

 

Instructions
Download a nightly Solr 5.x, uncompress it and start it:

bin/solr start -cloud
bin/solr create -c tweets

Then compile the Spark Solr app.

Enable the analytic widgets in hue.ini:

[search]
latest=true

Sum-up

They are other ways to index data in near real time but we took this approach as the scenario was working out of the box with just Spark Streaming and the Solr app. Next time, we will preview the new Analytics Features of Solr 5.2 and show how we can use Python Spark to index some data!

As usual feel free to comment on the hue-user list or @gethue!

35 Comments

  1. Blair Krotenko 2 years ago

    Hello Hue Team,

    I’m trying to work through this tutorial but I’m running into a few issues.

    When I click the drop-down to select a search index, I don’t see tweets_shard1_replica1. However, if I check the box for Show cores, I do see one called twitter_demo_shard1_replica1. But, that dataset only has 16 fields compared to the 42 in the demo, and the field ‘hashtags’ is not included. Also, the global time filter does not appear when I select this dataset. Do I need to install this dataset? I did install all application examples from step 2 of the Hue Quick Start Wizard.

    I’m using Hue 3.7.0 from the quickstart v. 5.4.0 VMware vm.

    Thanks,
    Blair

    • Hue Team 2 years ago

      The global time filter is in Hue 3.9 which is not released yet, but you should not need it.

      If you are seeing only 16 fields, this is probably because not all the dynamic fields were fetch when you opened the dashboard. Could you just reload the page?

      • Blair Krotenko 2 years ago

        I tried a few things to get the dynamic fields to appear, but was not successful. I tried page reload, create a new dashboard, re-install the application examples, and delete and re-import the VM from the download zip file. Also, the counter button does not exist in my version.

        Thanks,
        Blair

        • Hue Team 2 years ago

          Yes, counters are coming in Hue 3.9 which is not released but are in master.

          When you click on a result row, do you see some dynamic fields value there?

  2. Victor 1 year ago

    Hello there,

    We’re trying to create a dynamic dashboard using the Solr Search using a file to load some information and make hue read and create the graphics. But the panel that we are using is not refreshing even checking the checkbox of AutoRefreshing. The values from index still static…

    Is there something that we didn’t noticed?

    Thanks

    • Author
      Hue Team 1 year ago

      I just checked on http://demo.gethue.com/search/?collection=14 and it works if you check and uncheck the box after picking the time.

      • Victor 1 year ago

        Sorry. When I wrote that message I forgot to mention that we are trying to add some values into an index created. In that demo has something like that?

        • Author
          Hue Team 1 year ago

          Yes, don’t you see in the video the new tweets coming in?

          • Victor 1 year ago

            Yes but when we run our application in spark updating a file that was being used by the index, the workbook didn’t updated the charts and the tables. Could you release the spark code to us? Thanks

          • Author
            Hue Team 1 year ago

            The spark code is listed already: https://github.com/romainr/spark-solr

            What you might miss is what we do at 3:50 in the video to trigger the automatic refresh of the dashboard.

  3. Khaled Idriss 12 months ago

    Hello, How to use Metrics in HUE Dashboards, I want to create bar chart with Y-Axis using Sum not Count as usual

  4. Poshita Singh 8 months ago

    Will Ubuntu 16.04 support Hue?

    • Author
      Hue Team 8 months ago

      I just tried and a freshly installed 16.04 supports Hue. The steps I did:

      sudo add-apt-repository ppa:webupd8team/java
      sudo apt-get update
      sudo apt-get install oracle-java8-installer

      sudo apt-get install ant gcc g++ libffi-dev libkrb5-dev libmysqlclient-dev libsasl2-dev libsasl2-modules-gssapi-mit libsqlite3-dev libssl-dev libxml2-dev libxslt-dev make maven libldap2-dev python-dev python-setuptools libgmp3-dev libz-dev

      git clone https://github.com/cloudera/hue.git
      cd hue
      make apps
      build/env/bin/hue runserver

  5. Fred 7 months ago

    Hi All,
    I use HUE 3.11. Everythings seem work fine buidind Dashboard. But I can’t see global time filter to auto refresh indexes. I looked for a parameter in hue.ini but
    I don’t find it. Could you help me please ?

    • Author
      Hue Team 7 months ago

      Hi, you will need to have at least one field defined as date/time in your Solr collection to enable that feature

  6. Fred 7 months ago

    Thanks for your very fast answer. I allready have a date/time field in my Solr collection :
    ex :
    But I cannot see the option to schedule collection’s auto-refresh (every 5mn, 10mn, …)

  7. Fred 7 months ago

    Sorry my sample disappeared when I validated my post
    field name=”PRE_DATE_EXPED” type=”tdates”

  8. Fred 7 months ago

    Thanks but … I tried your solution creating a new collection with “tdate” type fields. Any changes. Maybe I ‘m not clear. I ‘d just like to automatically refresh Dashboard after new data injection. But I can’t see “time panel” on top of HUE Search screen, (near from collection name). I’ve seen some demo & video where it can be done automatically and in which the user could change refresh Schedule. To be complete I use HUE 3.11 & Solr 6.4.1. Thanks again.

  9. Fred 7 months ago

    Please find a sample of my data. but once again, I just want to access at “time panel” to automatically refresh my collection, not to enable the date filtering.
    File Header
    PRE_ID_DO,PRE_ID_ORDER,PRE_ID_PREPA,PRE_ID_ORDER_CLIENT,PRE_ID_DEST,PRE_ADR_LIV1,PRE_ADR_LIV2,PRE_ADR_LIV3,PRE_ADRESSE_IP,PRE_INST_EXPED,PRE_LIV_ZIP,PRE_LIV_VILLE,PLI_ID_PROD,PRO_LITERAL,PLI_QTY_ORDER,PRE_DATE_EXPED,PRE_ID_STATUS,LON_coordinate,LAT_coordinate
    Data sample :
    Line 1 :
    TFI,10680785,177983731,0000088/DEMO CH DAUMEZON FLEURY LES AUBRAIS 385S,447950,1 ROUTE DE CHANTEAU,BP 62016, ,,LIVRAISON URGENTE,45402,FLEURY LES AUBRAIS CEDEX,6ZTUGAP1002,SCAN INSTALLATION – PRISE EN MAIN ADMINISTRATEUR S,1,2017-04-05T14:49:01Z,CC,1.9066718,47.9267019
    Line 2 :
    TFI,10680785,177983731,0000088/DEMO CH DAUMEZON FLEURY LES AUBRAIS 385S,447950,1 ROUTE DE CHANTEAU,BP 62016, ,,LIVRAISON URGENTE,45402,FLEURY LES AUBRAIS CEDEX,6B000000738,E-STUDIO385S (DP-3850S-MJD),1,2017-04-05T14:49:01Z,CC,1.9066718,47.9267019
    Line 3 :
    TFI,10680785,177983731,0000088/DEMO CH DAUMEZON FLEURY LES AUBRAIS 385S,447950,1 ROUTE DE CHANTEAU,BP 62016, ,,LIVRAISON URGENTE,45402,FLEURY LES AUBRAIS CEDEX,6BTFMCSTECH,CONNEXION PAR LE SAV LOCAL,1,2017-04-05T14:49:01Z,CC,1.9066718,47.9267019
    Line 4 :
    TFI,10680785,177983731,0000088/DEMO CH DAUMEZON FLEURY LES AUBRAIS 385S,447950,1 ROUTE DE CHANTEAU,BP 62016, ,,LIVRAISON URGENTE,45402,FLEURY LES AUBRAIS CEDEX,6ZTUGAP1001,PRINT INSTALLATION – PRISE EN MAIN ADMINISTRATEUR,1,2017-04-05T14:49:01Z,CC,1.9066718,47.9267019
    Line 5 :
    TFI,10680788,177983733,0000090/DEMO CH DAUMEZON FLEURY LES AUBRAIS 2000AC,447950,1 ROUTE DE CHANTEAU,BP 62016, ,,LIVRAISON URGENTE,45402,FLEURY LES AUBRAIS CEDEX,6ZTUGAP1002,SCAN INSTALLATION – PRISE EN MAIN ADMINISTRATEUR S,1,2017-04-05T14:49:01Z,CC,1.9066718,47.9267019
    Line 6 :
    TFI,10680788,177983733,0000090/DEMO CH DAUMEZON FLEURY LES AUBRAIS 2000AC,447950,1 ROUTE DE CHANTEAU,BP 62016, ,,LIVRAISON URGENTE,45402,FLEURY LES AUBRAIS CEDEX,6ZTUGAP1001,PRINT INSTALLATION – PRISE EN MAIN ADMINISTRATEUR,1,2017-04-05T14:49:01Z,CC,1.9066718,47.9267019
    Line 7 :
    TFI,10680788,177983733,0000090/DEMO CH DAUMEZON FLEURY LES AUBRAIS 2000AC,447950,1 ROUTE DE CHANTEAU,BP 62016, ,,LIVRAISON URGENTE,45402,FLEURY LES AUBRAIS CEDEX,6BTMARIANNE,INSTALLATION FONCTIONNALITE MARIANNE,1,2017-04-05T14:49:01Z,CC,1.9066718,47.9267019
    Line 8 :
    TFI,10680788,177983733,0000090/DEMO CH DAUMEZON FLEURY LES AUBRAIS 2000AC,447950,1 ROUTE DE CHANTEAU,BP 62016, ,,LIVRAISON URGENTE,45402,FLEURY LES AUBRAIS CEDEX,6BTFMCSTECH,CONNEXION PAR LE SAV LOCAL,1,2017-04-05T14:49:01Z,CC,1.9066718,47.9267019

    • Author
      Hue Team 6 months ago

      So I am using master Hue and I see the time panel. If you left the filter selected with Rolling/’All’ it should not apply any time filtering (and you can check the live update)

  10. barış 6 months ago

    Hi,

    we have similar app. Three download buttons (excel,csv,… ) dont seem on our dahboard’s html or grid views. How can we enable these buttons like below page?
    https://www.cloudera.com/documentation/enterprise/5-5-x/topics/search_use_hue_search.html

    Thanks.

    • Author
      Hue Team 6 months ago

      Which version of Hue are you running?

      • barış 6 months ago

        3.10
        thanks

Leave a reply

Your email address will not be published. Required fields are marked *

*