Build a Real Time Analytic dashboard with Solr Search and Spark Streaming

Published on 21 May 2015 in Tutorial - 2 minutes read - Last modified on 06 March 2021

Search is a great way to interactively explore your data. The Search App is continuously improving and now comes with a better support for real time!

In this video, we are collecting tweets with Spark Streaming and directly indexing them into Solr with the Spark Solr app. Note that we are using a slightly modified version that adds more tweet information.

 

You can see the tweets rolling in! Compared to the previous version:

  • the dashboard updates its widgets only when the data changes without any page jumping
  • the dashboard can refresh itself automatically every N seconds
  • a main date filter lets you quickly select a rolling date range for all the dashboard

 

Tweets coming in

 

Instructions

Download a nightly Solr 5.x, uncompress it and start it:



bin/solr start -cloud

bin/solr create -c tweets

Then compile the Spark Solr app.

Enable the analytic widgets in hue.ini:

[search]

latest=true

Sum-up

They are other ways to index data in near real time but we took this approach as the scenario was working out of the box with just Spark Streaming and the Solr app. Next time, we will preview the new Analytics Features of Solr 5.2 and show how we can use Python Spark to index some data!

As usual feel free to comment on the hue-user list or @gethue!


comments powered by Disqus

More recent stories

26 June 2024
Integrating Trino Editor in Hue: Supporting Data Mesh and SQL Federation
Read More
03 May 2023
Discover the power of Apache Ozone using the Hue File Browser
Read More