Build a Real Time Analytic dashboard with Solr Search and Spark Streaming

Published on 21 May 2015 in Querying / Tutorial - 2 minutes read - Last modified on 04 February 2020

Search is a great way to interactively explore your data. The Search App is continuously improving and now comes with a better support for real time!

In this video, we are collecting tweets with Spark Streaming and directly indexing them into Solr with the Spark Solr app. Note that we are using a slightly modified version that adds more tweet information.

 

You can see the tweets rolling in! Compared to the previous version:

  • the dashboard updates its widgets only when the data changes without any page jumping
  • the dashboard can refresh itself automatically every N seconds
  • a main date filter lets you quickly select a rolling date range for all the dashboard

 

Tweets coming in

 

Instructions

Download a nightly Solr 5.x, uncompress it and start it:



bin/solr start -cloud

bin/solr create -c tweets

Then compile the Spark Solr app.

Enable the analytic widgets in hue.ini:

[search]

latest=true

Sum-up

They are other ways to index data in near real time but we took this approach as the scenario was working out of the box with just Spark Streaming and the Solr app. Next time, we will preview the new Analytics Features of Solr 5.2 and show how we can use Python Spark to index some data!

As usual feel free to comment on the hue-user list or @gethue!


comments powered by Disqus

More recent stories

23 September 2020
Hue 4.8 and its improvements are out!
Read More
15 September 2020
SQL Querying Improvements: Phoenix, Flink, SparkSql, ERD Table...
Read More
14 September 2020
REST API for sending SQL queries and Browsing files
Read More