Build a Real Time Analytic dashboard with Solr Search and Spark Streaming

Build a Real Time Analytic dashboard with Solr Search and Spark Streaming

Search is a great way to interactively explore your data. The Search App is continuously improving and now comes with a better support for real time!

In this video, we are collecting tweets with Spark Streaming and directly indexing them into Solr with the Spark Solr app. Note that we are using a slightly modified version that adds more tweet information.

 

You can see the tweets rolling in! Compared to the previous version:

  • the dashboard updates its widgets only when the data changes without any page jumping
  • the dashboard can refresh itself automatically every N seconds
  • a main date filter lets you quickly select a rolling date range for all the dashboard

 

live-search

Tweets coming in

 

Instructions
Download a nightly Solr 5.x, uncompress it and start it:

bin/solr start -cloud
bin/solr create -c tweets

Then compile the Spark Solr app.

Enable the analytic widgets in hue.ini:

[search]
latest=true

Sum-up

They are other ways to index data in near real time but we took this approach as the scenario was working out of the box with just Spark Streaming and the Solr app. Next time, we will preview the new Analytics Features of Solr 5.2 and show how we can use Python Spark to index some data!

As usual feel free to comment on the hue-user list or @gethue!