Search is a great way to interactively explore your data. The Search App is continuously improving and now comes with a better support for real time!
In this video, we are collecting tweets with Spark Streaming and directly indexing them into Solr with the Spark Solr app. Note that we are using a slightly modified version that adds more tweet information.
You can see the tweets rolling in! Compared to the previous version:
- the dashboard updates its widgets only when the data changes without any page jumping
- the dashboard can refresh itself automatically every N seconds
- a main date filter lets you quickly select a rolling date range for all the dashboard
Instructions
Download a nightly Solr 5.x, uncompress it and start it:
bin/solr start -cloud
bin/solr create -c tweets
Then compile the Spark Solr app.
Enable the analytic widgets in hue.ini:
[search]
latest=true
Sum-up
They are other ways to index data in near real time but we took this approach as the scenario was working out of the box with just Spark Streaming and the Solr app. Next time, we will preview the new Analytics Features of Solr 5.2 and show how we can use Python Spark to index some data!
As usual feel free to comment on the hue-user list or @gethue!