Bay Area BikeShare Data Analysis with Search and Spark Notebook

Published on 07 July 2015 in Tutorial - 1 minute read - Last modified on 06 March 2021

In this tutorial, we use public data from Bay Area BikeShare and visualize bike trips patterns and their users to understand more the usage of the platform. Hue provides a Dynamic Search dashboard as well as the new Spark Notebook for enriching the data.

We recommend to start with the Trip dataset from http://www.bayareabikeshare.com/datachallenge and index it into Solr. For impatient people, we provide a subset of trips ready to be indexed as well as the weather data to be processed later with Spark. The Search Dashboard can be downloaded here, the Notebook can be downloaded and imported with Hue 3.9 or just copy pasted.

 

This demo combined with Real-time Spark Streaming have been presented at conference like Hadoop Summit and Big Data Day LA.

Happy Biking!

 

 

Example of interactive dashboard created by Drag&Drop

 

As usual feel free to comment on the hue-user list or @gethue!

 

Tip

A quick way to index the data with Solr:



bin/solr create_collection  -c  bikes

URL=http://localhost:8983/solr

u="$URL/bikes/update?commitWithin=5000"

curl $u -data-binary @/home/test/index_data.csv -H 'Content-type:text/csv'


comments powered by Disqus

More recent stories

30 June 2021
Azure Storage sharing by leveraging SAS tokens so that your users don’t need credentials
Read More
10 June 2021
Hue 4.10 and its new SQL Editor component, REST API, small File Importer and Slack App are out!
Read More
29 May 2021
Build your own SQL Editor (BYOE) in 5 minutes via Sql Scratchpad component and public REST API.
Read More