Real time data querying is becoming a modern standard. Who wants to wait until the next day or week when needing to take decision now?
The stream of data comes from an Apache Kafka topic which can be queried via ksqlDB.
Components
To keep things simple, all the pieces have been put together in a “one-click” Docker Compose project which contains:
- ksqlDB from the ksqlDB quickstart
- A Hue Editor already configured with the ksqlDB Editor
One-line setup
For fetching the configurations and starting everything:
mkdir stream-sql-demo
cd stream-sql-demo
wget https://raw.githubusercontent.com/romainr/query-demo/master/stream-sql-demo/docker-compose.yml
docker-compose up -d
>
Creating network "stream-sql-demo_default" with the default driver
Creating hue-database ... done
Creating stream-sql-demo_jobmanager_1 ... done
Creating stream-sql-demo_mysql_1 ... done
Creating ksqldb-server ... done
Creating stream-sql-demo_zookeeper_1 ... done
Creating flink-sql-api ... done
Creating stream-sql-demo_taskmanager_1 ... done
Creating hue ... done
Creating ksqldb-cli ... done
Creating stream-sql-demo_kafka_1 ... done
Creating stream-sql-demo_datagen_1 ... done
Then the Hue Editor will be up at http://localhost:8888/.
As well as the ksqlDB API:
curl http://localhost:8088/info
> {"KsqlServerInfo":{"version":"0.12.0","kafkaClusterId":"DJzUX-zaTDCC5lqfVwf8kw","ksqlServiceId":"default_","serverStatus":"RUNNING"}}
For stopping everything:
docker-compose down
Query Experience
Notice that the Live SQL requires the New Editor which is in beta. In addition to soon offer multiple statements running at the same time on the same editor page and more robustness, it also bring the live result grid.
More improvements are on the way, in particular in the SQL autocomplete and Editor 2. In the future, the Task Server with Web Sockets will allow long running queries to run as separate tasks and prevent them from timing-out in the API server.
Note
In case you have an existing Hue Editor and want to point to the ksqlDB, just activate it via this config change:
[desktop]
enable_hue_5=true
[[interpreters]]
[[[ksqlDB]]]
name=ksqlDB
interface=ksql
options='{"url": "http://localhost:8088"}'
ksql
One nicety of ksqDB is its close integration with Kafka, for example we can list the topics:
SHOW TOPICS
The SQL syntax is a bit different but here is one way to create a similar table as above:
CREATE STREAM user_behavior (
user_id BIGINT,
item_id BIGINT,
category_id BIGINT,
behavior STRING,
ts STRING
) WITH (kafka_topic='user_behavior', value_format='json', partitions=1)
And peek at it:
SELECT *
FROM user_behavior
EMIT CHANGES
LIMIT 30
In another statement within Hue's Editor or by booting the SQL shell:
docker exec -it ksqldb-cli ksql http://ksqldb-server:8088
You can also insert your own records and notice the live updates of the results:
INSERT INTO user_behavior (
user_id ,
item_id ,
category_id ,
behavior ,
ts
)
VALUES
(1, 10, 20, 'buy', '1602998392')
In the next episodes, we will demo how to easily create tables directly from raw streams of data via the importer.
Any feedback or question? Feel free to comment here!
All these projects are also open source and welcome feedback and contributions. In the case of the Hue editor the Forum or Github issues are good places for that.
Onwards!
Romain