scala – Loïc Bélec

Making kdb+ work with Apache Spark

kdb+, spark

Something exposed by Hugh Hyndman in his blog is the perfect fit between kdb+ tabular format and Apache Spark. He has created a Spark data source for kdb+ to this end. In this post, we will test his work in a simple way – before opening the door to a distributed system.

This data source makes Spark a powerful addition to kdb+ capabilities. In fact, kdb+ can be really limited sometimes as it is not scalable horizontally. Your limitation is often the hardware. Using Apache Spark in addition to kdb+ can help you alleviate the workload on your host machine and makes it possible to do some distributed computing.
More