Record Linkage, a real use case with Spark ML
I participated to a project for a leading insurance company where I implemented a Record…
Apache Spark: MapReduce and RDD manipulations with keys
In a previous article [https://test-ippon.ghost.io/intro-mapreduce-spark/], we saw that Apache Spark allows…
Intro to MapReduce operations with Spark
In the previous post, we used the Map operation which allows us to transform values…
Spark [http://spark.apache.org/] is a tool intended to process large volumes of data…
From development to production with Vagrant and Packer
From development to production with Vagrant and Packer Have you heard of Vagrant? Vagrant [http:…