Record Linkage, a real use case with Spark ML

I participated to a project for a leading insurance company where I implemented a Record…

Read More


Feb 22, 2016 10 min read

Alexis Seigneurin

Apache Spark: MapReduce and RDD manipulations with keys

In a previous article [https://test-ippon.ghost.io/intro-mapreduce-spark/], we saw that Apache Spark allows…

Read More


Dec 30, 2014 5 min read

Alexis Seigneurin

Intro to MapReduce operations with Spark

In the previous post, we used the Map operation which allows us to transform values…

Read More


Nov 22, 2014 3 min read

Alexis Seigneurin

Introduction to Apache Spark

Spark [http://spark.apache.org/] is a tool intended to process large volumes of data…

Read More


Nov 11, 2014 5 min read

Alexis Seigneurin

From development to production with Vagrant and Packer

From development to production with Vagrant and Packer Have you heard of Vagrant? Vagrant [http:…

Read More


Apr 14, 2014 12 min read

Alexis Seigneurin