My journey with Kotlin - Part 2: Introduction to Kotlin
This is the second post of a 2-part introduction to Kotlin, a programming language for…
My journey with Kotlin - Part 1: How I came to dislike Java
This post is a 2-part introduction to Kotlin, a programming language for the JVM. If…
Why Kafka Streams didn't work for us? - Part 3
This is the third and final post in this series of posts in which I…
Why Kafka Streams didn't work for us? - Part 2
This is the second post in this series of posts in which I explain why,…
Why Kafka Streams didn't work for us? - Part 1
This is a post in 3 parts in which I explain how we started a…
Incrementally loaded Parquet files
In this post, I explore how you can leverage Parquet [https://parquet.apache.org/] when…
Spark Summit East 2017 - A summary
I attended Spark Summit East 2017 last week. This 2 day conference - February 8th…
Kafka Streams - Scaling up or down
Kafka Streams is a new component of the Kafka platform. It is a lightweight library…
Strata+Hadoop World 2016 in New York
Strata+Hadoop World is probably the most important conference about Data Science and Data Engineering.…
A team of five Ipponites hungry for knowledge descended upon a bustling Capital One [https:…
Spark - Calling Scala code from PySpark
In a previous post [https://test-ippon.ghost.io/spark-kafka-achieving-zero-data-loss/], I demonstrated how to consume a…
Spark & Kafka - Achieving zero data-loss
Kafka and Spark Streaming are two technologies that fit well together. Both are distributed systems…
Kafka, Spark and Avro - Part 3 of 3, Producing and consuming Avro messages
This post is the third and last post in a series in which we learn…
Kafka, Spark and Avro - Part 2 of 3, Consuming Kafka messages with Spark
This post is the second post in a series in which we will learn how…
Kafka, Spark and Avro - Part 1 of 3, Kafka 101
This post is the first in a series of posts in which we will learn…