Process CSVs from Amazon S3 using Apache Flink, JHipster, and Kubernetes
Apache Flink [https://flink.apache.org/] is one of the latest distributed Big Data frameworks…
Confluent & Twitter4j Tutorial
Reading a Real-Time stream of Tweets into Kafka Kafka is an amazing tool for processing…
This post demonstrates a cost-effective and automated solution for running Spark-Jobs on the EMR cluster on a daily basis using CloudWatch, Lambda, EMR, S3, and SNS.…
On our previous video on the basics of Nifi [https://test-ippon.ghost.io/basics-of-apache-nifi], we…
Performance Tweaking Apache Spark
Apache Spark Streaming applications need to be monitored frequently to be certain that they are…
In our previous article on Nifi [https://test-ippon.ghost.io/why-nifi-2], we discussed the history,…