Total 64 Posts

Data

Coalesce 2022 - The Analytics Engineering Conference hosted by dbt Labs (Recap)

Coalesce 2022 is dbt Labs' analytics engineering conference. For its third year, from October…

Read More


Jan 18, 2023 7 min read

Pooja Krishnan

Data

Data+AI Summit 2022 - Top Announcements and Recap

Data+AI Summit 2022 [https://databricks.com/dataaisummit/] is the world’s largest gathering among…

Read More


Jul 07, 2022 3 min read

Theo LEBRUN

Data

A Primer on Snowflake Stored Procedures

Snowflake is a data warehouse-as-a-service hosted completely in the cloud. For a Snowflake Primer, take…

Read More


Jul 05, 2022 14 min read

Pooja Krishnan

Cloud

Data Basics for Life-Long Software Engineers

Having recently made the switch from software to data engineering, I learned there are many…

Read More


Jan 18, 2022 3 min read

Hector Sanchez

Azure

Event-Driven Architecture: Getting Started with Kafka (Part 2)

An event-driven architecture is a paradigm that has become increasingly used in modern microservices-based architectures. It promises a more flexible and responsive architecture to business events, while offering better technical decoupling. Let's see how we can build it with Kafka.…

Read More


Nov 02, 2021 8 min read

Jean-François SIMON

Event-Driven

Event-Driven Architecture: Getting Started with Kafka (Part 1)

An event-driven architecture is a paradigm that has become increasingly used in modern microservices-based architectures. It promises a more flexible and responsive architecture to business events, while offering better technical decoupling. Let's see how we can build it with Kafka.…

Read More


Oct 26, 2021 7 min read

Jean-François SIMON

Event-Driven

A Beginner’s Guide to InfluxDB: A Time-Series Database

A time series database (TSDB) is specifically made for data that can be evaluated as…

Read More


Jun 29, 2021 4 min read

Ketki V Deshpande

Data

Data Hackathon Recap

Is the Holiday Spirit Contagious? During Ippon's first Data Hackathon in December 2020,…

Read More


Feb 12, 2021 3 min read

Ramya Shetty

Data

Process CSVs from Amazon S3 using Apache Flink, JHipster, and Kubernetes

Apache Flink [https://flink.apache.org/] is one of the latest distributed Big Data frameworks…

Read More


Feb 04, 2021 6 min read

Theo LEBRUN

Data Streaming

Use Stargate by DataStax to effortlessly store and query your data

Stargate [https://stargate.io/] is one of the latest shiny tools from DataStax [https://www.…

Read More


Jan 15, 2021 5 min read

Theo LEBRUN

Cassandra

Tips and Tricks for Manually Scaling a Global DynamoDB Table from an AWS Lambda

Objective Write an AWS Lambda that manually scales a global DynamoDB table Why? DynamoDB tables…

Read More


Dec 01, 2020 3 min read

Dennis Sharpe

AWS

ABC's of DQM: Control

This is the finale of a 3-part series introducing a Data Quality Management (DQM) framework…

Read More


Aug 31, 2020 8 min read

Dan Ferguson

Data

The ABCs of DQM: Balance

This blog is a part of a series of posts on Data Quality Management. The…

Read More


Aug 18, 2020 5 min read

Pooja Krishnan

Data

Saving and Analyzing Trending Topics on Twitter using AWS Athena, Lambda, and CDK

With more than 300 million active users, Twitter is still one of the more optimal…

Read More


Aug 11, 2020 5 min read

Theo LEBRUN

Twitter

Starting with AWS Glue and Querying S3 from Athena

Part one of three in a deep dive of ETL in AWS Glue. Learn how to create powerful low-code/no-code ETL processes from S3 to many data sources in AWS.…

Read More


Jul 28, 2020 8 min read

Sam Portillo

AWS