A music streaming company, Sparkify, decided that to introduce more automation and monitoring to their data warehouse ETL pipelines and come to the conclusion that the best tool to achieve this is Apache Airflow. The goal is to create high grade data pipelines that are dynamic and built from reusable tasks, can be monitored, and allow easy backfill -
View it on GitHub