47 min listen
Automating Your Production Dataflows On Spark
Automating Your Production Dataflows On Spark
ratings:
Length:
49 minutes
Released:
Nov 4, 2019
Format:
Podcast episode
Description
As data engineers the health of our pipelines is our highest priority. Unfortunately, there are countless ways that our dataflows can break or degrade that have nothing to do with the business logic or data transformations that we write and maintain. Sean Knapp founded Ascend to address the operational challenges of running a production grade and scalable Spark infrastructure, allowing data engineers to focus on the problems that power their business. In this episode he explains the technical implementation of the Ascend platform, the challenges that he has faced in the process, and how you can use it to simplify your dataflow automation. This is a great conversation to get an understanding of all of the incidental engineering that is necessary to make your data reliable.
Released:
Nov 4, 2019
Format:
Podcast episode
Titles in the series (100)
Citus Data: Distributed PostGreSQL for Big Data with Ozgun Erdogan and Craig Kerstiens - Episode 13: Scaling PostGreSQL for Big Data and Parallel Execution with Citus Data (Interview) by Data Engineering Podcast