Powered by RND
PodcastsTechnologyThe Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI

The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI

Astronomer
The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI
Latest episode

Available Episodes

5 of 80
  • How Redica Transformed Their Data With Airflow and Snowflake with Shankar Mahindar
    The life sciences industry relies on data accuracy, regulatory insight and quality intelligence. Building a unified system that keeps these elements aligned is no small feat.In this episode, we welcome Shankar Mahindar, Senior Data Engineer II at Redica Systems. We discuss how the team restructures its data platform with Airflow to strengthen governance, reduce compliance risk and improve customer experience.Key Takeaways:00:00 Introduction.01:53 A focused analytics platform reduces compliance risk in life sciences.07:31 A centralized warehouse orchestrated by Airflow strengthens governance.09:12 Managed orchestration keeps attention on analytics and outcomes.10:32 A modern transformation stack enables scalable modeling and operations.11:51 Event-driven pipelines improve data freshness and responsiveness.14:13 Asset-oriented scheduling and versioning enhance reliability and change control.16:53 Observability and SLAs build confidence in data quality and freshness.21:04 Priorities include partitioned assets and streamlined developer tooling.Resources Mentioned:Shankar Mahindarhttps://www.linkedin.com/in/shankar-mahindar-83a61b137/Redica Systems | LinkedInhttps://www.linkedin.com/company/redicasystems/Redica Systems | Websitehttps://redica.comApache Airflowhttps://airflow.apache.org/Astronomerhttps://www.astronomer.io/Snowflakehttps://www.snowflake.com/AWShttps://aws.amazon.com/Thanks for listening to “The Data Flowcast: Mastering Apache Airflow® for Data Engineering and AI.” If you enjoyed this episode, please leave a 5-star review to help get the word out about the show. And be sure to subscribe so you never miss any of the insightful conversations.#AI #Automation #Airflow #MachineLearning
    --------  
    23:48
  • How Airflow and AI Power Investigative Journalism at the Financial Times with Zdravko Hvarlingov
    The Financial Times leverages Airflow and AI to uncover powerful stories hidden within vast, unstructured data.In this episode, Zdravko Hvarlingov, Senior Software Engineer at the Financial Times, discusses building multi-tenant Airflow systems and AI-driven pipelines that surface stories that might otherwise be missed. Zdravko walks through entity extraction and fuzzy matching, linking the UK Register of Members’ Financial Interests with Companies House, and how this work cuts weeks of manual analysis to minutes.Key Takeaways:00:00 Introduction.02:12 What computational journalism means for day-to-day newsroom work.05:22 Why a shared orchestration platform supports consistent, scalable workflows.08:30 Tradeoffs of one centralized platform versus many separate instances.11:52 Using pipelines to structure messy sources for faster analysis.14:14 Turning recurring disclosures into usable data for investigations.16:03 Applying lightweight ML and matching to reveal entities and links.18:46 How automation reduces manual effort and shortens time to insight.20:41 Practical improvements that make backfilling and reliability easier.Resources Mentioned:Zdravko Hvarlingovhttps://www.linkedin.com/in/zdravko-hvarlingov-3aa36016b/Financial Times | LinkedInhttps://www.linkedin.com/company/financial-times/Financial Times | Websitehttps://www.ft.com/Apache Airflowhttps://airflow.apache.org/UK Register of Members’ Financial Interestshttps://www.parliament.uk/mps-lords-and-offices/standards-and-financial-interests/parliamentary-commissioner-for-standards/registers-of-interests/register-of-members-financial-interests/UK Companies Househttps://www.gov.uk/government/organisations/companies-houseDopplerhttps://www.doppler.com/Kuberneteshttps://kubernetes.io/Airflow Kubernetes Executorhttps://airflow.apache.org/docs/apache-airflow/stable/executor/kubernetes.htmlGitHubhttps://github.com/Thanks for listening to “The Data Flowcast: Mastering Apache Airflow® for Data Engineering and AI.” If you enjoyed this episode, please leave a 5-star review to help get the word out about the show. And be sure to subscribe so you never miss any of the insightful conversations.#AI #Automation #Airflow #MachineLearning
    --------  
    24:28
  • Inside Vinted’s Code-Generated Airflow Pipelines with Oscar Ligthart and Rodrigo Loredo
    The shift from monolithic to decentralized data workflows changes how teams build, connect and scale pipelines.In this episode, we feature Oscar Ligthart, Lead Data Engineer, and Rodrigo Loredo, Lead Analytics Engineer, both at Vinted, as we unpack their YAML-driven abstraction that generates Airflow DAGs and standardizes cross-team orchestration.Key Takeaways:00:00 Introduction.05:28 Challenges of decentralization.06:45 YAML-based generator standardizes pipelines and dependencies.12:28 Declarative assets and sensors align cross-DAG dependencies.17:29 Task-level callbacks enable auto-recovery and clear ownership.21:39 Standardized building blocks simplify upgrades and maintenance.24:52 Platform focus frees domain work.26:49 Container-only standardization prevents sprawl.Resources Mentioned:Oscar Ligtharthttps://www.linkedin.com/in/oscar-ligthart/Rodrigo Loredohttps://www.linkedin.com/in/rodrigo-loredo-410a16134/Vinted | LinkedInhttps://www.linkedin.com/company/vinted/Vinted | Websitehttps://www.vinted.com/?srsltid=AfmBOor87MGR_eLOauCO93V9A-aLDaAhGYx9cnu_oN8s1SAXMlCRuhW7Apache Airflowhttps://airflow.apache.org/Kuberneteshttps://kubernetes.io/dbthttps://www.getdbt.com/Google Cloud Vertex AIhttps://cloud.google.com/vertex-aiAirflow Datasets & Assets (concepts)https://www.astronomer.io/docs/learn/airflow-datasetsAirflow Summithttps://airflowsummit.org/Thanks for listening to “The Data Flowcast: Mastering Apache Airflow® for Data Engineering and AI.” If you enjoyed this episode, please leave a 5-star review to help get the word out about the show. And be sure to subscribe so you never miss any of the insightful conversations.#AI #Automation #Airflow #MachineLearning
    --------  
    29:36
  • Transforming Data Pipelines at XENA Intelligence with Naseem Shah
    The shift from simple cron jobs to orchestrated AI-powered workflows is reshaping how startups scale. For a small team, these transitions come with unique challenges and big opportunities.In this episode, Naseem Shah, Head of Engineering at Xena Intelligence, shares how he built data pipelines from scratch, adopted Apache Airflow and transformed Amazon review analysis with LLMs.Key Takeaways:00:00 Introduction.03:28 The importance of building initial products that support growth and investment.06:16 The process of adopting new tools to improve reliability and efficiency.09:29 Approaches to learning complex technologies through practice and fundamentals.13:57 Trade-offs small teams face when balancing performance and costs.18:40 Using AI-driven approaches to generate insights from large datasets.22:38 How unstructured data can be transformed into actionable information.25:55 Moving from manual tasks to fully automated workflows.28:05 Orchestration as a foundation for scaling advanced use cases.Resources Mentioned:Naseem Shahhttps://www.linkedin.com/in/naseemshah/Xena Intelligence | LinkedInhttps://www.linkedin.com/company/xena-intelligence/Xena Intelligence | Websitehttps://xenaintelligence.com/Apache Airflowhttps://airflow.apache.org/Google Cloud Composerhttps://cloud.google.com/composerTechstarshttps://www.techstars.com/Dockerhttps://www.docker.com/AWS SQShttps://aws.amazon.com/sqs/PostgreSQLhttps://www.postgresql.org/Thanks for listening to “The Data Flowcast: Mastering Apache Airflow® for Data Engineering and AI.” If you enjoyed this episode, please leave a 5-star review to help get the word out about the show. And be sure to subscribe so you never miss any of the insightful conversations.#AI #Automation #Airflow #MachineLearning
    --------  
    28:32
  • Scaling Geospatial Workflows With Airflow at Overture Maps Foundation and Wherobots with Alex Iannicelli and Daniel Smith
    Using Airflow to orchestrate geospatial data pipelines unlocks powerful efficiencies for data teams. The combination of scalable processing and visual observability streamlines workflows, reduces costs and improves iteration speed.In this episode, Alex Iannicelli, Staff Software Engineer at Overture Maps Foundation, and Daniel Smith, Senior Solutions Architect at Wherobots, join us to discuss leveraging Apache Airflow and Apache Sedona to process massive geospatial datasets, build reproducible pipelines and orchestrate complex workflows across platforms.Key Takeaways:00:00 Introduction.03:22 How merging multiple data sources supports comprehensive datasets.04:20 The value of flexible configurations for running pipelines on different platforms.06:35 Why orchestration tools are essential for handling continuous data streams.09:45 The importance of observability for monitoring progress and troubleshooting issues.11:30 Strategies for processing large, complex datasets efficiently.13:27 Expanding orchestration beyond core pipelines to automate frequent tasks.17:02 Advantages of using open-source operators to simplify integration and deployment.20:32 Desired improvements in orchestration tools for usability and workflow management.Resources Mentioned:Alex Iannicellihttps://www.linkedin.com/in/atiannicelli/Overture Maps Foundation | LinkedInhttps://www.linkedin.com/company/overture-maps-foundation/Overture Maps Foundation | Websitehttps://overturemaps.orgDaniel Smithhttps://www.linkedin.com/in/daniel-smith-analyst/Wherobots | LinkedInhttps://www.linkedin.com/company/wherobotsWherobots | Websitehttps://www.wherobots.comApache Airflowhttps://airflow.apache.org/Apache Sedonahttps://sedona.apache.org/Github repohttps://github.com/wherobots/airflow-providers-wherobotsThanks for listening to “The Data Flowcast: Mastering Apache Airflow® for Data Engineering and AI.” If you enjoyed this episode, please leave a 5-star review to help get the word out about the show. And be sure to subscribe so you never miss any of the insightful conversations.#AI #Automation #Airflow #MachineLearning
    --------  
    24:03

More Technology podcasts

About The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI

Welcome to The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI— the podcast where we keep you up to date with insights and ideas propelling the Airflow community forward. Join us each week, as we explore the current state, future and potential of Airflow with leading thinkers in the community, and discover how best to leverage this workflow management system to meet the ever-evolving needs of data engineering and AI ecosystems. Podcast Webpage: https://www.astronomer.io/podcast/
Podcast website

Listen to The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI, Lex Fridman Podcast and many other podcasts from around the world with the radio.net app

Get the free radio.net app

  • Stations and podcasts to bookmark
  • Stream via Wi-Fi or Bluetooth
  • Supports Carplay & Android Auto
  • Many other app features

The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI: Podcasts in Family

Social
v7.23.11 | © 2007-2025 radio.de GmbH
Generated: 11/9/2025 - 11:29:16 AM