Open Source Airflow Contributions and Performance Improvements at G-Research with Christos Bisias episode artwork

EPISODE · Mar 19, 2026 · 17 MIN

Open Source Airflow Contributions and Performance Improvements at G-Research with Christos Bisias

from The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI · host Astronomer

Modern Airflow isn’t just orchestration. It's a contribution.In this episode, we explore how open source investment drives real performance gains and deeper observability.We’re joined by Christos Bisias, Open Source Software Engineer, Apache Airflow at G-Research, to discuss how his team uses Airflow for large-scale data transformations, contributes upstream and improves scheduler throughput and OpenTelemetry support. From trace-level observability to CI-enforced metrics governance and a major scheduler optimization, this conversation spans strategy, engineering and community impact.Key Takeaways:00:00 Introduction.01:20 How G-Research applies machine learning and big data to predict financial market movements.02:15 Contributing to open source is a business decision.03:10 Maintaining a fork is costly.04:30 OpenTelemetry collects metrics, logs and traces to provide deep system visibility. 06:10 Custom spans help identify bottlenecks inside tasks and enable performance optimization. 08:05 OpenTelemetry integration works properly in Airflow 3.0 and above.10:00 A YAML-based metrics registry with CI enforcement ensures consistency between docs and exported metrics.12:10 Scheduler throughput improved significantly by applying concurrency limits earlier in the database query.  15:20 Future Task SDK changes may enable language-agnostic DAG authoring beyond Python.Resources Mentioned:Christos Bisiashttps://www.linkedin.com/in/xbis/G-Research https://www.linkedin.com/company/g-research/Apache Airflowhttps://airflow.apache.org/OpenTelemetryhttps://opentelemetry.io/Prometheushttps://prometheus.io/Grafanahttps://grafana.com/Jaegerhttps://www.jaegertracing.io/Thanks for listening to “The Data Flowcast: Mastering Apache Airflow® for Data Engineering and AI.” If you enjoyed this episode, please leave a 5-star review to help get the word out about the show. And be sure to subscribe so you never miss any of the insightful conversations.#AI #Automation #Airflow

Modern Airflow isn’t just orchestration. It's a contribution.In this episode, we explore how open source investment drives real performance gains and deeper observability.We’re joined by Christos Bisias, Open Source Software Engineer, Apache Airflow at G-Research, to discuss how his team uses Airflow for large-scale data transformations, contributes upstream and improves scheduler throughput and OpenTelemetry support. From trace-level observability to CI-enforced metrics governance and a major scheduler optimization, this conversation spans strategy, engineering and community impact.Key Takeaways:00:00 Introduction.01:20 How G-Research applies machine learning and big data to predict financial market movements.02:15 Contributing to open source is a business decision.03:10 Maintaining a fork is costly.04:30 OpenTelemetry collects metrics, logs and traces to provide deep system visibility. 06:10 Custom spans help identify bottlenecks inside tasks and enable performance optimization. 08:05 OpenTelemetry integration works properly in Airflow 3.0 and above.10:00 A YAML-based metrics registry with CI enforcement ensures consistency between docs and exported metrics.12:10 Scheduler throughput improved significantly by applying concurrency limits earlier in the database query.  15:20 Future Task SDK changes may enable language-agnostic DAG authoring beyond Python.Resources Mentioned:Christos Bisiashttps://www.linkedin.com/in/xbis/G-Research https://www.linkedin.com/company/g-research/Apache Airflowhttps://airflow.apache.org/OpenTelemetryhttps://opentelemetry.io/Prometheushttps://prometheus.io/Grafanahttps://grafana.com/Jaegerhttps://www.jaegertracing.io/Thanks for listening to “The Data Flowcast: Mastering Apache Airflow® for Data Engineering and AI.” If you enjoyed this episode, please leave a 5-star review to help get the word out about the show. And be sure to subscribe so you never miss any of the insightful conversations.#AI #Automation #Airflow

NOW PLAYING

Open Source Airflow Contributions and Performance Improvements at G-Research with Christos Bisias

0:00 17:43

No transcript for this episode yet

We transcribe on demand. Request one and we'll notify you when it's ready — usually under 10 minutes.

Frequently Asked Questions

How long is this episode of The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI?

This episode is 17 minutes long.

When was this The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI episode published?

This episode was published on March 19, 2026.

What is this episode about?

Modern Airflow isn’t just orchestration. It's a contribution.In this episode, we explore how open source investment drives real performance gains and deeper observability.We’re joined by Christos Bisias, Open Source Software Engineer, Apache...

Can I download this The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI episode?

Yes, you can download this episode by clicking the download button on the episode player, or subscribe to the podcast in your preferred podcast app for automatic downloads.
URL copied to clipboard!