Spec-Driven Development for Airflow DAGs
AI coding assistants have transformed software development, moving from ad hoc ""vibe coding"" to rigorous spec-driven development (SDD). The Airflow ecosystem has fully embraced these advancements, but different use cases demand different SDD approaches.
This talk compares ETL and ML pipeline patterns, showing how each leverages Airflow's unique capabilities differently. I then present SDD strategies along a Spec Stability Spectrum. ETL specs are stable and external — schemas, dbt models — making deterministic, template-driven approaches like DAG Factory and the cosmos-dbt-core skill the right fit. ML specs are volatile and internal, as experiments evolve, so LLM-driven hybrid approaches like the Airflow AI SDK and the airflow-hitl skill are better suited. Both approaches are demonstrated live with Claude Code.
Examples draw from my work at TXI Digital generating ETL and ML pipelines for heavy industry clients, with a focus on Rail and anecdotes from Renewable Energy.