Masterclasses
GOTO AI Days Chicago 2023

Thursday Oct 26
09:00 –
15:00
Breakout room 1

Fundamentals of Data Engineering Workshop

The world runs on data — every web search, personalized playlist, GPS guided trip, interactive AI chat and Lyft ride sits on a foundation of data pipelines that ingest, store, transform and serve data to support machine learning and AI experiences. Data engineering is the discipline that builds, monitors and maintains these systems and processes.

This workshop brings you Joe Reis and Matt Housley, authors of the best selling book Fundamentals of Data Engineering, for a live, interactive session that perfectly complements the book. You’ll get a guided tour of practical data engineering with time for Q&A. The workshop will focus on real world fundamentals, and hands-on experience with cloud-based tools.

By the end of this course, you’ll understand:

  • The stages of the data engineering lifecycle (Generation, Storage, Ingestion, Transformation, Serving)
  • The undercurrents of the data engineering lifecycle (Security, Data management, DataOps, Data architecture, Orchestration, Software engineering, etc.)
  • Basic principles of data architecture
  • Some of the main tradeoffs in data engineering

And you’ll be able to:

  • Map out a simple data stack
  • Understand how stages of the data engineering lifecycle correspond to architecture components
  • Spin up data engineering tools in the cloud and run experiments

This training course is for you because…

  • You’re a software engineer, product manager, data scientist or ML engineer who relies on data and needs to collaborate with data engineers
  • You’re data engineering curious and want to learn more about the discipline
  • You’ve decided to embark on a career in data engineering and are ready to take the first steps.

Course Requirements Participants should be familiar with foundational data concepts, such as analytics, data generation, tabular data and databases. We also recommend some experience with programming and cloud platforms — GCP, AWS or Azure. We will run our exercises on Google Cloud Platform, so participants should set up a GCP user account in advance — instructions will be provided after enrollment.