Scaling Python for Machine Learning: Beyond Data Parallelism

Wednesday May 24

10:50 AM –

11:40 AM

Kinzie Forum

Slides:

This video is also available in the GOTO Play video app! Download it to enjoy offline access to our conference videos while on the move.

Data Parallelism can be amazing and it frees us from so many fiddly complicated tasks (like dealing with locks). On the other hand, as training large machine learning models becomes increasingly popular, we're seeing the need to move beyond purely data-parallel techniques. Depending on recompute exclusively for failure is no longer sufficient as our operations are not idempotent.

In this talk we will look at Spark, Dask, and Ray in the context of scaling machine learning models and how you can take advantage of other types of distributed parallelism (including the actor model for managing model weights during training).

machine learning (ML)

Holden Karau

Open Source Engineer at Netflix

Keynotes

Monday May 22 @ 5:10 PM

It's a Noisy World Out There

Linda Rising

Tuesday May 23 @ 9:30 AM

One Rule to Rule Them All

Dave Thomas

Tuesday May 23 @ 1:50 PM

The Psychology of UX

Fabio Nudge Pereira

Monday May 22 @ 1:50 PM

The Universe, Unfolded: NASA Webb Space Telescope

Kenneth Harris II

Wednesday May 24 @ 9:30 AM

Practical Magic: The Resilience Potion and Security Chaos Engineering

Kelly Shortridge

Wednesday May 24 @ 1:50 PM

What We Talk About When We Talk About Resilience

Courtney Nash

Monday May 22 @ 9:30 AM

Large Language Models: Friend, Foe, or Otherwise

Alex Castrounis

Tuesday May 23 @ 5:10 PM

Sailing Solo: One Man's Journey Through the World's Loneliest Race

Ian Herbert-Jones