Cloud Native Data Pipelines

Tuesday May 2

2:35 PM –

3:25 PM

Vevey 1-2

Slides:

This video is also available in the GOTO Play video app! Download it to enjoy offline access to our conference videos while on the move.

Big Data companies (e.g. LinkedIn, Facebook, Google, and Twitter) have historically built custom data pipelines over bare metal in custom-designed data centers. In order to meet strict requirements on data security, fault-tolerance, cost control, job scalability, network topology, and compute and storage placement, they need to closely manage their core technology. In recent years, many companies with Big Data needs have started migrating to one of the public cloud vendors. How does the public cloud change the game? Specifically, how can companies effectively marry cloud best-practices with big data technology in order to leverage the benefits of both? Agari, a leading email security company, is applying big data best practices to both the security industry and to the cloud in order to secure the world against email-bourne threats. We do this by building both batch and stream processing predictive data pipelines in the AWS cloud. Come to this talk to learn about our architectural best practices and technologies.

Sid Anand

Data Architect for Datazoom, a rising email security company. Committer, Advisor, Speaker. Former LinkedIn, Netflix and eBay

Data

Tuesday May 2 @ 11:40 AM

Processing Data of Any Size with Apache Beam

Jesse Anderson

Tuesday May 2 @ 3:40 PM

Apache Spark Beyond Shuffling - Why it isn't Magic - but also where there is some really cool Magic

Holden Karau

Tuesday May 2 @ 1:30 PM

Apache Flink - The State of the Art in Streaming Computation

Jamie Grier

Tuesday May 2 @ 10:35 AM

Fast Data Architectures for Streaming Applications

Dean Wampler

Tuesday May 2 @ 2:35 PM

Cloud Native Data Pipelines

Sid Anand

Tuesday May 2 @ 4:45 PM

Stream All Things - Patterns of Modern Data Integration

Gwen Shapira