Seattle, WA
December 10–13, 2018
Click Here for More Information & Registration
View Venue Map
Back To Schedule
Tuesday, December 11 • 1:45pm - 2:20pm
Scaling AI Inference Workloads with GPUs and Kubernetes - Renaud Gaubert & Ryan Olson, NVIDIA

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Feedback form is now closed.
Deep Learning (DL) is a computational intense form of machine learning that has revolutionize many fields including computer vision, automated speech recognition, natural language processing and artificial intelligence (AI).

DL impacts every vertical market from automotive to healthcare to cloud, as a result, the training and deployment of Deep Neural Networks (DNNs) has shifted datacenter workloads from traditional CPUs to AI-specific accelerators like NVIDIA GPUs.

Leveraging several popular CNCF projects such as Prometheus, Envoy, and gRPC, we will demonstrate an implementation of NVIDIA’s reference scale-out inference architecture, capable of delivering petaops per second of performance.

This is a new and challenging problem in the datacenter and we will discuss these challenges and ways to optimize for service delivery metrics (latency/throughput), cost, and redundancy.

avatar for Renaud Gaubert

Renaud Gaubert

Software Engineer, Nvidia
Renaud Gaubert has been working since 2017 at NVIDIA on making GPU applications easier to deploy and manage in data centers. He focuses on supporting GPU-accelerated machine learning frameworks in container orchestration systems such as Kubernetes, Docker swarm, and Nomad. He is an... Read More →
avatar for Ryan Olson

Ryan Olson

Solutions Architect, NVIDIA
Ryan Olson is a Solutions Architect in the Worldwide Field Organization at NVIDIA. His primary responsibilities involve supporting deep learning and high performance computing applications. Ryan is particularly interested in scalable software design that leverages the unique capabilities... Read More →

Tuesday December 11, 2018 1:45pm - 2:20pm PST