Seattle, WA
December 10–13, 2018
Click Here for More Information & Registration
View Venue Map

Sign up or log in to bookmark your favorites and sync them to your phone or calendar.

Performance [clear filter]
Thursday, December 13


gRPC Performance; Tuning Applications and Libraries - Noah Eisen, Google
gRPC C++ team has been working on performance for over a year now. In this presentation we will share the story of our journey, sharing insights on tuning applications that use gRPC as well as optimizing the library itself.

All concrete examples will be from gRPC, but the high level concepts will be interesting to anyone who has worked on performance.

We will focus on:
- benchmarking
+ microbenchmarks
+ synthetic benchmarks
+ application benchmarks
+ cutting down noise on benchmarks
- tooling
+ flamegraphs
+ latency breakdowns
- concrete optimizations
+ tuning the threading model of gRPC apps
+ high performance network polling systems

avatar for Noah Eisen

Noah Eisen

Software Engineer, Google
Noah Eisen, who hails from University of Michigan, has worked at Google for the past two year. For most of that time he has been on the gRPC C++ team, and within that, the performance sub-team. He now leads the team's benchmarking and tooling efforts, focusing on measuring the impact... Read More →

Thursday December 13, 2018 10:50am - 11:25am
4C 3/4


Got a Need for Speed? Accelerate Your Prometheus Dashboard Using Trickster - Shilla Saebi & James Ranson, Comcast
We live in a world where high performance, and speed are essential. A few extra seconds of response time on a dashboard can be a deal breaker. Many dashboards request the entire time range of data from the time series database, every time a dashboard loads or reloads. This can result in slower rendering times, and different results depending on when the request is made. We are proud to announce that Trickster, a new open source project, was developed to address this very issue.

Written in Go, Trickster is a reverse proxy cache for the Prometheus HTTP APIv1 that considerably accelerates dashboard rendering times for any series queried from Prometheus. This is possible because of the delta proxy, step boundary normalization, and fast forward features. In the presentation, we will discuss how Trickster was developed at Comcast, and we will show you a live demo of the software.

avatar for James Ranson

James Ranson

Principal Software Architect, Comcast
James Ranson is a Platform Software Architect currently living in Denver Colorado. He has been with Comcast for over 8 years specializing in creating software and platforms that operate efficiently and scale horizontally. He is an expert on software development and release management... Read More →
avatar for Shilla Saebi

Shilla Saebi

Open Source Program Manager, Comcast
Shilla is an Open Source Program Manager at Comcast and recently became a CNCF ambassador. She's worked in diverse roles within the industry in positions ranging from operations engineering, sys administration, customer service, and network ops. She's an open source contributor and... Read More →

Thursday December 13, 2018 11:40am - 12:15pm
4C 3/4


eBPF Powered Distributed Kubernetes Performance Analysis - Lorenzo Fontana, InfluxData
Since the Linux kernel 4.x series a lot of enanchements reached mainline to the eBPF ecosystem giving us the capability to do a lot more than just network stuff.
The purpose of this talk is to give an initial understanding on what eBPF programs are and how to hook them to programs running inside Kubernetes clusters in order to answer targeted questions at cluster level but about very specific fine-grained situations happening in our programs and systems, like:
- Had that function in my program been called ?
- For a given function which arguments have been passed to it? And what it did return?
- Which TCP packets are being retransmitted?
- What are the queries running slow?
- Insights on programming language events/gc
- Had that file been opened?
Imagine a programmable Kubernetes performance analysis tool that runs at cluster level without performance implications how would you it to be?

avatar for Lorenzo Fontana

Lorenzo Fontana

SRE, Sysdig
Lorenzo Fontana is an SRE at InfluxData where he works on tooling, scaling and performance on InfluxCloud. He’s passionate about distributed systems, software defined networking, linux and performance analysis and divides himself between his daily job and open source contributions... Read More →

Thursday December 13, 2018 1:45pm - 2:20pm
4C 3/4


Encoding 250,000 Songs a Day with batch/v1 Jobs - Leigh Capili & John Slivka, Beatport
Tasked with rebuilding the way we deliver music to DJ's, the Beatport Infrastructure team set out to use Kubernetes to construct scalable compute for executing batch and on-demand encoding workloads in order to level-up our customer's capabilities for playing and mixing dance music.

What would follow is a 5-month journey of building clusters, thrashing with software dependencies, and trudging through erratic performance and scalability issues with the kubernetes API.

How did we decide to use kubernetes?
Was it easy to prototype?

Is etcd capable of sustainably servicing 10,000 Jobs an hour?
How many Pods can the kubernetes API store?
How do you monitor and manage Job failures?

We'll walk you through our lessons learned and talk about our most exciting moments and deflating realizations.
Join us we re-tell the story of delivering a correct system to production :)

avatar for Leigh Capili

Leigh Capili

Infrastructure Engineer, Beatport
Leigh is a young Cloud Engineer local to Denver who is passionate about Distributed Systems. Leigh contributes to Kubernetes and likes functional Javascript. At AT&T, Leigh helped design and implement a consistent, hierarchical, and reactive datastore inspired by Facebook's Flux... Read More →
avatar for John Slivka

John Slivka

Infrastructure Engineer, Beatport
John Slivka works on the Infrastructure team at Beatport. Previously, he's worked for IBM Cloud and Oracle Cloud Object Storage as a Software Engineer.

Thursday December 13, 2018 2:35pm - 3:10pm
4C 3/4


Performance Testing Ingress for Internet-Scale Workloads - Alexander Brand, Heptio
Have you ever wondered how much ingress traffic a Kubernetes cluster could handle? How many nodes would it take to handle the traffic of an Alexa top-40 website? Understanding these numbers and how your ingress infrastructure scales is critical when it comes to deploying internet-accessible applications in production.
 At Heptio, we needed to prove that our Envoy-based ingress projects, Contour and Gimbal, would scale to support millions of concurrent connections, thousands of backend services, and thousands of virtual hosts.
 In this talk, we will explore the strategies and tools we used, the challenges we faced and the lessons we learned while running these tests. We will dive into kernel tuning, HTTP benchmarking, Envoy metrics, and more. We hope that talking about our experience will help when it comes to performance testing your cloud-native applications and infrastructure.

avatar for Alexander Brand

Alexander Brand

Systems Software Engineer, Heptio
Alex works at Heptio, helping customers realize all the benefits of Kubernetes and Cloud Native technologies. He is also a maintainer of the Heptio Gimbal project, a software load balancing platform that can route traffic to multiple Kubernetes and OpenStack clusters. He has been... Read More →

Thursday December 13, 2018 3:40pm - 4:15pm
4C 3/4


Automated Kubernetes Scalability Testing - Sebastian Jug & Naga Ravi Chaitanya Elluri, Red Hat
Kubernetes supports large clusters according to the docs, but how does it actually scale? Who came up with those limits? What are the actual numbers? To challenge this we built a CI/CD environment geared towards deploying, and testing, Kubernetes at Scale.

Our stack consists of Kubernetes, OpenStack for IaaS, Jenkins Pipeline, Ansible for automation, pbench a performance benchmarking, and visualizing tool, Prometheus and other Open Source projects. The stack has pushed the performance and scale limits of Kubernetes with kubelet, control plane and cluster density focused tests. In this presentation we will explore the story and challenges on how we built and tested this stack. We will demo the test harness and share the latest performance and scale results. Attendees will learn what the real scalability limits of Kubernetes are, as well as how to scale-test their own infrastructure.

avatar for Naga Ravi Chaitanya Elluri

Naga Ravi Chaitanya Elluri

Software Engineer, Red Hat
Ravi Elluri is a Software Engineer at Red Hat working on OpenShift Performance and Scalability. He is currently working on Automation and Tooling needed for Scale testing OpenShift on OpenStack and is involved in pushing towards higher cluster limits. His interest lies in the cloud... Read More →

Sebastian Jug

Software Engineer, Red Hat
Sebastian Jug is a software engineer at Red Hat where he works on OpenShift and container runtime performance. Member of SIG-Scale, he creates tools (cluster loader) and enables large scale performance testing. As keeper of ‘the internet,’ resident of the master branch and FOSS... Read More →

Thursday December 13, 2018 4:30pm - 5:05pm
4C 3/4