Conferences can be lots of things: a place to share you work, a place to network, and sometimes a place to learn something completely new. Through the kind support of a Mozilla Science Mini-Grant, I was recently able to have the "learn something completely new" experience by traveling to KubeCon/CloudNativeCon 2019 in San Diego.
Just a bit of back story. One of the objectives of the Mozilla Mini-Grant was to implement some cloud functionality into Minus80, a tool for managing curated biological datasets. Until recently, my mental model for web-based services was the basic client-server model where web requests are routed to a server which handles the request. A snapshot from the wiki-page gives you the the idea.
The idea is that you run your app on a server and attach it to a web address. Anyone who visits your page gets served the content, whether its HTML or data.
However, industry giants supporting a bazillion queries for who-knows-how-many users per day weren't running a vanilla Apache web server and a SQL database. And while I don't necessarily expect to support a bazillion users anytime in the near future, I really wanted to know how it all worked have a go at designing something that natively ran on the cloud.
Designing software for the cloud
There are several reasons to design an app the be cloud native. Most cloud apps run in containers, which I was already heavily utilizing in the lab for their reproducibility properties (the environment in a container is static, which is great for science). Running containers in the cloud allows processes to be deployed in isolation and tasks can be abstracted away from the environment they are run in.
Cloud native apps are are also elastically scalable, which means that under heavy loads, more machines can be scaled up/down to meet the current demand – which is a side effect of the containers being deployed independently.
The last reason I wanted to attend KubeCon was because I had only ever read the word "Kubernetes" – and didn't know how to pronounce it. I couldn't properly evangelize my labmates without knowing how to say the word. I figured chances were high that it would be said out loud at KubeCon 😁.
What I learned at KubeCon
I'm not going to bore you with details on what talks I saw or specifics on what tips were helpful to me. But I do think there are some worthy things of note.
First, this conference was a new situation because I had never used Kubernetes before – it was something that I wanted to know more about. Kubernetes is not something that our HPC really supports so it is not really a resource that is available to us. So why even learn it? Kubernetes is clearly something that is hot in cloud computing right now, so there must be a good reason, no? Either way, it was a great opportunity to see what all the fuss was about (a similar reason for starting to learn Rust).
Second, which is related to point number 1, not knowing anything about Kubernetes or having used it before, I was going in with a completely fresh slate. I used to play a game in college where, when I was signed up for a class I knew nothing about (e.g. operating systems), I'd write down how I thought it worked before I went to my first class. At the end of the class, I'd review how I thought it worked vs what I learned. It was a fun way to reflect and appreciate how abstract things impact our lives (we use operating systems every day!) and how often times we know nothing about how they actually work. Typically, things that seem like magic are much less mesmerizing once the curtain has been pulled back.
Which leads to the final point: in small ways, I had already been using Kubernetes type ideas without even knowing it. Or rather a better way to put it, many of the Google Cloud Platform services I am already using are very likely using Kubernetes-like ideas behind the scenes (however, I have no way to be sure what Google is running behind the scenes). Minus80 uses cloud functions to implement most of the back-end logic for pushing/pulling datasets to and from the cloud. This is nice because it is horizontally scalable, meaning that it is set up in a way that you can upload small snippets of code that can scaled up or scaled down depending on how many people are accessing your web function. This is similar to the idea of micro-services in Kubernetes, and I would not be surprised if the code snippets were being containerized along the way (in fact I'd be surprised if they weren't). Also Google Cloud Run, the newest server-less tech from GCP, runs on top of Knative, which is built on top of Kubernetes.
All in on Kubernetes?
It was such a great opportunity to learn about how people are designing their apps to web scale. And having built many client-server apps in the past, it was rewarding to be able to leverage easy to use tools to create a scalable design for Minus80. Despite not knowing nitty-gritty details on the inner workings of Kubernetes, it was encouraging to find out that many of the same design decision used in building our tools are also used in more battle tested and widely distributed apps.
As it stands for tools such as Minus80, general purpose and highly tuneable frameworks like Kubernetes are overkill at this stage. We can get Kubernetes-like performance and scalability using more user friendly tools like cloud functions (which are likely using Kubernetes in the background anyways). And when the day comes when we need to leverage some of the features that haven't yet made it into cloud functions, we will already have a great head start on deploying the next iteration of our app using Kubernetes.