Introduction to Containers on Linux

_images/container.jpg

Note

Hi all. I’m Ben Kero. I’m a systems administrator slash devops slash SRE. That basically means that I’m a nerdy ops guy who fiddles around with containers. I work at an open source cloud provider. We use containers in production.

What is a Container?

Note

So what is a container after all? Well, they’re reusable vessels that can be made of plastic or glass. They’re typically used to store food leftovers after a meal. Most of them stack nicely in the refrigerator, and also nest within each other while not in use to save space.

What is a Software Container?

Note

Similarly, software containers are also often reusable. Since cooking food for every meal is time consuming and expensive, leftovers are a lightweight way of having a meal. They also allow us to have microservices, which are kind of like being able to pull many small leftover containers out to have a smorgasbord of choice. They promote cleanliness by keeping each food separate so they can’t contaminate each other. They’re also reproducible, which doesn’t fit this analogy very well.

How They Work - Linux Namespaces

Note

Let’s dig into how containers actually work. If this part doesn’t make sense, don’t worry. You can still use containers without knowing exactly how they work underneath. There are some container analogues on Windows and OS X now as well, but this talk is focusing on Linux. This is the most common type you’ll find in the wild, and the most mature implementation. You can think of containers in 2 parts. There are Namespaces and CGroups, we’ll be covering both. A Linux system can be said to be have a split in them. There is a distinction between kernel space, which is the core part that touches the hardware and provides the base operating system resources. The userland however is all the things that rely on those resources to run. This is all the files on all your disks, your processes, your memory, etc. Namespaces are like separate userlands. It can allow you to run an entirely different set of software on top of the same kernel. If you’re curious and like reading man pages, you can read the man page named namespaces.

Namespaces rely on only 3 syscalls. A system call is a special type of function that allows userland to talk to the kernel. These are typically things like open to open a file, connect to establish a network connection, et cetera.

Briefly, clone is a function call that allows a running program to invoke another program. When it does so it can request that it be put in another namespace. Setns allows you set the namespaces of already-running programs. For example it allows you to set a program to run in a different user namespace after it’s already started.

How They Work - CGroups

Note

CGroups are the way for Linux to limit how many resources can be consumed by each namespace. For example it can limit the amount of disk space that a namespace is allowed to use. It can also restrict networks, or provide an entirely separate network. Similar to a chroot it can restrict which parts of the filesystem that the namespace can use. It can also restrict seeing process IDs outside of the group. That’s why, as I’ll show later, when you’re in a container you can’t see outside processes.

Flavors

_images/flavors.jpg

Note

We’re going to be talking about 3 container engines today. I’ll go into depth with what each is, how to use it, and some scenarios in which you might decide to choose that tool. If you’re familiar with VMs, one of these will be very familiar. The others not as much.

Flavors - All the Old Stuff

Note

It’s important to note that these are not the first container solutions. Not by a long shot. These are just the ones that were (and still are) on Linux. Thy don’t receive much love these days, but if you’re really curious about these, check them out, or at least the Wikipedia page.

LXD - Linux Container Daemon

_images/lxd.png

Note

The first container engine I’d like to talk about is LXD. It is based on LXC, mentioned earlier. LXD stands for Linux Container Daemon. It is an Ubuntu project, although it works with many other Linux distributions. Personally I’ve used it on Ubuntu, Fedora, and Archlinux. You can think of LXD containers just like you would think of a VM. They have many processes inside of them, they can be unique, like pets, or non-unique like cattle depending on how you use them. The documentation treats them like pets. What’s important to note is that while it’s predecessor LXC used a bunch of custom scripts for building container images, LXD uses a much simpler images model. With LXD you just download images and run them. You can also make your own images.

LXD - Use Cases

Note

When would you want to use LXD? They’re useful for throw-away environments, in which you want to run a simple command, check to see if a package is available, or try to replicate a bug you’ve seen on another system. For instance my laptop does not run Ubuntu Xenial, but at work we run Xenial on our servers. If we experience a crash on it, I can set up a similar Xenial system using LXD and attempt to replicate the bug, then fix it there.

You could use them as general lightweight VMs, assuming that in your VM you want to play with Linux. LXD, and LXC before it, are the only container engine that allow you to run an entirely altered modern Linux system in a container. Docker and others will choke hard on systemd and dbus.

Lastly, and we already kind of covered this, LXD can be used for sandbox testing. You can set one up, install some software, play around with it, and when you’re done blow it away. No sense in keeping all that stuff on your laptop long-term just for a short experiment. Unlike VMs you can do it without chewing through a lot of battery life.

LXD - Basic Usage

# apt-get install lxd
# lxc init
# lxc launch ubuntu:16.04 c1
Creating c1
Starting c1
# lxc list
+---------------+---------+---------------+------------+-----------+
|     NAME      |  STATE  | IPV4          |    TYPE    | SNAPSHOTS |
+---------------+---------+---------------+------------+-----------+
| hp            | RUNNING | 192.168.0.200 | PERSISTENT | 0         |
+---------------+---------+---------------+------------+-----------+
| ubnt          | STOPPED |               | PERSISTENT | 0         |
+---------------+---------+---------------+------------+-----------+

# lxc exec c1 bash
root@bash# exit
# lxc stop c1
# lxc delete c1

LXD - Network-aware

_images/ethernet.jpg
host1$ lxc config set core.https_address "[::]"
host1$ lxc config set core.trust_password mypassword1

host2$ lxc remote add host1 192.168.1.100
host2$ lxc exec host1:work bash
work$

LXD - Images

LXD - Building Images the Easy Way

$ lxc launch ubuntu:16.04 MyAwesomeApacheContainer
$ lxc exec MyAwesomeApacheContainer bash

MAAC# apt-get install apache2
MAAC# exit

$ lxc stop MyAwesomeApacheContainer
$ lxc publish MyAwesomeApacheContainer remote:

LXD - Building Images the Powerful Way

$ lxc image import md.tgz rootfs.tgz --alias mynewthing

Note

You can also use a separate tool to build the tarball. There are projects like debootstrap that allow you to install debian to a target directory, or disk-image-builder which is a powerful tool that allows you to build Linux systems out of composable elements. This happens to be my personal favorite. There is another independent tool called Packer, which is put out by the famous Hashicorp people that allows you to build images too. The end result of all of these is a tarball of the root filesystem used to run your new container.

After that all you need to do is write a simple metadata.yaml file, and you’ve created a usable container image.

LXD - Metadata.yml

metadata.yml
architecture: "i686"
creation_date: 1458040200

properties:
  description: My Awesome Container
  os: Ubuntu
  release:
    - xenial
    - 16.04

  template:
    /etc/hosts:
      when:
        - create
        - rename
      template: hosts.tpl
      properties:
        foo: bar

_images/unifi-ap.jpg

Note

Ok, it’s demo time. Here’s an example of something I had to do recently. I decided to upgrade to 802.11ac, the newest and fastest WiFi standard. After evaluating a lot of hardware, I decided to pick up some Ubiquiti Unifi access points. They look like this. Kind of glowing upside-down UFO hanging from the ceiling. Very cool.

_images/unifi-site.png

Note

After unpacking them and plugging them in, I noticed that these weren’t like other access points I’ve had. There was no web interface to be found. I started reading the manual on their site, and found out there is a piece of controller software that’s run on a separate computer and THAT talks to the access points. Fine. I navigated their site and downloaded this debian package, sure I could simply install it on my system, be done with it, and delete it after my access points were set up. This wasn’t ideal because I would need to reinstall the software if I needed to reconfigure the access points or do any kind of monitoring.

_images/unifi-deb.png

Note

And even worse. This package depends on not only Java but MongoDB as well! These are pieces of software that I do not want living permanently on my laptop. A solution I thought of for this problem was to stick the package and all its dependencies in a nice little container that I can shut off and forget about unless I need it again.

Demo Time

Note

Docker

_images/docker.png

Docker - Overview 1/2

Docker - Overview 2/2

Docker - Useful Applications

Note

  • Microservices – oooooh, autoscaling – aaaahhh
  • Ephemeral means they last for a very short time and stateless means they don’t keep any state to themselves. When you’re designing a system, it’s important to keep as many pieces stateless as possible. This reduces side-effects from software and makes a system much easier to reason about.
  • It’s useful as a tool to hand to all the developers of a software project to give them the ability to run a local copy of all the software that your group currently has in production. That way they can do local testing and development without breaking anybody else.
  • Similarly, when it comes time to land the code that the developres have been working on, container can be used as a clean slate upon which to test the new code. The clean slate is important to ensuring that previous tests did not contaminate the system causing the tests to be incorrect.

Docker - Grabbing images

$ docker pull ubuntu:16.04
16.04: Pulling from library/ubuntu
Status: Downloaded newer image for ubuntu:16.04

$ docker search haproxy
NAME    DESCRIPTION               STARS  OFFICIAL  AUTOMATED
haproxy HAProxy - The Reliable... 813    [OK]

$ docker pull haproxy
Using default tag: latest
Status: Downloaded newer image for haproxy:latest

$ docker images
REPOSITORY  TAG     IMAGE ID      CREATED     SIZE
ubuntu      16.04   2d696327ab2e  2 weeks ago 122MB
haproxy     latest  9a3ab33b1fee  2 weeks ago 136MB

$ docker rmi haproxy:latest

Docker - Container Management

$ docker run -ti --rm ubuntu:16.04 /bin/bash
root@fbd3a7b2795a:/# head -2 /etc/os-release
NAME="Ubuntu"
VERSION="16.04.3 LTS (Xenial Xerus)"
root@fbd3a7b2795a:/# exit

$ vim /home/bkero/configs/haproxy.cfg
$ docker run \
         --rm -v \
         --name haproxy \
         /home/bkero/configs:/usr/local/etc/haproxy haproxy

haproxy-systemd-wrapper: executing /usr/local/sbin/haproxy -p
  /run/haproxy.pid -f /usr/local/etc/haproxy/haproxy.cfg -Ds

$ curl http://172.17.0.2:8080/
<html><body><h1>503 Service Unavailable</h1>
$

Harder, Better, Faster, Stronger

$ docker images
REPOSITORY  TAG     IMAGE ID      CREATED     SIZE
ubuntu      16.04   2d696327ab2e  2 weeks ago 122MB
haproxy     latest  9a3ab33b1fee  2 weeks ago 136MB

$ docker ps
CONTAINER ID  IMAGE    COMMAND       CREATED      STATUS      NAMES
60115b5c4074  haproxy  "/docker..."  2 hours ago  Up 2 hours  haproxy

$ docker stop haproxy
$ docker rm haproxy
$ docker rmi haproxy

Docker Building

$ cd MyAwesomeImage
$ vim Dockerfile
$ docker build --tag MyAwesomeImage .
Sending build context to Docker daemon  2.048kB
Step 1/3 : FROM ubuntu:16.04
 ---> 2d696327ab2e
Step 2/3 : RUN touch /myfile
 ---> Running in b1bce2147089
 ---> 793b0666eeea
Removing intermediate container b1bce2147089
Step 3/3 : CMD true
 ---> Running in f975f5af30ea
 ---> d3cb7b89667b
Removing intermediate container f975f5af30ea
Successfully built d3cb7b89667b
Successfully tagged MyAwesomeImage:latest

Dockerfiles

FROM ubuntu:xenial
MAINTAINER Ben Kero <bkero@bke.ro>

ENV MAXCONNS 5

RUN apt-get update
RUN apt-get install -y haproxy
ADD haproxy.cfg /etc/haproxy/haproxy.cfg

EXPOSE 80
CMD /usr/bin/haproxy -n $MAXCONNS -f /etc/haproxy/haproxy.cfg

_images/rkt.jpg

Rkt - Overview

Rkt - Useful Applications

Rkt - Usage

$ rkt run --interactive quay.io/coreos/alpine-sh
rkt: using image from local store for image name coreos.com/rkt/stage1-coreos:1.0.0
rkt: searching for app image quay.io/coreos/alpine-sh
rkt: remote fetching from URL "https://quay.io/c1/aci/quay.io/coreos/alpine-sh/latest/aci/linux/amd64/"
prefix: "quay.io/coreos/alpine-sh"
key: "https://quay.io/aci-signing-key"
gpg key fingerprint is: BFF3 13CD AA56 0B16 A898 7B8F 72AB F5F6 799D 33BC

/ #

$ rkt run --interactive docker://ubuntu \
      --insecure-options=image

Rkt - Build an Example Program

hello.go
package main
import "fmt"

func main() {
    fmt.Println("hello world")
}
$ go build hello.go

$ ./hello
hello world

Rkt - Building Images

$ acbuild begin
$ acbuild set-name example.com/helloworld
$ acbuild copy hello /bin/hello
$ acbuild set-exec /bin/hello
$ acbuild write hello-world.aci
$ acbuild end

$ rkt run --insecure-options=image hello-world.aci

_images/orchestra.jpg

Docker Swarm - Overview

Docker Swarm - Setup

$ docker swarm init --advertise-addr 192.168.0.100
Swarm initialized: current node (n2xt5ovgdrd9ljqzoc88nhm6b)
  is now a manager.

To add a worker to this swarm, run the following command:

   docker swarm join --token SWMTKN-1-37mvzp...n2w 192.168.0.100:2377

To add a manager to this swarm, run 'docker swarm join-token manager'
  and follow the instructions.

$ docker info | grep -A5 Swarm
Swarm: active
 NodeID: n2xt5ovgdrd9ljqzoc88nhm6b
 Is Manager: true
 ClusterID: eeb6rd5jnby1e8y6ajimz71lq
 Managers: 1
 Nodes: 1

Docker Swarm - Joining Nodes

$ docker node list
ID                           HOSTNAME  STATUS  AVAILABILITY  MANAGER STATUS
n2xt5ovgdrd9ljqzoc88nhm6b *  Pioneer   Ready   Active        Leader

$ docker swam join --token SWMTKN-1-37mvzp...n2w 192.168.0.100:2377

$ docker node list
ID                           HOSTNAME  STATUS  AVAILABILITY  MANAGER STATUS
n2xt5ovgdrd9ljqzoc88nhm6b *  Pioneer   Ready   Active        Leader
6bmhn88cozqjl9drdgvo5tx2n *  Daedalus  Ready   Active

Docker Swarm - Deploying Services

$ docker service create --name haproxy --replicas 5 haproxy
$ docker service list
ID            NAME     MODE        REPLICAS  IMAGE               PORTS
omp28uf5l7jg  haproxy  replicated  5/5       haproxy:latest

$ docker service update --replicas 10 haproxy
$ docker service list
ID            NAME     MODE        REPLICAS  IMAGE               PORTS
omp28uf5l7jg  haproxy  replicated  10/10     haproxy:latest

Note

  • It’s not just replica count. You can also specify mounts, limit ram, cpu, etc, add and remove ports, do a rollback to a previous tag, add/adjust an environment variable, etc

Docker Compose

version '3'

services:
  web:
    image: 192.168.0.100:5000/MyAwesomeApache
    ports:
      - "8000:8000"
    environment:
      - "DEBUG=1"
  db:
    image: 192.168.0.100:5000/mysql:6.5
$ docker-compose up -d
Creating network "stackdemo_default" with the default driver
Building web
...(build output)...
Creating web_1
Creating demo_db_1
$ docker-compose stop web
$ docker-compose up web

In Summary

Resources

Note

Open Containers Initiative is a Linux Foundation project that’s trying to make standards in this space. Standards for things like disk image formats, metadata bundles,

That’s All