Sean McGary

Software Engineer, builder of webapps

How to structure a Node.js Express project

by on

"How do I structure my Node.js/Express project?". I see this question come up quite regularly, especially on the Node.js and Javascript subreddits. I also remember thinking the exact same thing years ago when I first started using Node. Today, Im going to outline how I typically start structuring projects so that they're modular and easy to extend as the project grows.

The Majestic Monolith

At the end of February 2016, David Heinemeier Hansson (or you may know him simply as DHH, creator of Rails, founder of Basecamp, etc) wrote an article on Medium called "The Majestic Monolith". For the longest time, web apps existed as giagantic, monolithic, codebases where it was only one codebase that contained nearly everything for the app to run. In fact, only within the last 5 or so years have the terms "microservice" and "service oriented architecture" become really mainstream; So mainstream that I see people on discussion forums trying to pre-optimize the ever loving crap out of their platforms before they even exist!

Stop. Hold on. Back up. Let's first talk about the reasons microservices and service oriented architectures exist. These modular patterns exist generally to solve a problem of scale; this could be one of a number of things. Maybe you have a very large team and you want to break things up into smaller pieces so that smaller teams can own specific things. Take Google for example. They have hundreds of public facing services and god knows how many internal services. Splitting things into a SOA makes sense for them. At the same time though, all of their code is in a single monolithic repository.

What about scaling the actual application. Once you hit a certain size maybe you need to split things up and have an authorization server, a billing service, a logging service - this way you can scale each service independently without (hopefully) bringing down the entire platform.

Embrace the Majestic Monolith - at least to start

As a single developer, or even a team of 5, starting on a project for the first time, you dont have any of the problems above. Rather than trying to worry about dependency management of Node modules and spending time trying to write, deploy, and monitor tons of services, my suggestion is to start with a monolith.

Just because you are starting with a monolith, doesnt mean it cant be modular.

Starting with a monolith gives you a few advantages:

All of your code is in one place

This makes managing things easy. Rather than writing Node modules that are installed via npm, you can require them out of a directory of your project. Because of this...

Everyone on your team can find things easily

There's only one repository to look at which means you dont have to go digging through tons of repos on Github to find what you're looking for. Git exists for a reason, so the excuse of "there are too many people doing too many things at once" is really a poor one. Instead, learn how to properly use branches and merge features properly. Feature-flags are also your friend in this case.

No npm dependency management hell

From personal experience, prematurely creating npm modules is just shooting yourself in the foot. If you end up with 3 runnable services that depend on the same module, that is now three things that can easily break, especially if this shared module does something important like interact with your database. If you make a schema change, you now need to go through the tedious process of updating the version of your DB module in each service, re-test each service, deploy it, etc. This gets incredibly annoying, especially when your schema is still being hashed out and is prone to change.

Building your monolith majestic

Let's say for instance that we're writing a RESTful API service built on top of PostgreSQL. I tend to have three different layers to provide the best combination of separation of concerns and modularity. The example Im going to walk you through is fairly simple: we're going to have the notion of a "company" in our database and each "company" can have n many "users" associated with it. Think of this as the start of a multi-seat SaaS app where users are grouped/scoped by the company they work for, but can only belong to one company.

Here's the directory structure we'll be working with:

.
├── index.js
├── lib
│   ├── company
│   │   └── index.js
│   └── user
│       └── index.js
├── models
│   ├── company.js
│   └── user.js
├── routes
│   └── account
│       └── index.js
└── services
    └── account
        └── index.js

The schema of our models is going to look something like this:

user
----------
- id
- name
- email
- password
- company_id


company
----------
- id
- name


user (1) --> (1) company
company (1) --> (n) user

Let's start with the foundation of our platform:

Models

These are the core of everything. I really like sequelize as its a very featureful and powerfuly ORM that can also get out of your way if you need to write raw SQL queries.

Your models (and thus data in your database) are the very foundation of everything in you application. Your models can express relationships and are used to build sets of data to eventually send to the end user.

Core library/model buisness logic/CRUD layer

This is a small step up from the model level, but still pretty low level. This is where we start to actually interact with our models. Typically I'll create a corresponding file for each model that will wrap basic CRUD operations of a model so that we're not repeating the same operations all over the place. The reason I do this here and not the model is so we can start to handle some higher level features.

Given our example use-case, if you wanted to list all users in a company, your model shouldnt be concerned with interpreting query data, it is only concerned with actually querying the database. For example:

lib/user/index.js

let models = require('../../models');

const listUsersForCompany = exports.listUsersForCompany = (companyId, options = { limit: 10, offset: 0 }){
    let { limit, offset } = options;

    return models.Users.findAll({
        where: {
            company_id: companyId
        },
        limit: limit,
        offset: offset
    })
    .then((users) => {
        let cursor = null;

        if(users.length === limit){
            cursor = {
                limit: limit,
                offset: offset + limit
            };
        }

        return Promise.all([users, cursor]);
    });
}

In this example, we've created a very basic function to list users given a companyId and some limit/offset parameters.

Each of these modules should correspond to a particular model. At this level, we dont want to be introducing other model module dependencies to allow for the greatest level of composability. Thats where the next level up comes in:

Services

I refer to these modules as services because they take different model-level modules and perform some combination of actions. Say we want to write a registration system for our application. When a user registers, you take their name, email, and password, but you also need to create a company profile which could potentially have more users down the road.

One company per user, many users per company. Being that a user depends on the existence of a company, we're going to transactionally create the two together to illustrate how a service would work.

We have our user module:

lib/user/index.js

let models = require('../../models');

exports.createUser = (userData = {}, transaction) => {
    // do some stuff here like hash their password, etc

    let txn = (!!transaction ? { transaction: transaction } : {});
    return models.User.create(userData, transaction);
};

And our company module:

lib/company/index.js


let models = require('../../models');

exports.createCompany = (companyData = {}, transaction) => {
    // do some other prep stuff here

    let txn = (!!transaction ? { transaction: transaction } : {});
    return models.Company.create(companyData, txn);
};

And now we have our service that combines the two:

services/account/index.js


const User = require('../../lib/user');
const Company = require('../../lib/company');

let models = require('../../models');

exports.registerUserAndCompany = (data = { user: {}, company: {} }) => {

    return models.sequelize.transaction()
    .then((t) => {
        return Company.createCompany(data.company, t)
        .then((company) => {
            let user = data.user;
            user.company_id = company.get('id');

            return User.createUser(user, t);
        })
        .then((user) => {
            return t.commit()
            .then(() => user);
        })
        .catch((err) => {
            t.rollback();
            throw err;
        });
    });
};

By doing things this way, it doesnt matter if the user or company are created in the same transaction, or even the same request, or even if a company is created at all. For example, what if we wanted to create a user, adding them to an existing company? We can either add another function to our account service, or our route handler could call our user module directly since it would already have the company_id in the request payload.

My app has grown; I NEED microservices!

Thats great! However, you can still build microservices without breaking apart your monolithic repository (at least until you absolutely need to due to team sizes, iteration speed, etc). Our goal from the beginning was to structure our application in a way that was modular and composeable. This means that there is nothing wrong with creating new executables that simply use your monolith as a central library. This way, everything remains in the same repository and services all share the same identical modules. You've essentially created a core library that you can build things on top of.

The only overhead when deploying things is the potential duplication of your repository across services. If you're using something like Docker, which has file system layering, or rkt containers which also do some file magic to cache things, then you can actually share the single repository and simple execute whichever service you need and that overhead potentially decreases.

Getting started with React and Redux - Part 1

by on

React is pretty awesome, but getting started can be tough. Do you use flux? redux? Do you use the new ES6 features and compile with Babel? How do I compile everything with Webpack?

Back in December, 2015 was dubbed "The Year of Javascript Fatigue", and rightfully so. You have all of these new technologies and libraries being developed and before people can decide on a best practice, the next hot library has hit. Maybe you found yourself wanting to try out these cool new things, but quickly felt turned off by how hard it was to get started because literally everyone had an opinion on how you should do it.

Now that the dust has settled a little bit, we're going to take a walk through how to set up a React project using redux for our datalayer, babel for transpiling ES6 features, and webpack for bundling it all together.

For those that are out of the loop, React is a Javascript, component-based view library built by Facebook. "How does this compare to Angular?" you ask. React is just the "view" portion of MV-whatever, allowing you to choose how you architect your data.

Many of the popular state-management libraries follow the action-reducer pattern set out by Flux. Flux at a high level dictates that data only flows in one direction (unlike Angular's two-way data binding) and thus state is managed by a centralized data store. Over time, the Flux pattern of data flow has been refined and simplified, so for the purpose of this how-to, we're going to look at Redux which is a bit easier to grasp.

Getting started

There are two ways you can go about all of this:

  1. Simply use React and all of the libraries standalone without any build/compilation tools
  2. Set up a build/compilation environment with things like Babel and Webpack.

Option 2 is more complicated, but probably what you'll run into in a production setting, so we're going to walk through how to set things up. This means that we're going to use the new ES2015/ES7 features provided in Babel and we're going to use Webpack to bundle everything together to distribute in a single javascript file.

Installing Webpack and Babel

First let's initialize our project:

mkdir react-intro && cd react-intro

npm init -y
mkdir src
mkdir src/components
mkdir src/store
mkdir -p dist/js
mkdir server
touch src/main.js
touch server/index.js

Should give us a directory structure that looks like this:

.
├── dist
│   └── js
├── package.json
├── server
│   └── index.js
└── src
    ├── components
    ├── main.js
    └── store

In your project's directory, we want to run the following to install Webpack and the necessary Babel plugins:

npm install --save \
    webpack \
    babel-loader \
    babel-core \
    babel-plugin-syntax-jsx \
    babel-preset-react \
    babel-preset-es2015 \
    babel-preset-stage-0

Our package.json file now looks like this:

{
  "name": "react-intro",
  "version": "1.0.0",
  "description": "",
  "main": "index.js",
  "scripts": {
    "test": "echo \"Error: no test specified\" && exit 1"
  },
  "keywords": [],
  "author": "",
  "license": "ISC",
  "dependencies": {
    "babel-core": "^6.7.2",
    "babel-loader": "^6.2.4",
    "babel-plugin-syntax-jsx": "^6.5.0",
    "babel-preset-es2015": "^6.6.0",
    "babel-preset-react": "^6.5.0",
    "babel-preset-stage-0": "^6.5.0",
    "webpack": "^1.12.14"
  }
}

Setting up our build process

To compile everything we need to create two files:

  • A webpack.config.js to tell webpack how to compile everything
  • A .babelrc to tell Babel which presets to load and use

./.babelrc


{
    "presets": [
        "react",
        "es2015",
        "stage-0"
    ]
}

./webpack.config.js

'use strict';

let path = require('path');

module.exports = {
    entry: path.resolve(__dirname + '/src/main.js'),
    output: {
        path: path.resolve(__dirname + '/dist/js'),
        filename: 'main.js',
        devtoolLineToLine: true
    },
    module: {
        loaders: [
            {
                test: /src\/.+.jsx?$/,
                exclude: /node_modules/,
                loader: 'babel'
            }
        ]
    }
}

A lot of the webpack config looks scary and complicated, so lets take a few of these sections and break them down:

{
    entry: path.resolve(__dirname + '/src/main.js'),
}

This tells webpack that ./components/main.js is the entry point to our application and where it should start to compile things.

{
    output: {
        path: path.resolve(__dirname + '/dist/js'),
        filename: 'main.js',
        devtoolLineToLine: true
    }
}

Now we tell it to place the compiled source into a file called main.js in the ./dist directory.

{
    loaders: [
        {
            test: /src\/.+.jsx?$/,
            exclude: /node_modules/,
            loader: 'babel'
        }
    ]
}

This loader is specific to babel. It says that every file in the ./components directory with a .js or .jsx extension can be included and compiled using the presets specified. The presets are babel-specific and give us the functionality of not only ES2015/ES6 features, but also some ES7 features (that's what stage-0 gives us).

Install react, react-redux, react-router, and friends

Now we want to install react, react-redux, react-router, immutable, redux-actions, redux-logger:

  • react: will allow us to define and structure our views using JSX
  • react-redux: react-specific bindings for redux to manage the state of our app
  • react-router: gives us the ability to load specific components for a given route
  • immutable: to make detecting changes easier, we're going to use Immutable, this way a simple variable reference comparison will tell us if two objects are equal.
  • redux-actions: eliminates a lot of boilerplate when creating redux actions
  • redux-logger: a logger middleware that makes it easy to visualize what actions are firing and what data is changing.
npm install --save \
    react \
    react-dom \
    react-redux \
    react-router \
    immutable \
    redux-actions \
    redux-logger

Creating a test server

One of the nice things about react-router is that it supports the HTML history API. In order to properly illustrate this and support a hard reload when you navigate to a new page, we're going to run a small Node.js server with express to serve up our client side app and handle server side routing.

Install the dependencies

We'll need two modules - express and serve-static.

npm install --save \
    express \
    serve-static

The server

Our serve is super simple; just an HTML template that includes our client side app and provides a mount point (<div id="app-root">) for our application.

'use strict';

const express = require('express');
const serveStatic = require('serve-static');
const path = require('path');

const template = `
<html>
    <head></head>
    <body>
        <div id="app-root"></div>
        <script type="text/javascript" src="/js/main.js"></script>
    </body>
</html>
`;

const app = express();

app.use(serveStatic(path.resolve(__dirname + '/../dist')));

app.get('*', (req, res) => {
    res.set('text/html');
    res.send(template);
});

app.listen(8080, () => {
    console.log('server listening on port 8080');
});

Starting the server is as simple as running:

node server

Writing and compiling our first component

As a super basic example to make sure that we have everything set up correctly, we're going to create a single react component and render it to the DOM. Your src/main.js file should look like this:

'use strict';

import React, { Component } from 'react';
import { render } from 'react-dom';

class TestComponent extends Component {
    constructor(props){
        super(props);
    }

    render(){
        return (
            <h1>Hello World!</h1>
        );
    }
}

render(<TestComponent />, document.getElementById('app-root'));

To compile, run webpack:

./node_modules/.bin/webpack --config ./webpack.config.js

Start up your server and you should see a nice big "Hello World" on the page.

Ready to start building a more functional application? Stay tuned for part 2!

How to run Node.js in a rocket container

by on

Last time we talked about "Building a fedora-based rocket container", so today we're going to use that as a base to build a container for running NodeJS applications.

If you are just joining us, please go back and read "Building a fedora-based rocket container" as the post includes instructions on how to get set up with rkt, acbuild, and actool and introduces the basics of building a container.

Building a NodeJS rkt container

While it is possible to statically compile node, native npm modules will sometimes need to link off of libraries included in the system. So for this particular demonstration, we're going to use the Fedora container we created in the previous post as a base for our node container.

Back to acbuild

Our acbuild procedure is going to look something like this:

acbuild begin
sudo acbuild dependency add <your domain>/fedora:latest
sudo acbuild set-name <your domain>/nodejs
sudo acbuild label add version "4.2.3"
sudo acbuild run -- /bin/bash -c "curl https://nodejs.org/dist/v4.2.3/node-v4.2.3-linux-x64.tar.gz | tar xvz --strip-components=1 -C /usr/local"
sudo acbuild write nodejs-4.2.3-linux-amd64.aci
sudo acbuild end

Let's go through this step by step:

sudo acbuild dependency add <your domain>/fedora:latest

This tells acbuild to use the Fedora container we built in the previous post. As you can see, we're also specifying a version of latest. acbuild will first check the local container cache to see if it exists, otherwise it will use http based discovery to location the container (more on discovery and how to set it up to come in a later post)

acbuild label add version "4.2.3"

Since we're pulling in node v4.3.2, we'll tag the version of our container as such.

sudo acbuild run -- /bin/bash -c "curl https://nodejs.org/dist/v4.2.3/node-v4.2.3-linux-x64.tar.gz | tar xvz --strip-components=1 -C /usr/local"

acbuild run is analogous to the RUN parameter you would find in a docker file; it can be used to execute a command within the container. In the case of acbuild (and rkt), what happens is acbuild actually starts systemd-nspawn to run the command against the rootfs as defined by the included dependencies.

sudo acbuild write nodejs-4.2.3-linux-amd64.aci

Now we're getting a little more fancy with our file naming. In this case, we have named our aci in a way that allows us to make it discoverable later on, following the format of:

{name}-{version}-{os}-{arch}.{ext}

So if I named my container seanmcgary.com/nodejs, the discovery mechanism would at:

https://seanmcgary.com/nodejs-4.2.3-linux-amd64.aci

Packaging an application

Now that we have our nodejs base container, we can create another container to house our test application. A while back I wrote a little app called stupid-server that can be found over on github at seanmcgary/stupid-server. Let's create our container:

# first clone the repo
git clone https://github.com/seanmcgary/stupidServer.git

acbuild begin
sudo acbuild dependency add <your domain>/nodejs:4.2.3
sudo acbuild set-name <your domain>/stupid-server
sudo acbuild label add version 1.0.0
sudo acbuild copy ./stupid-server /stupid-server
sudo acbuild set-exec -- /bin/bash -c "node /stupid-server"
sudo acbuild write stupidServer-1.0.0-linux-amd64.aci
sudo acbuild end

We have some new commands in our process:

sudo acbuild copy ./stupid-server /stupid-server

This one is pretty straightforward - takes a local file/directory and a destination path of where to put it in your container.

sudo acbuild set-exec -- /bin/bash -c "node /stupid-server"

Here, we are specifying what to run when rkt executes our container. set-exec is analagous to CMD <command> found in a Dockerfile.

Running our application

As a quick recap, we have an application that inherits a chain of containers that looks like this:

fedora --> nodejs --> stupidServer

Now we can actually run our container with rkt:

sudo rkt run --insecure-options=all --net=host ./stupidServer-1.0.0-linux-amd64.aci

rkt: using image from local store for image name coreos.com/rkt/stage1-coreos:0.13.0
rkt: using image from file /home/core/node/stupidServer-1.0.0-linux-amd64.aci
rkt: using image from local store for image name seanmcgary.com/nodejs,version=latest
rkt: using image from local store for image name seanmcgary.com/fedora,version=latest

If you want to push it to the background, you can also run it with systemd-run:

sudo systemd-run rkt run --insecure-options=all --net=host /home/core/stupidServer-1.0.0-linux-amd64.aci

Now with your container running, you should be able to hit your server:

curl http://localhost:8000/
2015-12-16T00:36:46.694Z

Wrap up

That's it! Now that you know how to build containers based off of other containers, you should be able to figure out how to deploy your own app in a containerized fashion.

Next time, we'll talk about how to set up discovery so that you can host your containers in a central location.

Building a fedora-based rocket container

by on

Containers are basically taking over the world; Docker, rkt, systemd-nspawn, LXC, etc. Today, we're going to talk about rkt (pronounced "rocket") and how to get started building rkt-runnable containers.

What is rkt?

rkt (pronounced "rock-it") is a CLI for running app containers on Linux. rkt is designed to be composable, secure, and fast.

Alright, so whats an App Container (appc)?

An application container is a way of packaging and executing processes on a computer system that isolates the application from the underlying host operating system.

In other words, rkt is a runtime implementation that uses the appc container spec. It leverages systemd to manage processes within the container, making it compatible with orchestration tools such as fleet and Kubernetes.

Containers can include basically anything from a single, static binary, to an entire root file system. Today, we're going to look at building a Fedora based container that can then be used as a foundation for building other containers on top of it. This will effectively give us the equivalent of using the Fedora docker image.

Boot up CoreOS

To follow along, you'll need to boot an instance of CoreOS of some kind (AWS, GCE, Azure, DigitalOcean, Vagrant, etc). I would use either the latest Beta or Alpha channel release to be sure you have the latest versions of rkt and actool.

Fetching Fedora

We're striving for a super minimal image to use as our base layer, and it just so happens that the folks over at Fedora build a Docker base image which is nothing more than a stripped down Fedora file system. So we're going to use that, but we're not going to use Docker at all to get it.

Here you will find all of the Fedora build images. Builds in green have passed and are probably safe to use. We're going to be using Fedora 23, so look for a build that says f23-candidate, Fedora-Docker-Base-23....

Once you've SSHd into your machine as the core user, fetch the Fedora image:

mkdir fedoraLayer

# fetch and unpack fedora build
curl https://kojipkgs.fedoraproject.org/work/tasks/7696/12107696/Fedora-Docker-Base-23-20151208.x86_64.tar.xz | tar -xJ -C fedoraLayer
cd fedoraLayer

HASH=$(cat repositories | awk -F '"latest": "' '{ print $2 }' | awk '{ sub(/[^a-zA-Z0-9]+/, ""); print }')

mv $HASH/layer.tar .
rm -rf $HASH repositories
sudo tar -xf layer.tar --same-owner --preserve-permissions

sudo rm layer.tar
cd ../

The HASH variable represents the directory inside the tarball that contains the rootfs; we take the contents of said directory and move it up one level so that /home/core/fedoraLayer contains the rootfs.

Installing acbuild

acbuild is a nifty little interactive CLI for building your container manifest. If you want the most flexibility, you can feel free to write out the manifest by hand.

When this post was written, acbuild was still in early development, so we're going to build it from source. For those unfamiliar with CoreOS, CoreOS comes with basically nothing; no package manager and only a very small set of tools. It does however come with a tool called toolbox which is a container that we can use to actually do some work. We're going to use toolbox to fetch and build acbuild from source.

# Clone acbuild to a known directory inside /home/core.
# We're specifically going to clone it /home/core/builds/acbuild.
mkdir $HOME/builds && cd $HOME/builds
git clone https://github.com/appc/acbuild.git

toolbox

# now inside toolbox
yum install -y golang git

# the host filesystem is mounted to /media/root
cd /media/root/home/core/builds/acbuild
./build

# exit toolbox by pressing ctrl+c

# now back on the host system, outside of toolbox

sudo mkdir -p /opt/bin || true
# /usr is readonly, but /opt/bin is in the PATH, so symlink our 
# acbuild binary to that directory
sudo ln -s /home/core/builds/acbuild/bin/acbuild /opt/bin

acbuild --help

If all goes well, you should see the acbuild help menu at the very end.

Building our container

acbuild works by declaring that you are beginning the build of a container (this creates a hidden directory in your CWD that will be used to hold the state of everything as we go), running subcommands, writing the aci (app container image), and then telling acbuild that we're done. Here's our build process:

sudo acbuild begin /home/core/fedoraLayer
sudo acbuild set-name <your domain>/fedora
sudo acbuild label add version "latest"
sudo acbuild write fedora.aci
sudo acbuild end
actool validate fedora.aci

What we're doing here is:

  • Telling acbuild to use our fedora rootfs that we extracted as the rootfs for the container
  • Setting the name to <your domain>/fedora. For example, I would use seanmcgary.com/fedora. This is very similar to the naming convention you see in docker when hosting containers on something like Quay.io and acts as a name that you will reference your container by.
  • We set the label "version" to "latest"
  • We write out everything to fedora.aci. This what we will actually run with rkt.
  • Tell acbuild we're done
  • Validate our container with actool.

Thats it, we're done! Well almost. We have a container, but it's pretty useless because we didnt tell it what to execute when we run it with rkt. Lets create it again, but this time we'll tell it to run /bin/date.

sudo acbuild begin /home/core/fedoraLayer
sudo acbuild set-name <your domain>/fedora
sudo acbuild label add version "latest"
sudo acbuild set-exec -- /bin/date
sudo acbuild write fedora.aci
sudo acbuild end
actool validate fedora.aci

Now we can actually run it:

sudo rkt run --insecure-options=all ./fedora.aci
rkt: using image from local store for image name coreos.com/rkt/stage1-coreos:0.11.0
rkt: using image from file /home/core/fedora.aci
[123872.191605] date[4]: Tue Dec 15 00:14:06 UTC 2015

Advanced rkt containers

Stay tuned for more posts about building more advanced rkt containers, building your own container repository with appc discovery, and more!

Deploying Node.JS applications with systemd

by on

You've built a node application and now it's time to deploy and run it. But how do you make sure that if your app crashes, it restarts automatically? Or if the host machine goes down, how do you ensure that your app comes back up? Ive seen a number of people across the internet suggest things like node-supervisor, forever, nodemon, and hell, even gnu screen. These might be fine for running a server locally or in a testing environment, but they have a (pretty large) drawback; the underlying process is (by default) managed by the user that ran it. This means that the process that is managing your node app (supervisor, forever, nodemon, screen, etc) isn't managed by anything else and if it goes down, then what? I thought we were going for uptime here...

For whatever reason, it seems that people forget that the underlying operating system (we're assuming Linux here) has an init system that is designed to do exactly what we want. Now that the majority of the major linux distros come with systemd it's easier than ever to make sure that your node app is properly managed, not to mention it can handle logging for you as well.

Setting up your machine

We're going to be using Fedora 23 for the purpose of this article and installing node directly on it.

curl https://nodejs.org/download/release/v4.2.1/node-v4.2.1-linux-x64.tar.gz | sudo tar xvz --strip-components=1 -C /usr/local

node -v
# v4.2.1

What is systemd?

systemd is a suite of basic building blocks for a Linux system. It provides a system and service manager that runs as PID 1 and starts the rest of the system. systemd provides aggressive parallelization capabilities, uses socket and D-Bus activation for starting services, offers on-demand starting of daemons, keeps track of processes using Linux control groups, supports snapshotting and restoring of the system state, maintains mount and automount points and implements an elaborate transactional dependency-based service control logic.

tl;dr - at a high level, systemd gives you:

  • process management
  • logging management
  • process/service dependency management via socket activation

Our test node application

We're going to use this stupidly simple HTTP server as our test application. It's so simple, that it doesn't even depend on any external packages.

server.js

var http = require('http');

var server = http.createServer(function(req, res){
    res.end(new Date().toISOString());
});

server.listen(8000);

We'll build on this as we go to demonstrate the features of systemd.

Running your application

To run our application, we need to write out a unit file that describes what to run and how to run it. For this, we're going to want to look at:

Here's a super simple unit file to get us started:

node-server.service

[Unit]
Description=stupid simple nodejs HTTP server

[Service]
WorkingDirectory=/path/to/your/app
ExecStart=/usr/local/bin/node server.js
Type=simple

Place this file in /etc/systemd/system and run:

sudo systemctl start node-server.service

systemctl is the utility to manage systemd-based services. When given a unit name that isn't a direct path, it looks in /etc/systemd/system, attempting to match the provided name to unit file names. Now, let's check the status of our service:

systemctl status node-server.servicenode-server.service - stupid simple nodejs HTTP server
   Loaded: loaded (/etc/systemd/system/node-server.service; static; vendor preset: disabled)
   Active: active (running) since Mon 2015-11-30 11:40:18 PST; 3s ago
 Main PID: 17018 (node)
   CGroup: /system.slice/node-server.service
           └─17018 /usr/local/bin/node server.js

Nov 30 11:40:18 localhost.localdomain systemd[1]: Started stupid simple nodejs HTTP server.
Nov 30 11:40:18 localhost.localdomain systemd[1]: Starting stupid simple nodejs HTTP server...

Awesome, it looks to be running! Let's curl our HTTP server and see if it actually is:

curl http://localhost:8000/
2015-11-30T19:43:17.102Z

Managing logs with systemd-journald

Now that we have everything running, let's modify our script to print something to stdout when a request comes in.

var http = require('http');

var server = http.createServer(function(req, res){
    var date = new Date().toISOString();
    console.log('sending date: ', date);
    res.end(date);
});

server.listen(8000);

Edit your server to look like the code above. For logging, all we need to do is log directly to stdout and stderr; systemd-journald will handle everything else from here. Now, let's restart our server and tail the log:

sudo systemctl restart node-server.service
journalctl -f -u node-server.service

-- Logs begin at Mon 2015-10-19 17:41:06 PDT. --
Nov 30 11:40:18 localhost.localdomain systemd[1]: Started stupid simple nodejs HTTP server.
Nov 30 11:40:18 localhost.localdomain systemd[1]: Starting stupid simple nodejs HTTP server...
Nov 30 11:46:30 localhost.localdomain systemd[1]: Stopping stupid simple nodejs HTTP server...
Nov 30 11:46:30 localhost.localdomain systemd[1]: Started stupid simple nodejs HTTP server.
Nov 30 11:46:30 localhost.localdomain systemd[1]: Starting stupid simple nodejs HTTP server...

Close out journalctl (ctrl-c) and curl your HTTP server again. You should now see a new line added to the log:

-- Logs begin at Mon 2015-10-19 17:41:06 PDT. --
Nov 30 11:40:18 localhost.localdomain systemd[1]: Started stupid simple nodejs HTTP server.
Nov 30 11:40:18 localhost.localdomain systemd[1]: Starting stupid simple nodejs HTTP server...
Nov 30 11:46:30 localhost.localdomain systemd[1]: Stopping stupid simple nodejs HTTP server...
Nov 30 11:46:30 localhost.localdomain systemd[1]: Started stupid simple nodejs HTTP server.
Nov 30 11:46:30 localhost.localdomain systemd[1]: Starting stupid simple nodejs HTTP server...
Nov 30 11:47:40 localhost.localdomain node[17076]: sending date:  2015-11-30T19:47:40.319Z

Handling crashes and restarting

What if your application crashes; you probably want it to restart, otherwise you wouldn't need an init system in the first place. Systemd provides the Restart= property to specify when your application should restart, if at all. We're going to use Restart=always for simplicity sake, but all of the options can be found in a table on the systemd.service docs page.

Our updated unit file:

[Unit]
Description=stupid simple nodejs HTTP server

[Service]
WorkingDirectory=/path/to/your/app
ExecStart=/usr/local/bin/node server.js
Type=simple
Restart=always
RestartSec=10

Note that we also added RestartSec=10. This is just so that we can easily see in the logs the restart. Now that our unit file is updated, we need to tell systemd:

sudo systemctl daemon-reload

Before we restart everything, let's modify our server so that it crashes:

var http = require('http');

var server = http.createServer(function(req, res){
    var date = new Date().toISOString();
    console.log('sending date: ', date);
    throw new Error('crashing');
    res.end(date);
});

server.listen(8000);

Now we can restart everything:

sudo systemctl restart node-server.service

Now when you curl your server, it will crash and restart itself. We can verify this by checking the logs as we did above:

journalctl -f -u node-server.service

Nov 30 12:01:38 localhost.localdomain systemd[1]: Started stupid simple nodejs HTTP server.
Nov 30 12:01:38 localhost.localdomain systemd[1]: Starting stupid simple nodejs HTTP server...
Nov 30 12:02:20 localhost.localdomain node[17255]: sending date:  2015-11-30T20:02:20.807Z
Nov 30 12:02:20 localhost.localdomain systemd[1]: node-server.service: Main process exited, code=exited, status=1/FAILURE
Nov 30 12:02:20 localhost.localdomain systemd[1]: node-server.service: Unit entered failed state.
Nov 30 12:02:20 localhost.localdomain systemd[1]: node-server.service: Failed with result 'exit-code'.
Nov 30 12:02:30 localhost.localdomain systemd[1]: node-server.service: Service hold-off time over, scheduling restart.
Nov 30 12:02:30 localhost.localdomain systemd[1]: Started stupid simple nodejs HTTP server.
Nov 30 12:02:30 localhost.localdomain systemd[1]: Starting stupid simple nodejs HTTP server...

Starting your app on boot

Often times, you may want your application or service to start when the machine boots (or reboots for that matter). To do this, we need to add an [Install] section to our unit file:

[Unit]
Description=stupid simple nodejs HTTP server

[Service]
WorkingDirectory=/path/to/your/app
ExecStart=/usr/local/bin/node server.js
Type=simple
Restart=always
RestartSec=10

[Install]
WantedBy=basic.target

Now, we can enable it:

sudo systemctl enable node-server.service
Created symlink from /etc/systemd/system/basic.target.wants/node-server.service to /etc/systemd/system/node-server.service.

When control is handed off to systemd on boot, it goes through a number of stages:

local-fs-pre.target
         |
         v
(various mounts and   (various swap   (various cryptsetup
 fsck services...)     devices...)        devices...)       (various low-level   (various low-level
         |                  |                  |             services: udevd,     API VFS mounts:
         v                  v                  v             tmpfiles, random     mqueue, configfs,
  local-fs.target      swap.target     cryptsetup.target    seed, sysctl, ...)      debugfs, ...)
         |                  |                  |                    |                    |
         \__________________|_________________ | ___________________|____________________/
                                              \|/
                                               v
                                        sysinit.target
                                               |
          ____________________________________/|\________________________________________
         /                  |                  |                    |                    \
         |                  |                  |                    |                    |
         v                  v                  |                    v                    v
     (various           (various               |                (various          rescue.service
    timers...)          paths...)              |               sockets...)               |
         |                  |                  |                    |                    v
         v                  v                  |                    v              rescue.target
   timers.target      paths.target             |             sockets.target
         |                  |                  |                    |
         v                  \_________________ | ___________________/
                                              \|/
                                               v
                                         basic.target
                                               |
          ____________________________________/|                                 emergency.service
         /                  |                  |                                         |
         |                  |                  |                  To do this, we first need to add an [In                       v
         v                  v                  v                                 emergency.target
     display-        (various system    (various system
 manager.service         services           services)
         |             required for            |
         |            graphical UIs)           v
         |                  |           multi-user.target
         |                  |                  |
         \_________________ | _________________/
                           \|/
                            v
                  graphical.target

As you can see in the chart, basic.target is the first target hit after the core components of the system come online. This flow chart is how systemd orders services and even resolves service dependencies. When we ran systemctl enable it creates a symlink into /etc/systemd/systemd/basic.target.wants. Everything that is symlinked there will be run as part of the basic.target step in the boot process.

Wrap up

As you can see, it's pretty simple to get everything up and running with systemd and you don't have to mess around with user space based process managers. With a small config and like 10 minutes of time, you now no longer have to worry about your entire app crashing down and not being able to restart itself.

Now that you have a basic understanding of systemd and what it offers, you can start really digging into it and exploring all of the other features that it offers to make your application infrastructure even better.

Today, Im going to show you how to setup a fault-tolerant master/slave redis cluster using sentinel to failover lost nodes.

Redis is a very versatile database, but what if you want to run it on a cluster? A lot of times, people will run redis as a standalone server with no backup. But what happens when that machine goes down? Or what if we want to migrate our redis instance to a new machine without downtime?

All of this is possible by creating a replica set (master node and n many slave nodes) and letting sentinel watch and manage them. If sentinel discovers that a node has disappeared, it will attempt to elect a new master node, provided that a majority of sentinels in the cluster agree (i.e. quorum).

The quorum is the number of Sentinels that need to agree about the fact the master is not reachable, in order for really mark the slave as failing, and eventually start a fail over procedure if possible.

However the quorum is only used to detect the failure. In order to actually perform a failover, one of the Sentinels need to be elected leader for the failover and be authorized to proceed. This only happens with the vote of the majority of the Sentinel processes.

In this particular example, we're going to setup our nodes in a master/slave configuration, where we will have 1 master and 2 slave nodes. This way, if we lose one node, the cluster will still retain quorum and be able to elect a new master. In this setup, writes will have to go through the master as slaves are read-only. The upside to this is that if the master disappears, its entire state has already been replicated to the slave nodes, meaning when one is elected as master, it can being to accept writes immediately. This is different than setting up a redis cluster where data is sharded across master nodes rather than replicated entirely.

Since sentinel handles electing a master node and sentinel nodes communicate with each other, we can use it as a discovery mechanism to determine which node is the master and thus where we should send our writes.

Setup

To set up a cluster, we're going to run 3 redis instances:

  • 1 master
  • 2 slaves

Each of the three instances will also have a redis sentinel server running along side it for monitoring/service discovery. The config files I have for this example can be run on your localhost, or you can change the IP addresses to fit your own use-case. All of this will be done using version 3.0.2 of redis.

Configs

If you dont feel like writing configs by hand, you can clone the example repository I have at github.com/seanmcgary/redis-cluster-example. In there, you'll find a directory structure that looks like this:

redis-cluster
├── node1
│   ├── redis.conf
│   └── sentinel.conf
├── node2
│   ├── redis.conf
│   └── sentinel.conf
└── node3
    ├── redis.conf
    └── sentinel.conf

3 directories, 6 files

For the purpose of this demo, node1 will be our starting master node and nodes 2 and 3 will be added as slaves.

Master node config

redis.conf

bind 127.0.0.1
port 6380

dir .

sentinel.conf

# Host and port we will listen for requests on
bind 127.0.0.1
port 16380

#
# "redis-cluster" is the name of our cluster
#
# each sentinel process is paired with a redis-server process
#
sentinel monitor redis-cluster 127.0.0.1 6380 2
sentinel down-after-milliseconds redis-cluster 5000
sentinel parallel-syncs redis-cluster 1
sentinel failover-timeout redis-cluster 10000

Our redis config should be pretty self-explainatory. For the sentinel config, we've chosen the redis-server port + 10000 to keep things somewhat consistent and make it easier to see which sentinel config goes with which server.

sentinel monitor redis-cluster 127.0.0.1 6380 2

The third "argument" here is the name of our cluster. Each sentinel server needs to have the same name and will point at the master node (rather than the redis-server it shares a host with). The final argument (2 here) is how many sentinel nodes are required for quorum when it comes time to vote on a new master. Since we have 3 nodes, we're requiring a quorum of 2 sentinels, allowing us to lose up to one machine. If we had a cluster of 5 machines, which would allow us to lose 2 machines while still maintaining a majority of nodes participating in quorum.

sentinel down-after-milliseconds redis-cluster 5000

For this example, a machine will have to be unresponsive for 5 seconds before being classified as down thus triggering a vote to elect a new master node.

Slave node config

Our slave node configs don't look much different. This one happens to be for node2:

redis.conf

bind 127.0.0.1
port 6381

dir .

slaveof 127.0.0.1 6380

sentinel.conf

# Host and port we will listen for requests on
bind 127.0.0.1
port 16381

#
# "redis-cluster" is the name of our cluster
#
# each sentinel process is paired with a redis-server process
#
sentinel monitor redis-cluster 127.0.0.1 6380 2
sentinel down-after-milliseconds redis-cluster 5000
sentinel parallel-syncs redis-cluster 1
sentinel failover-timeout redis-cluster 10000

The only difference is this line in our redis.conf:

slaveof 127.0.0.1 6380

In order to bootstrap the cluster, we need to tell the slaves where to look for a master node. After the initial bootstrapping process, redis will actually take care of rewriting configs as we add/remove nodes. Since we're not really worrying about deploying this to a production environment where addresses might be dynamic, we're just going to hardcode our master node's IP address and port.

We're going to do the same for the slave sentinels as well as we want them to monitor our master node (node1).

Starting the cluster

You'll probably want to run each of these in something like screen or tmux so that you can see the output from each node all at once.

Starting the master node

redis-server, node1

$ redis-server node1/redis.conf

57411:M 07 Jul 16:32:09.876 * Increased maximum number of open files to 10032 (it was originally set to 256).
                _._                                                  
           _.-``__ ''-._                                             
      _.-``    `.  `_.  ''-._           Redis 3.0.2 (01888d1e/0) 64 bit
  .-`` .-```.  ```\/    _.,_ ''-._                                   
 (    '      ,       .-`  | `,    )     Running in standalone mode
 |`-._`-...-` __...-.``-._|'` _.-'|     Port: 6380
 |    `-._   `._    /     _.-'    |     PID: 57411
  `-._    `-._  `-./  _.-'    _.-'                                   
 |`-._`-._    `-.__.-'    _.-'_.-'|                                  
 |    `-._`-._        _.-'_.-'    |           http://redis.io        
  `-._    `-._`-.__.-'_.-'    _.-'                                   
 |`-._`-._    `-.__.-'    _.-'_.-'|                                  
 |    `-._`-._        _.-'_.-'    |                                  
  `-._    `-._`-.__.-'_.-'    _.-'                                   
      `-._    `-.__.-'    _.-'                                       
          `-._        _.-'                                           
              `-.__.-'                                               

57411:M 07 Jul 16:32:09.878 # Server started, Redis version 3.0.2
57411:M 07 Jul 16:32:09.878 * DB loaded from disk: 0.000 seconds
57411:M 07 Jul 16:32:09.878 * The server is now ready to accept connections on port 6380

sentinel, node1

$ redis-server node1/sentinel.conf --sentinel

57425:X 07 Jul 16:32:33.794 * Increased maximum number of open files to 10032 (it was originally set to 256).
                _._                                                  
           _.-``__ ''-._                                             
      _.-``    `.  `_.  ''-._           Redis 3.0.2 (01888d1e/0) 64 bit
  .-`` .-```.  ```\/    _.,_ ''-._                                   
 (    '      ,       .-`  | `,    )     Running in sentinel mode
 |`-._`-...-` __...-.``-._|'` _.-'|     Port: 16380
 |    `-._   `._    /     _.-'    |     PID: 57425
  `-._    `-._  `-./  _.-'    _.-'                                   
 |`-._`-._    `-.__.-'    _.-'_.-'|                                  
 |    `-._`-._        _.-'_.-'    |           http://redis.io        
  `-._    `-._`-.__.-'_.-'    _.-'                                   
 |`-._`-._    `-.__.-'    _.-'_.-'|                                  
 |    `-._`-._        _.-'_.-'    |                                  
  `-._    `-._`-.__.-'_.-'    _.-'                                   
      `-._    `-.__.-'    _.-'                                       
          `-._        _.-'                                           
              `-.__.-'                                               

57425:X 07 Jul 16:32:33.795 # Sentinel runid is dde8956ca13c6b6d396d33e3a47ab5b489fa3292
57425:X 07 Jul 16:32:33.795 # +monitor master redis-cluster 127.0.0.1 6380 quorum 2

Starting the slave nodes

Now we can go ahead and start our slave nodes. As you start them, you'll see the master node report as they come online and join.

redis-server, node2

$ redis-server node2/redis.conf

57450:S 07 Jul 16:32:57.969 * Increased maximum number of open files to 10032 (it was originally set to 256).
                _._                                                  
           _.-``__ ''-._                                             
      _.-``    `.  `_.  ''-._           Redis 3.0.2 (01888d1e/0) 64 bit
  .-`` .-```.  ```\/    _.,_ ''-._                                   
 (    '      ,       .-`  | `,    )     Running in standalone mode
 |`-._`-...-` __...-.``-._|'` _.-'|     Port: 6381
 |    `-._   `._    /     _.-'    |     PID: 57450
  `-._    `-._  `-./  _.-'    _.-'                                   
 |`-._`-._    `-.__.-'    _.-'_.-'|                                  
 |    `-._`-._        _.-'_.-'    |           http://redis.io        
  `-._    `-._`-.__.-'_.-'    _.-'                                   
 |`-._`-._    `-.__.-'    _.-'_.-'|                                  
 |    `-._`-._        _.-'_.-'    |                                  
  `-._    `-._`-.__.-'_.-'    _.-'                                   
      `-._    `-.__.-'    _.-'                                       
          `-._        _.-'                                           
              `-.__.-'                                               

57450:S 07 Jul 16:32:57.971 # Server started, Redis version 3.0.2
57450:S 07 Jul 16:32:57.971 * DB loaded from disk: 0.000 seconds
57450:S 07 Jul 16:32:57.971 * The server is now ready to accept connections on port 6381
57450:S 07 Jul 16:32:57.971 * Connecting to MASTER 127.0.0.1:6380
57450:S 07 Jul 16:32:57.971 * MASTER <-> SLAVE sync started
57450:S 07 Jul 16:32:57.971 * Non blocking connect for SYNC fired the event.
57450:S 07 Jul 16:32:57.971 * Master replied to PING, replication can continue...
57450:S 07 Jul 16:32:57.971 * Partial resynchronization not possible (no cached master)
57450:S 07 Jul 16:32:57.971 * Full resync from master: d75bba9a2f3c5a6e2e4e9dfd70ddb0c2d4e647fd:1
57450:S 07 Jul 16:32:58.038 * MASTER <-> SLAVE sync: receiving 18 bytes from master
57450:S 07 Jul 16:32:58.038 * MASTER <-> SLAVE sync: Flushing old data
57450:S 07 Jul 16:32:58.038 * MASTER <-> SLAVE sync: Loading DB in memory
57450:S 07 Jul 16:32:58.038 * MASTER <-> SLAVE sync: Finished with success

sentinel, node2

$ redis-server node2/sentinel.conf --sentinel

                _._                                                  
           _.-``__ ''-._                                             
      _.-``    `.  `_.  ''-._           Redis 3.0.2 (01888d1e/0) 64 bit
  .-`` .-```.  ```\/    _.,_ ''-._                                   
 (    '      ,       .-`  | `,    )     Running in sentinel mode
 |`-._`-...-` __...-.``-._|'` _.-'|     Port: 16381
 |    `-._   `._    /     _.-'    |     PID: 57464
  `-._    `-._  `-./  _.-'    _.-'                                   
 |`-._`-._    `-.__.-'    _.-'_.-'|                                  
 |    `-._`-._        _.-'_.-'    |           http://redis.io        
  `-._    `-._`-.__.-'_.-'    _.-'                                   
 |`-._`-._    `-.__.-'    _.-'_.-'|                                  
 |    `-._`-._        _.-'_.-'    |                                  
  `-._    `-._`-.__.-'_.-'    _.-'                                   
      `-._    `-.__.-'    _.-'                                       
          `-._        _.-'                                           
              `-.__.-'                                               

57464:X 07 Jul 16:33:18.109 # Sentinel runid is 978afe015b4554fdd131957ef688ca4ec3651ea1
57464:X 07 Jul 16:33:18.109 # +monitor master redis-cluster 127.0.0.1 6380 quorum 2
57464:X 07 Jul 16:33:18.111 * +slave slave 127.0.0.1:6381 127.0.0.1 6381 @ redis-cluster 127.0.0.1 6380
57464:X 07 Jul 16:33:18.205 * +sentinel sentinel 127.0.0.1:16380 127.0.0.1 16380 @ redis-cluster 127.0.0.1 6380

Go ahead and do the same for node3.

If we look at the log output for node1's sentinel, we can see that the slaves have been added:

57425:X 07 Jul 16:33:03.895 * +slave slave 127.0.0.1:6381 127.0.0.1 6381 @ redis-cluster 127.0.0.1 6380
57425:X 07 Jul 16:33:20.171 * +sentinel sentinel 127.0.0.1:16381 127.0.0.1 16381 @ redis-cluster 127.0.0.1 6380
57425:X 07 Jul 16:33:44.107 * +slave slave 127.0.0.1:6382 127.0.0.1 6382 @ redis-cluster 127.0.0.1 6380
57425:X 07 Jul 16:33:44.303 * +sentinel sentinel 127.0.0.1:16382 127.0.0.1 16382 @ redis-cluster 127.0.0.1 6380

Find the master node

Now that our cluster is in place, we can ask sentinel which node is currently set as the master. To illustrate this, we'll ask sentinel on node3:

$ redis-cli -p 16382 sentinel get-master-addr-by-name redis-cluster

 1) "127.0.0.1"
 2) "6380"

As we can see here, the ip and port values match our node1 which is our master node that we started.

Electing a new master

Now lets kill off our original master node

$ redis-cli -p 6380 debug segfault

Looking at the logs from node2's sentinel we can watch the new master election happen:

57464:X 07 Jul 16:35:30.270 # +sdown master redis-cluster 127.0.0.1 6380
57464:X 07 Jul 16:35:30.301 # +new-epoch 1
57464:X 07 Jul 16:35:30.301 # +vote-for-leader 2a4d7647d2e995bd7315d8358efbd336d7fc79ad 1
57464:X 07 Jul 16:35:30.330 # +odown master redis-cluster 127.0.0.1 6380 #quorum 3/2
57464:X 07 Jul 16:35:30.330 # Next failover delay: I will not start a failover before Tue Jul  7 16:35:50 2015
57464:X 07 Jul 16:35:31.432 # +config-update-from sentinel 127.0.0.1:16382 127.0.0.1 16382 @ redis-cluster 127.0.0.1 6380
57464:X 07 Jul 16:35:31.432 # +switch-master redis-cluster 127.0.0.1 6380 127.0.0.1 6381
57464:X 07 Jul 16:35:31.432 * +slave slave 127.0.0.1:6382 127.0.0.1 6382 @ redis-cluster 127.0.0.1 6381
57464:X 07 Jul 16:35:31.432 * +slave slave 127.0.0.1:6380 127.0.0.1 6380 @ redis-cluster 127.0.0.1 6381
57464:X 07 Jul 16:35:36.519 # +sdown slave 127.0.0.1:6380 127.0.0.1 6380 @ redis-cluster 127.0.0.1 6381

Now, lets see which machine got elected:

$ redis-cli -p 16382 sentinel get-master-addr-by-name redis-cluster

 1) "127.0.0.1"
 2) "6381"

Here, we can see that node2 has been elected the new master of the cluster. Now, we can restart node1 and you'll see it come back up as a slave since node2 has been elected as the master node.

$ redis-server node1/redis.conf

57531:M 07 Jul 16:37:24.176 # Server started, Redis version 3.0.2
57531:M 07 Jul 16:37:24.176 * DB loaded from disk: 0.000 seconds
57531:M 07 Jul 16:37:24.176 * The server is now ready to accept connections on port 6380
57531:S 07 Jul 16:37:34.215 * SLAVE OF 127.0.0.1:6381 enabled (user request)
57531:S 07 Jul 16:37:34.215 # CONFIG REWRITE executed with success.
57531:S 07 Jul 16:37:34.264 * Connecting to MASTER 127.0.0.1:6381
57531:S 07 Jul 16:37:34.264 * MASTER <-> SLAVE sync started
57531:S 07 Jul 16:37:34.265 * Non blocking connect for SYNC fired the event.
57531:S 07 Jul 16:37:34.265 * Master replied to PING, replication can continue...
57531:S 07 Jul 16:37:34.265 * Partial resynchronization not possible (no cached master)
57531:S 07 Jul 16:37:34.265 * Full resync from master: 135e2c6ec93d33dceb30b7efb7da171b0fb93b9d:24756
57531:S 07 Jul 16:37:34.276 * MASTER <-> SLAVE sync: receiving 18 bytes from master
57531:S 07 Jul 16:37:34.276 * MASTER <-> SLAVE sync: Flushing old data
57531:S 07 Jul 16:37:34.276 * MASTER <-> SLAVE sync: Loading DB in memory
57531:S 07 Jul 16:37:34.276 * MASTER <-> SLAVE sync: Finished with success

That's it! This was a pretty simple example and is meant to introduce how you can setup a redis replica cluster with failover. In a followup post, I'll show how you can implement this on an actual cluster with CoreOS, containers, and HAProxy for loadbalancing.

nsenter a systemd-nspawn container

by on

If you run applications in containers, you've probably needed a way to enter the container to debug something. Sure you could run sshd in your container, but its really not necessary. The same thing can be accomplished using a little program called nsenter.

nsenter can be used to enter both Docker containers and systemd-nspawn containers. In this situation, we're going to be looking at a container running with systemd-nspawn.

Start a container

To make things easier, we're going to pull the "vanilla" Fedora 21 Docker container and export its filesystem so we can run it with systemd-nspawn.

> docker pull fedora:21

# create a directory to dump everything into
> mkdir fedora21

> docker export "$(docker create --name fedora21 fedora:21 true)" | tar -x -C fedora21

# clean up Docker's mess
> docker rm fedora21

Now we can actually boot the machine. One thing to note here for those that are unfamiliar, is that when you boot the machine it's pretty much the same as turning on a physical machine; you'll see systemd start up and it'll show a command prompt. The act of using nsenter will be in a different shell/terminal/screen/whatever than the running machine.

> sudo systemd-nspawn --directory fedora21 --machine fedora-container --boot

> machinectl list

MACHINE                          CONTAINER SERVICE         
fedora-container                 container nspawn          

1 machines listed.

Now that we have a machine running, we need to find the PID for systemd running in the container. We can do that using machinectl status

> machinectl status fedora-container
fedora-container
           Since: Thu 2015-04-09 23:44:35 UTC; 5min ago
          Leader: 7943 (systemd)
         Service: nspawn; class container
            Root: /home/core/fedora21
         Address: 10.0.0.0
              OS: Fedora 21 (Twenty One)
            Unit: machine-fedora\x2dcontainer.scope
                  ├─7943 /usr/lib/systemd/systemd
                  └─system.slice
                    ├─dbus.service
                    │ └─7988 /bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation
                    ├─systemd-journald.service
                    │ └─7964 /usr/lib/systemd/systemd-journald
                    ├─systemd-logind.service
                    │ └─7987 /usr/lib/systemd/systemd-logind
                    └─console-getty.service
                      └─7992 /sbin/agetty --noclear --keep-baud console 115200 38400 9600 vt102

The PID we want is the one specified under "Leader", so 7943.

nsenter into the container

Based on the man page, nsenter:

Enters the namespaces of one or more other processes and then executes the specified program

In this case, that is systemd inside of the container we are running. The goal here is to nsenter into the container and get a simple bash shell running so that we can run commands as if we logged into it.

> sudo nsenter --target 7943 --mount --uts --ipc --net

That's it! If you run something like whoami you'll see that you are in the container as the root user. You can now do everything you normally could if we logged in from the main login prompt or ssh'd into the machine.

When you're done, simply control + d to logout. To terminate the container, you can use machinectl terminate

sudo machinectl terminate fedora-container

Cortex - express style routing for Backbone

by on

Ive found Backbone to be one of the most useful client-side frameworks available due to it's lightweight nature. I know of a number of people that dislike it because it doesn't provide everything including the kitchen sink, but that's one of the reasons why I love it; it gives me the foundation to build exactly what I need, and only what I need.

Routing

One of the things that I find myself wanting when building a fairly large single page app is the ability to add middlewares to routes. Out of the box, Backbone's routing is extremely simple and looks something like this:

var app = Backbone.Router.extend({
    routes: {
        'users': function(){

        },
        'users/:id': function(id){
            // handle users/:id route
            console.log(id);
        }
    },
    initialize: function(){
        // initialize your app
    }
});

new app();
Backbone.history.start({ pushState: true });

For a lot of apps this is more than sufficient. But what if you want to add handlers that run before each route to do things like fetch data, or check to see if a user is authenticated and allowed to access that route.

Introducing Cortex

Cortex is a small library that allows you to set up chains of middlewares for each of your routes in the same way that Express does for your NodeJS server.

Let's take a look at simple example that does the same as our vaniall Backbone router above.

<script type="text/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/lodash.js/3.3.1/lodash.min.js"></script>
<script type="text/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/jquery/2.1.3/jquery.min.js"></script>
<script type="text/javascript" src="https://cdnjs.cloudflare.com/ajax/libs/backbone.js/1.1.2/backbone-min.js"></script>
<script type="text/javascript" src="<path>/cortex-x.x.x.min.js"></script>
<script type="text/javascript">
    $(function(){
        var cortex = new Cortex();

        cortex.route('users', function(route){
            // handle users route
        });

        cortex.route('users/:id', function(route){
            // handle users/:id route

            console.log(route.params.id);
        });

        var app = Backbone.Router.extend({
            routes: cortex.getRoutes(),
            initialize: function(){
                // initialize your app
            }
        });

        new app();
        Backbone.history.start({ pushState: true });
    });
</script>

This example should be pretty straightforward. Cortex.prototype.route takes at least two parameters:

  • A pattern to define the route. This is the exact same string we used in the vanilla Backbone example
  • A function to handle the route. This is the function that will be called when your route is matched. It takes two parameters:
    • route - This is an object that will contain things like url parameter tokens, query parameters, etc
    • next - This is a callback that can be called to move on to the next handler in the chain. In our example we dont call if because there is nothing after the handler we defined.

Lets add a middleware that will run before all routes:

$(function(){
    var cortex = new Cortex();

    cortex.use(function(route, next){
        // do something before all routes

        next();
    });

    cortex.route('users', function(route){
        // handle users route
    });

    cortex.route('users/:id', function(route){
        // handle users/:id route

        console.log(route.params.id);
    });

    var app = Backbone.Router.extend({
        routes: cortex.getRoutes(),
        initialize: function(){
            // initialize your app
        }
    });

    new app();
    Backbone.history.start({ pushState: true });
});

Middlewares function almost identically to those in Express save for the parameters that are passed (since we're not working with an HTTP server here). Middlewares will be called in the order they are defined. If you don't invoke the next callback, execution of the middleware/handler chain will stop at that point.

Now what if we want a chain of middlewares for a particular route:

$(function(){
    var cortex = new Cortex();

    cortex.route('users', function(route){
        // handle users route
    });

    var authUser = function(route, next){
        // check if the user is authenticated
        if(user.isAuthenticated){
            next();
        } else {
            throw new Error('User is not authenticated');
        }
    };

    cortex.route('users/:id', authUser, function(route){
        // handle users/:id route

        console.log(route.params.id);
    });

    var app = Backbone.Router.extend({
        routes: cortex.getRoutes(),
        initialize: function(){
            // initialize your app
        }
    });

    new app();
    Backbone.history.start({ pushState: true });
});

In this example, if the user is determined to be unauthenticated, we'll throw an exception. Cortex actually has a mechanism built in to handle exceptions that arise in middlewares/handlers. You can listen to the error event on your Cortex instance to handle errors:

$(function(){
    var cortex = new Cortex();

    cortex.on('error', function(err, route){
        // err - the error object/exception thrown
        // route - the route payload in the context the error was thrown
    });

    cortex.route('users', function(route){
        // handle users route
    });

    var authUser = function(route, next){
        // check if the user is authenticated
        if(!user.isAuthenticated){
            throw new Error('User is not authenticated');
        }
        next();
    };

    cortex.route('users/:id', authUser, function(route){
        // handle users/:id route

        console.log(route.params.id);
    });

    var app = Backbone.Router.extend({
        routes: cortex.getRoutes(),
        initialize: function(){
            // initialize your app
        }
    });

    new app();
    Backbone.history.start({ pushState: true });
});

In this error handler you can use the err object and route object to determine where the error happened and how to handle it.

The future

This is a very first iteration of this library, so expect that things will improve as time goes on. Future updates will include support for various module systems and possibly an Express middleware to make serving the individual file super easy.

Improvements and pull requests are more than welcome and can be created over at seanmcgary/backbone-cortex.

Run Docker containers with systemd-nspawn

by on

For a while now, Ive been using Docker to deploy containers to a number of CoreOS clusters and while its very convenient (kind of a boot the machine and you're ready to deploy type situation) there are some kinks in the system, particularly with how Docker and Systemd play (or fight) with each other.

For the unfamiliar, "CoreOS is an open source lightweight operating system based on the Linux kernel and designed for providing infrastructure to clustered deployments, while focusing on automation, ease of applications deployment, security, reliability and scalability." One of the important things that comes packaged with it is systemd.

systemd is a suite of basic building blocks for a Linux system. It provides a system and service manager that runs as PID 1 and starts the rest of the system. systemd provides aggressive parallelization capabilities, uses socket and D-Bus activation for starting services, offers on-demand starting of daemons, keeps track of processes using Linux control groups, supports snapshotting and restoring of the system state, maintains mount and automount points and implements an elaborate transactional dependency-based service control logic

Basically you get a linux kernel, an init system (systemd), the tools the CoreOS folks provide, and Docker (among some other basic utilities like vim) with the assumption that anything else you need will be installed and deployed via containers.

This is all pretty awesome and convenient; until you start trying to deploy your Docker containers with something like fleet. At that point systemd and Docker don't exactly play nice with each other.

systemd vs. the Docker daemon

Fleet is basically an interface for communicating with systemd on all of the nodes in your cluster. When you schedule a unit, that unit file is dropped onto a machine of fleet's choosing and then executed and managed through systemd. Systemd, being an init system, already knows how to manage processes and restart/stop them when necessary. Docker containers however, rely on the Docker daemon which is itself a kind of pseudo init system as it manages all of the processes run through it.

This means when you go to start a unit, you have to also write a bunch of scripts to make sure Docker manages its processes properly and cleans up after itself (Docker is very messy and likes leaving things all over the place).

So how do we fix this?

One init system to rule them all

Systemd has a lot of goodies that are baked in from the beginning. One of those is a utility called systemd-nspawn. Well what the hell is it?

systemd-nspawn may be used to run a command or OS in a light-weight namespace container. It is more powerful than chroot since it fully virtualizes the file system hierarchy, as well as the process tree, the various IPC subsystems and the host and domain name.

Cool, sounds exactly what we want. If you look at a lot of Docker containers, I would say a good majority of them build off some kind of base system, be it Ubuntu, Debian, Fedora, etc. In the most basic sense, this is just a file system that you build up using the Dockerfile and docker build process. We're going to walk through how to build a container, extract the filesystem, and run it using systemd-nspawn.

Building the container

We're going to build a really simple container based off Fedora 21. The script we include is just a bash script that will print the date every 5 seconds.

Dockerfile

FROM fedora:21

ADD run.sh /

RUN chmod +x /run.sh

run.sh

#! /bin/bash

while true; do
    $(which date)
    $(which sleep) 5
done

Notice how in the Dockerfile we didnt include a CMD command at the bottom. This is because we're just using Docker to build the filesystem we will extract; systemd-nspawn doesn't know about all of the bells and whistles built into Docker. It just knows how to run what you tell it.

Im currently using Quay.io for all my hosting, and you can actually pull and use the container Im building in this post. If you're not using Quay, or are using the Docker registry, just substitute the URL with the one that points to your container.

Now that we have our Dockerfile and run script, we can build the container:

docker build -t quay.io/seanmcgary/nspawn-test .

At this point, we could run our container using the Docker daemon if we wanted to like so:

docker run -i -t --name=test quay.io/seanmcgary/nspawn-test /run.sh

Extracting the filesystem

Now that we have a container, we can export/extract the filesystem from it. There are a few steps that are bundled in to one here:

  • Running docker create <container> <command> will initialize the container for the first time and thus create the filesystem. The command on the end can literally be anything, and as far as I can tell it doesn't even have to be valid
  • Docker export takes the ID returned from the and spits out a compressed image
  • We then pipe this compressed image to tar which we tell to put in a directory called nspawntest
mkdir nspawntest
docker export "$(docker create --name nspawntest quay.io/seanmcgary/nspawn-test true)" | tar -x -C nspawntest
docker rm nspawntest

We now have ourselves a filesystem:

tree -L 2
.
`-- nspawntest
    |-- bin -> usr/bin
    |-- boot
    |-- dev
    |-- etc
    |-- home
    |-- lib -> usr/lib
    |-- lib64 -> usr/lib64
    |-- lost+found
    |-- media
    |-- mnt
    |-- nspawntest_new
    |-- opt
    |-- proc
    |-- root
    |-- run
    |-- run.sh
    |-- sbin -> usr/sbin
    |-- srv
    |-- sys
    |-- tmp
    |-- usr
    `-- var

Running the machine

Now that we have a Fedora filesystem just sitting here, we can point systemd-nspawn at it and tell it to run our run.sh script.

sudo systemd-nspawn --machine nspawntest --directory nspawntest /run.sh
core@coreoshost ~ $ sudo systemd-nspawn --machine nspawntest --directory nspawntest /run.sh
Spawning container nspawntest on /home/core/nspawntest.
Press ^] three times within 1s to kill container.
Thu Feb 26 18:19:58 UTC 2015
Thu Feb 26 18:20:03 UTC 2015
Thu Feb 26 18:20:08 UTC 2015

Whenever you create a machine with systemd-nspawn it will show up when you run machinectl

core@coreoshost ~ $ machinectl
MACHINE                          CONTAINER SERVICE         
nspawntest                       container nspawn          

1 machines listed.

Now, if we want to stop our script from running, we can do so by using the machinectl terminate command:

sudo machinectl terminate nspawntest

Making it deployable

Now that we know how to run this on its own, we can easily write out a unit file that can then be started via systemd directly or passed to fleet to be scheduled on your cluster:

[Unit]
Description=nspawntest
After=docker.service
Requires=docker.service

[Service]
User=core
ExecStartPre=/bin/bash -c 'docker pull quay.io/seanmcgary/nspawn-test:latest || true'
ExecStartPre=/bin/bash -c 'mkdir /home/core/containers/nspawntest_new || true'
ExecStartPre=/bin/bash -c 'docker export "$(docker create --name nspawntest quay.io/seanmcgary/nspawn-test true)" | tar -x -C /home/core/containers/nspawntest_new'
ExecStartPre=/bin/bash -c 'docker rm nspawntest || true'
ExecStartPre=/bin/bash -c 'mv /home/core/containers/nspawntest_new /home/core/containers/nspawntest_running'

ExecStart=/bin/bash -c 'sudo systemd-nspawn --machine nspawntest --directory /home/core/containers/nspawntest_running /run.sh'

ExecStop=/bin/bash -c 'sudo machinectl terminate nspawntest'

TimeoutStartSec=0
Restart=always
RestartSec=10s

In this unit file, we are basically doing everything that we did above by hand:

  • Pull in the latest version of the container
  • Create a directory to extract the container to
  • Create and export the container via docker, piping the contents through tar to unpack them
  • Do a little bit of Docker cleanup, removing the now un-needed container
  • Run the container using systemd-nspawn
  • If systemd is told to stop the container, make a call to machinectl to terminate the container by the name that we gave it.

If all goes to plan, you should see the following output when you tail the journal for your unit:

Feb 26 19:24:12 coreoshost systemd[1]: Starting nspawntest...
Feb 26 19:24:12 coreoshost bash[4864]: Pulling repository quay.io/seanmcgary/nspawn-test
Feb 26 19:24:13 coreoshost bash[4864]: a22582cd26be: Pulling image (latest) from quay.io/seanmcgary/nspawn-test
Feb 26 19:24:13 coreoshost bash[4864]: a22582cd26be: Pulling image (latest) from quay.io/seanmcgary/nspawn-test, endpoint: https://quay.io/v1/
Feb 26 19:24:13 coreoshost bash[4864]: a22582cd26be: Pulling dependent layers
Feb 26 19:24:13 coreoshost bash[4864]: 511136ea3c5a: Download complete
Feb 26 19:24:13 coreoshost bash[4864]: 00a0c78eeb6d: Download complete
Feb 26 19:24:13 coreoshost bash[4864]: 834629358fe2: Download complete
Feb 26 19:24:13 coreoshost bash[4864]: 478c125478c6: Download complete
Feb 26 19:24:13 coreoshost bash[4864]: a22582cd26be: Download complete
Feb 26 19:24:13 coreoshost bash[4864]: a22582cd26be: Download complete
Feb 26 19:24:13 coreoshost bash[4864]: Status: Image is up to date for quay.io/seanmcgary/nspawn-test:latest
Feb 26 19:24:46 coreoshost bash[4916]: nspawntest
Feb 26 19:24:46 coreoshost systemd[1]: Started nspawntest.
Feb 26 19:24:46 coreoshost sudo[4932]: core : TTY=unknown ; PWD=/ ; USER=root ; COMMAND=/usr/bin/systemd-nspawn --machine nspawntest --directory /home/core/containers/nspawntest_running /run.sh
Feb 26 19:24:46 coreoshost echo[4926]: Running systemd-nspawn
Feb 26 19:24:46 coreoshost sudo[4932]: Spawning container nspawntest on /home/core/containers/nspawntest_running.
Feb 26 19:24:46 coreoshost sudo[4932]: Press ^] three times within 1s to kill container.
Feb 26 19:24:46 coreoshost sudo[4932]: Thu Feb 26 19:24:46 UTC 2015
Feb 26 19:24:51 coreoshost sudo[4932]: Thu Feb 26 19:24:51 UTC 2015
Feb 26 19:24:56 coreoshost sudo[4932]: Thu Feb 26 19:24:56 UTC 2015
Feb 26 19:25:01 coreoshost sudo[4932]: Thu Feb 26 19:25:01 UTC 2015

Wrap up

This is just the very tip of the iceberg when it comes to running things with systemd-nspawn. There are a lot of other options you can configure when running your container, like permissions, network configurations, journal configurations, etc. I highly suggest taking a look at the docs to see what there is.

Now that we know how to run a container via systemd-nspawn, next time we'll look at running systemd within a container using systemd-nspawn so that we can manage multiple processes.



Automatically scale HAProxy with confd and etcd

by on

Load balancing with HAProxy is pretty easy; today we're going to use etcd and confd to automatically configure cluster nodes to make things more elastic.

For the unfamiliar, etcd is "a highly-available key value store for shared configuration and service discovery" built by the guys over at CoreOS. Each node in our cluster (which will be a CoreOS machine) will run etcd by default allowing units deployed to the cluster to register themselves when they start up and remove themselves when they shutdown.

Confd is a configuration management tool that pulls data from etcd at set intervals and is reponsible for generating updated configs when it detects a change.

Cluster configuration

The example cluster we're going to use looks a little like this:

1 Machine running Fedora

This is going to be our loadbalancer. Im choosing Fedora for this one machine because it comes with systemd by default which is going to make it super easy to setup HAProxy and confd. We also don't necessarily want this machine updating all the time like our CoreOS machines will; we want it to remain relatively static and we need it to keep a static IP address. This of course could be remedied by having multiple loadbalancers.

3 CoreOS nodes

For this test, we're going to run a cluster of CoreOS machines that will run our etcd cluster. When running etcd, its a good idea to run at least 3 machines in order to maintain quorum across the cluster. We're also going to be using fleet (which also uses etcd) to schedule our test webservice to the cluster.

Note: to make configuring things easier, I will be using AWS and providing a cloud-config file when creating these machines.

Creating a CoreOS cluster

For the CoreOS cluster, we're going to provide some initialization data via a cloud-config file. This will tell CoreOS to start things like fleet, etcd, and docker and will also provide etcd with the discovery endpoint to use (note, this is etcd 0.4.6, not the new and improved 2.0 [yet]).

Note: you'll need to generate a discovery token by going to https://discovery.etcd.io/new

#cloud-config

coreos:
  etcd:
    discovery: https://discovery.etcd.io/<put your token here>
    addr: $public_ipv4:4001
    peer-addr: $public_ipv4:7001
  units:
    - name: etcd.service
      command: start
    - name: fleet.service
      command: start
      metadata: type=webserver

When running this on AWS, make sure to open up the necessary ports for etcd (4001 and 7001) as well as the ports for your application.

Setting up HAProxy and confd

Now that our CoreOS cluster is running, we're going start up a Fedora based machine to run HAProxy and confd. In this case, I picked Fedora 21 as that was the most up to date version I could find on AWS.

The latest version of HAProxy (1.5.x) is available as an RPM and can be installed using yum:

yum install haproxy.x86_64

The latest version in this case is 1.5.10

The config for HAProxy is located at /etc/haproxy/haproxy.cfg. What we're going to do now is install confd which will overwrite the config, so you may want to save the default config to reference later.

confd - installation and configuration

We're going to be installing version 0.7.1 which can be fetched from the releases page on the confd Github page. The release is a pre-built confd binary, so we don't need to worry about building it ourselves.

curl -OL https://github.com/kelseyhightower/confd/releases/download/v0.7.1/confd-0.7.1-linux-amd64

mv confd-0.7.1-linux-amd64 confd

cp confd /usr/bin && chmod +x /usr/bin/confd
cp confd /usr/sbin && chmod +x /usr/sbin/confd

Running the above commands will download the binary from Github, copy it to /usr/bin and /usr/sbin and make it executable. If you were to just run confd you'd get some errors that look like this:

2015-01-30T18:51:54Z confd[840]: WARNING Skipping confd config file.
2015-01-30T18:51:54Z confd[840]: ERROR cannot connect to etcd cluster: http://127.0.0.1:4001

By default, confd will look for a config file in /etc/confd. The structure for /etc/confd will look something like this:

├── confd
│   ├── conf.d
│   │   └── haproxy.toml
│   ├── confd.toml
│   └── templates
│       └── haproxy.cfg.tmpl

confd.toml is the overall config for confd which will describe the backend we want to use (etcd), the interval to poll it at, the config directory, etc.

confd.toml

confdir = "/etc/confd"
interval = 20
backend = "etcd"
nodes = [
        "http://<address that points to one of your CoreOS nodes>:4001"
]
prefix = "/"
scheme = "http"
verbose = true

The "nodes" property needs at least one node specified and should point to one of your CoreOS nodes. You could also list each of your three nodes here so that if confd isn't able to reach one, it will try one of the others.

Also a thing to note is the "interval" value. Here we're telling confd to poll etcd for changes every 20 seconds.

Now lets look at the HAProxy specific config located at /etc/confd/conf.d/haproxy.toml

[template]
src = "haproxy.cfg.tmpl"
dest = "/etc/haproxy/haproxy.cfg"
keys = [
        "/app/your_awesome_app"
]
reload_cmd = "echo restarting && /usr/bin/systemctl reload haproxy"

The "keys" attribute lists the various keys within etcd we want confd to monitor. When we launch our app on our CoreOS cluster, each unit file will register itself with etcd by creating a key in the /app/your_awesome_app directory that contains information to insert into the HAProxy config (it's IP address and port to forward traffic to).

The "reload_cmd" attribute is an optional command that confd can run whenever it writes a change to your config. Here we're

Now lets take a look at what the HAProxy template will look like (/etc/confd/templates/haproxy.cfg.tmpl)

global
    log 127.0.0.1    local0
    log 127.0.0.1    local1 notice
    maxconn 4096
    user haproxy
    group haproxy
    daemon
    stats socket /var/run/haproxy.sock mode 600 level admin    

defaults
    log    global
    mode    http
    option    httplog
    option    dontlognull
    retries    3
    option redispatch
    maxconn    2000
    contimeout    5000
    clitimeout    50000
    srvtimeout    50000
    option forwardfor
    option http-server-close

frontend stats
    bind *:8888
    stats enable
    stats uri /

frontend http-in
    bind *:80
    default_backend application-backend

backend application-backend
    balance leastconn
    option httpclose
    option forwardfor
    cookie JSESSIONID prefix

    {{range getvs "/app/your_awesome_app*"}}
    server {{.}} cookie A check
    {{end}}

Most of this is boilerplate from the default HAProxy config, so the sections we want to look are the frontend and backend at the bottom.

frontend http-in
    bind *:80
    default_backend application-backend

backend application-backend
    balance leastconn
    option httpclose
    option forwardfor
    cookie JSESSIONID prefix

    {{range getvs "/app/your_awesome_app*"}}
    server {{.}} cookie A check
    {{end}}

With our frontend, we're accepting all traffic on port 80 and sending it to the "application-backend". Here we have some Go templates (confd is written in Go; this template will loop over the keys in the etcd directory we defined and print out their value. (You can find more template examples here in the confd docs)

Running confd using systemd

Since we need confd to constantly be monitoring etcd, we're going to use systemd to manage it. This way, if confd crashes or if the machine restarts, confd will always come back up.

Lets create the file /etc/systemd/system/confd.service

[Unit]
Description=Confd
After=haproxy.service

[Service]
ExecStart=/usr/bin/confd
Restart=always

[Install]
WantedBy=basic.target

If you're unfamiliar with systemd's unit files, I would highly suggest reading the docs as there are a lot of available options and configurations. This one is pretty simple though. We're telling systemd where to find the confd binary and to always restart if the process dies. The line WantedBy=basic.target tells systemd to start the process on boot as well.

Now we can install and activate the service:

sudo systemctl enable /etc/systemd/system/confd.service
sudo systemctl start /etc/systemd/system/confd.service

Enabling our unit will symlink the file to /etc/systemd/system/basic.target.wants so that it starts on boot. Calling systemctl start actually starts it for the first time.

If you want to see the log output, you can do so by running:

journalctl -f -u confd.service

Registering you app with etcd

As an example service, we're going to look at a project I have called "stupid-server". Its a simple webserver written in NodeJS. Theres a docker container over on quay.io that we'll be using and scheduling on our cluster using fleet.

stupid-server@.service

Here's what our unit file will look like:

[Unit]
Description=Stupid Server
After=docker.service
Requires=docker.service

[Service]
ExecStartPre=/usr/bin/docker pull quay.io/seanmcgary/stupid-server:latest
ExecStart=/usr/bin/docker run --name stupidservice -p 9000:8000 quay.io/seanmcgary/stupid-server
ExecStopPre=/usr/bin/docker kill stupidservice
ExecStop=/usr/bin/docker rm stupidservice
TimeoutStartSec=0
Restart=always
RestartSec=10s

[X-Fleet]
X-Conflicts=stupid-server@*.service

Each time we start the unit, we'll try to pull the latest container from quay then proceed with actually starting the server. Now we're going to modify it to register itself with etcd when it starts and de-register when it stops.

[Unit]
Description=Stupid Server
After=docker.service
Requires=docker.service

[Service]
ExecStartPre=/usr/bin/docker pull quay.io/seanmcgary/stupid-server
ExecStart=/usr/bin/docker run --name stupidservice -p 9000:8000 quay.io/seanmcgary/stupid-server
ExecStartPost=/bin/bash -c 'etcdctl set /apps/stupid_server/%n "%p-%i $(curl http://169.254.169.254/latest/meta-data/public-ipv4/):9000"'
ExecStopPre=/usr/bin/docker kill stupidservice
ExecStop=/usr/bin/docker rm stupidservice
ExecStopPost=/bin/bash -c 'etcdctl rm /apps/stupid_server/%n'
TimeoutStartSec=0
Restart=always
RestartSec=10s

[X-Fleet]
X-Conflicts=stupid-server@*.service

These are the two lines of note:

ExecStartPost=/bin/bash -c 'etcdctl set /apps/stupid_server/%n "%p-%i $(curl http://169.254.169.254/latest/meta-data/public-ipv4/):9000"'
ExecStopPost=/bin/bash -c 'etcdctl rm /apps/stupid_server/%n'

After our service starts, we make a curl request to the AWS metadata service to get the public IP of the machine that we're on (you can also get the private IP if you want) to build the name/IP of the server that will be written to the HAProxy config. The key/value that gets written to etcd looks like this:

Key: /apps/stupid_server/stupid-server@1.service
Value: stupid-server-1 10.10.10.10:9000

Note that the actual IP will be whatever the IP of the machine is.

On the ExecStopPost line, we delete the key from etcd which in turn will cause confd to recompile the config and reload HAProxy.

Start your server

Now, we can actually start our server by submitting it to fleet

fleetctl start stupid-server@1.service

Thats it! Now we can start as many stupid-servers as we want and they'll automatically show up in HAProxy when they start.