Sean McGary

Software Engineer, builder of webapps

Run Docker containers with systemd-nspawn

by on

For a while now, Ive been using Docker to deploy containers to a number of CoreOS clusters and while its very convenient (kind of a boot the machine and you're ready to deploy type situation) there are some kinks in the system, particularly with how Docker and Systemd play (or fight) with each other.

For the unfamiliar, "CoreOS is an open source lightweight operating system based on the Linux kernel and designed for providing infrastructure to clustered deployments, while focusing on automation, ease of applications deployment, security, reliability and scalability." One of the important things that comes packaged with it is systemd.

systemd is a suite of basic building blocks for a Linux system. It provides a system and service manager that runs as PID 1 and starts the rest of the system. systemd provides aggressive parallelization capabilities, uses socket and D-Bus activation for starting services, offers on-demand starting of daemons, keeps track of processes using Linux control groups, supports snapshotting and restoring of the system state, maintains mount and automount points and implements an elaborate transactional dependency-based service control logic

Basically you get a linux kernel, an init system (systemd), the tools the CoreOS folks provide, and Docker (among some other basic utilities like vim) with the assumption that anything else you need will be installed and deployed via containers.

This is all pretty awesome and convenient; until you start trying to deploy your Docker containers with something like fleet. At that point systemd and Docker don't exactly play nice with each other.

systemd vs. the Docker daemon

Fleet is basically an interface for communicating with systemd on all of the nodes in your cluster. When you schedule a unit, that unit file is dropped onto a machine of fleet's choosing and then executed and managed through systemd. Systemd, being an init system, already knows how to manage processes and restart/stop them when necessary. Docker containers however, rely on the Docker daemon which is itself a kind of pseudo init system as it manages all of the processes run through it.

This means when you go to start a unit, you have to also write a bunch of scripts to make sure Docker manages its processes properly and cleans up after itself (Docker is very messy and likes leaving things all over the place).

So how do we fix this?

One init system to rule them all

Systemd has a lot of goodies that are baked in from the beginning. One of those is a utility called systemd-nspawn. Well what the hell is it?

systemd-nspawn may be used to run a command or OS in a light-weight namespace container. It is more powerful than chroot since it fully virtualizes the file system hierarchy, as well as the process tree, the various IPC subsystems and the host and domain name.

Cool, sounds exactly what we want. If you look at a lot of Docker containers, I would say a good majority of them build off some kind of base system, be it Ubuntu, Debian, Fedora, etc. In the most basic sense, this is just a file system that you build up using the Dockerfile and docker build process. We're going to walk through how to build a container, extract the filesystem, and run it using systemd-nspawn.

Building the container

We're going to build a really simple container based off Fedora 21. The script we include is just a bash script that will print the date every 5 seconds.

Dockerfile

FROM fedora:21

ADD run.sh /

RUN chmod +x /run.sh

run.sh

#! /bin/bash

while true; do
    $(which date)
    $(which sleep) 5
done

Notice how in the Dockerfile we didnt include a CMD command at the bottom. This is because we're just using Docker to build the filesystem we will extract; systemd-nspawn doesn't know about all of the bells and whistles built into Docker. It just knows how to run what you tell it.

Im currently using Quay.io for all my hosting, and you can actually pull and use the container Im building in this post. If you're not using Quay, or are using the Docker registry, just substitute the URL with the one that points to your container.

Now that we have our Dockerfile and run script, we can build the container:

docker build -t quay.io/seanmcgary/nspawn-test .

At this point, we could run our container using the Docker daemon if we wanted to like so:

docker run -i -t --name=test quay.io/seanmcgary/nspawn-test /run.sh

Extracting the filesystem

Now that we have a container, we can export/extract the filesystem from it. There are a few steps that are bundled in to one here:

  • Running docker create <container> <command> will initialize the container for the first time and thus create the filesystem. The command on the end can literally be anything, and as far as I can tell it doesn't even have to be valid
  • Docker export takes the ID returned from the and spits out a compressed image
  • We then pipe this compressed image to tar which we tell to put in a directory called nspawntest
mkdir nspawntest
docker export "$(docker create --name nspawntest quay.io/seanmcgary/nspawn-test true)" | tar -x -C nspawntest
docker rm nspawntest

We now have ourselves a filesystem:

tree -L 2
.
`-- nspawntest
    |-- bin -> usr/bin
    |-- boot
    |-- dev
    |-- etc
    |-- home
    |-- lib -> usr/lib
    |-- lib64 -> usr/lib64
    |-- lost+found
    |-- media
    |-- mnt
    |-- nspawntest_new
    |-- opt
    |-- proc
    |-- root
    |-- run
    |-- run.sh
    |-- sbin -> usr/sbin
    |-- srv
    |-- sys
    |-- tmp
    |-- usr
    `-- var

Running the machine

Now that we have a Fedora filesystem just sitting here, we can point systemd-nspawn at it and tell it to run our run.sh script.

sudo systemd-nspawn --machine nspawntest --directory nspawntest /run.sh
core@coreoshost ~ $ sudo systemd-nspawn --machine nspawntest --directory nspawntest /run.sh
Spawning container nspawntest on /home/core/nspawntest.
Press ^] three times within 1s to kill container.
Thu Feb 26 18:19:58 UTC 2015
Thu Feb 26 18:20:03 UTC 2015
Thu Feb 26 18:20:08 UTC 2015

Whenever you create a machine with systemd-nspawn it will show up when you run machinectl

core@coreoshost ~ $ machinectl
MACHINE                          CONTAINER SERVICE         
nspawntest                       container nspawn          

1 machines listed.

Now, if we want to stop our script from running, we can do so by using the machinectl terminate command:

sudo machinectl terminate nspawntest

Making it deployable

Now that we know how to run this on its own, we can easily write out a unit file that can then be started via systemd directly or passed to fleet to be scheduled on your cluster:

[Unit]
Description=nspawntest
After=docker.service
Requires=docker.service

[Service]
User=core
ExecStartPre=/bin/bash -c 'docker pull quay.io/seanmcgary/nspawn-test:latest || true'
ExecStartPre=/bin/bash -c 'mkdir /home/core/containers/nspawntest_new || true'
ExecStartPre=/bin/bash -c 'docker export "$(docker create --name nspawntest quay.io/seanmcgary/nspawn-test true)" | tar -x -C /home/core/containers/nspawntest_new'
ExecStartPre=/bin/bash -c 'docker rm nspawntest || true'
ExecStartPre=/bin/bash -c 'mv /home/core/containers/nspawntest_new /home/core/containers/nspawntest_running'

ExecStart=/bin/bash -c 'sudo systemd-nspawn --machine nspawntest --directory /home/core/containers/nspawntest_running /run.sh'

ExecStop=/bin/bash -c 'sudo machinectl terminate nspawntest'

TimeoutStartSec=0
Restart=always
RestartSec=10s

In this unit file, we are basically doing everything that we did above by hand:

  • Pull in the latest version of the container
  • Create a directory to extract the container to
  • Create and export the container via docker, piping the contents through tar to unpack them
  • Do a little bit of Docker cleanup, removing the now un-needed container
  • Run the container using systemd-nspawn
  • If systemd is told to stop the container, make a call to machinectl to terminate the container by the name that we gave it.

If all goes to plan, you should see the following output when you tail the journal for your unit:

Feb 26 19:24:12 coreoshost systemd[1]: Starting nspawntest...
Feb 26 19:24:12 coreoshost bash[4864]: Pulling repository quay.io/seanmcgary/nspawn-test
Feb 26 19:24:13 coreoshost bash[4864]: a22582cd26be: Pulling image (latest) from quay.io/seanmcgary/nspawn-test
Feb 26 19:24:13 coreoshost bash[4864]: a22582cd26be: Pulling image (latest) from quay.io/seanmcgary/nspawn-test, endpoint: https://quay.io/v1/
Feb 26 19:24:13 coreoshost bash[4864]: a22582cd26be: Pulling dependent layers
Feb 26 19:24:13 coreoshost bash[4864]: 511136ea3c5a: Download complete
Feb 26 19:24:13 coreoshost bash[4864]: 00a0c78eeb6d: Download complete
Feb 26 19:24:13 coreoshost bash[4864]: 834629358fe2: Download complete
Feb 26 19:24:13 coreoshost bash[4864]: 478c125478c6: Download complete
Feb 26 19:24:13 coreoshost bash[4864]: a22582cd26be: Download complete
Feb 26 19:24:13 coreoshost bash[4864]: a22582cd26be: Download complete
Feb 26 19:24:13 coreoshost bash[4864]: Status: Image is up to date for quay.io/seanmcgary/nspawn-test:latest
Feb 26 19:24:46 coreoshost bash[4916]: nspawntest
Feb 26 19:24:46 coreoshost systemd[1]: Started nspawntest.
Feb 26 19:24:46 coreoshost sudo[4932]: core : TTY=unknown ; PWD=/ ; USER=root ; COMMAND=/usr/bin/systemd-nspawn --machine nspawntest --directory /home/core/containers/nspawntest_running /run.sh
Feb 26 19:24:46 coreoshost echo[4926]: Running systemd-nspawn
Feb 26 19:24:46 coreoshost sudo[4932]: Spawning container nspawntest on /home/core/containers/nspawntest_running.
Feb 26 19:24:46 coreoshost sudo[4932]: Press ^] three times within 1s to kill container.
Feb 26 19:24:46 coreoshost sudo[4932]: Thu Feb 26 19:24:46 UTC 2015
Feb 26 19:24:51 coreoshost sudo[4932]: Thu Feb 26 19:24:51 UTC 2015
Feb 26 19:24:56 coreoshost sudo[4932]: Thu Feb 26 19:24:56 UTC 2015
Feb 26 19:25:01 coreoshost sudo[4932]: Thu Feb 26 19:25:01 UTC 2015

Wrap up

This is just the very tip of the iceberg when it comes to running things with systemd-nspawn. There are a lot of other options you can configure when running your container, like permissions, network configurations, journal configurations, etc. I highly suggest taking a look at the docs to see what there is.

Now that we know how to run a container via systemd-nspawn, next time we'll look at running systemd within a container using systemd-nspawn so that we can manage multiple processes.



Automatically scale HAProxy with confd and etcd

by on

Load balancing with HAProxy is pretty easy; today we're going to use etcd and confd to automatically configure cluster nodes to make things more elastic.

For the unfamiliar, etcd is "a highly-available key value store for shared configuration and service discovery" built by the guys over at CoreOS. Each node in our cluster (which will be a CoreOS machine) will run etcd by default allowing units deployed to the cluster to register themselves when they start up and remove themselves when they shutdown.

Confd is a configuration management tool that pulls data from etcd at set intervals and is reponsible for generating updated configs when it detects a change.

Cluster configuration

The example cluster we're going to use looks a little like this:

1 Machine running Fedora

This is going to be our loadbalancer. Im choosing Fedora for this one machine because it comes with systemd by default which is going to make it super easy to setup HAProxy and confd. We also don't necessarily want this machine updating all the time like our CoreOS machines will; we want it to remain relatively static and we need it to keep a static IP address. This of course could be remedied by having multiple loadbalancers.

3 CoreOS nodes

For this test, we're going to run a cluster of CoreOS machines that will run our etcd cluster. When running etcd, its a good idea to run at least 3 machines in order to maintain quorum across the cluster. We're also going to be using fleet (which also uses etcd) to schedule our test webservice to the cluster.

Note: to make configuring things easier, I will be using AWS and providing a cloud-config file when creating these machines.

Creating a CoreOS cluster

For the CoreOS cluster, we're going to provide some initialization data via a cloud-config file. This will tell CoreOS to start things like fleet, etcd, and docker and will also provide etcd with the discovery endpoint to use (note, this is etcd 0.4.6, not the new and improved 2.0 [yet]).

Note: you'll need to generate a discovery token by going to https://discovery.etcd.io/new

#cloud-config

coreos:
  etcd:
    discovery: https://discovery.etcd.io/<put your token here>
    addr: $public_ipv4:4001
    peer-addr: $public_ipv4:7001
  units:
    - name: etcd.service
      command: start
    - name: fleet.service
      command: start
      metadata: type=webserver

When running this on AWS, make sure to open up the necessary ports for etcd (4001 and 7001) as well as the ports for your application.

Setting up HAProxy and confd

Now that our CoreOS cluster is running, we're going start up a Fedora based machine to run HAProxy and confd. In this case, I picked Fedora 21 as that was the most up to date version I could find on AWS.

The latest version of HAProxy (1.5.x) is available as an RPM and can be installed using yum:

yum install haproxy.x86_64

The latest version in this case is 1.5.10

The config for HAProxy is located at /etc/haproxy/haproxy.cfg. What we're going to do now is install confd which will overwrite the config, so you may want to save the default config to reference later.

confd - installation and configuration

We're going to be installing version 0.7.1 which can be fetched from the releases page on the confd Github page. The release is a pre-built confd binary, so we don't need to worry about building it ourselves.

curl -OL https://github.com/kelseyhightower/confd/releases/download/v0.7.1/confd-0.7.1-linux-amd64

mv confd-0.7.1-linux-amd64 confd

cp confd /usr/bin && chmod +x /usr/bin/confd
cp confd /usr/sbin && chmod +x /usr/sbin/confd

Running the above commands will download the binary from Github, copy it to /usr/bin and /usr/sbin and make it executable. If you were to just run confd you'd get some errors that look like this:

2015-01-30T18:51:54Z confd[840]: WARNING Skipping confd config file.
2015-01-30T18:51:54Z confd[840]: ERROR cannot connect to etcd cluster: http://127.0.0.1:4001

By default, confd will look for a config file in /etc/confd. The structure for /etc/confd will look something like this:

├── confd
│   ├── conf.d
│   │   └── haproxy.toml
│   ├── confd.toml
│   └── templates
│       └── haproxy.cfg.tmpl

confd.toml is the overall config for confd which will describe the backend we want to use (etcd), the interval to poll it at, the config directory, etc.

confd.toml

confdir = "/etc/confd"
interval = 20
backend = "etcd"
nodes = [
        "http://<address that points to one of your CoreOS nodes>:4001"
]
prefix = "/"
scheme = "http"
verbose = true

The "nodes" property needs at least one node specified and should point to one of your CoreOS nodes. You could also list each of your three nodes here so that if confd isn't able to reach one, it will try one of the others.

Also a thing to note is the "interval" value. Here we're telling confd to poll etcd for changes every 20 seconds.

Now lets look at the HAProxy specific config located at /etc/confd/conf.d/haproxy.toml

[template]
src = "haproxy.cfg.tmpl"
dest = "/etc/haproxy/haproxy.cfg"
keys = [
        "/app/your_awesome_app"
]
reload_cmd = "echo restarting && /usr/bin/systemctl reload haproxy"

The "keys" attribute lists the various keys within etcd we want confd to monitor. When we launch our app on our CoreOS cluster, each unit file will register itself with etcd by creating a key in the /app/your_awesome_app directory that contains information to insert into the HAProxy config (it's IP address and port to forward traffic to).

The "reload_cmd" attribute is an optional command that confd can run whenever it writes a change to your config. Here we're

Now lets take a look at what the HAProxy template will look like (/etc/confd/templates/haproxy.cfg.tmpl)

global
    log 127.0.0.1    local0
    log 127.0.0.1    local1 notice
    maxconn 4096
    user haproxy
    group haproxy
    daemon
    stats socket /var/run/haproxy.sock mode 600 level admin    

defaults
    log    global
    mode    http
    option    httplog
    option    dontlognull
    retries    3
    option redispatch
    maxconn    2000
    contimeout    5000
    clitimeout    50000
    srvtimeout    50000
    option forwardfor
    option http-server-close

frontend stats
    bind *:8888
    stats enable
    stats uri /

frontend http-in
    bind *:80
    default_backend application-backend

backend application-backend
    balance leastconn
    option httpclose
    option forwardfor
    cookie JSESSIONID prefix

    {{range getvs "/app/your_awesome_app*"}}
    server {{.}} cookie A check
    {{end}}

Most of this is boilerplate from the default HAProxy config, so the sections we want to look are the frontend and backend at the bottom.

frontend http-in
    bind *:80
    default_backend application-backend

backend application-backend
    balance leastconn
    option httpclose
    option forwardfor
    cookie JSESSIONID prefix

    {{range getvs "/app/your_awesome_app*"}}
    server {{.}} cookie A check
    {{end}}

With our frontend, we're accepting all traffic on port 80 and sending it to the "application-backend". Here we have some Go templates (confd is written in Go; this template will loop over the keys in the etcd directory we defined and print out their value. (You can find more template examples here in the confd docs)

Running confd using systemd

Since we need confd to constantly be monitoring etcd, we're going to use systemd to manage it. This way, if confd crashes or if the machine restarts, confd will always come back up.

Lets create the file /etc/systemd/system/confd.service

[Unit]
Description=Confd
After=haproxy.service

[Service]
ExecStart=/usr/bin/confd
Restart=always

[Install]
WantedBy=basic.target

If you're unfamiliar with systemd's unit files, I would highly suggest reading the docs as there are a lot of available options and configurations. This one is pretty simple though. We're telling systemd where to find the confd binary and to always restart if the process dies. The line WantedBy=basic.target tells systemd to start the process on boot as well.

Now we can install and activate the service:

sudo systemctl enable /etc/systemd/system/confd.service
sudo systemctl start /etc/systemd/system/confd.service

Enabling our unit will symlink the file to /etc/systemd/system/basic.target.wants so that it starts on boot. Calling systemctl start actually starts it for the first time.

If you want to see the log output, you can do so by running:

journalctl -f -u confd.service

Registering you app with etcd

As an example service, we're going to look at a project I have called "stupid-server". Its a simple webserver written in NodeJS. Theres a docker container over on quay.io that we'll be using and scheduling on our cluster using fleet.

stupid-server@.service

Here's what our unit file will look like:

[Unit]
Description=Stupid Server
After=docker.service
Requires=docker.service

[Service]
ExecStartPre=/usr/bin/docker pull quay.io/seanmcgary/stupid-server:latest
ExecStart=/usr/bin/docker run --name stupidservice -p 9000:8000 quay.io/seanmcgary/stupid-server
ExecStopPre=/usr/bin/docker kill stupidservice
ExecStop=/usr/bin/docker rm stupidservice
TimeoutStartSec=0
Restart=always
RestartSec=10s

[X-Fleet]
X-Conflicts=stupid-server@*.service

Each time we start the unit, we'll try to pull the latest container from quay then proceed with actually starting the server. Now we're going to modify it to register itself with etcd when it starts and de-register when it stops.

[Unit]
Description=Stupid Server
After=docker.service
Requires=docker.service

[Service]
ExecStartPre=/usr/bin/docker pull quay.io/seanmcgary/stupid-server
ExecStart=/usr/bin/docker run --name stupidservice -p 9000:8000 quay.io/seanmcgary/stupid-server
ExecStartPost=/bin/bash -c 'etcdctl set /apps/stupid_server/%n "%p-%i $(curl http://169.254.169.254/latest/meta-data/public-ipv4/):9000"'
ExecStopPre=/usr/bin/docker kill stupidservice
ExecStop=/usr/bin/docker rm stupidservice
ExecStopPost=/bin/bash -c 'etcdctl rm /apps/stupid_server/%n'
TimeoutStartSec=0
Restart=always
RestartSec=10s

[X-Fleet]
X-Conflicts=stupid-server@*.service

These are the two lines of note:

ExecStartPost=/bin/bash -c 'etcdctl set /apps/stupid_server/%n "%p-%i $(curl http://169.254.169.254/latest/meta-data/public-ipv4/):9000"'
ExecStopPost=/bin/bash -c 'etcdctl rm /apps/stupid_server/%n'

After our service starts, we make a curl request to the AWS metadata service to get the public IP of the machine that we're on (you can also get the private IP if you want) to build the name/IP of the server that will be written to the HAProxy config. The key/value that gets written to etcd looks like this:

Key: /apps/stupid_server/stupid-server@1.service
Value: stupid-server-1 10.10.10.10:9000

Note that the actual IP will be whatever the IP of the machine is.

On the ExecStopPost line, we delete the key from etcd which in turn will cause confd to recompile the config and reload HAProxy.

Start your server

Now, we can actually start our server by submitting it to fleet

fleetctl start stupid-server@1.service

Thats it! Now we can start as many stupid-servers as we want and they'll automatically show up in HAProxy when they start.

Deploying Docker Containers on CoreOS Using Fleet

by on

Docker containers are the hot tech du jour and today we're going to look at how to deploy your containers to CoreOS using Fleet.

In my previous post, I talked about how to deploy a NodeJS application using a pretty vanilla Docker container. Now Im going to walk you through what CoreOS is and how to deploy your containers using Fleet.

What is CoreOS?

The masthead on the CoreOS website puts it perfectly:

Linux for Massive Server Deployments. CoreOS enables warehouse-scale computing on top of a minimal, modern operating system.

Without all the buzzwords, CoreOS is a stripped down Linux distro that gives you a bare-bones environment designed to run containers. That means you effectively get Systemd, Docker, fleet and etcd (as well as other low level things), all of which play a role in deploying our containers.

CoreOS is available on a bunch of different cloud platforms including EC2, Google Compute Engine, Rackspace. You can even run a cluster locally using Vagrant. For today, we're going to be using EC2.

Fleet and Etcd

Bundled with CoreOS you'll find fleet and etcd. Etcd is a distributed key-value store that acts as the backbone for your CoreOS cluster that is built on top of the Raft protocol. Fleet, is a low-level init system that uses etcd and interacts with Systemd on each CoreOS machine. It handles everything from scheduling services to migrating services should you lose a node, to restarting them should they go down. Think of it as Systemd but for a distributed cluster.

Setting up a container

For this tutorial, I created a stupidly simple NodeJS webserver as well as a container. You have find it on github at seanmcgary/stupid-server. All it does is print out the current time for every request you make to it on port 8000.

var http = require('http');

var server = http.createServer(function(req, res){
    res.end(new Date().toISOString());
});

server.listen(8000);

The Dockerfile for it is pretty simple too. Its built off another container that has NodeJS already built and installed. Like in my previous tutorial, it includes a start.sh script that pulls the latest git repo and runs the application each time the container is run. This way updating your application only requires restarting your container.

Dockerfile

# DOCKER-VERSION 0.11.0

FROM quay.io/seanmcgary/nodejs-raw-base
MAINTAINER Sean McGary <sean@seanmcgary.com>


EXPOSE 8000

ADD start.sh start.sh

RUN chmod +x start.sh

CMD ./start.sh

start.sh

git clone https://github.com/seanmcgary/stupid-server.git stupid-server

node stupid-server

Creating a Systemd unit

Remember how I said fleet is like a distributed Systemd? That means that all we need to do is create a Systemd unit file (in this case a template) that we will submit to fleet for scheduling. Fleet will be responsible for finding a machine to run it on, but once it does that, the unit file is compied directly to the machine to be run. This is what our unit file will look like:

stupidServerVanilla@.service

[Unit]
Description=Stupid Server
After=docker.service
Requires=docker.service

[Service]
ExecStartPre=/usr/bin/docker pull quay.io/seanmcgary/stupid-server:latest
ExecStart=/usr/bin/docker run --name stupidservice -p 9000:8000 quay.io/seanmcgary/stupid-server
ExecStopPre=/usr/bin/docker kill stupidservice
ExecStop=/usr/bin/docker rm stupidservice
TimeoutStartSec=0
Restart=always
RestartSec=10s

[X-Fleet]
X-Conflicts=stupidServerVanilla@*.service
  • ExecStartPre: before we start our service, we want to make sure that not only do we have the container downloaded, but we have the latest container version
  • ExecStart: Here we run our container, give it a name and map port 9000 on our host to port 8000 in the container (the one our server is listening on).
  • ExecStopPre: We need to make sure to kill the container
  • ExecStop: Then we can actually remove it
  • TimeoutStartSec: This is set to 0 telling Systemd to not timeout during the startup process. We do this because containers can be large and depending on your bandwidth, can take a while to download initially.
  • Restart: This tells Systemd to restart the unit if it dies while it is running.
  • X-Conflicts: This line (and this X-Fleet block) is specific to fleet. This tells fleet not to schedule services on the same machine as the matching service name. In this case, we want just 1 service per machine.

Spinning up some CoreOS nodes

We're going to spin up 3 instances of CoreOS on the beta channel (the current version in beta is 367.1.0). Simply search for "coreos-beta-367" if you're using the web console . You're looking for an ami with an ID of "33e5e776".

Once you have found it, select which size you want (I picked the micro instance, but you can pick which ever you want). On the configuration details screen, we'll want to enter "3" for the number of instances. We're also going to provide a cloud config so that CoreOS starts Docker and etcd on startup. We're also going to provide a discovery token for etcd so that the machines can all find each other.

NOTE: make sure to get your own discovery token and replace the one that is in the example. To get a new one, go to https://discovery.etcd.io/new.

#cloud-config

coreos:
  etcd:
    discovery: https://discovery.etcd.io/78c03094374cc2140d261d116c6d31f3
    addr: $public_ipv4:4001
    peer-addr: $public_ipv4:7001
  units:
    - name: etcd.service
      command: start
    - name: fleet.service
      command: start

Thats pretty much it, hit the "launch and review" button and in a few moments you'll have three CoreOS instances up and running.

Scheduling Services with Fleet

Now that our cluster is running, we can start to schedule services on it using fleet. This can be done one of a few ways - you can log directly into one of the machines in your cluster and run fleetctl that way, or you can download the lastest binary and run it locally. Im going to run it locally to make things easier.

If you do decide to run it locally, I would suggest creating an alias as you'll need to specify some additional flags to tell fleetctl where to find your cluster. I have the following in my .zshrc:

alias fleetcluster="fleetctl --tunnel=path.to.a.node.com"

This way I can just run fleetcluster <command> each time.

To schedule a service on fleet, we need our unit file, so cd into the directory of your project (I'll be doing this based on the stupid-server from above). Scheduling a service is as easy as fleetcluster run <service>. To schedule the stupid-server, I would run:

$ fleetcluster start stupidServerVanilla@1.service
Job stupidServerVanilla@1.service launched on a33809a9.../10.10.10.10

If you look closesly you'll realize that there is no stupidServerVanilla@1.service file. This is because stupidServerVanilla@.service is a Systemd template. Rather than creating a uniquely named file for each service, we have one that is used as a template. You'll see below the command, fleet responds with where it scheduled your service. Now, if we run fleetcluster list-units we should see it:

$ fleetcluster list-units

UNIT                                 STATE       LOAD      ACTIVE        SUB          DESC                     MACHINE
stupidServerVanilla@1.service        launched    loaded    activating    start-pre    Stupid Server            a33809a9.../10.10.10.10

Fleet also takes care of letting you view logs as well. If we want to view the logs of our server, just run:

$ fleetcluster journal -f stupidServerVanilla@1.service

-- Logs begin at Sun 2014-08-24 14:57:19 UTC. --
Aug 25 02:13:49 10.10.10.10 systemd[1]: [/run/fleet/units/stupidServerVanilla@1.service:9] Unknown lvalue 'ExecStopPre' in section 'Service'
Aug 25 02:13:49 10.10.10.10 systemd[1]: [/run/fleet/units/stupidServerVanilla@1.service:9] Unknown lvalue 'ExecStopPre' in section 'Service'
Aug 25 02:13:49 10.10.10.10 systemd[1]: Starting Stupid Server...
Aug 25 02:13:50 10.10.10.10 docker[3401]: Pulling repository quay.io/seanmcgary/stupid-server
Aug 25 02:16:41 10.10.10.10 systemd[1]: Started Stupid Server.
Aug 25 02:16:41 10.10.10.10 docker[3426]: Cloning into 'stupid-server'...

Fleet communicates with systemd and journald and then pipes the log over ssh to your local terminal session.

Launching a Fleet of Services

Since we created a Systemd template for our unit file, we can use fleet to launch as many as we want at once. If we wanted to launch three more services we would just run:

$ fleetcluster start stupidServerVanilla@{2,3,4}.service

Now if we look at our units:

stupidServerVanilla@1.service        launched    loaded   deactivating    stop-sigterm        Stupid Server        a33809a9.../10.10.10.10
stupidServerVanilla@2.service        launched    loaded   activating      start-pre           Stupid Server        b4809b8d.../10.10.10.11
stupidServerVanilla@3.service        inactive    -        -               -                   Stupid Server        -
stupidServerVanilla@4.service        launched    loaded   activating      start-pre           Stupid Server        27b315e2.../10.10.10.12

You'll see that three of them have been deployed and we have one that's left as inactive. This is because we told fleet to only schedule one per machine.

Stopping and Destroying Your Service

When you need to take down your service or upload a new version of your service file, stopping and destroying are very easy:

$ fleetcluster stop stupidServerVanilla@1.service

$ fleetcluster destroy stupidServerVanilla@1.service

Getting started with DeepLinkr

by on

Being able to provide your customers with the best experience possible is a very important thing, especially in the ecommerce world. Doing so across platforms poses a challenge. Getting customers to the right place based on the platform they are currently on is an even bigger challenge but very beneficial if you can do it. Deeplinkr is a service that aims to accomplish that for companies that have both a web and native app presence. Deeplinkr uses deep links (also recently called app links) to accomplish this.

Deeplinkr is a platform that allows you to get your customers where they need be so that they have the best experience possible. A common case, espeically among e-commerce companies, is to have a web-based store and a native application. Say you send out a marketing email to your customers. According to statistics, around 65% of the emails you just sent out will be opened by people on mobile devices. Knowing that, would you rather be able to drop your customers directly into your optimized, native application, or put them on your website where they might fumble around a little because it's not streamlined for mobile use. The same thing goes if you're posting on social media which includes a large number of mobile users.

It all starts with a single link. This single link can take your customers in a number of directions including simply redirecting them onto your website, or dropping them into your iOS or Android app if you have one. If the customer doesn't have your native app installed, they will fall back to the website URL that you provide.

Lets pick a link to use. It so happens Etsy has deeplinking support built into their website. We'll use this link to a cool BMW M3 print that I purchased a little while back:
https://www.etsy.com/listing/94416200/classic-car-print-bmw-2002-turbo-m?ref=related-4

All we need to do is take the URL to this page and drop it into the URL field in the DeepLinkr link creator.

When we paste the link into the form, DeepLinkr crawls the linked page the see if it can find any meta information regarding deeplinks. As it turns out, Twitter and Facebook have done some heavy lifting for us already by establishing some standards for defining deeplinks for a given page. Applinks.org has all the documentation you need should you want to add these meta tags to your pages to make things a little easier.

If you dont have the meta tags implemented on your site, or you need some custom functionality, you can provide your own app handler URIs.

Now that we have our link, we can send it out over email, social media, etc and we can track and see where people are coming from and what platforms them are on (browser, operating system, mobile device platform, etc).

You'll be able to see links as they come in over time and even break down visits to see which time of day people are clicking your links.

App attribution

Since we've just launched, we're still working on adding some features. One of those is being able to flag a link click as having been opened in the app. Everything up until this point can be done with very little to no development at all. This however will require some integration within your app to communicate back to the (future) DeepLinkr API and say "hey, this click actually opened the application".

With this type of integration, you'll be able to accurately determine if customers are more apt to convert when being dropped directly into your app or not. Or you'll be able to accurately track clicks from Facebook, Twitter, or Google ads. There is a lot of potential in this space, so jump on board with us and help us build a system that can help you and other businesses out there!

How to use systemd timers

by on

Having started to switch over to Fedora recently (as well as CoreOS), I needed to figure out how to run jobs at certain times and/or intervals. On an OS like Ubuntu this is accomplished using Cron. Cron is the worlds largest pain in the ass, that is at least in comparison to how easy it is to create timers under Systemd.

For this example, we're going to set up a timer that runs every minute and then create a service that attaches and uses that timer.

Creating the timer

A timer is just like any other unit file except it has a [Timer] section. It looks a little bit like this:

/etc/systemd/system/minute-timer.timer

[Unit]
Description=Minute Timer

[Timer]
OnBootSec=5min
OnCalendar=*:0/1
Unit=minute-timer.target

[Install]
WantedBy=basic.target

Systemd is pretty power with its OnCalendar function. In this case we're telling it to run every minute, but we could get REALLY specific if we wanted. Have a look at the docs to learn more about whats possible.

Now that we have a timer, we need to create a target that will be used by our actual services.

/etc/systemd/system/minute-timer.target

[Unit]
Description=Minute Timer Target
StopWhenUnneeded=yes

Lets create a test service now that will simply print the current date each time it is run

/etc/systemd/system/testservice.service

[Unit]
Description=Prints the date every minute
Wants=minute-timer.timer

[Service]
ExecStart=/bin/date

[Install]
WantedBy=minute-timer.target

Start your timers

Now that we have our timers and test service created, we need to start everything using systemctl

systemctl enable /etc/systemd/system/minute-timer.timer
systemctl start  /etc/systemd/system/minute-timer.timer

systemctl enable /etc/systemd/system/testservice.service

Now if everything goes as planned, we can watch the logs of our service print the date every minute

> journalctl -f -u testservice.service

Jul 07 21:24:00 ip-10-10-10-10 systemd[1]: Starting Prints the date every minute...
Jul 07 21:24:00 ip-10-10-10-10 systemd[1]: Started Prints the date every minute.
Jul 07 21:24:00 ip-10-10-10-10 date[20887]: Mon Jul  7 21:24:00 UTC 2014
Jul 07 21:25:00 ip-10-10-10-10 systemd[1]: Starting Prints the date every minute...
Jul 07 21:25:00 ip-10-10-10-10 systemd[1]: Started Prints the date every minute.
Jul 07 21:25:00 ip-10-10-10-10 date[20889]: Mon Jul  7 21:25:00 UTC 2014
Jul 07 21:26:00 ip-10-10-10-10 systemd[1]: Starting Prints the date every minute...
Jul 07 21:26:00 ip-10-10-10-10 systemd[1]: Started Prints the date every minute.

When it comes segmenting data to be visualized, Elasticsearch has become my go-to database as it will basically do all the work for me. Need to find how many times a specific search term shows up in a data field? It can do that for you. Need to sum the totals of a collection of placed orders over a time period? It can do that too. Today though Im going to be talking about generating a date histogram, but this one is a little special because it uses Elasticsearch's new aggregations feature (basically facets on steroids) that will allow us to fill in some empty holes.

First came facets

Back before v1.0, Elasticsearch started with this cool feature called facets. A facet was a built-in way to quey and aggregate your data in a statistical fashion. Like I said in my introduction, you could analyze the number of times a term showed up in a field, you could sum together fields to get a total, mean, media, etc. You could even have Elasticsearch generate a histogram or even a date histogram (a histogram over time) for you. The date histogram was particulary interesting as you could give it an interval to bucket the data into. This could be anything from a second to a minute to two weeks, etc. That was about as far as you could go with it though.

Aggregations - facets on steroids

With the release of Elasticsearch v1.0 came aggregations. If you look at the aggregation syntax, they look pretty simliar to facets. A lot of the facet types are also available as aggregations. The general structure for aggregations looks something like this:

"aggregations" : {
    "<aggregation_name>" : {
        "<aggregation_type>" : {
            <aggregation_body>
        }
        [,"aggregations" : { [<sub_aggregation>]+ } ]?
    }
    [,"<aggregation_name_2>" : { ... } ]*
}

Lets take a quick look at a basic date histogram facet and aggregation:

// this is a facet
{
    query: {
        match_all: {}
    },
    facet: {
        some_date_facet: {
            date_histogram: {
                key_field: "timestamp",
                value_field: "widgets_sold",
                interval: "day"
            }
        }
    }
}

// this is an aggregation
{
    query: {
        match_all: {}
    },
    aggregations: {
        some_date_facet: {
            date_histogram: {
                field: "timestamp",
                interval: "day"
            }
        }
    }
}

They look pretty much the same, though they return fairly different data. The facet date histogram will return to you stats for each date bucket whereas the aggregation will return a bucket with the number of matching documents for each. The reason for this is because aggregations can be combined and nested together. So if you wanted data similar to the facet, you could them run a stats aggregation on each bucket.

{
    query: {
        match_all: {}
    },
    aggregations: {
        some_date_facet: {
            date_histogram: {
                field: "timestamp",
                interval: "day"
            },
            aggregations: {
                bucket_stats: {
                    stats: {
                        field: "widgets_sold"
                    }
                }
            }
        }
    }
}

Filling in the missing holes

One of the issues that Ive run into before with the date histogram facet is that it will only return buckets based on the applicable data. This means that if you are trying to get the stats over a date range, and nothing matches it will return nothing. If Im trying to draw a graph, this isnt very helpful. One of the new features in the date histogram aggregation is the ability to fill in those holes in the data. I'll walk you through an example of how it works.

Lets first get some data into our Elasticsearch database. We're going to create an index called dates and a type called entry.

curl -XPOST http://elasticsearch.local:9200/dates/entry -d '{ "date": "2014-05-21T00:00:00.000Z", "value": 10 }'
curl -XPOST http://elasticsearch.local:9200/dates/entry -d '{ "date": "2014-05-22T00:00:00.000Z", "value": 10 }'
curl -XPOST http://elasticsearch.local:9200/dates/entry -d '{ "date": "2014-05-23T00:00:00.000Z", "value": 10 }'
curl -XPOST http://elasticsearch.local:9200/dates/entry -d '{ "date": "2014-05-26T00:00:00.000Z", "value": 10 }'
curl -XPOST http://elasticsearch.local:9200/dates/entry -d '{ "date": "2014-05-30T00:00:00.000Z", "value": 10 }'
curl -XPOST http://elasticsearch.local:9200/dates/entry -d '{ "date": "2014-06-10T00:00:00.000Z", "value": 10 }'
curl -XPOST http://elasticsearch.local:9200/dates/entry -d '{ "date": "2014-05-11T00:00:00.000Z", "value": 10 }'
curl -XPOST http://elasticsearch.local:9200/dates/entry -d '{ "date": "2014-05-12T00:00:00.000Z", "value": 10 }'

Run that and it'll insert some dates that have some gaps in between. Lets now create an aggregation that calculates the number of documents per day:

curl -XGET http://elasticsearch.local:9200/dates/entry/_search -d '
{
  "query": {
    "match_all": {}
  },
  "aggregations": {
    "dates_with_holes": {
      "date_histogram": {
        "field": "date",
        "interval": "day"
      }
    }
  }
}
'

If we run that, we'll get a result with an aggregations object that looks like this:

"aggregations":{
  "dates_with_holes":{
     "buckets":[
        {
           "key_as_string":"2014-05-11T00:00:00.000Z",
           "key":1399766400000,
           "doc_count":1
        },
        {
           "key_as_string":"2014-05-12T00:00:00.000Z",
           "key":1399852800000,
           "doc_count":1
        },
        {
           "key_as_string":"2014-05-21T00:00:00.000Z",
           "key":1400630400000,
           "doc_count":1
        },
        {
           "key_as_string":"2014-05-22T00:00:00.000Z",
           "key":1400716800000,
           "doc_count":1
        },
        {
           "key_as_string":"2014-05-23T00:00:00.000Z",
           "key":1400803200000,
           "doc_count":1
        },
        {
           "key_as_string":"2014-05-26T00:00:00.000Z",
           "key":1401062400000,
           "doc_count":1
        },
        {
           "key_as_string":"2014-05-30T00:00:00.000Z",
           "key":1401408000000,
           "doc_count":1
        },
        {
           "key_as_string":"2014-06-10T00:00:00.000Z",
           "key":1402358400000,
           "doc_count":1
        }
     ]
  }
}

As you can see, it returned a bucket for each date that was matched. In this case since each date we inserted was unique, it returned one for each. Thats cool, but what if we want the gaps between dates filled in with a zero value? Turns out there is an option you can provide to do this, and it is min_doc_count. In this case we'll specify min_doc_count: 0. Our new query will then look like:

curl -XGET http://elasticsearch.local:9200/dates/entry/_search -d '
{
  "query": {
    "match_all": {}
  },
  "aggregations": {
    "dates_with_holes": {
      "date_histogram": {
        "field": "date",
        "interval": "day",
        "min_doc_count": 0
      }
    }
  }
}
'

Alright, now we have some zeros:

"aggregations":{
  "dates_with_holes":{
     "buckets":[
        {
           "key_as_string":"2014-05-11T00:00:00.000Z",
           "key":1399766400000,
           "doc_count":1
        },
        {
           "key_as_string":"2014-05-12T00:00:00.000Z",
           "key":1399852800000,
           "doc_count":1
        },
        {
           "key_as_string":"2014-05-13T00:00:00.000Z",
           "key":1399939200000,
           "doc_count":0
        },
        {
           "key_as_string":"2014-05-14T00:00:00.000Z",
           "key":1400025600000,
           "doc_count":0
        },
        {
           "key_as_string":"2014-05-15T00:00:00.000Z",
           "key":1400112000000,
           "doc_count":0
        },
        {
           "key_as_string":"2014-05-16T00:00:00.000Z",
           "key":1400198400000,
           "doc_count":0
        },
        {
           "key_as_string":"2014-05-17T00:00:00.000Z",
           "key":1400284800000,
           "doc_count":0
        },
        {
           "key_as_string":"2014-05-18T00:00:00.000Z",
           "key":1400371200000,
           "doc_count":0
        },
        {
           "key_as_string":"2014-05-19T00:00:00.000Z",
           "key":1400457600000,
           "doc_count":0
        },
        {
           "key_as_string":"2014-05-20T00:00:00.000Z",
           "key":1400544000000,
           "doc_count":0
        },
        {
           "key_as_string":"2014-05-21T00:00:00.000Z",
           "key":1400630400000,
           "doc_count":1
        },
        {
           "key_as_string":"2014-05-22T00:00:00.000Z",
           "key":1400716800000,
           "doc_count":1
        },
        {
           "key_as_string":"2014-05-23T00:00:00.000Z",
           "key":1400803200000,
           "doc_count":1
        },
        {
           "key_as_string":"2014-05-24T00:00:00.000Z",
           "key":1400889600000,
           "doc_count":0
        },
        {
           "key_as_string":"2014-05-25T00:00:00.000Z",
           "key":1400976000000,
           "doc_count":0
        },
        {
           "key_as_string":"2014-05-26T00:00:00.000Z",
           "key":1401062400000,
           "doc_count":1
        },
        {
           "key_as_string":"2014-05-27T00:00:00.000Z",
           "key":1401148800000,
           "doc_count":0
        },
        {
           "key_as_string":"2014-05-28T00:00:00.000Z",
           "key":1401235200000,
           "doc_count":0
        },
        {
           "key_as_string":"2014-05-29T00:00:00.000Z",
           "key":1401321600000,
           "doc_count":0
        },
        {
           "key_as_string":"2014-05-30T00:00:00.000Z",
           "key":1401408000000,
           "doc_count":1
        },
        {
           "key_as_string":"2014-05-31T00:00:00.000Z",
           "key":1401494400000,
           "doc_count":0
        },
        {
           "key_as_string":"2014-06-01T00:00:00.000Z",
           "key":1401580800000,
           "doc_count":0
        },
        {
           "key_as_string":"2014-06-02T00:00:00.000Z",
           "key":1401667200000,
           "doc_count":0
        },
        {
           "key_as_string":"2014-06-03T00:00:00.000Z",
           "key":1401753600000,
           "doc_count":0
        },
        {
           "key_as_string":"2014-06-04T00:00:00.000Z",
           "key":1401840000000,
           "doc_count":0
        },
        {
           "key_as_string":"2014-06-05T00:00:00.000Z",
           "key":1401926400000,
           "doc_count":0
        },
        {
           "key_as_string":"2014-06-06T00:00:00.000Z",
           "key":1402012800000,
           "doc_count":0
        },
        {
           "key_as_string":"2014-06-07T00:00:00.000Z",
           "key":1402099200000,
           "doc_count":0
        },
        {
           "key_as_string":"2014-06-08T00:00:00.000Z",
           "key":1402185600000,
           "doc_count":0
        },
        {
           "key_as_string":"2014-06-09T00:00:00.000Z",
           "key":1402272000000,
           "doc_count":0
        },
        {
           "key_as_string":"2014-06-10T00:00:00.000Z",
           "key":1402358400000,
           "doc_count":1
        }
     ]
  }
}

All of the gaps are now filled in with zeroes. Here comes our next use case; say I want to aggregate documents for dates that are between 5/1/2014 and 5/30/2014 by day. Our data starts at 5/21/2014 so we'll have 5 data points present, plus another 5 that are zeroes. But what about everything from 5/1/2014 to 5/20/2014? Turns out, we can actually tell Elasticsearch to populate that data as well by passing an extended_bounds object which takes a min and max value. This way we can generate any data that might be missing that isnt between existing datapoints. Our query now becomes:

curl -XGET http://elasticsearch.local:9200/dates/entry/_search -d '
{
  "query": {
    "match_all": {}
  },
  "aggregations": {
    "dates_with_holes": {
      "date_histogram": {
        "field": "date",
        "interval": "day",
        "min_doc_count": 0,
        "extended_bounds": {
            "min": 1398927600000,
            "max": 1401433200000
        }
      }
    }
  }
}
'

The weird caveat to this is that the min and max values have to be numerical timestamps, not a date string. Now our resultset looks like this:

"aggregations":{
  "dates_with_holes":{
     "buckets":[
        {
           "key_as_string":"2014-05-01T00:00:00.000Z",
           "key":1398902400000,
           "doc_count":0
        },
        {
           "key_as_string":"2014-05-02T00:00:00.000Z",
           "key":1398988800000,
           "doc_count":0
        },
        {
           "key_as_string":"2014-05-03T00:00:00.000Z",
           "key":1399075200000,
           "doc_count":0
        },
        {
           "key_as_string":"2014-05-04T00:00:00.000Z",
           "key":1399161600000,
           "doc_count":0
        },
        {
           "key_as_string":"2014-05-05T00:00:00.000Z",
           "key":1399248000000,
           "doc_count":0
        },
        {
           "key_as_string":"2014-05-06T00:00:00.000Z",
           "key":1399334400000,
           "doc_count":0
        },
        {
           "key_as_string":"2014-05-07T00:00:00.000Z",
           "key":1399420800000,
           "doc_count":0
        },
        {
           "key_as_string":"2014-05-08T00:00:00.000Z",
           "key":1399507200000,
           "doc_count":0
        },
        {
           "key_as_string":"2014-05-09T00:00:00.000Z",
           "key":1399593600000,
           "doc_count":0
        },
        {
           "key_as_string":"2014-05-10T00:00:00.000Z",
           "key":1399680000000,
           "doc_count":0
        },
        {
           "key_as_string":"2014-05-11T00:00:00.000Z",
           "key":1399766400000,
           "doc_count":1
        },
        {
           "key_as_string":"2014-05-12T00:00:00.000Z",
           "key":1399852800000,
           "doc_count":1
        },
        {
           "key_as_string":"2014-05-13T00:00:00.000Z",
           "key":1399939200000,
           "doc_count":0
        },
        {
           "key_as_string":"2014-05-14T00:00:00.000Z",
           "key":1400025600000,
           "doc_count":0
        },
        {
           "key_as_string":"2014-05-15T00:00:00.000Z",
           "key":1400112000000,
           "doc_count":0
        },
        {
           "key_as_string":"2014-05-16T00:00:00.000Z",
           "key":1400198400000,
           "doc_count":0
        },
        {
           "key_as_string":"2014-05-17T00:00:00.000Z",
           "key":1400284800000,
           "doc_count":0
        },
        {
           "key_as_string":"2014-05-18T00:00:00.000Z",
           "key":1400371200000,
           "doc_count":0
        },
        {
           "key_as_string":"2014-05-19T00:00:00.000Z",
           "key":1400457600000,
           "doc_count":0
        },
        {
           "key_as_string":"2014-05-20T00:00:00.000Z",
           "key":1400544000000,
           "doc_count":0
        },
        {
           "key_as_string":"2014-05-21T00:00:00.000Z",
           "key":1400630400000,
           "doc_count":1
        },
        {
           "key_as_string":"2014-05-22T00:00:00.000Z",
           "key":1400716800000,
           "doc_count":1
        },
        {
           "key_as_string":"2014-05-23T00:00:00.000Z",
           "key":1400803200000,
           "doc_count":1
        },
        {
           "key_as_string":"2014-05-24T00:00:00.000Z",
           "key":1400889600000,
           "doc_count":0
        },
        {
           "key_as_string":"2014-05-25T00:00:00.000Z",
           "key":1400976000000,
           "doc_count":0
        },
        {
           "key_as_string":"2014-05-26T00:00:00.000Z",
           "key":1401062400000,
           "doc_count":1
        },
        {
           "key_as_string":"2014-05-27T00:00:00.000Z",
           "key":1401148800000,
           "doc_count":0
        },
        {
           "key_as_string":"2014-05-28T00:00:00.000Z",
           "key":1401235200000,
           "doc_count":0
        },
        {
           "key_as_string":"2014-05-29T00:00:00.000Z",
           "key":1401321600000,
           "doc_count":0
        },
        {
           "key_as_string":"2014-05-30T00:00:00.000Z",
           "key":1401408000000,
           "doc_count":1
        },
        {
           "key_as_string":"2014-05-31T00:00:00.000Z",
           "key":1401494400000,
           "doc_count":0
        },
        {
           "key_as_string":"2014-06-01T00:00:00.000Z",
           "key":1401580800000,
           "doc_count":0
        },
        {
           "key_as_string":"2014-06-02T00:00:00.000Z",
           "key":1401667200000,
           "doc_count":0
        },
        {
           "key_as_string":"2014-06-03T00:00:00.000Z",
           "key":1401753600000,
           "doc_count":0
        },
        {
           "key_as_string":"2014-06-04T00:00:00.000Z",
           "key":1401840000000,
           "doc_count":0
        },
        {
           "key_as_string":"2014-06-05T00:00:00.000Z",
           "key":1401926400000,
           "doc_count":0
        },
        {
           "key_as_string":"2014-06-06T00:00:00.000Z",
           "key":1402012800000,
           "doc_count":0
        },
        {
           "key_as_string":"2014-06-07T00:00:00.000Z",
           "key":1402099200000,
           "doc_count":0
        },
        {
           "key_as_string":"2014-06-08T00:00:00.000Z",
           "key":1402185600000,
           "doc_count":0
        },
        {
           "key_as_string":"2014-06-09T00:00:00.000Z",
           "key":1402272000000,
           "doc_count":0
        },
        {
           "key_as_string":"2014-06-10T00:00:00.000Z",
           "key":1402358400000,
           "doc_count":1
        }
     ]
  }
}

Elasticsearch returned to us points for every day in our min/max value range.

That about does it for this particular feature. Now if we wanted to, we could take the returned data and drop it into a graph pretty easily or we could go onto run a nested aggregation on the data in each bucket if we wanted to.

Extending Underscore Templates with Partials

by on

Back in March, I made a post on how to share Underscore templates between your client and server. Now its time to take what I talked about there to the next level to make it even easier and add some structure to your templates.

Organization and Structure

Out of the few template systems that I've looked at, hardly any of them come with a nice way of structuring template files, and thats if they use files at all in the first place. A lot of templating engines rely on embedding your templates in <script> tags in your HTML files. To me this seems really messy, disorganized, and isnt portable to be used to render things server side if you want.

Last time, we were able to take template that occupied a single file, dump it into Underscore (or LoDash if that's your preference) and be able to use that compiled template in the same way on the server and client. The trouble here is that each template, no matter how small or large, must occupy a single file. That can potentially get our of hand really quickly. It also doesnt give you an easy way of referencing templates from within other templates.

Introducing node-partials

Node-partials is an npm module that wraps Underscore/Lodash templates and allows you to define multiple partials within one template file. In doing so, this also allows us to easily reference partials from another template/partial file all together.

To install it, simply run:

npm install node-paritals

Now, lets take a look at how our template files are structured. Previously, our template files looked something like this:

<div class="some class">
    Like some HTML for example
</div>
<div class="some-val">
    This is some content of the partial
</div>

With node-partials, we would structure it like this:

## some-page-component
<div class="some class">
    Like some HTML for example
</div>
<div class="some-val">
    <%= foo %>
</div>

## text-content-partial
This is some content of the partial

Here, we have two paritals in our template file, some-page-component and text-content-partial. Each partial name is identified by using the ## followed by a space, followed by the name of your partial. Now, how do we use this?

initialization and setup

We have a template with some partials and we have the node-partials module installed. Lets set it up:

var partials = require('node-partials');
var templatePath = '/some/path/to/your/templates';

partials = new partials({
    delimiter: '## ',           // defaults to '## ' but can be pretty much whatever you want
    validFileTypes: ['html'],   // defaults to 'html' only
});

var templates = partials.compile(templatePath);
var serializedTemplates = partials.serializeTemplates(templates);

For node-partials to work, we create an instance that takes some options to know which type of files to look for and the delimiter to use for parsing the parials out of each of the template files. If you pass absolutely nothing, it will default to looking for .html files and use the double hash (##) as the partial name delimiter. Once it is initialized, we can call the .compile() function, passing a path to it. The module will traverse the directory looking for the types of files you specified, parse out the partials, run their contents through the Underscore/Lodash template function and store the compiled template in an object that is then returned.

When parsing the files, the name of the file/its path and the partial name will be used as the key to access the partial. For example, if a file with the name my-templates.html has the two partials in the example above, we would get two entries in the returned object that look like this:

{
    'my-template/some-page-component: <compiled template>,
    'my-template/text-content-partial: <compiled template>
}

Since the path/name of the file is used to construct the key, we can even have nested directories of templates. If we had some-widget/some-template.html, we would address the partials in like like this:

{
    'some-widget/some-template/<partial name>': <compiled template>
}

Rendering templates

Rendering the partials is nearly the same it was in my previous post. Rather than having a single variable, we have an object containing a bunch of templates. Putting it all together, it would look something like this:

var partials = require('node-partials');
var templatePath = '/some/path/to/your/templates';

partials = new partials({
    delimiter: '## ',           // defaults to '## ' but can be pretty much whatever you want
    validFileTypes: ['html'],   // defaults to 'html' only
});

var templates = partials.compile(templatePath);

console.log(templates['my-template/some-page-component']({ someData: 'some value' }));

Sharing the partials and templates with the client

Since each compiled template is a function, we can serialize everything and dump the source into a file that we can serve to the client, much like we did in the previous post. Once your templates are compiled, you can pass that object to the serialize() function of your instantiated node-partials object which will then return a stringified representation of the templates object that can then be passed to the client as a plain-old Javascript file.

var partials = require('node-partials');
var templatePath = '/some/path/to/your/templates';

partials = new partials({
    delimiter: '## ',           // defaults to '## ' but can be pretty much whatever you want
    validFileTypes: ['html'],   // defaults to 'html' only
});

var templates = partials.compile(templatePath);
var serializedTemplates = partials.serializeTemplates(templates);

When your templates are loaded on the client, the object can be found in the window.__views global variable and works exactly the as the templates object on the server.

Deploying a NodeJS Application Using Docker

by on

Docker is an open-source project to easily create lightweight, portable, self-sufficient containers from any application. The same container that a developer builds and tests on a laptop can run at scale, in production, on VMs, bare metal, OpenStack clusters, public clouds and more.

A little over a year ago, Docker was released, built on top of Linux Containers, or LXC for short. Linux Containers have been around for a little while and are a really interesting in that they provide operating system level virtualization. Rather than having a hypervisor running full operating systems on a piece of hardware (like Xen, if you're familiar), Linux Containers rely on the kernel of the host operating system. Think of LXCs as a fancy kind of chrooted environment. Docker then builds on top of that essentially allowing you to run an operating system on an operating system. This makes containers a really attractive method for distributing applications in that you can build one container and it'll run on any host operating system that can support Docker.

Building a base container

Our ultimate goal here is to end up with a container that can run a simple NodeJS web server.

_Note: I am going to assume that you have done some reading on Docker and have probably done their introduction/walkthrough. Im also going to assume that your machine has docker installed and running_

To start with, we need a base operating system for our container. I personally like Ubuntu, so we're gonna use 13.10 as our base image. Lets create a Dockerfile and populate it with the following:

# DOCKER-VERSION 0.10.0

FROM ubuntu:13.10

This tells Docker to go fetch Ubuntu from the registry and use version 13.10. In case you are wondering or have forgotten, containers can be used to build containers and can be publicly stored in the docker registry. If you were to publish your own container, you'd end up with the repository name looking like <username>/<container name>:<revision>. In this case, the ubuntu happens to be a special repository that doesnt belong to a particular user.

Now, we have a very base container. If we wanted to build it, we could run:

docker build -t <username>/ubuntu-base .

*This assumes that you are in the same directory as your dockerfile.

Installing node and npm

Remember how I said containers are effectively operating systems? This means that we can use the container exactly as we would our local machine. Just to show you that our container is basically a base Ubuntu image, try running:

docker run -i -t <username>/ubuntu-base /bin/bash

This will fire up our built container and execute /bin/bash. The -i flag tells docker to redirect the output of the command to stdout and the -t flag tells it to open a tty giving us an interactive session as if we logged in or ssh'd into a machine.

Now lets get NodeJS and NPM installed. We're going to install git while we're at it so that we can clone our repository into our container later. We can do this through apt-get.

# DOCKER-VERSION 0.10.0

FROM ubuntu:13.10

# make sure apt is up to date
RUN apt-get update

# install nodejs and npm
RUN apt-get install -y nodejs npm git git-core

If it isnt obvious already, the RUN instruction takes a command and will run it. The interesting thing to note about Docker here though is that it will cache the state after each command. This is how we can incrementally build a system in a container and not have to rebuild the entire thing each time we deploy that container.

Another thing to note here is that Docker will run these commands without the use of stdin when building, so we need to bass the -y flag to apt-get install to tell it "yes, install these packages and their dependencies"

Lets build our container again.

docker run -i -t <username>/ubuntu-base /bin/bash

Now, if we were to run our container and open up an interactive session, NodeJS, NPM, and git would be available to us on the commandline.

A simple Node webserver

Lets create a simple web server in node using express.

index.js

var express = require('express');

var app = express();

app.get('/', function(req, res){
    res.send('Hello from inside a container!');
});

app.listen(8080);

package.json

{
  "name": "my-cool-webserver",
  "version": "0.0.1",
  "description": "A NodeJS webserver to run inside a docker container",
  "main": "index.js",
  "author": "sean@seanmcgary.com",
  "license": "MIT",
  "dependencies": {
      "express": "*"
  }
}

To make this container easy to deploy and updateable, everytime it runs, it will pull the latest version of our app from a remote git repository, so go ahead and commit and push your app to a git repository.

Running the application

To run our application, we need to again modify our Dockerfile with a few things:

  • We need to expose/map port 8080 between the container and host. Remember, a container is basically a fancy chroot so unless we tell the host operating system to map a port to it, nothing can access the container from the outside, and nothing in the container can access the host.
  • We need to pull the app from the remote repository
  • Run npm install to make sure express is installed
  • Finally run our application

Let's modify our Dockerfile:

# DOCKER-VERSION 0.10.0

FROM ubuntu:13.10

# make sure apt is up to date
RUN apt-get update

# install nodejs and npm
RUN apt-get install -y nodejs npm git git-core

ADD start.sh /tmp/

RUN chmod +x /tmp/start.sh

CMD ./tmp/start.sh

So here we use the ADD instruction to copy a file called start.sh to /tmp/ in our container, make it executable, then run it. You're probably wondering what the hell is in start.sh. Aren't we supposed to be running a node app?

Heres what start.sh looks like:

cd /tmp

# try to remove the repo if it already exists
rm -rf <git repo name>; true

git clone <remote git repo>

cd <git repo name>

npm install

node .

The reason we put these commands into a script file is so that docker wont cache the result of it. See, unlike RUN, the CMD instruction is used to start and run whatever it is you want to run in your container. It is always the last thing in your Dockerfile and is run every time your container is started/restarted. This way, we clone the repository fresh every time. This makes deploying an update really easy - just restart your container!

Lets build this thing and name it something more descriptive:

docker build -t <username>/my-nodejs-webserver .

Now, to run it we're going to do something likethis:

docker run -p 8080:8080 <username>/my-nodejs-webserver

You'll notice that we have a -p flag in there. This says "take port 8080 on the host operating system and map it to port 8080 in the container". Now, we can send/receive web traffic from our container. The other thing you'll notice is that once you run that command, there isnt any output. To see what's going on run:

docker ps -a

This will give you something that looks like this:

$ docker ps -a
CONTAINER ID        IMAGE                                COMMAND                CREATED             STATUS                    PORTS                    NAMES
4acbdf4c6695        91f00a99f058                    /bin/sh -c ./start.s        2 days ago          Exited (0) 2 days ago    0.0.0.0:8080->8080/tcp    hopeful_hawking

That first column is the container id that we can use to attach to our container to view the logs.

docker logs 4acbdf4c6695 -f

This will tail the log for you.

When you're ready to stop your container, simply run:

docker stop 4acbdf4c6695

You can also start and restart it in the same way

docker start 4acbdf4c6695

docker restart 4acbdf4c6695

Now that we're all done, we can push our container to the public registry:

docker push <username>/my-nodejs-webserver

To be continued...

This is the first post in a series I plan on writing about my experiences with docker and implementing some technologies from CoreOS including etcd, fleet, and CoreOS itself to create an automated, distributed application environment.

A NodeJS Module for Delighted

by on

Recently at work, we decided to integrate with a cool little service called Delighted to start getting feedback from our users after they have made a purchase from us.

Delighted is the easiest and most beautiful way to measure customer happiness. Are your customers delighted?

Delighted does one thing and does it really well - they provide a service for automating the sending of NPS emails to customers. For those that don't know, NPS, or "Net Promoter Score" is a system designed to gauge loyalty of a company's customers. Ever got one of those emails or popups on a site that asks you "how likely are you to recommend our product to someone" ? That's exactly what Delighted does, but in a little more elegant way.

They send a simple and straightforward email asking you to provide a rating from 0 to 10. After picking a number, they will also give you the chance to provide a comment if you so choose.

Automation and Implementation

At ThirdLove, we needed a way to automate this. Lucky for us, Delighted provides a RESTful API. They provide endpoints for sending emails, fetching responses and even metrics. The thing we found to most useful so far is the ability to schedule emails to be sent to customers in the future. This way, when a customer makes a purchase, our backend will make a call to Delighted, telling it to send an email to the customer n many days in the future.

When I started looking at the API docs, I noticed that they didnt have a Node module - just a Ruby gem and raw curl examples. With that being the case, and our backend being written using Node, I decided to create a module that wraps their API to make it easy to interact with.

The source can be found here on Github.

The module is very very simple and basically wraps the API calls, forwarding on the JSON payloads that Delighted's API returns. This module also uses the Q promise library rather than traditional callbacks. Included in the repository are a few examples and a pretty detailed README that explains all of the methods and the parameters they take (nearly identical to the Delighted API docs).

So one day you decided to write a web application where the rendering of templates is shared between the server and client and you wonder to yourself "how in the world can I share templates between my server and client without needlessly duplicating code and/or templates?". Today we're going to look at how to address this problem using Underscore's (or Lo-Dash if you prefer) templating engine.

Templates in Underscore

For the uninformed, Underscore (and Lo-Dash, which is a fork of Underscore that has more functionality and is allegedly faster) is a Javascript utility library that provides a crap-load of useful (and cross-browser) helper functions, one of which is a templating system that is similar to both EJS and ERB for those of you that maybe have used Ruby. The even better thing is that Underscore and Lo-Dash work not only in the browser but in NodeJS as well making their templating system perfect for this use-case.

Templates look a like this:

<div class="my-super-awesome-div">
    <%= mySuperAwesomeVariable %>
</div>
<ul class="things-that-are-awesome">
    <% _.each(thingsThatAreAwesome, function(thing){ %>
        <li><%= thing %></li>        
    <% }); %>
</ul>

Unlike templating languages like Mustache/Handlebars, you can use all of the features of Javascript in your templates. This is entirely up to you and really depends on your general idea of the purpose of templates and if logic, let alone ALL of Javascript's features should be accessible.

Generating templates

To begin with, we're going to start on the server-side. We're going to assume that our template above lives in a file on the filesystem exactly as you see it in the block above. We're going to read it and feed it as a string into the template engine.

var _ = require('lodash'); // or 'underscore' if you so choose
var fs = require('fs');

fs.readFile('/path/to/your/template', function(err, data){
    data = data.toString();

    var template = _.template(data);
});

The result of _.template() is a function that you can then pass a block of data to. To use our template we would do something like this:

var templateString = template({
    mySuperAwesomeVariable: 'Im super awesome',
    thingsThatAreAwesome: ['This is awsome', 'So is this', 'This is too!']
});

The interesting thing to note here is that we passed an object into our template, but we're referencing variables in the template itself. Turns out, when you evaluate your template, it will take the keys of the object you passed in and create variables out of those keys in the scope of your template. If we do a console.log(template); we can kinda see what is going on (Ive formatted it to be a little more readable):

{ [Function]
  source: 'function(obj) {
      obj || (obj = {});
     var __t, __p = \'\', __e = _.escape, __j = Array.prototype.join;
     function print() { __p += __j.call(arguments, \'\') }\n
     with (obj) {
         __p += \'<div class="my-super-awesome-div">\\n\\t\' +\n
        ((__t = ( mySuperAwesomeVariable )) == null ? \'\' : __t) +\n\
        '\\n</div>\\n<ul class="things-that-are-awesome">\\n\\t\';\n 
        _.each(thingsThatAreAwesome, function(thing){ ;\n
                __p += \'\\n\\t\\t<li>\' 
                +\n((__t = ( thing )) == null ? \'\' : __t) +\n\
                '</li>    \\t\\n\\t\';\n 
         }); ;\n
        __p += \'\\n
        </ul>\\n\';\n\n}\n
        return __p\n
    }' 
}

In short, it creates a function that that takes a single argument (the object that we pass in), and builds a string withing a with block. The with block is the magic that takes our arguments object and creates variables in the template's scope from the keys and values of the object.

Using your template in the client

Now that we have a compiled template, how in the hell do we get it to the client? As we just saw, our template is just a function that returns an evaluated string. All we really need to do is serve up the "source" function to the client. Lets take a look at how we can do that:

var viewString = 'var __views = {};';

viewString += '__views["ourCoolView"] = ' + template.source;

What we're doing here is programmatically building the source of a Javascript file that we're going to serve up to the client. If we view the whole thing as if we wrote it by hand, it would look something like this:

var __views = {};

__views["ourCoolView"] = function(obj) {\nobj || (obj = {});\nvar __t, __p = \'\', __e = _.escape, __j = Array.prototype.join;\nfunction print() { __p += __j.call(arguments, \'\') }\nwith (obj) {\n__p += \'<div class="my-super-awesome-div">\\n\\t\' +\n((__t = ( mySuperAwesomeVariable )) == null ? \'\' : __t) +\n\'\\n</div>\\n<ul class="things-that-are-awesome">\\n\\t\';\n _.each(thingsThatAreAwesome, function(thing){ ;\n__p += \'\\n\\t\\t<li>\' +\n((__t = ( thing )) == null ? \'\' : __t) +\n\'</li>    \\t\\n\\t\';\n }); ;\n__p += \'\\n</ul>\\n\';\n\n}\nreturn __p\n};

When sent to the client, the variable __views will be placed in the global scope (window.__views). To evaluate our template to get the string output like we did before we would do:

var renderedTemplate = window.__views['ourCoolView']({
    mySuperAwesomeVariable: 'Im super awesome',
    thingsThatAreAwesome: ['This is awsome', 'So is this', 'This is too!']
});

$('.someDomElement').html(renderedTemplate);

Thats pretty much it! In the next week or two, I will be following up this post with a post on how to extend this system even further by introducing a small library I built called node-partials that introduces inter-file partials as well as compiling multiple files and partials together.