Building your infrastructure with Terraform - Review and Walkthrough

By Martin Rusev read

Terraform has been on my "Cool devops projects to checkout" list for a while. I am a long time Vagrant user and I have some experience with Packer and Consul. As a Hashicorp project, I knew what to expect before hand from Terraform - great branding, steep learning curve and lack of documentation for anything beyond the basics. Once past these hurdlers - a valuable addition to the devops arsenal.

What is Terraform

Terraform is an infrastructure builder. You define the details in a template and execute with terraform apply.

Comparable tools are Amazon CloudFormation and OpenStack Heat. They are both locked to their respective platforms. Terraform on the other hand supports all the major cloud providers - from DigitalOcean and Linode to the big ones like Azure, Google Compute and of course Amazon.

If you are already invested in a CAPS(Chef, Ansible, Puppet and Salt), than you already know that all of them have at least decent support for building servers in the cloud. What makes Terraform different? Why should you invest your precious time in learning yet another tool offering similar functionality.

When I was reading through the Terraform docs, I honestly could not see the benefits until after I have taken the time to learn and use it.


# Hello World in Terraform
provider "digitalocean" {
    token = "${var.digitalocean_token}"
}

resource "digitalocean_droplet" "terra" {
    image = "ubuntu-14-04-x64"
    name = "terra-one"
    region = "fra1"
    size = "512mb"
    ssh_keys = ["${var.digitalocean_ssh_fingerprint}"]

Benefits of using Terraform

State

In a way Terraform acts as a simple version control for cloud infrastructure. Once you run terraform apply, it will create a .tfstate file with details about the newly created resource. The next time you run the the command, it will get the id from the tfstate file and compare that to what already exist in the real world.

The tfstate file is a simple JSON, you can keep it in Git, share it with your team and build on it. If something bad happens - you can easily rebuild.

Planning

For me this was the main reason to try Terraform. terraform plan matches the definition against the real world infrastructure and shows you exactly what is going to happen if you execute in that moment.

In the real-world

One minor issue I had here. Any error you might encounter while building with Terraform could result in a created, untracked server. This could be an API error on creation or on later stage when you do some provisioning.

Piping data between resources

With Terraform you can create a server, get the dynamically generated IP and use the value in a config file on another server that is going to be created afterwards. For example you can install nginx and then pass the server IP to a DNS service. Or install salt-master and then paste the IP to the salt-minions

In the real-world

I am not aware of any other provisioning tool that does something like that. Once you understand the Terraform variables syntax, it becomes really useful.

Parallelism

Spinning servers is slow regardless of your cloud provider. Terraform was designed from the ground up to run things in parallel. Terraform is written in Go and all CAPS tools are in either Ruby or Python which are not great at running things in parallel efficiently.

In the real-world

My personal experience is with Ansible which spawns a new process of the main app which results in excessive memory and CPU usage at scale(going up to 4GB for 50 hosts). I didn't do any extensive benchmark, but while creating 5 servers Terraform was at 0% CPU and 8MB Memory.

One issue I had here. Terraform run things in parallel, but outputs a massive list of consequent lines in the terminal which becomes hard to follow for more than 5 servers.

Potenital Terraform issues

One other issue I found to be rather annoying is that you can write pretty much anything in your Terraform config and it will accept it as a valid plan. This is a problem, especially if you pipe variables from one machine to the others. It does not give you instant feedback - I intentionally created one valid configuration and one made up and Terraform showed me the error 2-3 minutes later.

Walkthrough and Tips

As I am writing this Terraform is version 0.6.3 and stands at 6329 commits on Github. Plenty of time to get the docs in order, but this is not the case. I personally found the docs severaly lacking and spent most of my time digging through Google Groups, tutorials and Github issues to find the proper syntax.

In this section I want to cover my first 2 days with Terraform, the things I learned and the initial hurdles I had to go through:

Installation

Terraform has precompiled binaries for all the major distros. You download, unzip and you run terraform plan/apply. Nothing complicated here. A few notes:

Your very first Terraform config

Open your favorite text editor, paste the snippet below and save it as do.tf. The filename doesn't matter - Terraform automatically picks up everything that ends with *.tf. Run terraform apply to create your droplet.


provider "digitalocean" {
    token = "your-token"
}
resource "digitalocean_droplet" "terra-one" {
    image = "ubuntu-14-04-x64"
    name = "terra-one"
    region = "fra1"
    size = "512mb"

}

Terraform supports most cloud providers, but in this article I am going to use DigitalOcean as a reference.

You will need the following 2 blocks to make it work:

In addition to the native *.tf format Terraform also supports JSON. I found it more useful than the default syntax, because you can parse the file and validate against the API. For example:

"provider": {
    "digitalocean": {
        "token": "yourtoken"
    }
}
"resource": {
        "digitalocean_droplet": {
            "terra-one": {
                "image": "ubuntu-14-04-x64",
                "region": "fra1"
            }
        }
    }
}
# terraform_validator.py
import json

terraform_config = json.loads("do.tf")
# apache-libcloud
from libcloud.compute.types import Provider
from libcloud.compute.providers import get_driver

token = terraform_config['path-to-token']
droplet = terraform_config['resource']['digitalocean_droplet']['terra-one']

cls = get_driver(Provider.DIGITAL_OCEAN)

driver = cls(terraform_config['path-to-token'], api_version='v2')
droplet =
sizes = driver.list_sizes()
if droplet['image'] not in driver.list_images():
    print "Invalid Image"
if droplet['size'] not in driver.list_sizes():
    print "Invalid Droplet Size"

Variables

The next thing you will want to do is move the sensitive information like tokens and access_keys outside of the Terraform config. This part is not well explained and it took me awhile to understand.

To extract the variables you will need a .tfvars file:

# To store the values, create a file **variables.tfvars**
# The name does not matter, only the # extension
digitalocean_token = "yourdigitaloceantokengoeshere"
digitalocean_ssh_fingerprint = "51:4a:72:22:a3:8b:08:87:5c:fc:42:35:71:e6:91:ae"

## Define them in do.tf or in a separate file
variable "digitalocean_token" {}
variable "digitalocean_ssh_fingerprint" {}

# Use them with ${var.}
provider "digitalocean" {
    token = "${var.digitalocean_token}"
}

In addition to the variables you define yourself, Terraform gives you access to dynamic variables - like IP address or Server ID or DNS entry. You can use these in Terraform itself or in an external tool. To see the full list of available dynamic variables, run terraform plan and check the ones marked with computed

resource "digitalocean_droplet" "terra-one" {
     # In terraform ->
      provisioner "local-exec" {
          command = "echo  ${digitalocean_droplet.terra-one.ipv4_address} 
          ansible_connection=ssh ansible_ssh_user=root > terra-one_host"
      }

}
# Outside, could be used in a CAPS tool.
# To get it run `terraform output terra-one-ip`
output "terra-one-ip" {
    value = "${digitalocean_droplet.terra-one.ipv4_address}"
}

Provisioning

Terraform has very basic provisioning capabilities. It can copy a file and execute a shell command. It does not in any way replace a CAPS(Chef, Ansible, ..) tool.

It is possible to build and provision at the same time, but my advice is to avoid doing so. Any error you might have or encounter in your Ansible playbook/Salt state will result in failed Terraform state and unmanaged server that you have to delete manually afterwards.

# In Terraform
provisioner "local-exec" {
    command = "echo  ${digitalocean_droplet.terra-one.ipv4_address} 
    ansible_connection=ssh ansible_ssh_user=root >> db-hosts"
}

# After Terraform is done building:
ansible-playbook ~/playbooks/apps/mongodb/main.yml -i db-hosts

Structure

By default when you execute terraform plan, terraform apply, Terraform will read all the .tf files in the current folder. That is fine for a handful of servers.

But what happens if you want to separate them by function, like this:

# Execute with terraform apply db
variables.tf
/db
- postgres.tf
# Execute with terraform apply cache
/cache
- redis.tf
/web
- django.tf
/server
- nginx.tf

Terraform works well with folders and you can separate your definitions pretty much any way you want. The best part is - you will get separate state file for each part of your infrastructure.

One small issue I had here were the variables - there is nothing in Terraform itself that can include a tf file from a parent directory and you have two options - one is to copy the same file with sensitive data in each directory or two - to create a simple Make wrapper around Terraform which does that automatically.

# Makefile
build_db:
    copy sensitive_variables.tf /db/sensitive_variables.tf
    # Bonus step, validate the template
    terraform_validate.py db/postgres.tf 
    terraform build db
    rm db/sensitive_variables.tf

How I am using Terraform

I spent almost a week playing around with Terraform. Once past my initial frustrations with the docs, I was quite happy with the results.

Currently I am using it to automate the deployment of the hosted/trial Amon instances. Each one of them is a separate DigitalOcean droplet. Terraform spins the droplet, installs Amon on it and then passes the IP address to nginx, which creates a new config entry with a subdomain matching the IP


upstream subdomain.amon.cx {
    # DigitalOcean IP
    server 178.23.22.100 fail_timeout=10;
}

Before Terraform, I was using the DigitalOcean control panel to spin the droplet, then Ansible to install and finally a form in a Django admin to create the nginx config. The whole process, although fairly automated took 15 minutes of my time. Now I am down to 1 minute total engagement and I could easily stop/remove an instance with a single command.

Conclusion

I hope you enjoyed this post, for the next one I will be looking at one more interesting HashiCorp project - Vault. Feel free to leave your email in the form below, if you want to get notified when the blog post is available.