Getting Started with Hashicorp Terraform

A brief overview of what terraform is, and how we can use it to define infrastructure as code.

-

Hello again, my name is Rich Minchuk, and I'd like to describe some infrastructure as code. What exactly does that mean?

Well, it means a few things.

I would like a singular place that I can look that describes the infrastructure I'm working with (say I'm deploying an application to this infrastructure), and know exactly how it was created. In specific cases where I might be unsure if my infrastructure matches the code that describes it, I should be able to easily destroy the infrastructure and know exactly how to recreate it.

Terraform is an open source tool for defining infrastructure as code. You can use it to define on-prem and cloud infrastructure (IE: VMWare, AWS, Azure, GCP, Networking, etc..), as well as provision software using well known infrastructure automation frameworks (Helm, Chef, etc..).

When we "terraform infrastructure," we are using the defined-in-code infrastructure to inform how we "plan" it's deployment using the plan command, then apply the diff of changes using apply. To define our infrastructure as code we use Hashicorp Configuration Language and JSON, and here's what that looks like.

// Hashicorp Configuration Language
Variable "ami" {
   type = "string"
   description = "the ami to use"
}
// Plain old JSON
{
   "Variable": {
      "ami": {
         "description": "the ami to use"
      }
   }
}

In the example above, I'm creating a variable I can use later in my project to define which Amazon Machine Image I want to use, and you can see the syntactical differences between Hashicorp Configuration Language and JSON as well.

Resources & Other Components

Resources are the primary component you'll work with after setting up a terraform repo. They define the infrastructure we wish to create, according to how the resource expects the infrastructure to be defined.

resource "asw_instance" "web" {
   ami = "ami-a1b2c3d4"
   instance_type = "t2.micro"
}

In this example, we've specified that the "AWS Instance" called "web" is based on the Amazon Machine Image "AMI - a1 b2 c3 d4", and it's of the "instance type" "t2 micro". And for now, that's all we really need to define this machine in code.

A provider is a configuration block that exposes specific Resources.

provider "aws" {
   region = "us-east-1"
   version = "~> 2.00"
}

For example, if I wanted to create a virtual machine in AWS, I would use the AWS Provider to expose the "AWS Instance" resource (above).

module "my-module-that-does-things" {
   source = "./my-module-dir"
   my-input-variable: 5
}

Modules are an abstraction layer for resources that are typically used together. They help you encapsulate complexity out of the way. From the root module (or your terraform git repository's top level folder) you can call any number of modules. So, instead of configuring a lengthy resource definition in your top level main.tf file, abstract away it's inputs in a sub module.

Even better, there are hundreds of prebuilt modules to easily get started with, and they're accessible through the Terraform Registry. Each module in the Terraform Registry comes with examples of how to build these pieces of infrastructure inside your own terraform project.

You can checkout my AKS Preview/Application Gateway Front-End Module in the Terraform Registry. Your main.tf file then becomes just a few lines:
module "aks-appgw-fe" {
  source  = "richminchukio/aks-appgw-fe/azurerm"
  version = "0.1.2"

  ssh_public_key = file("~/.ssh/id_rsa.pub")
}

As you saw above I included a custom variable called my-input-variable in the call to my-module-that-does-things. Variables help you define the inputs that your project or module requires. When building infrastructure you may require environment specific configuration. Being able to substitute this information at planning and execution is a crucial component of multi-environment based infrastructure management.

terraform plan -f my-variable-file.tf
# or
terraform plan -f my-variable-file.tf.json

We'd define this environment specific configuration in variable files, which are included when we terraform plan, or terraform apply the infrastructure.

Data sources are an interesting way to dynamically modify your terraform plan. For example, I might want to create an EC2 instance based on the latest Amazon Machine Image.

data "aws_ami" "web" {
   filter {
      name = "state"
      values = ["available"]
   }
   most_recent = true
}

resource "aws_instance" "web" {
   ami = data.aws_ami.web.id
   instance_type = "t1.micro"
   // ... continued below

I can define an "AWS AMI" data source, then reference the ID of the machine image in my AWS Instance resource definition. When I terraform plan my project, It'll return whether or not there's a new Amazon Machine Image for me to use or not.

Provisioners are tools that allow us to perform custom actions on our infrastructure.

   // ... continued from above
   // copy version controlled configs to aws instance 
   provisioner "file" {
      source = "./conf/configs.d"
      destination = "/etc"
   }
   // join to consul for service discovery 
   provisioner "remote-exec" {
      inline = [ "consul join ${aws_instance.web.private_ip}" ]
   }
}

We can copy files to remote servers or execute scripts there too. There's also support for common infrastructure automation frameworks like puppet and chef.

Terraform uses state files to understand the last known configuration of your infrastructure. This helps terraform understand what actions need to be taken when planning our infrastructure changes. We use the plan command to test that changes we made in code will have the effect we desire on our infrastructure.

In a multi user terraform-ing team (like a DevOps team), remote backends can be used to share the known state of our infrastructure. Some remote backends even support state locking, because we wouldn't want two people trying to apply changes at the same time.

terraform {
  backend "artifactory" {
    username = "MyServiceAccount"
    password = $var.artifactory_pass
    url      = "https://custom.artifactoryonline.com/artifactory"
    repo     = "foo"
    subpath  = "terraform-bar"
  }
}

There's a slew of supported options for terraform backends including (Artifactory, AzureRM, GCS, S3, and many more).

Getting Started

I'm sure you're wondering how to get started. Well that's a pretty difficult question to answer due to the wide breadth of infrastructure you can create with Terraform. My suggestion to you is that you start trying to define your existing infrastructure in code first. When you've done that you can rest assured you'll always be able to recreate it if there is a problem!

For a more litteral set of getting started steps, first create a new git repo, change directories to it, and run init. This is the first command you'll run when setting up a new terraform project. you'll probably want to create a .gitignore as well:

echo -n "#  Local .terraform directories
**/.terraform/*

# .tfstate files
*.tfstate
*.tfstate.*

# .tfvars files
*.tfvars" >.gitignore

Good Luck, and be sure to click subscribe to see how I terraform my Azure Kubernetes Service next!

Rich Minchuk

Technology Enthusiast and Wannabe Growth Hacker