Bestest Practices for Terraform!!!

Configuration bugs, not code bugs, are the most common cause I’ve seen of really bad outages

Motivation

Based on my personal experience, I can definitely say that configuration changes can be a major source of headaches in production environment. One can argue if they are any more bigger than the code bugs. But, I have seen production database cluster, ecs cluster getting deleted because of human errors. I am sure many of you would have faced similar issues in your lifetime. The internet is full with stories of how small human errors have brought down complete production systems.

When it comes to configuration of your infrastructure, it is very important how does one deploy changes to their infrastructure. In this regard, infrastructure as code (IaC) has been gaining popularity and developers are now coding their complete infrastructure. One of the popular IaC tool/language is Terraform. Hence, writing good Terraform code can reduce significantly disasters due to human errors. In this article, I discuss some of the practices to be followed while using Terraform in your project. Now, I did not want to start another “Best Practices” battle here, so I took it a bit higher ;-).

Practices

This is one of the first lesson I learnt while working with Terraform. The biggest advantage one gets by using small Terraform projects is — reduction in the blast radius. For example, in a microservice based environment, if there are separate Terraform projects for each of the microservice, a bug in one Terraform project will not affect another. For example, it is always a good practice to create a dedicated Terraform project for networks consisting of VPC, subnets, routes, etc. Firstly, these network resources are very rarely changed during their lifetime. Secondly, if they are mixed with the application project and accidental change by a developer can bring down the whole network.

Additionally, small projects increase the performance. Terraform make calls to the provider APIs to fetch the current state of resources while doing terraform apply and terraform plan. A smaller code base, significantly reduces the time taken in running those commands as the number of resources are less. Hence, it is important to spend some time in structuring your Terraform code. e.g. vpc related resources can be kept in one Terraform project and application specific Terraform codes in another. One can reference outputs from one Terraform project in another by using the terraform_remote_state.

Both core Terraform and most of the popular provider modules are open source projects and are under active development. This means, a new release can have breaking changes or bugs which can put your infrastructure at risk. Hence, it is important to use versions in your terraform project. Please note that terraform, terraform provider and even modules can be versioned. Click here for some official documentation on versioning.

Use Modules

In software development, it is very important that wherever possible, we do not reinvent the wheel and reuse code as much as possible. The same principal applies in case of IaC. In terraform this can be achieved through the use of modules. Terraform modules allows developers to write their infrastructure code as modules and reuse the module in multiple projects. There are many popular open source Terraform modules present which can be easily incorporated in your project. Terraform registry is a good source for those. It contains many modules which are created and maintained by Hashicorp themselves. Cloudposse team also has open sourced many of their Terraform modules which are of very good quality. Additionally, you can create your own modules and distribute them through out your organisation or open source them. Modules also improves readability, reusability, reliability and consistency. e.g You can create a Terraform module for rds cluster based on the company policies and all application team can use the module without having to reinvent them again. Moreover, it speeds up development and reduces a lot of boiler plate codes.

Never, never, never commit your state files in git. State files might contain sensitive information like rds database master password in plain text.

Always use remote state for saving your state files. S3 is a preferred remote state location for saving state file. S3 provides encryption at rest, has support for versions, granular access control through IAM and bucket policies. Moreover, S3 would be free for most of our use cases. Additionally, in multi user environment, it is recommended to use locking. So, when a developer is applying the changes, no other developer can make changes to the state file. Please click here for more information about state storage and locks.

I have seen few projects where the same Terraform files are copied to their respective environment folder. This is done with the idea of having all the environments in one project. But, any accidental change can mess up your prod environment. Moreover, this approach produces a single Terraform state file and if for some reason, the Terraform state files get corrupted because of some innocent mistake, it can put your production environment in risk and recovering from that situation might be difficult if not impossible. Hence, use separate state files for different environments by passing appropriate parameter during terraform init ( terraform init — backend-config=”bucket=xxxxxxxxx-test”) or use Terraform workspaces.

Always run plan before apply. This step needs lot of discipline and is strongly recommended when you use CI/CD pipeline for running your Terraform code. Using terraform plan — out=xxxxx.plan, you can produce a Terraform plan which can be reviewed and can be used with terraform apply (terraform apply xxxxx.plan). This way you have complete control over what changes are going into your system and there are no surprises.

Terraform state files are human readable json files and I have seen many developers editing them manually using text editors. This practice has also resulted in permanently corrupting a state file. One should avoid editing the state file on their own. Terraform state files are for consumption of Terraform API and should not be edited manually. There might many reasons to edit your state file (e.g. moving certain resources to a module). In that case, use the below Terraform commands to gracefully edit the state file.

  • terraform import
  • terraform state mv
  • terraform state rm

Finally

This article has been written as part of the Spice Program @ Futurice (an open source and social impact program)

Solutions Architect @ Fortum