Terraform Testing

Terraform lets you describe the infrastructure you want and automatically creates, deletes, and modifies your existing infrastructure to match. OPA makes it possible to write policies that test the changes Terraform is about to make before it makes them. Such tests help in different ways:

  • tests help individual developers sanity check their Terraform changes
  • tests can auto-approve run-of-the-mill infrastructure changes and reduce the burden of peer-review
  • tests can help catch problems that arise when applying Terraform to production after applying it to staging

Goals

In this tutorial, you’ll learn how to use OPA to implement unit tests for Terraform plans that create and delete auto-scaling groups and servers.

Prerequisites

This tutorial requires

  • Terraform 0.8
  • OPA
  • tfjson (go get github.com/palantir/tfjson): a Go utility that converts Terraform plans into JSON

(This tutorial should also work with the latest version of Terraform and the latest version of tfjson, but it is untested. Contributions welcome!)

Steps

1. Create and save a Terraform plan

Create a Terraform file that includes an auto-scaling group and a server on AWS. (You will need to modify the shared_credentials_file to point to your AWS credentials.)

  1. cat >main.tf <<EOF
  2. provider "aws" {
  3. region = "us-west-1"
  4. }
  5. resource "aws_instance" "web" {
  6. instance_type = "t2.micro"
  7. ami = "ami-09b4b74c"
  8. }
  9. resource "aws_autoscaling_group" "my_asg" {
  10. availability_zones = ["us-west-1a"]
  11. name = "my_asg"
  12. max_size = 5
  13. min_size = 1
  14. health_check_grace_period = 300
  15. health_check_type = "ELB"
  16. desired_capacity = 4
  17. force_delete = true
  18. launch_configuration = "my_web_config"
  19. }
  20. resource "aws_launch_configuration" "my_web_config" {
  21. name = "my_web_config"
  22. image_id = "ami-09b4b74c"
  23. instance_type = "t2.micro"
  24. }
  25. EOF

Then ask Terraform to calculate what changes it will make and store the output in plan.binary.

  1. terraform plan --out tfplan.binary

2. Convert the Terraform plan into JSON

Use the tfjson tool to convert the Terraform plan into JSON so that OPA can read the plan.

  1. tfjson tfplan.binary > tfplan.json

Here is the expected contents of tfplan.json.

  1. {
  2. "aws_autoscaling_group.my_asg": {
  3. "arn": "",
  4. "availability_zones.#": "1",
  5. "availability_zones.3205754986": "us-west-1a",
  6. "default_cooldown": "",
  7. "desired_capacity": "4",
  8. "destroy": false,
  9. "destroy_tainted": false,
  10. "force_delete": "true",
  11. "health_check_grace_period": "300",
  12. "health_check_type": "ELB",
  13. "id": "",
  14. "launch_configuration": "my_web_config",
  15. "load_balancers.#": "",
  16. "max_size": "5",
  17. "metrics_granularity": "1Minute",
  18. "min_size": "1",
  19. "name": "my_asg",
  20. "protect_from_scale_in": "false",
  21. "vpc_zone_identifier.#": "",
  22. "wait_for_capacity_timeout": "10m"
  23. },
  24. "aws_instance.web": {
  25. "ami": "ami-09b4b74c",
  26. "associate_public_ip_address": "",
  27. "availability_zone": "",
  28. "destroy": false,
  29. "destroy_tainted": false,
  30. "ebs_block_device.#": "",
  31. "ephemeral_block_device.#": "",
  32. "id": "",
  33. "instance_state": "",
  34. "instance_type": "t2.micro",
  35. "ipv6_addresses.#": "",
  36. "key_name": "",
  37. "network_interface_id": "",
  38. "placement_group": "",
  39. "private_dns": "",
  40. "private_ip": "",
  41. "public_dns": "",
  42. "public_ip": "",
  43. "root_block_device.#": "",
  44. "security_groups.#": "",
  45. "source_dest_check": "true",
  46. "subnet_id": "",
  47. "tenancy": "",
  48. "vpc_security_group_ids.#": ""
  49. },
  50. "aws_launch_configuration.my_web_config": {
  51. "associate_public_ip_address": "false",
  52. "destroy": false,
  53. "destroy_tainted": false,
  54. "ebs_block_device.#": "",
  55. "ebs_optimized": "",
  56. "enable_monitoring": "true",
  57. "id": "",
  58. "image_id": "ami-09b4b74c",
  59. "instance_type": "t2.micro",
  60. "key_name": "",
  61. "name": "my_web_config",
  62. "root_block_device.#": ""
  63. },
  64. "destroy": false
  65. }

3. Write the OPA policy to check the plan

The policy computes a score for a Terraform that combines * The number of deletions of each resource type * The number of creations of each resource type * The number of modifications of each resource type

The policy authorizes the plan when the score for the plan is below a threshold and there are no changes made to any IAM resources. (For simplicity, the threshold in this tutorial is the same for everyone, but in practice you would vary the threshold depending on the user.)

terraform.rego:

  1. package terraform.analysis
  2. import input as tfplan
  3. ########################
  4. # Parameters for Policy
  5. ########################
  6. # acceptable score for automated authorization
  7. blast_radius = 30
  8. # weights assigned for each operation on each resource-type
  9. weights = {
  10. "aws_autoscaling_group": {"delete": 100, "create": 10, "modify": 1},
  11. "aws_instance": {"delete": 10, "create": 1, "modify": 1}
  12. }
  13. # Consider exactly these resource types in calculations
  14. resource_types = {"aws_autoscaling_group", "aws_instance", "aws_iam", "aws_launch_configuration"}
  15. #########
  16. # Policy
  17. #########
  18. # Authorization holds if score for the plan is acceptable and no changes are made to IAM
  19. default authz = false
  20. authz {
  21. score < blast_radius
  22. not touches_iam
  23. }
  24. # Compute the score for a Terraform plan as the weighted sum of deletions, creations, modifications
  25. score = s {
  26. all := [ x |
  27. some resource_type
  28. crud := weights[resource_type];
  29. del := crud["delete"] * num_deletes[resource_type];
  30. new := crud["create"] * num_creates[resource_type];
  31. mod := crud["modify"] * num_modifies[resource_type];
  32. x := del + new + mod
  33. ]
  34. s := sum(all)
  35. }
  36. # Whether there is any change to IAM
  37. touches_iam {
  38. all := instance_names["aws_iam"]
  39. count(all) > 0
  40. }
  41. ####################
  42. # Terraform Library
  43. ####################
  44. # list of all resources of a given type
  45. instance_names[resource_type] = all {
  46. some resource_type
  47. resource_types[resource_type]
  48. all := [name |
  49. tfplan[name] = _
  50. startswith(name, resource_type)
  51. ]
  52. }
  53. # number of deletions of resources of a given type
  54. num_deletes[resource_type] = num {
  55. some resource_type
  56. resource_types[resource_type]
  57. all := instance_names[resource_type]
  58. deletions := [name | name := all[_]; tfplan[name]["destroy"] == true]
  59. num := count(deletions)
  60. }
  61. # number of creations of resources of a given type
  62. num_creates[resource_type] = num {
  63. some resource_type
  64. resource_types[resource_type]
  65. all := instance_names[resource_type]
  66. creates := [name | all[_] = name; tfplan[name]["id"] == ""]
  67. num := count(creates)
  68. }
  69. # number of modifications to resources of a given type
  70. num_modifies[resource_type] = num {
  71. some resource_type
  72. resource_types[resource_type]
  73. all := instance_names[resource_type]
  74. modifies := [name | name := all[_]; obj := tfplan[name]; obj["destroy"] == false; not obj["id"]]
  75. num := count(modifies)
  76. }

4. Evaluate the OPA policy on the Terraform plan

To evaluate the policy against that plan, you hand OPA the policy, the Terraform plan as input, and ask it to evaluate data.terraform.analysis.authz.

  1. opa eval --data terraform.rego --input tfplan.json "data.terraform.analysis.authz"

If you’re curious, you can ask for the score that the policy used to make the authorization decision. In our example, it is 11 (10 for the creation of the auto-scaling group and 1 for the creation of the server).

  1. opa eval --data terraform.rego --input tfplan.json "data.terraform.analysis.score"

If as suggested in the previous step, you want to modify your policy to make an authorization decision based on both the user and the Terraform plan, the input you would give to OPA would take the form {"user": <user>, "plan": <plan>}, and your policy would reference the user with input.user and the plan with input.plan. You could even go so far as to provide the Terraform state file and the AWS EC2 data to OPA and write policy using all of that context.

5. Create a Large Terraform plan and Evaluate it

Create a Terraform plan that creates enough resources to exceed the blast-radius permitted by policy.

  1. cat >main.tf <<EOF
  2. provider "aws" {
  3. region = "us-west-1"
  4. }
  5. resource "aws_instance" "web" {
  6. instance_type = "t2.micro"
  7. ami = "ami-09b4b74c"
  8. }
  9. resource "aws_autoscaling_group" "my_asg" {
  10. availability_zones = ["us-west-1a"]
  11. name = "my_asg"
  12. max_size = 5
  13. min_size = 1
  14. health_check_grace_period = 300
  15. health_check_type = "ELB"
  16. desired_capacity = 4
  17. force_delete = true
  18. launch_configuration = "my_web_config"
  19. }
  20. resource "aws_launch_configuration" "my_web_config" {
  21. name = "my_web_config"
  22. image_id = "ami-09b4b74c"
  23. instance_type = "t2.micro"
  24. }
  25. resource "aws_autoscaling_group" "my_asg2" {
  26. availability_zones = ["us-west-2a"]
  27. name = "my_asg2"
  28. max_size = 6
  29. min_size = 1
  30. health_check_grace_period = 300
  31. health_check_type = "ELB"
  32. desired_capacity = 4
  33. force_delete = true
  34. launch_configuration = "my_web_config"
  35. }
  36. resource "aws_autoscaling_group" "my_asg3" {
  37. availability_zones = ["us-west-2b"]
  38. name = "my_asg3"
  39. max_size = 7
  40. min_size = 1
  41. health_check_grace_period = 300
  42. health_check_type = "ELB"
  43. desired_capacity = 4
  44. force_delete = true
  45. launch_configuration = "my_web_config"
  46. }
  47. EOF

Generate the Terraform plan and convert it to JSON.

  1. terraform plan --out tfplan_large.binary
  2. tfjson tfplan_large.binary > tfplan_large.json

Evaluate the policy to see that it fails the policy tests and check the score.

  1. opa eval --data terraform.rego --input tfplan_large.json "data.terraform.analysis.authz"
  2. opa eval --data terraform.rego --input tfplan_large.json "data.terraform.analysis.score"

6. (Optional) Run OPA as a daemon and evaluate policy

In addition to running OPA from the command-line, you can run it as a daemon loaded with the Terraform policy and then interact with it using its HTTP API. First, start the daemon:

  1. opa run -s terraform.rego

Then in a separate terminal, use OPA’s HTTP API to evaluate the policy against the two Terraform plans.

  1. curl localhost:8181/v0/data/terraform/analysis/authz -d @tfplan.json
  2. curl localhost:8181/v0/data/terraform/analysis/authz -d @tfplan_large.json

Wrap Up

Congratulations for finishing the tutorial!

You learned a number of things about Terraform Testing with OPA:

  • OPA gives you fine-grained policy control over Terraform plans.
  • You can use data other than the plan itself (e.g. the user) when writing authorization policies.

Keep in mind that it’s up to you to decide how to use OPA’s Terraform tests and authorization decision. Here are some ideas. * Add it as part of your Terraform wrapper to implement unit tests on Terraform plans * Use it to automatically approve run-of-the-mill Terraform changes to reduce the burden of peer-review * Embed it into your deployment system to catch problems that arise when applying Terraform to production after applying it to staging