Working Around Missing Terraform Data Sources

Jan 10, 2018 21:00 lambda parameterstore serverless sqs terraform

Terraform All Day

I recently found myself working on a project that involved building out a serverless app - running on AWS Lambda. The specifics of the app aren’t all that important. Basically, several Lambda functions would need to pass messages to one another via SQS queue for the app to function properly.

Helpfully, Lambda provides a mechanism to expose environment variables to your functions - this makes for a nice configuration interface. This would allow me to expose the SQS queue URL via environment variable, to each of the functions. As with many of the projects I work on, Terraform is the prevailing tool for provisioning the underlying infrastructure. Terraform has the concept of data sources, which, according to the documentation: "…​allow data to be fetched or computed for use elsewhere in Terraform configuration". Their usage also "…​allows a Terraform configuration to build on information defined outside of Terraform, or defined by another separate Terraform configuration". This is particularly helpful given that all of the Lambda functions that comprise the app will live in separate repos with separate Terraform configurations. I’m also not keen on statically defining shared configuration data inside of Terraform configurations.

The way that Terraform data sources work is by searching for all instances of a given resource type, filtering on some key. Once a match is found, Terraform returns some data about that resource. Here’s a snippet demonstrating use with IAM users:

data "aws_iam_user" "js" {
  user_name = "john_smith"
}

The user john_smith could have been created via the AWS console, aws-cli or even a completely separate Terraform config. With this code, Terraform will search IAM for all the users in your account, filtering on the name john_smith. Then, if it finds a match, you can reference the resource in your Terraform config as data.aws_iam_user.js.arn - assuming you’re after the user’s ARN. Armed with this knowledge I figured I’d do something similar for my SQS queue URL and other configuration data. Problem solved!

Problem Not Solved

It turns out, Terraform (oddly) doesn’t have a data source for aws_sqs_queue. Bummer.

I had all sorts of wild ideas about how to sort this problem out - including writing a data source for SQS. But then, I thought to myself how rare it is, in my experience, to be the first person to have to deal with a problem. The answer is probably buried out there somewhere on the internet - you just need to dig for it.

Enter Parameter Store

After some quick DDG searches, I came across the AWS Parameter Store (formally, Systems Manager Parameter Store) docs. Parameter Stores’s purpose is pretty much what the name suggests: it stores (configuration) parameters. After a quick scan of the Parameter Store docs, I was ready to cook with gas.

In a base Terraform config I did something like the following:

resource "aws_sqs_queue" "q" {
...
}

resource "aws_ssm_parameter" "queue_url" {
  name  = "/MyApp/Dev/queue_url"
  type  = "string"
  value = "${aws_sqs_queue.q.id}"
}

Then, in the Terraform configs for each Lambda function, I referenced the parameters using the aws_ssm_parameter data source:

data "aws_ssm_paramater" "param" {
  name = "/MyApp/Dev/queue_url"
}

From there, simply reference the paramater like any other data source object:

output "queue_url" {
  value = "${data.aws_ssm_parameter.param.value}"
}

You may notice that I used an odd-looking name for the parameter. It’s really just an organizational thing. Using a hierarchy like this makes life much simpler as you start throwing more paramaters into Parameter Store. It’s analogous to object namespacing in S3.

The data I’ve put into Parameter Store isn’t particularly sensitive. Therefore, I haven’t gone through the trouble of encrypting the params with a KMS key. However, if you’re putting sensitive data in Parameter Store, you should absolutely encrypt it.