Skip to main content

Command Palette

Search for a command to run...

Terraform Data Source using AWS

Updated
4 min read

The problem:

An EC2 instance must be created from a specific AMI ID, but in real projects, you rarely want to hard‑code that AMI ID, because it changes over time when AWS publishes new images (for example, the latest Amazon Linux 2).

An AMI is identified by an ID like ami-0abcd1234.... Terraform’s aws_instance (EC2) resource takes a parameter ami = "<some-id>". If you want to create an “EC2”, which you also want to “Create: latest” – meaning “always launch from the latest Amazon Linux 2 image”, not from a fixed AMI.​

If you manually paste an AMI ID, it will become stale when AWS releases a new Linux 2 image or if you change region; then your “Create: latest” intent is broken, and you have to keep updating the code. So, “Where do we get the ID for the latest AMI image when it changes over time and you don’t want to manually change the value whenever a change occurs?”

The problem statement is:

“How can Terraform dynamically look up the correct AMI ID (for example, the latest Amazon Linux 2 in a given region) and feed that into the EC2 resource, instead of us hard‑coding the AMI ID in the configuration?”

Data Source

Context: We have a VPC shared by other teams. We have two subnets, subnet-1 and subnet-2. We need to create two EC2 instances in subnet-1 and subnet-2, each. Also, need to use the existing VPC and Subnet for this task. So we will reference the existing VPC and subnet ID to the environment.

So, we will use:

  • Data Source for AMI

  • Data Source for VPC and subnet.

Configuration:

data "aws_vpc" "vpc_name" {
  filter {
    name = "tag:Name"
    values = ["default"]
  }
}

data "aws_subnet" "shared" {
  filter {
    name = "tag:Name"
    values = ["subnet-a"]
  }
  vpc_id = data.aws_vpc.vpc_name.id
}

data "aws_ami" "linux2" {
  owners = ["amazon"]
  most_recent = true

  filter {
    name = "name"
    values = ["amzn2-ami-hvm-*-x86_64-gp2"]
  }
  filter {
    name = "virtualization-type"
    values = ["hvm"]
  }
}

resource "aws_instance" "example_instance" {
  ami = data.aws_ami.linux2.id
  instance_type = "t2.micro"
  subnet_id = data.aws_subnet.shared.id
  tags = var.tags
}
💡
Go to your AWS account, and find the VPC section and change the first VPC existing with the name ““ to “default“. Also, change the default ““ subnet id, with your preferred subnet ID.

Explanation:

VPC data source

data "aws_vpc" "vpc_name" {
  filter {
    name   = "tag:Name"
    values = ["default"]
  }
}
  • data "aws_vpc" "vpc_name" does not create a VPC; it queries AWS for an existing VPC.

  • The filter says: “Find the VPC whose Name tag equals default”, and Terraform exposes its attributes (like .id) as data.aws_vpc.vpc_name.id.

Subnet data source

textdata "aws_subnet" "shared" {
  filter {
    name   = "tag:Name"
    values = ["subnet-a"]
  }
  vpc_id = data.aws_vpc.vpc_name.id
}
  • This also only reads an existing subnet.

  • It restricts the search to subnets:

    • In the VPC found above (vpc_id = data.aws_vpc.vpc_name.id).

    • With tag Name = "subnet-a".

  • The resulting subnet ID is available as data.aws_subnet.shared.id.

AMI data source (Amazon Linux 2)

textdata "aws_ami" "linux2" {
  owners      = ["amazon"]
  most_recent = true

  filter {
    name   = "name"
    values = ["amzn2-ami-hvm-*-x86_64-gp2"]
  }
  filter {
    name   = "virtualization-type"
    values = ["hvm"]
  }
}
  • This looks up an existing AMI, again without creating anything.

  • owners = ["amazon"] limits results to AMIs published by the Amazon account.

  • The name filter matches AMIs whose name looks like amzn2-ami-hvm-<something>-x86_64-gp2, which is the pattern for Amazon Linux 2 HVM, x86_64, gp2-backed images.

  • virtualization-type = "hvm" ensures only HVM virtualisation images.

  • most_recent = true picks the newest AMI matching those filters, so you always get the latest Amazon Linux 2 image in that region.

EC2 instance using those data sources

textresource "aws_instance" "example_instance" {
  ami           = data.aws_ami.linux2.id
  instance_type = "t2.micro"
  subnet_id     = data.aws_subnet.shared.id
  tags          = var.tags
}
  • This creates an EC2 instance.

  • ami is wired to data.aws_ami.linux2.id, so the instance always uses the latest Amazon Linux 2 AMI found by the data source.

  • subnet_id is taken from the existing subnet looked up by tag (subnet-a in the default VPC).

  • tags = var.tags applies whatever tags you pass via the var.tags variable.

So overall, the data sources make the configuration dynamic by discovering the right VPC, subnet, and AMI at plan/apply time, and the EC2 resource consumes those discovered IDs instead of hard-coding them.

That’s it.

For Video Reference:


Arigato!

More from this blog

Code Companions

32 posts