Things to review after take over an existing Terraform project

ducnm
4 min readOct 1, 2022

--

Well, whenever someone hands over an existing project to you, what would you do ? What should you check again, and what if you want to add a new module to an existing project without breaking anything ? OK so let’s come over here, I’ll tell my story.. It’s maybe helpful and saves you some time.

Where’s the documents ?

You’re in charged and well, they don’t have any, so what to check:

  • AWS account, you need one, setup and add profile with necessary permissions export AWS_PROFILE=”project-tf-prod”;
  • Workspace, how many environments ? Like development, staging, production
  • State files, what’s the output when trying to terraform plan ?
  • Variables files, each environment has its variable file, try
    export AWS_PROFILE=”project-tf-prod”; terraform plan --var-file="./envs/production.tfvars"
  • Talk to your colleagues about this project’s history. You may see through the past, why this is here and why it is. Or maybe why you got hired.
  • Any document is gold, whatever it is. :|
Keep calm and it will be all fine!

Is it in the latest state ?

Expected: Yes, it should be.

Reality: I see a lot of yellow/red lines when I did terraform plan

First, pull the latest code, then try to terraform init

To standardize and make sure current infrastructure is synced with terraform state file. There’re some cases you may faced:

  • Unused variables
  • Variables errors
  • Current infrastructure is remove/added new component
Ohhhh! Ohhhhhhh!

How do you add new modules ?

To plan for added new module, check bellow list:

  • Terraform project structure
  • Terraform’s workspace dev/stag/prod
  • Terraform’s existing modules
  • The variables.tf
  • What’s current CI/CD pipeline

Things to be carefully

State file should be backup every time it changes.

What if this project is not fully complete ?

By all means, these days I found out that, project still has some parts that are manually configured, not so well designed (good thing is: “it is running”). For example, S3 was manually created, and its part acted as a variable, RDS was using password instead of role.. And.. even don’t have any CI/CD pipeline, which developers have to do manually.

For this, I’d like to open some tickets and then finish them later.

Remove unused resources

One day, when meeting with team, you realized there’s one environment is abandoned. So just careful delete it.

  • AWS profile region is the same as env does.
  • export AWS_PROFILE=”project-tf-alpha”; terraform destroy --var-file="./envs/alpha.tfvars"
  • Check the status Plan: 0 to add, 0 to change, XX to destroy careful before enter yes.
  • Try to pick some resource name from state file and compare to current running.
  • If delete error, let fix first one then try call terraform destroy again.

Issue may have when remove resources:

  • Resource don’t exist

Terraform perform this for us during terraform destroy

  • S3 delete Bucket delete error BucketNotEmpty

Could go empty by yourself, or keep the S3 by just remove resource from terraform state

# To check which resource object. 
terraform state list
# Example: module.vpc.aws_s3_bucket.bucket_name
# To remove resource from state
terraform state rm module.vpc.aws_s3_bucket.bucket_name
# Removed module.vpc.aws_s3_bucket.bucket_name
# Successfully removed 1 resources instance(s)
  • Dependency Error when delete resource

Error deleting security group: DependencyViolation: resource sg-** has a dependent object
status code: 400, request id: xxxx

It could be someone already “soft remove” resource from state file, but remain dependency resource stay. To remove un-tracked resource, go to AWS console and check which resources have depend on. Then delete it. Or we could remove the resource form state file.

Try to remove from AWS web console
  • Dependency error when resource in used by another one

Error: error deleting ElastiCache Subnet Group (rg**subnet): CacheSubnetGroupInUse: Cache subnet group **-subnet is currently in use by a cache cluster.
status code: 400, request id: xxxx

Same action as error delete resource

Epilogue

Always be calm, with the spirit of continuous improvement, do one by one small steps to improve the project. Get rid of manual things. I’d like to recommend an article “A calm sysadmin” while I’m with this project and update if I got something good to share.

--

--

ducnm
ducnm

No responses yet