Duplicate Code Metric for Terraform Code missing

Lars2 · March 20, 2024, 10:25am

We are a DevOps Team doing a lot with IaC Terraform with a heavy engineering background in java/python.

We currently evaluate Sonarqube (still Community Version, on EKS with Helm) and are quite impressed.

I used to see duplication metrics for code, but except for scala I dont see any for our TF Code.

I played with sonar.cpd.terraform.minimumtokens and … …Lines. But from the log perspective I dont see any ‘DEBUG: Detection of duplications for …’ for terraform files.

Is it missing in implementation?

Thx in advance!
Lars

Lars2 · March 20, 2024, 11:15am

From what I read in GitHub - SonarSource/sonar-iac: Static Code Analyser for Infrastructure-as-Code languages such as CloudFormation and Terraform as well as DevOps like Docker and Kubernetes there is no implementation of CPD - right?

I compared to sonar-python/sonar-python-plugin/src/main/java/org/sonar/plugins/python/cpd/PythonCpdAnalyzer.java at master · SonarSource/sonar-python · GitHub, would be implementing A TerraformCPDAnalyzer the same way a good start?

Marcin_Stachniuk · March 21, 2024, 9:58am

Hi @Lars2

Welcome to the community, thanks for evaluating SonarQube and for your question!

So far we haven’t considered implementing Copy Paste Detection for Terraform. It can cause a lot of false positives that we would like to avoid.

Please provide us some examples where you expect copy-paste detection, so we can better understand your case and think about it again.

Best Regards
Marcin Stachniuk

Lars2 · March 21, 2024, 1:22pm

hi @ Marcin_Stachniuk

thx for replying.

As example let me describe the context we work in to understand the problem.

We implement “datalakes” in AWS consisting of multiple data pipelines.
Each data pipeline

gets data from somewhere (S3/…)
mangle the data (Glue/Lambda/…)
places the results to somewhere (S3/RDS/…).

One can imagine when it comes to routine implementations one way to proceed is to copy and adapt (adjusting the terraform state, change a base name and implement the logic in a lambda). In an oversimplified example could it be the copy 10 files and change 3 lines and have another 300 lines duplicate code.

example file hierarchy

datapipelines:

dp1
– /lambda
— lambda.py
– input_s3.tf
– process_lambda.tf
– output_s3.tf
– provider.tf
– versions.tf
dp2
– /lambda
— lambda.py
– input_s3.tf
– process_lambda.tf
– output_s3.tf
– provider.tf
– versions.tf
dp3
– /lambda
— lambda.py
– input_s3.tf
– process_lambda.tf
– output_s3.tf
– provider.tf
– versions.tf

Data duplication could give us a clue to consider refactoring and to keep DRY and improve in the field of clean code.

Without we only wake up and say ‘Oh nooo’.

Hope to made it clear.

Thx.

denis.troller · March 21, 2024, 2:49pm

Hi @Lars2,

This seems like a nice improvement to our offering. I recorded it in our system and we will evaluate it for inclusion in our roadmap.

Denis

jamiekt · December 18, 2024, 10:58am

Hi,
I have just started evaluating the use of sonarqube for detecting duplicate terraform code and am disappointed to discover that this is currently not supported.

Please provide us some examples where you expect copy-paste detection, so we can better understand your case and think about it again.

I would like to add an example of where this would be useful in my organisation.

Its generally good practise to write terraform code that is agnostic of the environment that it will be applied against. One way this can be accomplished by having a variable:

variable "environment" {
  type = string
  condition     = contains(["live", "staging"], var.environment)
}

and that variable can be referred to in terraform resource blocks.

Another way is by using for_each arguments.

Sadly in our organisation we have lots of terraform code and much of it doesn’t exhibit these simple good practises. Hence we have code like the following:

resource "aws_s3_bucket" "dq-args-staging" {
  bucket        = "dq-args-staging"
  force_destroy = false
  versioning {
    enabled    = false
    mfa_delete = false
  }
  server_side_encryption_configuration {
    rule {
      apply_server_side_encryption_by_default {
        sse_algorithm = "AES256"
      }
      bucket_key_enabled = true
    }
  }
  lifecycle_rule {
    id      = "default_lifecycle_rules"
    enabled = true
    transition {
      days          = 0
      storage_class = "INTELLIGENT_TIERING"
    }
    abort_incomplete_multipart_upload_days = 7
    expiration {
      expired_object_delete_marker = true
    }
    noncurrent_version_expiration {
      days = 14
    }
  }
  # tflint-ignore: aws_resource_missing_tags
  tags = merge(
    module.tags.squad_data_infrastructure_staging,
    {
      Squad = "core-data-platform"
    }
  )
}
# tflint-ignore: terraform_naming_convention
resource "aws_s3_bucket" "dq-development" {
  bucket        = "dq-development"
  force_destroy = false
  versioning {
    enabled    = false
    mfa_delete = false
  }
  server_side_encryption_configuration {
    rule {
      apply_server_side_encryption_by_default {
        sse_algorithm = "AES256"
      }
      bucket_key_enabled = true
    }
  }
  lifecycle_rule {
    id      = "default_lifecycle_rules"
    enabled = true
    transition {
      days          = 0
      storage_class = "INTELLIGENT_TIERING"
    }
    abort_incomplete_multipart_upload_days = 7
    expiration {
      expired_object_delete_marker = true
    }
    noncurrent_version_expiration {
      days = 14
    }
  }
  # tflint-ignore: aws_resource_missing_tags
  tags = merge(
    module.tags.squad_data_infrastructure_staging,
    {
      Squad = "core-data-platform"
    }
  )
}

(This is a real example lifted from our code)

As you can see the two resources are virtually identical. I would have really liked sonarqube to detect this duplication.

Hope that helps.

Topic		Replies	Views
How Duplications Metrics Works on New Code in Sonarqube? SonarQube Server / Community Build sonarqube , scanner	6	1025	December 6, 2022
Duplications are not shown SonarQube Server / Community Build	8	2442	November 3, 2022
Code Duplications : In some files even though Code duplications are present but still not able to see in the sonarqube report generated SonarQube Server / Community Build cfamily , duplications	8	1444	May 3, 2021
Code Duplication for New Language Plugin Development	1	690	May 29, 2019
False Positives in Duplications Check SonarQube Cloud coverage , duplications	3	222	August 24, 2024

Duplicate Code Metric for Terraform Code missing

Related topics