SonarQube duplicate code detection

Community Build v25.5.0.107428

My development group is evaluating SonarQube for our company. We have concerns about duplicate code detection. Our initial analysis of our projects are reporting massive amounts of code duplication. Likewise, from a “Clean as you code” perspective, when a developer touches code, a lot of duplication is being reported.

When we looked into it, it is our understanding that SonarQube does not consider string literals when determining which code is duplicate code. Can that feature be toggled off? If not, I’m afraid that the duplicate code detection in SonarQube may be completely useless for us. It generates so much noise due to falsely reported duplicate code we are unable to identify actual duplicated code.

For example we have JavaScript for many pages where we might modify the input fields depending which mode a user is in (query vs edit). So in many thousands of places we might disable input fields like this (pseudo code for simplicity)…

disableField(“#firstName”);
disbaleField(“#lastName”);

and in another page we have:

disableField(“#partNumber”);
disableField(“#quantity”);

These are all being reported as duplicates, failing our quality gates and reporting a huge amount of duplicate code. What’s worse is no one has time to dig through all these false positives to find the possible legitimate cases of duplicate code in new commits.

If we cannot turn off the ignore of string literals in duplicate detection, we feel like we will need to turn off duplicate code detection completely. So I’m hoping I am misunderstanding something and it can be configured to take the string literals into account.

Thanks!
Todd

Hi Todd,

You’re not misunderstanding.

You may be interested in duplications exclusions.

 
HTH,
Ann

Thank you for confirming Ann. Unfortunately, we will have to turn it off for all of our code. We are generating dozens if not hundreds of false positives for every legitimate code duplication. We simply don’t have the capacity to have someone review all these false duplications. Even if we did there is no way to mark them as “accepted” so each new commit would require us to review them again.

What is the theory for having it work this way? It seems to me many development teams would struggle with this same issue. Is there any desire within the community to change it or make it configurable?

Hi,

The duplication detection algorithm was initially developed in the context of Java, not JavaScript. In that context, doing all the same things and only varying the string values is a lot less “normal”, so calling it a duplication makes more sense.

 
FWIW,
Ann