C++ analysis - Old code is treated as new code

which versions are you using (SonarQube, Scanner, Plugin, and any relevant extension)
SonarQube Enterprise EditionVersion 8.9.5 (build 50698)
Scanner - SonarScanner for MSBuild 5.4.1
what are you trying to achieve
We are trying to analyze new version (i.e. new code) added since last released version.
what have we tried so far to achieve this
What we are observing is that some issues & security hotspots from the old code are reported as on new code.
We are doing full cpp build with cache enabled. We are having different jenkins agent which builds the project and publish the analysis for the project on SQ server. Each jenkin Agent maintains its own SonarQube cache for the same project.
For example,

  • If Project was build on Agent1 it has its own local Sonarqube cache which would be used next time when the project is build on Agent1 again.
  • lly, If Project was build on Agent2 it has its own local Sonarqube cache which would be used next time when the project is build on Agent2 again.

Jenkins master decides on which Agent to runt he build on? Would this problem be caused because of all Agent keeping its own version of SonarQube Cache?

If that is the case, how would I get rid of these issues/Security hotspot which are basically on old code but treated as on new code.

Do I need to run the analysis again with the base line (lets say v5.1) and then again with new code (lets say v5.2). Is there any other solution to this?

Hi,

New Code detection is based on SCM data, not on your analysis cache.

Can you double-check the blame data SonarQube shows for one of these new/old lines? E.G.
Selection_385

Also, can you share what kinds of issues you’re seeing, i.e. what rules are raising these issues? There are times when it’s perfectly valid for a new issue to be raised on old code. For instance, if you delete a null-check, a new issue can be raised on the subsequent dereference in old code.

 
Ann

Thanks @ganncamp,
I understand the scenarios you are pointing out when it is perfectly valid for new issue to be raised on old code.
For us in some cases the issues have been raised as on new code for the files which were not at all modified within the timeline for new code definition, for example our new code starts from Jan 22 and file was last modified on June 21.

For example, security hotspot “Make sure use of “strlen” is safe here.” was raised on the source file which has not changed recently.

On one of the file which was changed we see following issue on old code reported as on new code.

This issue is not a new issue and it was already reported on overall code, but it suddenly started showing as on new code.

So we have two set of problems.

  1. Issues on old code are reported as on new code when there is some change in the file.
  2. Issues on old code are reported (particularly security hotspots) when there is no change at all in the file.

When you say that code detection is based on SCM data, is there any guideline/way how we can diagnose this problem from logs? Would we require verbose level logs?

Appreciate your help!!

Regards,
Sumedh

Hi Sumedh,

In the issue you’ve shown, can you click on '10 days ago` and provide a screenshot of the activity log that’s displayed?

Not really, but that’s why I asked for the SCM data SonarQube has (see my screenshot above). If you click in the grey margin, you should get that data.

 
Ann

Thanks @ganncamp ,
For that particular source file, it was changed and I can see some code added prior to the code on which we see an issue.

Below is the image that you requested.

Regards,
Sumedh

Hi @Sumedh.Bhogle,

You can find here the algorithm used to detect if the issue is new and needs a new date.

In your screenshot, if the issue was already there and you didn’t change the content of the line on which the issue is raised, the issue should not be considered as new and should have a date before your last version Jan 22 because of:

  • on the same rule, with the same message and with the same line hash (but not necessarily with the same line) > MATCH

There can be scenarios where newly introduced issues are reported as new on old code even if the file didn’t change. For example, if you modify code in a header file that is included in an unmodified file. This can change the meaning of the code in the unmodified file; hence possibly triggering new issues. those scenarios are only about newly introduced issues on old code that weren’t detected in your previous version.

Thanks,

Thanks @Abbas_Sabra ,
I understand what you are trying to say. But the nature of the issues marked as on new code (but actually are on old code) is such that they had been already raised, I am not sure how it says as created on new date.

To confirm,

  • we created a test Sonarqube project key for the same project.
  • Ran first analysis after moving the GIT head to commit from where our new code start.
  • Ran second the analysis on the current commit.
  • What we observed was the analysis is accurate and more consistent, i.e. it does not mark any issue on old code as on new code.

As of now we have no idea what happened with original project key and no way to progress for diagnosing the problem.

Hi,

Are we looking at a branch? And if so, what’s the branch’s New Code Period? Does your analysis log include any messages about SCM data detection?

 
Ann

Hello @ganncamp ,
This on main trunk. New code definition is default one, where it determines based on previous version. I.e. For version 1.2.0, new code is since 1.1.0.

I tried to dig into logs to see if I can find something, but no luck. As we are using enterprise version, we have reported this issue for official support.

Note: As I said earlier if we run the analysis for 1.1.0 again and then 1.2.0, everything is accurate and no anomalies are seen where issues are reported on new code which are actually on old code.