Unrelated files scanned in SonarCloud PR check

We are currently using SonarCloud for our PHP application which is using development branch as main and master branch as stable/release branch. Our current workflow is to create feature branches from development and merge back to development.

The set up is to build and test in our CI on every PR change and push the results into SonarCloud, as well as a nightly build on development branch.

The problem is that we are seeing inconsistent PR scan results and issues reported on a PR for unrelated files which are not part of the PR or doesn’t fit into definition of “New Code” (which is currently last 30 days). The report includes bug reports, duplicate lines and also test coverage metric is completely random.

What can cause this inconsistency?

1 Like

Hi @Ayan and welcome to our Community!

Could you please confirm that you configured your branches and CI as described at our documentation? This one also could give an overview about how it works at SonarCloud.

Having it confirmed, then please provide the configuration you have (as much as possible, use screenshots if necessary) plus the CI details and definition (which one and configuration files). If there is sensitive information please sanitize it or we can provide a private message for such data, as you wish (but let me know!). If your project is public, then just provide both SonarCloud URL and ALM one as well.

Hey,

we use a custom CI integration (our CI tool is gocd).
This is our sonar-project.properties config (code coverage is prepared before hand via phpunit):

sonar.organization=XXX
sonar.projectKey=XXX_YYY
sonar.host.url=https://sonarcloud.io

sonar.sources=XXX
sonar.tests=./tests
sonar.language=php
sonar.sourceEncoding=UTF-8
sonar.php.coverage.reportPaths=/var/tmp/coverage_report.xml
sonar.php.tests.reportPath=/var/tmp/tests_report.xml
sonar.coverage.exclusions=XXX
sonar.exclusions=XXX

And this is how we invoke the branch analysis (ran from the project root folder in the specified branch):

/sonar-scanner-4.4.0.2170-linux/bin/sonar-scanner  -Dsonar.login=${SONAR_LOGIN_CREDS}  -Dsonar.pullrequest.branch=${PROJECT_GIT_BRANCH} -Dsonar.pullrequest.key=${PROJECT_PR_NUMBER} -Dsonar.pullrequest.base=${PROJECT_GIT_DEST_BRANCH}

The bash variables are being populated depending on the running pipeline/branch/pr.

Finally this is how we do the main branch analysis (ran from the project root folder):

/sonar-scanner-4.4.0.2170-linux/bin/sonar-scanner  -Dsonar.login=${SONAR_LOGIN_CREDS}

@Ayan and @duronrulez , i still need some information about this:

Could you please confirm that you configured your branches and CI as described at our documentation ? This one also could give an overview about how it works at SonarCloud.

Can you tell me your branches configuration (long and short branches pattern)?

And this is how we invoke the branch analysis (ran from the project root folder in the specified branch):

Could you confirm that this command is executed only for pull requests? Can you check if PROJECT_GIT_BRANCH points to the pull request branch and that PROJECT_GIT_DEST_BRANCH points to a long living branch targeted by the pull request? Sometimes happens that the values here are not correct, but guessed that are fine leading to problems like yours.

Just to chip in with an idea, another common cause for seeing unrelated issues in PR analysis is when the base branch has some commits that have not been analyzed. For example when the base branch has commits:

A --- B --- C

And the pull request has commits:

A --- B --- C --- P1 --- P2

And let’s say the base branch was last analyzed after commit B, and it has not been analyzed after commit C. In this situation issues introduced by commit C will get included in the pull request. Basically, the pull request shows all the issues in the code minus the issues known in its base branch.

In short, make sure the base branch is always analyzed when a commit is pushed. That would prevent the scenario I described here. If yours is a different scenario, please follow the guidance of @Alexandre_Holzhey.

1 Like

@janos We are running a scan on the main branch (development) daily and merging separately scanned PRs several times a day. If it is the case you described, unrelated files we see would be limited to the files edited on the same day. But in our case we see reports on files changed several weeks and sometimes months ago.

@Alexandre_Holzhey

Long-lived branches pattern: (master|development|release\/).*

I’ll let @duronrulez to explain the rest of the custom CI config.

@Alexandre_Holzhey any further ideas on this one?

@duronrulez provided the configuration in the opening.

Scanning more often is certainly something we can do, but our codebase is quite big and the scan takes hours to complete. This makes it quite unviable, as a development scan on push taking 1+ hours to finish is not something that is very useful i think.

Please verify that the ${PROJECT_GIT_DEST_BRANCH} variable used by your commands points to the correct base branch. The files that show up in the pull request analysis should correspond to the files changed since the common fork point of the working tree that you are analyzing, and the base branch. If the value of this variable is incorrect, the results will be incorrect.

As an additional check, you could add the following command in your CI tool, to verify the changed files:

git diff --name-only "origin/${PROJECT_GIT_DEST_BRANCH}...HEAD"

The list of files printed by this command should include all the files you see on the Code tab of the pull request on SonarCloud. If the list of files printed by this command look strange, then probably ${PROJECT_GIT_DEST_BRANCH} doesn’t have the intended value.

Lastly, are there any warnings reported on the pull request page in SonarCloud? (Warnings show up in the top-right corner, in a small box with yellow background.) They may contain clues relevant to this issue.

This is what we get as an output from sonar scanner cli (with debug on):
11:56:59.664 DEBUG: SCM reported changed lines for 32 files in the branch

But this is what we see as a diff in both bitbucket ui and git cli (git diff --name-only “origin/development…XXXX-COMMIT-XXXX”):

both bitbucket and our cli commands return only 10 files (which is the correct result).

Can you please explain (a git command would be perfect) how does sonar figure out those 32 files between the branch and the destination.

There are also warnings about file resolution in the coverage report that we provide.

Let us know if you need anything else.