How PR file scanning scope is computed

ntriquenaux · July 12, 2019, 3:04pm

Dear Community

System informations :

SonarQube version : 6.7.6.38781
SonarScanner version : 3.3.0.1492
java : Java 1.8.0_121

I wanted to know if sonar scanner was doing something special regarding the scope of file to scan when considering pull requests. And if so under which condition. I’ll give you the context that motivates my question.

I do have a java project. When i launch a scan on a branch i do not have the same number of files that will be analyzed in regards to a pull requests that targets that same branch. Example :

seen in the branch scan log :

INFO: 6705 files indexed
INFO: 24361 files ignored because of inclusion/exclusion patterns
INFO: [....]
INFO: Java Main Files AST scan
INFO: 3755 source files to be analyzed

seen in the PR scan log :

INFO: 6695 files indexed
INFO: 24585 files ignored because of inclusion/exclusion patterns
INFO: [....]
INFO: Java Main Files AST scan
INFO: 468 source files to be analyzed

I tried to understand what would cause such difference:

I guessed that it might be because sonar scanner was relying on the changed file list, but the number of souce files to be analyzed does not match the number of modified files.
I assumed then that it would be the result of the Abstract Syntaxt Tree representation of the modified files that would provoke the inclusion of some additionnal files in the scope of analysis. But i could not verify that assumption.

So why do i have a difference between the PR and Branch analysis. And in the case of the PR what is deciding of the scope of file to analyze.

Secondly, i have a python project that relies only on pylint rules. In that case i have no difference in scanning scope as the full source folder is scanned whether it is a branch or a PR. I do find that behavior more logical that the one seen with the java project. So why do i have a difference between the two project ? Does it comes with the fact that java does have a better scanner ?

ganncamp · July 15, 2019, 12:59pm

Hi,

I’m a bit confused. You seem to say that you’re analyzing both short-lived branches and PRs in SonarQube 6.7.6, but PR analysis isn’t available in 6.7.*

Also, are you absolutely sure both analyses represent the same project state and that one doesn’t have added/removed, re-located files?

Ann

ntriquenaux · July 15, 2019, 2:14pm

Dear Ann,

I might not have used the proper vocabulary. When i say PR vs Branch analysis i have run the following command :

PR analysis:
sonar-scanner -Dsonar.sources=<SRC_PATH> -Dsonar.ce.javaOpts=-Xmx2G -Dsonar.analysis.mode=preview -Dsonar.github.oauth=<CREDS> -Dsonar.github.repository=<THE_REPOSITORY> -Dsonar.github.pullRequest=<PR_ID>
Branch analysis:
sonar-scanner -Dsonar.sources=<SRC_PATH>-Dsonar.ce.javaOpts="-Xmx2G" -Dsonar.branch=<BRANCH_NAME>

There might be some added/removed/relocated file between the two different logs i have shown, explaining why there is a difference between the number of indexed files. The thing that i cannot explain is the difference between the number of files to analyze. In the case of “PR analysis” i always have less number of files compared to the “Branch analysis”. In each case the PR is to be merged into the branch i refer by “branch analysis”.

ganncamp · July 15, 2019, 4:26pm

Hi,

I wonder if you’d provide screenshots, redacted as necessary, of what you get from your “PR analysis”?

From the analysis command you’ve provided, I suspect you’re getting a main-branch analysis:

-Dsonar.ce.javaOpts=-Xmx2G - This is only relevant in server settings
-Dsonar.analysis.mode=preview - This stopped working a while ago
-Dsonar.github.oauth=<CREDS> - I don’t recognize this param
-Dsonar.github.repository=<THE_REPOSITORY> - Must be configured via the UI
-Dsonar.github.pullRequest=<PR_ID> - This should be sonar.pullrequest.key

Ann