Race condition in SonarCloud PR commit status; ScannerReport.scmRevisionId discarded, not in webhooks

sonarcloud
github
pull-request

(Jesse Glick) #1

To reproduce, set up a CI process for GitHub pull requests (I used Jenkins X serverless), and a repository tied to a project in SonarCloud using the default wizard settings. I used a Jenkinsfile which runs

mvn -Dsonar.login=[TOKEN] -Dsonar.pullrequest.branch=PR-[NUMBER] -Dsonar.pullrequest.key=[NUMBER] -Dsonar.pullrequest.base=master -Dsonar.pullrequest.provider=github -Dsonar.pullrequest.github.repository=[ORG]/[REPO] install

on a Maven project defining

<sonar.host.url>https://sonarcloud.io</sonar.host.url>
<sonar.organization>[ORG]</sonar.organization>
<sonar.projectKey>[ORG]_[REPO]</sonar.projectKey>

and

<plugin>
    <groupId>org.sonarsource.scanner.maven</groupId>
    <artifactId>sonar-maven-plugin</artifactId>
    <version>3.5.0.1254</version>
    <executions>
        <execution>
            <phase>install</phase>
            <goals>
                <goal>sonar</goal>
            </goals>
        </execution>
    </executions>
</plugin>
<plugin>
    <groupId>org.jacoco</groupId>
    <artifactId>jacoco-maven-plugin</artifactId>
    <version>0.8.2</version>
    <executions>
        <execution>
            <goals>
                <goal>prepare-agent</goal>
            </goals>
        </execution>
    </executions>
</plugin>

File a (forked) pull request with some initially good code passing the default quality gate. Now:

  • Push a commit to the PR that takes several minutes to build (say, inserting sleep 300 in the Jenkinsfile prior to any real build steps) but otherwise does not touch source code. This is to simulate a random infrastructure delay.
  • Push another commit to the PR that removes the build delay, and introduces an obvious problem in the source code—say, adding an unused private method in a Java class.
  • Wait for the second commit to finish building. SonarCloud comments on the PR displaying the code smell, and sets a failing status on the second commit (and thus the PR as a whole, as this is the head commit). The SonarCloud UI displays the PR as having failed the quality gate, and displays the problem. So far so good.
  • Now wait for the first commit to finish building, which corresponds to a successful analysis. The status on the second commit is overridden to be passing (no status is added to the first commit); the PR as a whole is likewise marked with a green check mark; and the SonarCloud UI displays no issues on the PR. All this despite the fact that there is a known violation in the head commit.

Digging into the code, I see on the upload side that GitScmProvider does implement revisionId, and MetadataPublisher does call setScmRevisionId on a field defined in scanner_report.proto as scm_revision_id. But I can find no corresponding code on the server side that would interpret this metadata. There is no call to getScmRevisionId in sonarqube; none of AnalysisMetadataHolderImpl, BranchLoader, BranchImpl, PostProjectAnalysisTasksExecutor, or DefaultBranchImpl have any mention of this.

A related issue is that third-party services integrating with SonarCloud (or I suppose SonarQube Developer Edition) via webhooks do not receive information in the JSON payload about the Git commit being analyzed. (In code: ProjectAnalysis, WebhookPayloadFactoryImpl.writeBranch, Branch, WebhookPostTask.) I do see some documentation indicating that you could pass a sonar.analysis.scmRevision to the scan, which would be made available in the payload, but it seems silly that you would need to do this when the scan report evidently already includes that information; and a service using webhooks could not rely on such a parameter being present or having that exact name if this is up to every CI service to define individually.

I see some resolved issues SONAR-10153 + MMF-1129 + SONAR-10152 + BRANCH-13 + SONARSCGIT-21 (incl. a reference to an apparently private sonar-branch repository, used for D.E.?) all suggesting that the SCM revision field was added intentionally, but with no further details on its use. Since whatever code creates GitHub commit statuses appears to be closed-source, I cannot verify its behavior, though I suspect it is simply picking the current head commit of the PR.

I also see SONAR-10794 which sounds similar, though it lacks much context. Indeed if your CI setup premerges the base branch with the head branch before beginning a build of a PR then any code which creates a commit status must take care to select the appropriate merge parent commit.


(Janos Gyerik) #4

Great reproducer, explanation, and spot-on research into our implementation.

We’re aware of this issue, and SONAR-10794 was meant to fix it. As you pointed out, indeed it lacked context, and it did not explain the issue or its proposed solution very well. I fixed that now, with the help of this thread here.

Soon we will change the way we decorate pull requests on SonarCloud. The new mechanism won’t have this issue. This will continue to affect SonarQube though. I invite readers to vote on SONAR-10794 to increase its visibility.


(Janos Gyerik) #5

And another ticket for adding the commit ID to webhook payloads:

https://jira.sonarsource.com/browse/SONAR-11623