SonarScanner analysis report compress takes a long time

  • Version: 9.2.4.50792
  • What are you trying to achieve?
    I’m using sonarqube-scan-action to try and run analysis on a somewhat big, multi-app Typescript project using GitHub Actions.

The problem I’m facing is that the job takes an unusually long time to complete, in particular, the compression step takes longer than I would have expected, ~ 1 hour 16 minutes.

Why does the analysis report compression take so long and is it the expected behaviour?

...
16:01:41.808 INFO: CPD Executor Calculating CPD for 0 files
16:01:41.809 INFO: CPD Executor CPD calculation finished (done) | time=0ms
16:01:41.852 DEBUG: SCM revision ID '9890897f3b396d9843508183799560a36f0aeaf5'
16:03:34.788 INFO: SCM writing changed lines
16:03:34.819 DEBUG: loading config FileBasedConfig[/tmp/jgit/config]
16:03:34.915 INFO: Merge base sha1: f7b4ad78190dab72cd8f1dc318b0a7c8297e3bb1
16:03:34.920 DEBUG: SCM reported changed lines for 0 files in the branch
16:03:34.920 INFO: SCM writing changed lines (done) | time=132ms
16:03:34.998 INFO: Analysis report generated in 113110ms, dir size=903.9 kB
17:19:40.655 INFO: Analysis report compressed in 4565654ms, zip size=1.5 MB
17:19:40.661 INFO: Analysis report generated in /github/workspace/.scannerwork/scanner-report
17:19:40.661 DEBUG: Upload report
17:19:40.876 DEBUG: POST 200 ***api/ce/submit?projectKey=my_project&projectName=my_project&characteristic=pullRequest%3D27132 | time=213ms
17:19:40.881 INFO: Analysis report uploaded in 217ms
17:19:40.913 DEBUG: Report metadata written to /github/workspace/.scannerwork/report-task.txt
17:19:40.913 INFO: ANALYSIS SUCCESSFUL, you can browse ***
17:19:40.913 INFO: Note that you will be able to access the updated dashboard once the server has processed the submitted analysis report
17:19:40.913 INFO: More about the report processing at ***
17:19:40.916 DEBUG: Post-jobs : 
17:19:40.935 DEBUG: eslint-bridge server will shutdown
17:21:17.403 INFO: Analysis total time: 1:31:33.287 s
17:21:17.407 INFO: ------------------------------------------------------------------------
17:21:17.407 INFO: EXECUTION SUCCESS
17:21:17.407 INFO: ------------------------------------------------------------------------
17:21:17.407 INFO: Total time: 1:31:38.068s
17:21:17.483 INFO: Final Memory: 39M/134M
17:21:17.483 INFO: ------------------------------------------------------------------------

My setup:

# .github/workflows/my_workflow.yml
code-analysis-and-code-coverage:
    name: Run Sonar Scanner
    if: always()
    runs-on: [ self-hosted, node ]
    needs: test-unit
    steps:
      - uses: actions/checkout@v2
        with:
          fetch-depth: 0

      - name: Download Dependencies
        uses: actions/download-artifact@v2
        with:
          name: unit-test-coverage

      - name: Un-tar Coverage Files
        run: tar -xvf coverage.tar

      - name: Run Sonar Scan
        uses: sonarsource/sonarqube-scan-action@master
        env:
          SONAR_TOKEN: ${{ secrets.SONAR_TOKEN }}
          SONAR_HOST_URL: ${{ secrets.SONAR_HOST_URL }}
        with:
          args: -X
# sonar-project.properties
sonar.projectName=my_project
sonar.projectKey=my_project
sonar.testExecutionReportPaths=./coverage/test-report.xml
sonar.javascript.lcov.reportPaths=./coverage/lcov.info
sonar.javascript.node.maxspace=8192
sonar.cpd.exclusions=**
sonar.sourceEncoding=UTF-8

Hi,

Does this happen for every (nearly every?) analysis?

Also, FYI, this does nothing:

 
Ann

The compression is really just adding the report files into a zip archive.
Is the rest of the analysis also very slow, possibly indicating a problem with I/O?

Yep, it happens every single time, it is also somewhat consistent duration wise.

Thanks, removed it.

That is what I was thinking as well, no way the compression duration is warranted. Not sure why the size of the report goes up after compression even. The total duration of the analysis of this (relatively large) project is ~90 minutes, ~76 is spent on compression.

Regarding the potential I/O issue - I’ll attempt running the same analysis on a different runner, and see if I can reproduce the behaviour.

Would the full --verbose logs help the investigation?

Hi,

We are having the same problem.

11:27:13 INFO: Analysis report generated in 1169ms, dir size=7.8 MB
11:27:52 INFO: Analysis report compressed in 37157ms, zip size=3.0 MB
11:27:52 INFO: Analysis report uploaded in 257ms

1 second to generate the report, 37 seconds to compress it, and sending it in only a few hundred milliseconds…

Is it possible to disable compression?

1 Like

No. The compression is just a regular zip file so I’m having a hard time guessing what it could be. Ideally we would be able to reproduce and/or profile it.

Hi, also having this strange issue. I run the scanner on docker locally, and the docker resources are:
CPUs: 6
RAM: 8GB
SWAP: 2.5GB
Disk: 100GB

sonar-scanner-cli | INFO: CPD Executor CPD calculation finished (done) | time=2525ms
sonar-scanner-cli | INFO: Analysis report generated in 19904ms, dir size=22.5 MB
sonar-scanner-cli | INFO: Analysis report compressed in 552075ms, zip size=5.8 MB
sonar-scanner-cli | INFO: Analysis report uploaded in 1158ms

Everything goes smooth, analyzing only 900 files until the report compression.
Any ideas?
Thank you!

Hi,
Heard about this issue from a friend so did some digging. Was able to replicate the issue on my mac (local analysis) and got a profile showing the majority of the slowdown was during the ZipOutputStream.putNextEntry, closeEntry and close. This got me thinking a little and looking at how the analysis works…

From my understanding, sonarScanner will work in a .sonar-scanner folder in the project working folder, usually mounted to /usr/src (although the github action mounts to github/workspace) This folder is being mounted through the docker volume command, which on mac specifically is known to have performance issues, and any extensive work across a docker bind-mount is likely to include some performance issues.

When sonar scanner creates a decent number of files (5,200 in a random example project) and then begins to iterate each file to add to a zip (sonarqube/ZipUtils.java at master · SonarSource/sonarqube · GitHub) and therefore creating and closing the stream on each file, any penalty with working in a bind-mount will begin to get noticeable.

So I’m wondering if something similar is happening here. Github Action owned runners may not present the issue as the mount performance may be optimised however the self-hosted, node runners as defined in the first post might be impacted by this.

So question to the SonarScanner - is there a way to customise the working directory to be inside the container file system, something like /tmp/.sonar-scanner instead of the project directory to avoid working in the mount-bind path? If so then at a minimum this would help confirm that the mount-bind is the issue, instead of something else.

4 Likes

Thanks for digging into the problem.
The work directory can be changed with the property sonar.working.directory. It can be absolute or relative to the base directory of the project being analyzed.

1 Like

Awesome, setting this to /tmp solved the slowness for my mac, compression now completes in expected times (1s for 6MB instead of ~5 minutes). I suspect mount-binds especially on different file systems is a cause here, but will wait for Ciprian to confirm.

4 Likes

Amazing! Thank you for digging into this, by setting -Dsonar.working.directory="/tmp" not only the analysis compression ends a lot faster, but also most of the other steps are finishing quicker!. Problem solved

1 Like

Great find. I’m surprised this isn’t an obvious issue with any other task that works with the source directory, such as compilers.

The slight difference here is the sonar runner github action performs a docker container run to perform the github action, whilst most github action compilers I’ve used downloads the build dependencies into the runner, downloads the source code and then works from the “local” file system.

The main difference there is performance on the local filesystem (be it a physical/virtual server or inside a docker container) is quick on any environment - but when going through a docker bind-mount (docker -v) there can be penalties.

Some actions may be getting away with it as it mainly requires all of the following to be true for the issue to really present itself:

  • Self-hosted runner (as Github owned runners don’t seem to have this penalty, at least for Alpine based docker containers)
  • Github Action using Docker Container Actions
  • Performs multiple read-write file handling inside the mount directory (instead of using a temp directory inside the docker container)

Which I imagine for the majority of github actions is a rare combination.

We are working with a virtual Windows Server 2022, no docker involved and are using the SonarScanner for MSBuild which does not support sonar.working.directory. Therefore, the current solution does not apply.

After comparing different build systems and further investigation, we were able to find the cause, and a solution. It is the integrated real-time protection (Windows Security Settings > Virus & thread protection > Virus & thread protection settings > Manage settings > Real-time protection). After deactivating it temporarily, we got the expected performance. Our long-term solution was to add the Jenkins working space to the exclusion list.

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

Hi @miseas
Could you please let us know what is your run environment? Are you also using a self-hosted runner? Does it run on MacOs?