What are you trying to achieve?
I’m using sonarqube-scan-action to try and run analysis on a somewhat big, multi-app Typescript project using GitHub Actions.
The problem I’m facing is that the job takes an unusually long time to complete, in particular, the compression step takes longer than I would have expected, ~ 1 hour 16 minutes.
Why does the analysis report compression take so long and is it the expected behaviour?
...
16:01:41.808 INFO: CPD Executor Calculating CPD for 0 files
16:01:41.809 INFO: CPD Executor CPD calculation finished (done) | time=0ms
16:01:41.852 DEBUG: SCM revision ID '9890897f3b396d9843508183799560a36f0aeaf5'
16:03:34.788 INFO: SCM writing changed lines
16:03:34.819 DEBUG: loading config FileBasedConfig[/tmp/jgit/config]
16:03:34.915 INFO: Merge base sha1: f7b4ad78190dab72cd8f1dc318b0a7c8297e3bb1
16:03:34.920 DEBUG: SCM reported changed lines for 0 files in the branch
16:03:34.920 INFO: SCM writing changed lines (done) | time=132ms
16:03:34.998 INFO: Analysis report generated in 113110ms, dir size=903.9 kB
17:19:40.655 INFO: Analysis report compressed in 4565654ms, zip size=1.5 MB
17:19:40.661 INFO: Analysis report generated in /github/workspace/.scannerwork/scanner-report
17:19:40.661 DEBUG: Upload report
17:19:40.876 DEBUG: POST 200 ***api/ce/submit?projectKey=my_project&projectName=my_project&characteristic=pullRequest%3D27132 | time=213ms
17:19:40.881 INFO: Analysis report uploaded in 217ms
17:19:40.913 DEBUG: Report metadata written to /github/workspace/.scannerwork/report-task.txt
17:19:40.913 INFO: ANALYSIS SUCCESSFUL, you can browse ***
17:19:40.913 INFO: Note that you will be able to access the updated dashboard once the server has processed the submitted analysis report
17:19:40.913 INFO: More about the report processing at ***
17:19:40.916 DEBUG: Post-jobs :
17:19:40.935 DEBUG: eslint-bridge server will shutdown
17:21:17.403 INFO: Analysis total time: 1:31:33.287 s
17:21:17.407 INFO: ------------------------------------------------------------------------
17:21:17.407 INFO: EXECUTION SUCCESS
17:21:17.407 INFO: ------------------------------------------------------------------------
17:21:17.407 INFO: Total time: 1:31:38.068s
17:21:17.483 INFO: Final Memory: 39M/134M
17:21:17.483 INFO: ------------------------------------------------------------------------
The compression is really just adding the report files into a zip archive.
Is the rest of the analysis also very slow, possibly indicating a problem with I/O?
Yep, it happens every single time, it is also somewhat consistent duration wise.
Thanks, removed it.
That is what I was thinking as well, no way the compression duration is warranted. Not sure why the size of the report goes up after compression even. The total duration of the analysis of this (relatively large) project is ~90 minutes, ~76 is spent on compression.
Regarding the potential I/O issue - I’ll attempt running the same analysis on a different runner, and see if I can reproduce the behaviour.
Would the full --verbose logs help the investigation?
No. The compression is just a regular zip file so I’m having a hard time guessing what it could be. Ideally we would be able to reproduce and/or profile it.
Hi,
Heard about this issue from a friend so did some digging. Was able to replicate the issue on my mac (local analysis) and got a profile showing the majority of the slowdown was during the ZipOutputStream.putNextEntry, closeEntry and close. This got me thinking a little and looking at how the analysis works…
From my understanding, sonarScanner will work in a .sonar-scanner folder in the project working folder, usually mounted to /usr/src (although the github action mounts to github/workspace) This folder is being mounted through the docker volume command, which on mac specifically is known to have performance issues, and any extensive work across a docker bind-mount is likely to include some performance issues.
When sonar scanner creates a decent number of files (5,200 in a random example project) and then begins to iterate each file to add to a zip (sonarqube/ZipUtils.java at master · SonarSource/sonarqube · GitHub) and therefore creating and closing the stream on each file, any penalty with working in a bind-mount will begin to get noticeable.
So I’m wondering if something similar is happening here. Github Action owned runners may not present the issue as the mount performance may be optimised however the self-hosted, node runners as defined in the first post might be impacted by this.
So question to the SonarScanner - is there a way to customise the working directory to be inside the container file system, something like /tmp/.sonar-scanner instead of the project directory to avoid working in the mount-bind path? If so then at a minimum this would help confirm that the mount-bind is the issue, instead of something else.
Thanks for digging into the problem.
The work directory can be changed with the property sonar.working.directory. It can be absolute or relative to the base directory of the project being analyzed.
Awesome, setting this to /tmp solved the slowness for my mac, compression now completes in expected times (1s for 6MB instead of ~5 minutes). I suspect mount-binds especially on different file systems is a cause here, but will wait for Ciprian to confirm.
Amazing! Thank you for digging into this, by setting -Dsonar.working.directory="/tmp" not only the analysis compression ends a lot faster, but also most of the other steps are finishing quicker!. Problem solved
The slight difference here is the sonar runner github action performs a docker container run to perform the github action, whilst most github action compilers I’ve used downloads the build dependencies into the runner, downloads the source code and then works from the “local” file system.
The main difference there is performance on the local filesystem (be it a physical/virtual server or inside a docker container) is quick on any environment - but when going through a docker bind-mount (docker -v) there can be penalties.
Some actions may be getting away with it as it mainly requires all of the following to be true for the issue to really present itself:
Self-hosted runner (as Github owned runners don’t seem to have this penalty, at least for Alpine based docker containers)
We are working with a virtual Windows Server 2022, no docker involved and are using the SonarScanner for MSBuild which does not support sonar.working.directory. Therefore, the current solution does not apply.
After comparing different build systems and further investigation, we were able to find the cause, and a solution. It is the integrated real-time protection (Windows Security Settings > Virus & thread protection > Virus & thread protection settings > Manage settings > Real-time protection). After deactivating it temporarily, we got the expected performance. Our long-term solution was to add the Jenkins working space to the exclusion list.