Even if running an Enterprise instance we nevertheless need some custom plugins.
We run one central Sonarqube instance, that is accessed from several network areas, means proxies and web gateways (virus scan…) involved, so the network traffic is slowed down.
The sonar analysis fails with a socket timeout, because it fails downloading one plugin - that ain’t even needed for analysis - after 5mins, it’s of the type ‘external’.
I’ve tried with increasing sonar.ws.timeout and http.socket.timeout but didn’t work either.
Searched and found a workaround using curl curl -vvv -u squ_xxxxxxxx: --proxy http://xxx:xxx -o xxx-plugin.jar https://xxx/api/plugins/download?plugin=xxx
The plugin is downloaded in 5min 30 sec and curl uses exactly the same network settings as the
Sonar scanner.
It seems the Sonar scanner doesn’t respect an increased sonar.ws.timeout value !?
We run our workloads on ephemeral containers so to reduce the amount of downloads over the Internet we use a cache file (local to our network) to pre-download the plugins onto our containers before calling the sonar runner. I recommend you do something similar so that you are less likely to hit timeout or other similar issues from fetching binaries over the Internet. If the cache is not up to date the runner will then download the correct binaries as needed and the next cache refresh should get the updated ones.
The actually implementation will really depend on your CI/CD infrastructure.
The gist of it is:
have a pipeline that runs on regular basis (once a week is fine). Can be re-run on demand.
run the sonar runner with some code in the languages you want to support, or against your main repository.
create an archive from the cache directory.
push the archive to a shared location.
To re-use the cache:
before calling the sonar runner.
pull the most recent cache from the shared location.
extract the cache in the home directory.
run the sonar runner.
On the pipelines that call sonar, before we do call sonar we fetch the tgz and extract it into the sonar cache directory. When the sonar runner starts it detects the cached plugins, and in most case does not need to download them. By default the cache is under "${HOME}/.sonar/".
Since we run sonar for most of our repositories and we run pipelines through Jenkins under EKS in AWS we ended up creating a custom Jenkins library so that our sonar stages only need to declare installSonarPluginsCache() before calling our custom sh 'sonar-scanner.sh'. The cache is generated from a custom sonar_cache pipeline that runs once a week and takes under 2m to complete.
For compressing we use zstandard, which ends up being much smaller than a regular tgz. We use tar --exclude=*.jar_unzip -czvf sonar_cache.tgz -C "${HOME}/.sonar" cache to create the file and `tar --zstd -C “${HOME}/.sonar” -xf ‘${cacheArchive}’ to decompress it.
You can see if the cache is working by looking at the output of the runner.
Without the cache, downloading from the Internet took 20s:
INFO: User cache: /home/ci/.sonar/cache
INFO: Load/download plugins
INFO: Load plugins index
INFO: Load plugins index (done) | time=660ms
INFO: Load/download plugins (done) | time=20093ms
With the cache:
INFO: User cache: /home/ci/.sonar/cache
INFO: Load/download plugins
INFO: Load plugins index
INFO: Load plugins index (done) | time=236ms
INFO: Load/download plugins (done) | time=512ms
In our case pulling the cache and extracting it takes 5-6s so we save 15s per pipeline on average. The cache folder is 291MB and the compressed archive is 251MB, the plugins are already gzip compressed but zstandard is able to compress a bit more.
thank you for the very detailed description.
When you run sonar-scanner on some random or real-life code, this will create a new entry on the server as well. Did you find a way to circumvent this?
You are correct, we actually run it with no project configured. The runner fetches all the plugins before checking if the project exists:
INFO: ------------------------------------------------------------------------
INFO: EXECUTION FAILURE
INFO: ------------------------------------------------------------------------
INFO: Total time: 30.257s
INFO: Final Memory: 20M/53M
INFO: ------------------------------------------------------------------------
ERROR: Error during SonarScanner execution
ERROR: Project not found. Please check the 'sonar.projectKey' and 'sonar.organization' properties, the 'SONAR_TOKEN' environment variable, or contact the project administrator
ERROR:
ERROR: Re-run SonarScanner using the -X switch to enable full debug logging.
We pretty much run it with no project configured and then ignore the non zero exit code then proceed with archiving the freshly populated cache.
As a followup, I learned yesterday that we’re planning to start work “soon” on limiting download to only those plugins that are needed for the current analysis. The work will be done first on the SonarCloud side and then, at some point, ported over to SonarQube.
I have no dates or versions for you, so don’t bother asking.
@ganncamp limiting to only the needed plugins would be good, on the other hand, this mean that our current caching trick will break (we analyze nothing) and we will need to change how we generate the cache. Would there be a way to tell the sonar runner to only download caches for a specific set of languages.
Something like --download-plugins=python,go,terraform,xml,nodejs. This would allows us to more easily cache the plugins, but also have different sets of caches for our microrepos that use less plugins.
I’ve just talked to the guy who’s writing the spec. We’re not going to make special provision for this, but he believes that once the changes we’re contemplating are in place, this will be scriptable.