Server side analysis slow after update from v9.9.1 LTS to v10.5.1 (Enterprise)

Hi there,

  • Enterprise Edition v10.5.1 (90531) with the following additional plugins:
    • Checkstyle 10.17.0
    • Dependency-Check 5.0.0
    • Groovy 1.8
    • OpenID Connect Authentication for SonarQube 2.1.1
    • YAML Analyzer 1.9.1
  • Custom Docker image (JDK 17)
  • Database PostgreSQL 16
  • We monitor the JRE metrics in Kubernetes using jmx_exporter and heap max size of CE is 2.5 GB, usage is only between 10% and 60%. The pod requests 12G and 4 CPUs.

I looked into the background_tasks:

Android project (236k LOC):

  • v9.9.1 LTS analysis times between 1m and 2m15s-
  • v10.5.1 analysis time between 5m15s and 6m45s.

Spring-Boot Service II (1.8k LOC):

  • v9.9.1 LTS analysis times between 2s and 4s.
  • v10.5.1 analysis time between 4s and 6s.

Spring-Boot Service II (31k LOC):

  • v9.9.1 LTS analysis times between 6s and 16s.
  • v10.5.1 analysis time between 17s and 1m28s.

Any hints are very appreciated.
Best Regards
Mirko

Have you run a VACUUM FULL ANALYZE against the Postgres database?

Thanks @Colin for the suggestion (and the link). Funnily that came to my mind shortly before you replied. I will take a look whether it gets better and come back with results. The DB lookos ± idle, though.

1 Like

Yesterday I just executed VACUUM ANALZYE, I now see that you suggested a FULL vacuum. Will try that today after office hours.

1 Like

I think, we have another issue. Before we had SonarQube within a single node Kubernetes K3S running on a dedicated KVM under our control, the DB was already the same.

Now we moved to a managed company K8S cluster so the data directory used for e.g. ES8 storage is not a host directory anymore but CEPH based :frowning:

So we won less machine operating and OS updating but lost performance.

We still had the VMs we used before and setup our QA instance again there instead of running on K8S. No real difference so it is definitely the major update from SQ 9 LTS to 10.6 which makes the difference.
As said the background task took one minute for a bigger project and now takes between 6 and 16 minutes.

1 Like

I now created a small python-script which extracts times and sorts them (times < 1 second are not shown):

  • Result for Java multi-module project with 140k LOC:
❯ ./log-analyzer.py xxxx-master-sq9.log 
time=1.0s, message=Index analysis
time=1.1s, message=Extract report
time=1.9s, message=Persist sources
time=3.2s, message=Compute size measures on new code
time=5.7s, message=Compute new coverage
time=10.8s, message=Execute component visitors
time=27.2s, message=Persist live measures | insertsOrUpdates=128746
time=53.2s, message=Executed task | project=xxxx | type=REPORT | branch=master | branchType=BRANCH | id=AZAoyXYcQMlBGv_WzhQi | submitter=posartifactory
time=53.2s, message=TOTAL

❯ ./log-analyzer.py xxxx-master.log 
time=1.0s, message=Extract report
time=2.1s, message=Persist sources
time=2.2s, message=Load quality profiles
time=5.2s, message=Compute size measures on new code
time=9.0s, message=Build tree of components | components=3490
time=13.7s, message=Compute new coverage
time=45.1s, message=Execute component visitors
time=52.4s, message=Persist live measures | insertsOrUpdates=155669
time=133.9s, message=Executed task | project=XXX | type=REPORT | branch=master | branchType=BRANCH | id=b501c760-0671-437e-bb0a-e39b71336b83 | submitter=mam-local
time=133.9s, message=TOTAL
  • Result for an IOS project with 300k LOC:
❯ ./log-analyzer.py yyy-develop-sq9.log 
time=1.6s, message=Extract report
time=2.7s, message=Persist sources
time=3.3s, message=Compute size measures on new code
time=5.1s, message=Compute new coverage
time=16.7s, message=Execute component visitors
time=52.9s, message=Persist live measures | insertsOrUpdates=189951
time=86.1s, message=Executed task | project=YYYY | type=REPORT | branch=develop | branchType=BRANCH | id=AZAqVkBWQMlBGv_WzkEx | submitter=piosgitlab
time=86.1s, message=TOTAL

❯ ./log-analyzer.py yyy-develop.log 
time=1.4s, message=Extract report
time=2.0s, message=Load quality profiles
time=3.8s, message=Persist sources
time=14.1s, message=Compute size measures on new code
time=22.1s, message=Compute new coverage
time=46.0s, message=Build tree of components | components=6520
time=92.3s, message=Execute component visitors
time=117.3s, message=Persist live measures | insertsOrUpdates=238881
time=305.6s, message=Executed task | project=YYYY| type=REPORT | branch=develop | branchType=BRANCH | id=122111cb-3d51-46d6-8c7c-e393dbbc0afd | submitter=piosgitlab
time=305.6s, message=TOTAL

Hi @Colin, sorry for addressing you directly:

  • It looks like the all the steps are taking up to 4 times longer with SonarQube 10.
  • The DB is the same as before.

Any hints are appreciated.

Hey @mfriedenhagen

Sorry for the late response. A few points:

  • We have had a few reports of background task slowness after upgrading from 9.9 (possibly related to the introduction of new metrics), but nothing we could reproduce ourselves. SONAR-21949
  • We intend to specifically improve the performance of Persist Live Measures with SONAR-22870 for SonarQube v10.8
  • This user recently faced a slow environment, and traced it back to network/disk latency.

I emphasize this last post as they face similar issues as you (lag on execute component visitors and persist live measure)

2 Likes

Hello @Colin,

I’m a colleague of @mfriedenhagen - I think we can rule out network and disk latency. After the problems appeared in Kubernetes we installed SonarQube 10 for testing on the same machine where we did run SonarQube 9 without problems and even there it was slow like in the Kubernetes Cluster.

Hey @fuvo and @mfriedenhagen

I don’t have any more advice right now. However, I am collecting a few different data points (across community, our enterprise support channels) to provide a consolidated report of these reports to our PMs/Devs. Stay tuned.

Hey @fuvo and @mfriedenhagen

While I haven’t been able to reproduce your exact performance degradation, I did some investigation.

I decided to take a big project (~4000 files, 850k LoC, just a modified version of what’s in GitHub - SonarSource/sonar-scanning-examples: Shows how to use the Scanners), run my own tests, and measure the time for:

  • First analysis of the main branch
  • Second analysis of the main branch
  • First analysis of a non-main branch (after the first two analyses)

This was all done on brand new SQ servers (no upgrading) against a local Postgres server. Admittedly, a fairly optimal setup.

I chose these versions to compare the last LTA, to 10.3 (where we did a lot of refactoring of the DB model), to 10.4 (new measures as noted in SONAR-21949), and 10.7 (the latest)

These are the results:

Version 1st Scan 2nd Scan 1st Scan New Non-Main Branch
9.9.7 48s 39s 28s
10.3 49s 35s 70s
10.4 67s 50s 66s
10.7 58s 56s 97s

It definitely looks like something we should investigate, and I’ve passed this on to the right folks.

I would be curious to know if your performance issues also start only at 10.3/10.4, but I understand that it’s quite some work to reproduce.

I’ll keep you posted if something comes out of it, on top of the performance improvements we already expect in 10.8 with SONAR-22870.

2 Likes

Hello @Colin,

thank you for the update! We’re looking forward to hear from you.

Hey @mfriedenhagen and @fuvo

With 10.8 released, I’d love to get an update from you if you see any performance improvements.

Another update:

Hi @Colin,

happy new year. Sorry for not coming back earlier. We already switched our QA system to 10.8 and indeed saw an improved performance by factor 3. So while it is still not as fast as in the past it is definitely much better than before. Production will follow this week (the DB migrations from 10.7 to 10.8 takes around 1.5 hours, so we need to announce this properly).

Best regards
Mirko

Hi @Colin,

did a first test with sonarqube-25.1.0.102122; times are much better now: Still 13% slower but no more factor 3!

GĂĽnter