Performance issue with C++ analysis on SonarQube 8.5

Hi there,

Having noticed that the latest SonarQube 8.5 release dropped overnight, I was keen to upgrade our Enterprise edition test instance as we were eager to evaluate the latest version of the CFamily language plugin. In particular we were interested to see if the delivery of https://jira.sonarsource.com/browse/MMF-2099 had improved the analysis times for our Visual C++ (MS Build) projects. Unfortunately it seems to have made things worse.

Using the same analysis settings (same Quality Profile, cache disabled) and instance type (AWS EC2 r5d.2xlarge) the analysis time of one the projects I evaluated has nearly tripled. Consider the following log extracts for the analysis of this project:

SonarQube 8.4.2

INFO: Sensor CFamily [cpp]
15:44:19
  INFO: CFamily plugin version: 6.11.0.19130
15:44:19
  INFO: Using build-wrapper output: Z:\buildAgent\work\ef3d2ccf22bae763\.sonar_output\build-wrapper-dump.json
15:44:19
  INFO: Available processors: 8
15:44:19
  INFO: Using 8 threads for analysis according to value of "sonar.cfamily.threads" property.
15:44:22
  INFO: Cache is explicitly disabled: Optional[false]
....
  INFO: Subprocess(es) done in 419500ms
15:51:22
  INFO: 261 compilation units analyzed
15:51:22
  INFO: Sensor CFamily [cpp] (done) | time=422584ms

SonarQube 8.5

  INFO: Sensor CFamily [cpp]
14:33:08
  INFO: CFamily plugin version: 6.13.0.22261
14:33:08
  INFO: Using build-wrapper output: Z:\buildAgent\work\ef3d2ccf22bae763\.sonar_output\build-wrapper-dump.json
14:33:08
  INFO: Available processors: 8
14:33:08
  INFO: Using 8 threads for analysis according to value of "sonar.cfamily.threads" property.
14:33:10
  INFO: Cache is explicitly disabled: Optional[false]
....
  INFO: PCH: unique=2 use=224 (forceInclude=0,throughHeader=224,firstInclude=0) out of 261 (forceInclude=0,throughHeader=260)
....
14:53:57
  INFO: Subprocess(es) done in 1246579ms
14:53:57
  INFO: 263 compilation units analyzed
14:53:57
  INFO: Sensor CFamily [cpp] (done) | time=1249206ms

I’ve evaluated a couple of other projects and, while they aren’t as badly effected as the example above, the performance is unfortunately always worse under SonarQube 8.5.

Do you have any suggestions/ideas as to what may be the root cause of the issue? I’d happily provide logs, extra info etc privately if it helps. I’d also be keen to know if the expectation was for there to be a noticeable performance improvement with this release (or at least under what situations it could be expected). For reference the delivery of https://jira.sonarsource.com/browse/MMF-2078 in 8.4.2 did see an improvement of 25-35% for the analysis time of our projects.

I look forward to your feedback!

Cheers,

Sam

Hello @Sam_Anthonisz,

Thanks for the valuable feedback! Your numbers align with our expectations from MMF-2078.

We have a guess, that you can confirm by doing some check.
You are jumping from 6.11 to 6.13. 6.12 implemented MMF-2091: an improvement that let our symbolic execution run even in presence of parsing errors. The time increase can be explained if the symbolic execution wasn’t running before on your project and started to run after the upgrade.

To confirm this guess, you should try to run the analysis with 6.12:

For this MMF, our expectation is heavily dependant on the type of project:

  • If the project doesn’t use PCH, we don’t expect any impact on performance.
  • If the project uses PCH, the improvement depends on how impactful this PCH is on the build time.

I’m Looking forward to hearing from you,

Hi @Abbas_Sabra,

Thanks for your prompt and detailed response!

I have spent a bit of time comparing the 6.11, 6.12 and 6.13 versions of the CFamily plugin using the project in question and can report the following:

  • The increase in analysis time can definitely be attributed to the change in 6.12 - comparison against 6.11 shows the analysis time increasing from ~7m to ~23m
  • The output from running the analysis in debug mode is very verbose with lots of references to both external headers e.g. Windows SDK, Boost etc. as well as (some of) the source files being analysed. It’s a bit hard to understand what represents “errors” though, and I would be happy to provide some logs privately for you to provide guidance on that if that is possible?
  • (Re)Updating to 6.13 seems to result in a minor ~9% improvement in analysis times, which in the case of this project represents about 2 minutes.

So with the above in mind, I can accept that increased analysis time is as a result of MMF-2091 (and the improved accuracy that it brings). However I still would have hoped (expected?) to see a more noticeable improvement from MMF-2099 in 6.13.

We currently have nearly 200 C++ projects being analysed in SonarQube, ranging in size from XS (< 1K) to XL (> 500K). Given the hierarchical nature of our projects, the vast majority use a PCH (stdafx.h) to aid build times. For example in the case of the project I have been testing with, build times on my PC are 0m51s with the PCH active, or 6m48s when not using it - that’s a pretty massive difference on build times, but unfortunately we don’t seem to be seeing anywhere near that level of improvement in analysis times with MMF-2099.

I’d be happy to provide any further information where I can in order to help get a clearer understanding of whether this is expected, or if something is not quite right in the way things are operating. I can deploy any early access version of CFamily plugin to our test instance if that also helps.

Thanks again,

Sam

Hello @Sam_Anthonisz,

Thanks for spending time testing to confirm my hypothesis and for sending all the logs privately.

To be fair, the improvement that you got is what is expected. With this MMF we are trying to optimize the compilation part. Looking at your project, we can do some approximation. To keep it simple let’s say that the analysis = parse the code + run non-symbolic execution rules + run symbolic execution rules.

  • We can make a good guess that symbolic execution rules are taking 16 min(23 - 7) as it wasn’t running on most translation units on 6.11.
  • We can also say that 7 min is spent on parsing + non-symbolic execution rules.
  • On your project, MMF-2099 is trying to optimize only part of the 7min.

If you look at it from this perspective, getting 2 min improvement is a good/expected result. At the same time, I can understand why you hoped for more.

This is a good point. On our side, we rely on Clang to do the analysis, hence we use its implementation of PCH. You shouldn’t compare with MSVC here because you aren’t comparing the same thing. Currently, the MSVC implementation of PCH is superior.

You mentioned privately that you faced some trouble with the cache feature due to your project design and the usage of Conan. If you believe that there is something that can be improved with the cache feature, or even with other features, I encourage you to create a new post where you share your feedback!

Thanks,

Hi @Abbas_Sabra,

Thanks for the detailed response. At least the time increase can be attributed to something and it makes sense that the PCH doesn’t contribute to improving the symbolic execution portion of the analysis. In light of your response I just had a couple of related questions:

  • Is there a way in the SonarQube UI to determine which Rules are related to symbolic execution vs non-symbolic execution?
  • I did notice several occurrences of pchThroughHeader does not match first include in the debug output. Closer investigation shows this occurs on files that includes the stdafx.h but with a different casing e.g. StdAfx.h (which is allowed on Windows platforms and is also detected by S3806). I assume that the PCH detection is currently case-sensitive on the expected PCH filename?

Thanks again for all your help.

Cheers,

Sam

Hello @Sam_Anthonisz,

No, not currently. We are considering tagging them. If you are interested, here is the list of Symbolic execution rules in CFamily 6.13.txt (383 Bytes)

Yes, this is true. It explains the message you see in the debug log. Here is the ticket, it going to be fixed in the next version.
As discussed, this is not a good practice and specific to MSVC/Windows. Fixing the include to match the filename will also fix the problem.

Thanks,

Hi @Abbas_Sabra,

Thank you for providing that list - it is much appreciated.

Thanks also for raising that ticket, agree it is not a good practice and specific to MSVC/Windows, but is still likely to catch others out. We will update our sources in the meantime. Certainly looks like there’s some good things coming in the 6.14 release.

Cheers,

Sam

Hi,

Just providing an update on this - I thought you’d like to hear that I can report some very good gains with one of our large projects. It has a large, complex PCH file and the total analysis time has dropped by ~32minutes when comparing 6.11 to 6.13 - this represents a ~25% reduction in analysis time which is a great result from the implementation of MMF-2099. Of course using the cache feature will only improve this as well :+1: .

Thanks again,

Sam

2 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.