Analysis cache for large C++ projects

We are currently using SonarQube 10.6 (zip) and SonarScanner for MSBuild 6.2. We are running Azure DevOps pipelines.

We are trying to have a PR analysis, which is fast and focuses only on changed files. We regularily run full analysis against the target branch (develop). We have currently issues in getting the analysis cache to work as we would expect it. We get no (most of the time) or pretty random cache hits on our PR builds.

What logs can we generate or provide in order to get more information about why the cache does not work for us? We already ensured that the full analysis (at night) and the PR analysis are running with exactly the same build settings, macros and parameters (commandline for build wrapper is equal). What might further influence a cache hit? Might multiple build agents on the same server influence each other if being executed at the same time? Where is the cache physically saved to on the build server?

Any ideas/help appreciated!

Thanks!

It seems we got one step further in analyzing. Maybe someone can confirm our findings.
It seems that the cache is using full path names and not relative names, i.e. it only seems to find a cache hit if the full path to file matches. However we are using multiple build agents on one server, which have different root paths for their jobs. This seemingly seems to make the cache unusable as we typically do not get cache hits.

Like “Cache hit for: D:\Work\CI_AGENT_3\3\s\Base\TheCore\TheXMLWriter.cpp”, but the next build has its sources at “D:\Work\CI_AGENT_1\3\s\Base\TheCore\TheXMLWriter.cpp” wouldn’t give a cache it, although branch and file are equal.

Are those assumptions of ours correct? Any chance to give the cache a root folder, so it is not using the full path, but a releative patch starting from “s/*” for example?

Thanks!

Hello @Wolfgang_Gogg,

Thanks for the detailed explanation.

What logs can we generate or provide in order to get more information about why the cache does not work for us

If you enable the verbose log(-X), we print a line with a high-level description explaining the cache miss for every compilation unit.

it uses full paths if the header is not part of your project and relativized to the project base directory (the path where you are running the analysis from) if the header is part of the project

This way, if you check out your project in different directories, the cache should still work.

Like “Cache hit for: D:\Work\CI_AGENT_3\3\s\Base\TheCore\TheXMLWriter.cpp”, but the next build has its sources at “D:\Work\CI_AGENT_1\3\s\Base\TheCore\TheXMLWriter.cpp” wouldn’t give a cache it, although branch and file are equal.
Hence this case should also work. assuming you are running the analysis from Base.

This also means that if your compilation unit dependencies are outside your project and they don’t have a stable full path, that will be considered a dependencies change.

Note that PR uses the cache of the base branch. the analysis of the based branch updates the cache. the PR analysis doesn’t update the cache.

Let me know if this explains your case and if you need help interpreting the scanner verbose logs.

Hello, thanks for coming back to us. Can you tell me more details on how i can enable verbose logging exactly? I currently use sonar.log.level=DEBUG on the SonarQubePrepare@6 Task. But does only contain minmal information about the cache and no hints about the reason for a cache miss. How do i correctly set the logging to verbose (-X).

Thanks!

@Wolfgang_Gogg, it is a scanner property.

For sonar-scanner-cli it is:

If you need more debug information, you can add one of the following to your command line: -X , --verbose , or -Dsonar.verbose=true .

For Scanner for .Net:

/d:sonar.verbose=true

For the SonarQubePrepare@6 Task, did you try to set this in the extra property;
sonar.verbose=true

Thanks,

Hello, i attached a log that i tried to create with verbose logging. However i can not see any hint in the log why some files where cache misses. Is this the correct log or am i still missing a setting for more information?

Log_verbose.zip (35.9 KB)

Thanks!

@Wolfgang_Gogg,

You have 65 compilation units.
63 out of them have a cache hit:

Analysis cache: 63/65 hits, 595957 bytes

You have two cache misses the reason being that the source file content has changed:

2024-09-05T16:34:39.1539661Z 09:34:39.146 DEBUG: Cache miss for D:\Work\CI_AGENT_3\2\s\Base\TheCore\StdioFileEx.cpp: file content changed

and

2024-09-05T16:34:39.3368891Z 09:34:39.334 DEBUG: Cache miss for D:\Work\CI_AGENT_3\2\s\Base\TheCore\TheNodeName.cpp: file content changed

Due to cache, the C++ analysis took 12 seconds:

2024-09-05T16:34:50.9811749Z 09:34:50.974 INFO: Analysis done in 12484ms

Are these the logs of the PR in which you believe the cache wasn’t working?

No, we were already building another repository, where we are trying to reproduce the issues we have with caching on a smaller scale. However after some hickups at the beginning, the cache seems to work properly there. For now I wanted to make sure i am building the correct log files.
The analysis in the project we are actually having issues runs currently about 1 hour per build and has something around 2500 units.
I will create the proper logs there and share.
Thanks for the ongoing support!

1 Like

Hi Abbas, we now have a log with the debug information for cache misses. In such a case we get this entry for each and every file (2000+):

13:44:58.704 DEBUG: Cache miss for D:\Work\SonarQube\1\s\Administration\TheCmd\CSVUserImporter.cpp: response file not found, request changed

The reason seems to be that here the build agent has created "D:\Work\SonarQube\1" as his working directory and all source files are beneath. In previous builds typically this path ("D:\Work\SonarQube\2") was used and showing proper cache hits:

01:42:34.624 INFO: [1/2177] Cache hit for: D:\Work\SonarQube\2\s\Administration\TheCmd\CSVUserImporter.cpp

However if a build agents uses 1,2 or any other number as build root folder is difficult for us to stabalize. This is down to the Azure DevOps build system. Also we can not put it anywhere ourside the agent root folder as the agent refuses to build the code in such cases.

Can you confirm the different root patch is most probably the issue for us? Does this fit to the reported cache miss reason “response file not found, request changed”? Is there any way to tell the cache to use relative paths starting at a common root, which would be “/s/” in our case?

I can share the full success and error log file in case it is needed.

Thanks in advance.

@Wolfgang_Gogg,

From which directory are you running the analysis?

However if a build agents uses 1,2 or any other number as build root folder is difficult for us to stabalize.

The cache supports base unstable base directory, as we saw in your previous example, so it must be something else changing. My guess is that either you have header outside the project directories with unstable paths or there is a change of configuration (macro for example) between the two runs.

To know for sure, I will need a reproducer archive for CSVUserImporter.cpp in case of cache hit and it case of cache miss.

To generate the reproducer archive, you need to set the property sonar.cfamily.reproducer to the absolute path as printed in the analysis: “D:\Work\SonarQube\1\s\Administration\TheCmd\CSVUserImporter.cpp”
If this suceeds you should find the archive in the root directory of the repo.

Once you have them, I can send you a private message to upload the archives privately.

Thanks,