Cache hit very low when PR directory different from main branch

ALM used

GitHub

CI system used

Jenkins

Languages of the repository

Java 21, Maven 3.8.7

We cannot get the PR incremental cache to work reliably on the CI, whatever we do. We have a project of several thousand files, so I will illustrate the bad cache hit with the logs of only one module.

I manage to reproduce the issue in local, but also to actually make it work in one specific case.

Working scenario

  • Checkout of main in the directory /home/project/app
  • Run a full analysis (to update the remote cache in SonarQube Cloud) :
    mvn -Dsonar.projectKey=xxx -DskipTests=true -T 1C --batch-mode -Dsonar.organization=xxxx clean package sonar:sonar
  • Create a PR from the main branch. Change only one line of one file.
  • Run a PR analysis
    mvn -Dsonar.pullrequest.key=21691 -Dsonar.pullrequest.branch=feature/PrThatShouldWork -Dsonar.pullrequest.base=main -Dsonar.projectKey=xxx -DskipTests=true -T 1C --batch-mode -Dsonar.organization=xxxx clean package sonar:sonar -X

Here, and only here do I get a good cache hit :

Server-side caching is enabled. The Java analyzer was able to leverage cached data from previous analyses for 3430 out of 3951 files

Failing scenario

This is done right after the previous scenario.

  • Checkout of main in the directory /home/project/app2, at the exact same commit
  • Do not run a full analysis. The cache is already up to date and should be used
  • Create a PR from the main branch. Change only one line of one file (the same change as previously)
  • Run a PR analysis
    mvn -Dsonar.pullrequest.key=21692 -Dsonar.pullrequest.branch=feature/PrThatDoesNotWork -Dsonar.pullrequest.base=main -Dsonar.projectKey=xxx -DskipTests=true -T 1C --batch-mode -Dsonar.organization=xxxx clean package sonar:sonar -X

Now, the cache hit is terrible

Server-side caching is enabled. The Java analyzer was able to leverage cached data from previous analyses for 1539 out of 3951 files. These files will not be parsed.

We get the exact same result on every PR on our CI. This means that the PR builds last about 10 minutes instead of 4~5 minutes.

In the described scenario, we use the same :

  • Project
  • JDK
  • Base branch with same commit
  • OS

The only difference that I can think of would be the absolute path of the files. Is this how the Sonar remote cache work ? Are we supposed to run our CI build always on the same path ? It is possible that I am on the wrong path. I have the full debug run of the 3 executions described above, and I can share any part that could help identify the issue.


Edit : Additional tests that seem to confirm the cache depends on the base directory

First test:

  • Deleted the /home/project/app which moved it to /home/xxx/.local/share/Trash/files/app
  • Ran the sonar pr analysis from there
  • Bad cache hit

Second test:

  • Created a new /home/project/app (same name as first step of working scenario) from zero and checked out my app
  • Ran the sonar pr analysis from there
  • Good cache hit

Hi,

Could you go ahead and share those debug analysis logs?

 
Thx,
Ann

I made a password protected zip archive of those logs. Where can I share them safely ?
They are several thousand lines long, I’d rather not paste them publicly.

Hi,

I understand. I’ve flagged this for the team. They’ll ping you privately for the logs.

 
Ann

Hello @Adam_Birem
We also have a very low cache hit.
The absolute path changes between builds, so your theory potentially explains the problem we’re facing.
Thanks for sharing!

1 Like

I’m positive the absolute path is responsible for the cache hit issue. To prove it I made this experiment :

  • Waited for the Jenkins build to happen on the main branch, this took place on our CI agent in the directory /home/jenkins/workspace/ETECH_eTech_main
  • I created on my local computer the directory /home/jenkins/workspace/ETECH_eTech_main, checked out a branch, made a PR analysis.

The cache worked ! We got 90% of cache hit.

The issue is that our platform team informs us that building the PR in a invariable directory is hard to implement on our CI. Is this behaviour simply a bug that should be fixed on Sonar side, or is there a configuration of the sonar scanner that would change the observed behaviour ?

Hi @ganncamp

You mentioned flagging this for the team to contact me privately about the debug logs. I haven’t received any contact yet.

Just to confirm where we are: I’ve proven the cache depends on the absolute path on our current configuration. I created /home/jenkins/workspace/ETECH_eTech_main on my local machine and got 90% cache hit on the PR analysis.

The issue is that our platform team says building PRs in an invariable directory is hard to implement on our CI.

Is this behaviour simply a bug that should be fixed on Sonar side, or is there a configuration of the sonar scanner that would change the observed behaviour?

We’re losing 5-6 minutes on every PR build because of this. I still have the debug logs ready to share whenever someone reaches out.

Thanks,
Adam

Hi,

The delay is not unexpected; we’re at our annual company off-site this week. Hopefully they’ll be along next week.

 
Ann

1 Like