Slow java projects analysis

Hi,

(using sonarqube 9.9)

If you compare just the two projects you use now, a Java project and a TypeScript project, the scan and analysis of the Java project is too slow.

The Java project is about 250,000 lines long and the JavaScript project is about 340,000 lines long.

The scan and analysis takes 8-12 minutes for Java projects and 3-4 minutes for JavaScript projects.

Server-side caching is enabled, and for Java projects, you’ll see the following error message
INFO: Server-side caching is enabled. The Java analyzer was able to leverage cached data from previous analyses for 0 out of 4335 files. These files will not be parsed.
I get a log like the one above.

This log does not occur in a type script project,
INFO: 2669/2669 source files have been analyzed
INFO: Hit the cache for 2585 out of 2669
INFO: Miss the cache for 84 out of 2669: FILE_CHANGED [84/2669]
INFO: Sensor TypeScript analysis [javascript] (done) | time=18951ms
I’m getting logs like the above, it looks like I’m using the cache.

In both projects, the main branch is being scanned and analyzed daily, and the above logs are from the feature branch.

If you have any idea why the java project is slow, or if you know how to utilize cache in java project to scan and analyze, please reply.

Thank you.

Hi,

Welcome to the community!

You’re making an apples-to-oranges comparison here. Java and JavaScript analyses do different things, and they do them differently. It’s possible you could speed up the Java analysis by providing more memory to analysis. Check the docs for your scanner for details on that.

Regarding the cache, it’s only used during PR analysis, so there’s nothing you can do in a branch analysis scenario to force its use.

I know these aren’t the answers you want, but they’re the only ones I’ve got.

 
Ann

Hi,

Thank you for your response.

I’m sorry for the lack of explanation.
I posted the question because both java and javascript are PR analytics, and in the case of java it doesn’t seem to use cache.

Java.
16:56:23 INFO: Server-side caching is enabled. The Java analyzer was able to leverage cached data from previous analyses for 0 out of 4378 files. These files will not be parsed.

JavaScript
17:12:04 INFO: 2678/2678 source files have been analyzed
17:12:04 INFO: Hit the cache for 2644 out of 2678
17:12:04 INFO: Miss the cache for 34 out of 2678: FILE_CHANGED [34/2678]

As shown above, Java appears to be unable to use the cache.
How do i determine what went wrong?

Hi,

The underlying branch has to have been analyzed first for a cache to be available. We’re talking about two different projects here? One for Java and one for JavaScript? That could explain the difference.

 
Ann

Hi,

I apologize for the delay in responding.
You are correct.
The two projects are identical, except that one is Java and one is JavaScript, and I analyzed them while PRing them from another branch to the main branch.
Do you happen to know what the difference between a Java project and a JavaScript project is?

Hi,

The two analysis engines are written entirely differently in different technologies, analyzing different languages.

 
:woman_shrugging:
Ann

Hi,

Is analyzing Java slower than analyzing JavaScript?
I don’t know why JavaScript is faster than Java even though it has more LOC.

Hi @younghoon,

As @ganncamp explained, it is normal to have different speeds of analysis for different languages.

However, if you feel the analysis is too long for PRs, let’s check to see if there is something strange in your logs or in your setup.

Some exploration questions:

  • Could you please share your full analysis logs at the DEBUG level so we can look into this?
  • Can you also share more info about your setup? Where is the analysis happening? What memory have you configured? What is the number of CPUs?
  • Do you have external plugins?
  • How much time does it take to do the full analysis in the same codebase?
  • Are these small PRs with a few code changes? Are the changes likely to impact other parts of the code? Any more details could help.

Just so you know, for a Pull Request, we expect that the number of lines changed is what is mostly considered. However, technically, the number of lines impacted by the change also has an effect. For Java, we have multi-file analysis that looks for potential vulnerabilities and potential runtime issues in your code. So, changing a line could require checking many files.