Troubleshooting lengthy Java analysis

(Oliver Goroll) #1

I’m not quite sure what to make of the log messages. There are some MaximumStepsReachedException, for one class there are three exceptions directly following each other, with differences of 150ms and 60ms; so not that long.
What we get a lot are messages where classes were not found, those have the form of
.class not found for java.lang.OurOwnClass, or
.class not found for, or
.class not found for java
.class not found for java.util (for packages nested more deeply this is logged for each package), or
.class not found for
In the 10 seconds between progress report (INFO: N/M files analyzed...) sometimes two, sometimes four or six files are analyzed, with the number of failed class lookups being unrelated.
Do you have any further advice on how to check what takes up the time?

Performance guide for large project analysis
(Nicolas Peru) #2

This seems rather slow and seems that semantic is doing a lot of work : how is your project analyzed and how do you provide bytecode to the analyzer ?

(Nicolas Peru) #3

Moreover : another interesting information in investigation could be to deactivate the custom rules to see if the culprit is among those rules : if you keep a reference to a tree between files you may end up with a memory leak which would put pressure on the GC.

(Oliver Goroll) #4

Currently, we have about 750 individual modules that make up our full project. The sub projects are clustered into some generic categories such that we get a folder structure like
root/{category}/plugins/{module}. Below each {category} are also the plugins containing the test files; below each {module} there are the src and bin folders for sources and class files, respectively. Initially the the SonarScanner was configured as
I don’t know why it was done this way and I don’t know why the Scanner found any source file. However, analysis was performed with our own binaries missing. It was corrected as
Currently, I’m unsure if we even correctly configured our multi-module project.

We analyze our project using the SonarScanner for Jenkins which uses version of the SonarQube Scanner.

(Thanh Luong Giang) #5

To provide further information, we have analyzed our project one time with custom rules and one time without custom rules. Unfortunately the culprit is not among those custom rules.

(Dinesh Bolkensteyn) #6

Can you provide some overall numbers:

  1. How many lines of code does this project have?
  2. How many files does it consist of?
  3. How long does the analysis take?
  4. What is the hardware specification of the machine used for the analysis?
  5. Do you set any performance tuning related parameters (e.g. JVM memory) for the analysis?

Then to go a bit more into the details, can you have a look at the analysis logs and identify which stage of the analysis is taking up most of the time? If you are able to share the logs, I’d be happy to help with that (feel free to send me a direct message).

(Thanh Luong Giang) #7
  1. around 3m lines of code
  2. it consist ~30k+ files
  3. the analysis takes around 62h and failing at the end.
    Between the 10 seconds info report we are receiving multiple "Warn: Too many duplication references on file for block at line x. Keep only the first 100 references."

The longer the analysis takes the more warn messages appears. Could this appears due to a wrong configuration?

We are using the following Configuration:


  1. Processor: Intel Xeon CPU @ 3.60Ghz
    RAM: 64 GB

  2. We didn’t set any performance tuning related paramters for the analysis.

Sonar Plugin api 6.7
Sonar Java Plugin
Sonar Scanner

(Dinesh Bolkensteyn) #8

Can you try the following:

  1. Set the sonar.cpd.exclusions=**/* property to skip the duplication analysis entirely (this should remove all Warn: Too many duplication references on file messages).
  2. Update your SonarJava plugin to the latest 5.9.1 version

If the problem persists after those 2 changes, it would be helpful if you could share the original logs of the analysis for further investigation.

(Dinesh Bolkensteyn) #10

Thank you for sharing the analysis logs privately. It indeed seems that around 6-7 seconds is being spent per file, which is unusually high.

To help pinpoint the root cause, can you relaunch an analysis with all Java rules disabled? You can create a different quality profile and disable all rules there, and then set that project to use this new quality profile.

With this, we’ll be able to tell if the problem is coming from the rules or from the earlier stages of the analysis.

Thank you again