Performance guide for large project analysis

hosszubalazs · June 12, 2018, 7:26am

Hey Sonar Community!

Analyzing large projects can take super long. I wonder what are your experiences, tips on improving performance when doing a full analysis on an x-large Java project.

Is 20 minutes of analysis time for ~500k LoC expected ? What are actions I can take to decrease the time requirement?

I did check to exclude any generated code, only analyse what is really needed. Heap size is increased for Sonar.

A performance guide in the User Guide would be great, I couldn’t find such a thing: https://docs.sonarqube.org/display/SONAR/User+Guide
As an example, I really like the Gradle guide on performance improvement: https://guides.gradle.org/performance/

Thanks and cheers,
Balázs

Philippe_Cade · June 12, 2018, 7:51am

I can’t tell you how to make it faster, but I can share some experience.

We need a bit more than 3 hours to go through 5 million lines of code so 20 minutes doesn’t sound so bad

It all depends on how you want to integrate Sonar in your work. We told the programmers that we will only analyze their code at night so they know that they will have fresh results when they come in the morning. I think that most have gotten used to that.

We also use the Sonarlint plugin in Eclipse to give them direct feedback on some of the more obvious issues.

hosszubalazs · June 12, 2018, 8:00am

Thanks for your input Philippe!
Good point regarding SonarLint, the continuous feedback during development is game changer IMHO.

Nicolas_Peru · June 12, 2018, 11:24am

Hi,

This will be very hard to investigate whitout more details.
I had a look on some project we analyze internally and I can see a 3MLoc project analyzed in ~2h and a 1Mloc in ~40min (with all rules activated) so your numbers a bit high but not out of the extraordinary.

simon.brandhof · June 13, 2018, 5:57am

Upgrading to latest versions of SonarQube and analyzer plugins may also to benefit from performance improvements.

oliver · June 15, 2018, 5:43am

Simply to provide another data point:

We have about three million lines of code, it takes the SonarScanner (Version 3.0.3.778)

~40min for sources only
~1h 35 min with external libraries
~33h with external libraries and all binaries

with 382 active rules; some of them custom. Project analysis by the SonarQube (Version 6.72) server always takes approximately 40 minutes.
I cannot eliminate the possibility that we configured the binaries incorrecty, though.

hosszubalazs · June 18, 2018, 10:14am

Thanks a lot for all your inputs!

Is an incremental analysis maybe supported/planned?
Assuming I have Gradle project, and want to analyse the code on every project in my CI. Sonar already knows my sourcecode, I imagine it could store a hash and easily check if certain modules have been changed or not. On big projects with small changes this would be a dramatic performance difference. Could this work?

Nicolas_Peru · June 19, 2018, 8:11am

Just to let you know that “incremental mode” can mean a lot of different things : we thought about this and did some experiment that brought their share of complexity.
As of today we have no clear plan about such a feature.

NicoB · June 19, 2018, 12:27pm

Nice thread. At this stage though I’d like to point out that any comparison that restricts itself to LOC versus analysis_time is likely to be unreliable, and not give any interesting outcome. Couple of reasons why I’m pointing that out, all around the fact that a SonarQube analysis of a project involves some many components/factors:

obviously the versions of SonarQube and of analyzers, as performance is continuously improved
the actual server-side configuration: which rules are activated in the Quality Profile ? is duplication detection enabled ? etc.
the number of extensions installed in the environment: custom plugins ? coverage import ? etc.

All of those factors contribute to the overall analysis and can make a difference in terms of timing. So whenever looking at performance aspects I would suggest to take a pragmatic approach.

Understand what is taking time

The analysis is made of the client-side scanner run, and the server-side Background Task. Understanding long execution varies depend on what takes time:

client-side scanner run: enable debug logs (sonar.verbose=true) with timestamps , and nail-down the piece which takes time. If it’s the actual code analyzer doing its job, than that part can indeed grow with the volume of the codebase
server-side background task: check the state of resources (CPU/RAM/IO), see if database interactions aren’t slow for some reason etc. Verbose logs can also help narrow-down the lengthy part.

Monitor

Whatever the context, the minute one starts to look into performance, than monitoring comes in pair. There’s a good initial guide in the documentation. And ultimately those are pure monitoring/operational considerations for a Java application, i.e. first understand whether your performance feeling relates to system performance (CPU/RAM/IO), or application performance (product itself, but also interaction with other components like database),

NicoB · June 19, 2018, 12:33pm

The risks are higher than the (potential) gains. The output of the analysis is not only a factor of the code itself, it also depends on which rules are enabled in the Quality Profile, what are the analysis settings etc etc. Even if code doesn’t change, then the analysis report can greatly vary if rules/settings/parameters were modified.

Also never forget that even if a file did not change, there could still be a new issue in that file: think of a function now deprecated in fileA , if fileB uses that method then it would raise a do not use deprecated method issue, even if fileB did not change. That’s the basic example, only to illustrate that (on top of the influence of server-side project settings) focusing on changed files is also not enough for any advanced analyzer that does some detailed/cross-file inspection (which many SonarAnalyzers do).

NicoB · June 20, 2018, 8:53am

3 posts were split to a new topic: Troubleshooting lengthy Java analysis

hosszubalazs · July 2, 2018, 12:17pm

Thanks a lot for your answers, great points!

In general the server side background task seems to be fast enough, majority of time is spent on client side. I’ll check the logging options to look into that: https://docs.sonarqube.org/display/SONAR/Analysis+Parameters#AnalysisParameters-AnalysisLogging

Topic		Replies	Views
Sonar-scanner performance for large scale projects SonarQube Server / Community Build scanner , performance	12	156	March 19, 2025
How to speed up security analyzes on a large java project? SonarQube Server / Community Build java , scanner , performance , security	3	856	December 10, 2020
Project Analysis has gone from taking <10 minutes to 1-4 hours SonarQube Server / Community Build	6	525	July 14, 2022
Parallel rule execution SonarQube Server / Community Build performance , rules	2	1291	October 13, 2018
Project analysis sometimes run long SonarQube Server / Community Build performance	2	994	January 3, 2019

Performance guide for large project analysis

Understand what is taking time

Monitor

Related topics