Analysis a project of 1 millions lines of code

FFlorian · April 3, 2019, 11:42am

Project description :

Currently we have a big project, forcing us to make analyzes with several “projects” sonar.
Below, the details of our project with volumes and languages.
Then our way of analyzing the code.

We used :

sonar version : 7.5.0.20543
bamboo : 5.10.3
maven : 3.3.9
sonar-maven-plugin : 3.6.0.1398
Java : 1.8
Oracle : 11g
System OS : RedHat

Maven modules:

Padua (Java / webApp)
- Line of codes : Java: 441k / JavaScript: 47k / XML: 22k / CSS: 1.8K
paduaEE6 (Java / webApp)
- Line of codes : Java: 288k / JavaScript: 2.3k / XML: 766k / CSS: 1.4K
interfaces (Java)
- Line of codes : Java: 22k / XML: 302k
padua-plsql (PL / SQL)
- Line of codes : PL / SQL: 270k
others / reports (Java)
- Line of codes : Java: 55k / XML: 219k

Analysis:

We perform daily project analyzes at night with Bamboo (Bamboo: Continuous Integration & Deployment | Atlassian) in two ways :

Code analysis module by module (from the folder of each module).
Analysis of the full project.

Module by Module Analysis
This first analysis, allows us to have a code validation of each module. However, the tests are not run during this analysis.

The rules used are “sonar Way” except for Java where we had, for the moment, to disable the following rules :

S3546: Custom resources should be closed
S2083: I / O should not be vulnerable to path injection attacks
S2078: LDAP queries should not be vulnerable to injection attacks
S2631: Regular expressions should not be vulnerable to Denial of Service attacks
S3649: SQL queries should not be vulnerable to injection attacks

Activation of the rules results a OutOfMemory with 6GB JVM (-Xmx6g -XX: MaxMetaspaceSize = 1g). System with only 8GB.

Here are the maven goals for each module (in the same order as the previous chapter). Maven is launched from the folders of each module:

Padua:

clean compile sonar: sonar -P bamboo -Dmaven.compiler.source = 1.8 -Dmaven.compiler.target = 1.8 -DskipTests -Dmaven.test.skip = true -Dsonar.projectKey = geodis.sonar: padua

paduaEE6:

clean compile sonar: sonar -P bamboo -Dmaven.compiler.source = 1.8 -Dmaven.compiler.target = 1.8 -DskipTests -Dmaven.test.skip = true -Dsonar.projectKey = geodis.sonar: puaduaEE6

interfaces:

clean compile sonar: sonar -P bamboo -Dmaven.compiler.source = 1.8 -Dmaven.compiler.target = 1.8 -DskipTests -Dmaven.test.skip = true -Dsonar.projectKey = geodis.sonar: interfaces

padua-plsql:

clean compiles sonar: sonar -Dsonar.projectKey = geodis.sonar: sql

others / reports:

clean compile sonar: sonar -P bamboo -Dmaven.compiler.source = 1.8 -Dmaven.compiler.target = 1.8 -DskipTests -Dmaven.test.skip = true -Dsonar.projectKey = geodis.sonar: reports

Full analysis
For this analysis we did several tests. The goal is to get an execution of all tests (JUnit, Selenium)
Goals maven used :

-Dsonar -Pall-tests verify -Dmaven.test.failure.ignore = true -Dorg.xml.sax.driver = com.sun.org.apache.xerces.internal.parsers.SAXParser -Dfile.encoding = Cp1252 sonar: sonar

Here are the different parameterization tests:

Full rules analysis (Sonar Way): OutOfMemory with JVM : -Xmx6g -XX: MaxMetaspaceSize = 1g
Report size with excluded rules: Analysis reports compressed in 22665ms, zip size = 68 MB
In this case we have a problem to save the report in MySQL: MySQLSyntaxErrorException: The size of BLOB / TEXT data inserted in one transaction is greater than 10% of redo log size. Increase the log size using innodb_log_file_size.
Report size with profile without JAVA and PL-SQL rules: Analysis reports compressed in 7910ms, zip size = 39 MB

Ideal situation:

Currently we have 6 projects to have all the information in sonar.
The ideal would be to have a single project that gathers all the information.
Do you have a good practice?

ganncamp · April 3, 2019, 5:45pm

Hi,

First, kudos on the well-fleshed background data. Based on the fact that you’re analyzing PL/SQL, I know you’re on at least Developer Edition. If you’re actually a step above in Enterprise Edition, then an Application would probably fit the bill; it’s a synthetic project composed of multiple “real” projects. It would give you a consolidated project homepage and measures, and allow you to drop the attempt to analyze the whole project in one go. This would have the knock-on effect of reducing your total Lines Of Code for billing purposes.

Failing that, if you really want to analyze the whole project as a single entity, then you’re just going to have to tune your analysis memory and database settings until it works.

And by the way, to circle back to the topic of lines of code, I understand your motivation for what you’re trying to do, but from a fiscal standpoint by analyzing the same project twice as separate entities, you’re paying double what you really ought to pay. Since you’re only analyzing the split-up version once a day(?) I’d drop those projects and pursue the mega-project if I were you.

One last note: before you spend a lot of time tuning MySQL, you should be aware that support for it is deprecated. And “deprecated” here means it will go away ~~eventually~~ soon. (Keep your eyes peeled for an announcement on this topic in the next several days.)

HTH,
Ann