Sonar takes around 1 hr to complete (CPD Calculation step mainly)

Hi Community,

  • SonarQube : 8.9.2, Scanner Plugin: 4.4.0.2170 used

I am running sonar analysis for a python project having around 8000 thousand files which takes around 1 hr or sometime more than that to complete.

Mainly time taken by CPD calculation is more , How can i increase the analysis time any help/hints appreciated

INFO: CPD Executor Calculating CPD for 7987 files
WARN: Too many duplication references on file xxxx/xxxx/xxx.py for block at line 5. Keep only the first 100 references.

Regards,
Suraj Nahak

Hi Teams ,

Any hints or suggestion ?

Hi @surajnahak ,
Welcome to our community!
Could you please attach the full log of the analysis?

Hi @Andrea_Guarino

Thanks for your response , Please find the attached log for the analysis.
I have trim the analysis since it is very big mainly CPD Executor CPD calculation finished (done) | time=3610950ms

/opt/sonarscanner/sonar-scanner-4.4.0.2170-linux/bin/sonar-scanner -Dsonar.login=**** -Dsonar.projectName=demopy-project -Dsonar.projectKey=demopy-project -Dsonar.language=py -Dsonar.sources=. -Dsonar.inclusions=flow/**/*.py,xxxxxx/**/*.py
INFO: Scanner configuration file: /opt/sonarscanner/sonar-scanner-4.4.0.2170-linux/conf/sonar-scanner.properties
INFO: Project root configuration file: NONE
INFO: SonarScanner 4.4.0.2170
INFO: Java 11.0.3 AdoptOpenJDK (64-bit)
INFO: Linux 5.4.0-1061-aws amd64
INFO: User cache: /jenkins/.sonar/cache
INFO: Scanner configuration file: /opt/sonarscanner/sonar-scanner-4.4.0.2170-linux/conf/sonar-scanner.properties
INFO: Project root configuration file: NONE
INFO: Analyzing on SonarQube server 8.9.2
INFO: Default locale: "en", source code encoding: "UTF-8" (analysis is platform dependent)
INFO: Load global settings
INFO: Load global settings (done) | time=200ms
INFO: Server id: XXXXXXXXXXXXXXXXXX
INFO: User cache: /jenkins/.sonar/cache
INFO: Load/download plugins
INFO: Load plugins index
INFO: Load plugins index (done) | time=48ms
INFO: Load/download plugins (done) | time=111ms
INFO: Process project properties
INFO: Process project properties (done) | time=7ms
INFO: Execute project builders
INFO: Execute project builders (done) | time=2ms
INFO: Project key: demopy-project
INFO: Base dir: /jenkins/workspace/DevOps-Team-Testing/Sonar_py_test
INFO: Working dir: /jenkins/workspace/DevOps-Team-Testing/Sonar_py_test/.scannerwork
INFO: Load project settings for component key: 'demopy-project'
INFO: Load project settings for component key: 'demopy-project' (done) | time=21ms
INFO: Auto-configuring with CI 'Jenkins'
INFO: Load quality profiles
INFO: Load quality profiles (done) | time=46ms
INFO: Auto-configuring with CI 'Jenkins'
INFO: Load active rules
INFO: Load active rules (done) | time=955ms
INFO: Indexing files...
INFO: Project configuration:
INFO:   Included sources: flow/**/*.py, xxxxxx/**/*.py
INFO: 8232 files indexed
INFO: 2097 files ignored because of inclusion/exclusion patterns
INFO: 0 files ignored because of scm ignore settings
INFO: Quality profile for py: Sonar way
INFO: ------------- Run sensors on module demopy-project
INFO: Load metrics repository
INFO: Load metrics repository (done) | time=38ms
INFO: Sensor MuleSoft JSON Report Importer [mulesoft]
INFO: Sensor MuleSoft JSON Report Importer [mulesoft] (done) | time=22ms
INFO: Sensor Python Sensor [python]
INFO: Starting global symbols computation
INFO: 8232 source files to be analyzed
INFO: Load project repositories
INFO: Load project repositories (done) | time=139ms
INFO: 510/8232 files analyzed, current file: xxxxxx/xxxxxx/0059_xxxxxxxxx.py
INFO: 1601/8232 files analyzed, current file: xxxxxx/xxxxxx/m_xxxx.py
INFO: 2697/8232 files analyzed, current file: xxxxxx/xxxxxx/0054xxxx.py
INFO: 3834/8232 files analyzed, current file: xxxxxx/xxxxxx/xxxx.py
INFO: 4978/8232 files analyzed, current file: xxxxxx/xxxxxx/s_xxxxxxl.py
INFO: 6053/8232 files analyzed, current file: xxxxxx/xxxxxx/xxxxxx.py
INFO: 7185/8232 files analyzed, current file: xxxxxx/xxxxxx.py
INFO: 8232/8232 source files have been analyzed
INFO: Starting rules execution
INFO: 8232 source files to be analyzed
INFO: 534/8232 files analyzed, current file: xxxxxx/xxxxxx.py
INFO: 1200/8232 files analyzed, current file: xxxxxx/xxxxxxt.py
INFO: 1889/8232 files analyzed, current file: xxxxxx/xxxxxx.py
INFO: 2593/8232 files analyzed, current file: xxxxxx/xxxxxx.py
INFO: 3309/8232 files analyzed, current file: xxxxxx/xxxxxxd3.py
INFO: 4021/8232 files analyzed, current file: xxxxxx/xxxxxx.py
INFO: 4731/8232 files analyzed, current file: xxxxxx/xxxxxxh.py
INFO: 5452/8232 files analyzed, current file: xxxxxx/xxxxxx.py
INFO: 6162/8232 files analyzed, current file: xxxxxx/xxxxxxdate_h.py
INFO: 6848/8232 files analyzed, current file: xxxxxx/xxxxxxline.py
INFO: 7565/8232 files analyzed, current file: xxxxxxx/xxxxxx/team.py
INFO: 8232/8232 source files have been analyzed
INFO: Sensor Python Sensor [python] (done) | time=198686ms
INFO: Sensor Cobertura Sensor for Python coverage [python]
INFO: Sensor Cobertura Sensor for Python coverage [python] (done) | time=164ms
INFO: Sensor PythonXUnitSensor [python]
INFO: Sensor PythonXUnitSensor [python] (done) | time=151ms
INFO: Sensor CSS Rules [cssfamily]
INFO: No CSS, PHP, HTML or VueJS files are found in the project. CSS analysis is skipped.
INFO: Sensor CSS Rules [cssfamily] (done) | time=11ms
INFO: Sensor JaCoCo XML Report Importer [jacoco]
INFO: 'sonar.coverage.jacoco.xmlReportPaths' is not defined. Using default locations: target/site/jacoco/jacoco.xml,target/site/jacoco-it/jacoco.xml,build/reports/jacoco/test/jacocoTestReport.xml
INFO: No report imported, no coverage information will be imported by JaCoCo XML Report Importer
INFO: Sensor JaCoCo XML Report Importer [jacoco] (done) | time=11ms
INFO: Sensor C# Project Type Information [csharp]
INFO: Sensor C# Project Type Information [csharp] (done) | time=3ms
INFO: Sensor C# Properties [csharp]
INFO: Sensor C# Properties [csharp] (done) | time=0ms
INFO: Sensor JavaXmlSensor [java]
INFO: Sensor JavaXmlSensor [java] (done) | time=22ms
INFO: Sensor HTML [web]
INFO: Sensor HTML [web] (done) | time=20ms
INFO: Sensor VB.NET Project Type Information [vbnet]
INFO: Sensor VB.NET Project Type Information [vbnet] (done) | time=4ms
INFO: Sensor VB.NET Properties [vbnet]
INFO: Sensor VB.NET Properties [vbnet] (done) | time=1ms
INFO: ------------- Run sensors on project
INFO: Sensor Zero Coverage Sensor
INFO: Sensor Zero Coverage Sensor (done) | time=599ms
INFO: CPD Executor 245 files had no CPD blocks
INFO: CPD Executor Calculating CPD for 7987 files
WARN: Too many duplication references on file xxxxxx/xxxxxxx/prof.py for block at line 5. Keep only the first 100 references.
WARN: Too many duplication references on file xxxxxx/xxxxxxx_scripts/prof.py for block at line 5. Keep only the first 100 references.
WARN: Too many duplication references on file xxxxxx/xxxxxxx/prof.py for block at line 5. Keep only the first 100 references.
WARN: Too many duplication references on file xxxxxx/xxxxxxx/prof.py for block at line 5. Keep only the first 100 references.
WARN: Too many duplication references on file xxxxxx/xxxxxxx/prof.py for block at line 23. Keep only the first 100 references.
WARN: Too many duplication references on file xxxxxx/xxxxxxx/prof.py for block at line 33. Keep only the first 100 references.
WARN: Too many duplication references on file xxxxxx/xxxxxxx/prof.py for block at line 58. Keep only the first 100 references.
WARN: Too many duplication references on file xxxxxx/group.py for block at line 5. Keep only the first 100 references.
WARN: Too many duplication references on file xxxxxx/xxxxxx/d1.py for block at line 5. Keep only the first 100 references.
WARN: Too many duplication references on file xxxxxx/xxxx/d1.py for block at line 5. Keep only the first 100 references.
WARN: Too many duplication references on file xxxxxx/d1.py for block at line 5. Keep only the first 100 references.
WARN: Too many duplication references on file xxxxxx/d1.py for block at line 5. Keep only the first 100 references.
WARN: Too many duplication references on file xxxxxx/d1.py for block at line 23. Keep only the first 100 references.
WARN: Too many duplication references on file xxxxxx/d1.py for block at line 33. Keep only the first 100 references.



THIS WARNING Continues for all 7987 files




INFO: CPD Executor CPD calculation finished (done) | time=3610950ms
INFO: Analysis report generated in 1054ms, dir size=114 MB
INFO: Analysis report compressed in 21025ms, zip size=45 MB
INFO: Analysis report uploaded in 2325ms
INFO: ANALYSIS SUCCESSFUL, you can browse https://sonarurl/dashboard?id=demopy-project
INFO: Note that you will be able to access the updated dashboard once the server has processed the submitted analysis report
INFO: More about the report processing at https://sonarurl/api/ce/task?id=xxxxxxxx
INFO: Analysis total time: 1:04:02.572 s
INFO: ------------------------------------------------------------------------
INFO: EXECUTION SUCCESS
INFO: ------------------------------------------------------------------------
INFO: Total time: 1:04:03.949s
INFO: Final Memory: 6M/27M

Hi Community/ @Andrea_Guarino ,

Had to exclude the files from detecting duplication by using sonar.cpd.exclusions=**/*.py

Now the analysis only takes 3-4mins , But there should be property or something which we can pass to speed up the analysis if we don’t want to exclude files.

Hello @surajnahak,

Is your project a real life project or it’s a fake project you created manually to evaluate SonarQube?
I ask this because I wonder how it is possible to have multiple times the same files in different directories (d1.py, prof.py).

Alex

Hi Alexandre,

All files are unique, but there was directory and it had sub directory with .py files, which was generated from airflow cluster and hence we excluded from the analysis.

Regards,
Suraj Nahak