Tool versions:
- SonarQube Server Enterprise Edition v10.7 (96327)
- sonar-scanner-cli-6.2.1.4610-linux-x64 deployed using zip
What am I trying to achieve:
In my repository of about 5M LOC, I have 10+ products. These products use different toolchains and share a great deal of common code. I’m trying to find an optimal way to scan all of these products, with following criteria:
- No duplicate issues (or as few as possible)
- No duplicate counting LOC (or as few as possible), to not exhaust our license
- Keep scan time as short as possible
Sample repository structure:
REPO_ROOT/
├── Code/
│ ├── LibraryOne/
│ │ ├── CommonCode
│ │ ├── ProductSpecificCode/
│ │ │ ├── ProductOne
│ │ │ │ └── config.cmake
│ │ │ └── ProductTwo
│ │ │ └── config.cmake
│ │ └── CMakeLists.txt (include $PRODUCT/config.cmake)
│ └── ... Many more libraries following the LibraryOne pattern
└── Projects/
├── ProductOne/
│ └── CMakeLists.txt (PRODUCT=ProductOne)
└── ProductTwo/
└── CMakeLists.txt (PRODUCT=ProductTwo)
What I tried:
1. Scan individual products into individual projects in SonarQube
Upsides: Can parallelize the scans on mutliple machines, taking only about 1.5 hour for all products.
Downsides: I get duplicate issues from common code. Common code is counted multiple times, exhausting company’s license.Code base is about 5M LOC, but if I scanned all products individually, I’d get to 15+M LOC.
2. Scan whole repository using Code variants for each product
Upsides: Issues are not duplicated, LOC is counted only once. Nice unified display of issues at one place.
Downsides: Fresh scan takes almost 6 hours.
3. Scan products into single SonarQube project with mutliple branches
Upsides: LOC counted only for largest branch(=product).
Downsides: Issues remain duplicated. Display is not much user friendly. This is just trying to bend the tool’s workflow to do what I want.
4. Other attempts
- Tried the Applications feature to aggregate results of all the projects - neither issues nor LOC are de-duplicated
- Tried the Monorepo feature, but it appears the main purpose of that feature is to better integrate projects and not de-duplicate issues and LOC
- Recommended usage of sonar.exclusions parameter is not very feasible, due to structure of my repository. There are hundreds of libraries, some also include product specific code additionally to common code. Using the sonar.exclusions parameter would take a lot (and I mean a lot) and scanner configuration clutter to get to work.
- Didn’t find any option to scan without sending the results to the server. The workflow being, scan all products without sending results, then combine all results manually and send at once. I know about the possibility to use 3rd party scanners, but I’d like to keep SonarQube’s scanner as it works really great and I don’t need to invest more time configuring a second tool.
Final notes
The most usable solution appears to be number 2 - use Code variants, even with the long scanning time. One addition that would solve my problem is an option, or a flag, that would prevent the overwriting of whatever is on the server, when a new scan finishes. Instead of overwriting, the tool would rather “add” the new findings and cross-detect and remove duplicates.
I found a suggestion, and I hope I understood the proposition correctly. It proposes to allow scanning variants on multiple hosts. That would also solve my case: https://portal.productboard.com/sonarsource/3-sonarqube-server/c/444-analyze-multiple-code-variants-built-on-distinct-hosts.
I’d like to get your opinion on which options is best, or maybe propose a whole different alternative that I have missed.
Thanks,
Michal