Question regarding phase during analysis (switching scanner for existing project)

I’m currently switching our CLI-based analysis (using sonar-scanner.bat) to a Maven/Tycho driven analysis. I have first checked, that the Maven analyser works as expected and was now pointing the scanner to the actual production project. I’ve noticed that as one of the last steps all source files are touched, see below:

20:03:10,997 [INFO] Sensor Zero Coverage Sensor
20:03:13,645 [INFO] Sensor Zero Coverage Sensor (done) | time=2648ms
20:03:13,645 [INFO] Sensor Java CPD Block Indexer
20:03:41,484 [INFO] Sensor Java CPD Block Indexer (done) | time=27839ms
20:03:41,549 [INFO] SCM provider for this project is: svn
20:03:41,572 [INFO] 55548 files to be analyzed // <--
20:03:51,573 [INFO] 73/55548 files analyzed
...
21:38:48,471 [INFO] 55548/55548 files analyzed
21:38:49,991 [INFO] 9586 files had no CPD blocks
21:38:49,991 [INFO] Calculating CPD for 28542 files
21:38:54,000 [INFO] CPD calculation finished

What does the analyzer do with those file? Is this all related to Source Code Management?

With the CLI scanner only about 1000 files were regularly analyzed; for the test project (same code as the production project) there were about 40-150 files (albeit with less commits compared to the CLI analyzer). Is this a one time analysis because the project layout on disk has changed (from multiple svn repositories to a single one; with mostly the same layout on disk)?

Thanks

Hi,

IIRC, on first analysis the scanner gets all the blame data for all the lines of all the files (for correct issue dating & attribution & recognition of “new” code). After that, it only needs to get updates on what’s changed.

 
HTH,
Ann

Hi Ann,

thanks for the info. Is the blame data stored in the Sonar workspace? Does the scanner communicate with the SQ server which files to check? Currently, the CE task related to a scan is failing because of insufficient heap size; would this cause the scanner on subsequent analyses to compute the blame data again?

Hi,

Ehm… no. It’s stored in your SCM and in the SonarQube server

I believe the files to check on a non-initial analysis are based on local SCM data but my knowledge in this area is pretty shallow.

No. Just increase the heap & try again.

 
Ann

Hi Ann,

thanks for your reply. As long as the background task failed during the scanner’s analysis always the full set of files was “blamed”. As soon as the first background task succeeded only the new or changed files were “blamed”.
I don’t know how the scanner did that but it seems to me it can’t rely solely on the SCM.

Thanks for your help,
Oliver