Cobol baselines

bjhop · October 15, 2020, 6:14pm

Must-share information (formatted with Markdown):

which versions are you using (SonarQube, Scanner, Plugin, and any relevant extension)
- Enterprise Edition - Version 7.9.1 (build 27448)
what are you trying to achieve
- We are scanning updated cobol files from endevor check-ins, which is working well. Yet, we are trying to figure baseline status as we are only scanning the files within the change-set. We cannot scan all 10k+ cobol files to get a “baseline”. Therefore we are wondering how we should go about getting baselines on individual files or something like that.

This is all automated via Topaz workbench, CA Endevor, Gitlab, ZOWE, etc.

what have you tried so far to achieve this
- As I said, currently, we are just scanning all the files within the endevor change-set, and thus each scan looks like a new file each time even if the file has been scanned before, and has no scan history. If we try to keep history, files that were scanned in one pipeline may or may not be a part of the next change-set, when they are missing Sonar treats them like they have been removed. Consequently, we are not building update history of our cobol components

ganncamp · October 16, 2020, 6:28pm

Hi,

Welcome to the community!

Since our records show you’ve filed a parallel support case on this, I guess you’re hoping for non-SonarSource community input here.

I agree that it will be fascinating to see how other people have addressed this.

Ann

bjhop · October 19, 2020, 2:42pm

Ann
We did submit a support ticket as well, and yes I was hoping we might be lucky enough to have other folks that may have solved for this. Plus until last week I myself did not have access to the support site for SonarSource.

Brian

OlivierK · October 20, 2020, 7:06am

Hello @bjhop,

Your question is interesting for the community so let me post some suggestions here:

Your dilemma is shared by all COBOL users

If you scan every COBOL program individually, you get a very fine granularity, but you get tenths (if not hundreds of) thousands of projects in SonarQube, one per COBOL file/program and this is not easy to use (not to mention SonarQube performance that can be affected)
If you scan all at once, as you said, it converts in one single big bulky project in SonarQube (granularity is coarse) and you will scan your entire code base every time a single COBOL file is modified. This is inefficient from a scanning perspective

The right solution is something in between:

You should group your COBOL files, if possible in meaningful sets, that will be analyzed together and that will form a project in SonarQube
The sets shall be:
- Small enough to have a reasonably fine granularity and to not scan too many files (all the files of the set) when 1 file is changed
- Large enough that you entire code base is broken down in a limited number of sets, to not produce too many projects in SonarQube

When I write in meaningful sets I mean that if you can find a logical split of the COBOL files (some sort of “modularity”) that’s ideal. If there is no such logical split then you should split a bit arbitrarily, for instance based on the file name (eg of a very basic no-so-smart split: split in 26 sets based on the 1st letter of the file)

To split, you must also take into account the copybooks. For ideal results you must make sure that the copybooks that COBOL programs depend upon are analyzed with the COBOL programs themselves, so they must be extracted in the respective sets.
It’s desirable that copybooks are not extracted in too many different sets.

It means that the copybook is a dependency in too many files, never so good
The copybook will be analyzed several times in different sets

The next question you may have is “how do I extract the files per sets as you suggest”?
I have no specific answer to that, but I know this is possible since some customers do it, and there are many various ways.
I know a bit about Zowe and I know you can add a bit of extra logic to control well what should be extracted.
Alternatively you can also have a strategy to extract everything (maybe extraction of ALL the COBOL code is not that long) and analyze only the subset that you want to analyze by

Either running a script after extraction that deletes all unwanted files (all files not part of the set you want to analyze). You control the logic of the sets as you want with your script
Or by setting SonarQube scanner proper exclusions to only analyze the set you want to analyze

Let me know if that helps.

Olivier