Is there any point in detecting the language for cfamily header files?

The default configurations for language detection for the CFamily languages looks approximately like this:

I’m wondering if there is any point in having the header file - file extentions configured here. As far as I know, the sonar scanner will normally only analyze the source files that are part of the build, detected either by means of the build wrapper or a compile_commands.json And all the header files that get included from the source files will automatically also be scanned as part of the same translation units.

However, when .h files are set as C files, it seems like the project will be identified as having C code, which appears in the view on the projects page, and allows us to configure which quality profile we want to use for the C code in the project. In one of the projects I manage, I have set .h to be identified as C++ code instead, which matches our code base.

But I’m wondering if there is any point in identifying headers at all. Maybe it would be useful for the purposes of SonarLint, which I think performs an analysis directly on the header file.

Bonus points for having a good solution in a repository where there are both C++ headers and C headers that both use the .h extension. Or maybe that is why it would have been a good idea to motivate everyone to convert to .hpp :thinking: As an aside, C++20 modules should save us from headers eventually :slightly_smiling_face:

Hi @torgeir.skogen,

You are right for SonarQube, we analyze source files (files that are compiled in the build-wrapper/compilation database) and header files are analyzed in every context where they are included using the language of the source files.

These extensions are set and used by the sonar-scanner to identify the language and the analyzers that should be invoked and if you have at least one C, Obj-C, or C++ file the CFamily analyzer is invoked. This happens before the analysis starts. And when we identify the actual language of the header file(based on the compilation option and where it is included) it is too late to update the sonar-scanner information.

Distinguishing between the 3 languages only helps in things not related to the actual analysis, like the language of measures and the percentages of languages used in the project.

Distinguishing between the other languages and the CFamily languages is important and impacts the analysis. We don’t report CFamily issues on non CFamily files.

For SonarLint, the story is the same for these options even if we perform analysis for header files. In SonarLint we follow different strategies to analyze header files separately. For example, for VisualStudio, we use the project option defined in the vcxproj. For SonarLint with a compilation database, we detect the language by checking for source files with the same name and different extensions or the language of the source file in the same directory, etc… the algorithm here is always improving and hopefully should be transparent to our users.

This is one of the things that makes C and C++ a pain for tools, from my experience it is hard to convince C++ users to change all of their file extensions for a tool :smiley:

Thanks for the interesting questions!

2 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.