Language misidentified: c being detected as c++

For some reason sonar is misdetecting .c files as C++ and applying c++ rules to them.
The suffixes are correct in the project.

I have also added:

sonar.language=c
sonar.c.file.suffixes=.c,.h

To the sonar-project.properties file.
This is using compile_commands.json with entries like:

{
“directory”: “my/dev/dir”,
“command”: “/usr/bin/cc -Imy/src/dir -O2 -g0 -O0 -g3 -fprofile-arcs -ftest-coverage -std=gnu11 -o CMakeFiles/foolib.dir/foo_ver.c.o -c /my/dev/dir/src/foobar.c”,
“file”: “/my/dev/dir/src/foobar.c”
},

I added sonar.verbose=true
and it reports:

13:57:23.338 DEBUG: ‘src/foo.h’ indexed with language ‘c’

Any suggestions on how to fix or further debug this?

1 Like

Hi,

Could you tell us what rules you’re talking about? Many rules apply to both languages. And this part of the log:

makes me believe the language of the files is correctly detected.

BTW, this doesn’t do anything. Hasn’t for years:

And these are the defaults.

 
Ann

1 Like

The rule being (mis)applied is cpp:S5416 - “using” should be preferred for type aliasing

  • cppcoreguidelines
    since-c++11
  • Available SinceSep 25, 2019
  • SonarQube (C++)

As you can see clearly a C++ rule.

The suffixes are correctly set to C and not C++ in the web interface as well.
I was trying clutching at a straw to ‘correct’ it via the properties.

1 Like

Hi,

Thanks for the details. I’ve flagged this for the team.

 
Ann

1 Like

@KantarBruceAdams
To help us with the investigation, it would be very helpful if you could share the following files:

  • output from sonar-scanner in verbose mode: with -X option added
  • your compile_commands.json

If you think this file contains private information, let us know, and we’ll send you a private message that will allow you to send it privately.

In addition, what binary does /usr/bin/cc resolve to, in your case? I am interested in the name and version, but also the name of the binary (like gcc, g++).

1 Like

cc resolves to /usr/bin/x86_64-linux-gnu-gcc-9

cc --version
cc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0

The logs probably don’t contain any private information but I’d prefer to do it that way just in case.

I can’t see any additional information in the log with -X.

The project itself is C as are the majority of the source files being analysed.
There is unit test code in C++ which uses:

extern “C”
{
#include “foo.h”
}

My thought is that perhaps this is why C++ rules are triggered for foo.h.
However, foo.h does not itself have an extern “C” block or #ifdef __cplusplus for being compiled as C++.

Reproducing the issue is harder now as I have marked it as a false positive and completed the pull request. C++ smells appear on the master branch (which created by an ‘inscrutible’ azure pipeline) but this is no longer detected as new code.

Snippets from the sonar log and compile-compiles are:

grep -A 2 -B 2 foo.h sonarX.log


11:12:40.364 DEBUG: 'src/main/c/foo/foo_arch.h' indexed with language 'c'
11:12:40.365 DEBUG: 'src/main/c/foo/foo_errn.h' indexed with language 'c'
11:12:40.366 DEBUG: 'src/main/c/foo/foo.h' indexed with language 'c'
11:12:40.366 DEBUG: 'src/main/c/foo/foo_ver.h' indexed with language 'c'
11:12:40.367 DEBUG: 'src/main/c/foo/foo_tick.c' indexed with language 'c'
--
11:12:43.322 DEBUG: 'src/main/c/foo/foo_call.c' generated metadata with charset 'UTF-8'
11:12:43.325 DEBUG: 'src/main/c/foo/foo_user.h' generated metadata with charset 'UTF-8'
11:12:43.327 DEBUG: 'src/main/c/foo/foo.h' generated metadata with charset 'UTF-8'
11:12:43.336 DEBUG: 'src/main/c/foo/foo_misc.c' generated metadata with charset 'UTF-8'
11:12:43.346 DEBUG: 'src/main/c/foo/foo_main.c' generated metadata with charset 'UTF-8'
--
11:12:56.107 DEBUG: /someProject/foo_sonar/src/main/c/foo/foo_stm.h not marked as unchanged: file content changed
11:12:56.107 DEBUG: /someProject/foo_sonar/src/test/c/foo/foo_user_test.cpp not marked as unchanged: file content changed
11:12:56.107 DEBUG: /someProject/foo_sonar/src/main/c/foo/foo.h not marked as unchanged: file content changed
11:12:56.107 DEBUG: /someProject/foo_sonar/src/main/c/foo/foo_main.c not marked as unchanged: file content changed
11:12:56.107 DEBUG: /someProject/foo_sonar/src/main/c/foo/foo_errn.h not marked as unchanged: file content changed
--
11:12:56.404 DEBUG: Blame file (native) src/main/c/foo/foo_call.c
11:12:56.415 DEBUG: Blame file (native) src/main/c/foo/foo_user.h
11:12:56.416 DEBUG: Blame file (native) src/main/c/foo/foo.h
11:12:56.420 DEBUG: Blame file (native) src/main/c/foo/foo_misc.c
11:12:56.429 DEBUG: Blame file (native) src/main/c/foo/foo_main.c
--
11:12:56.653 INFO: CPD Executor 12 files had no CPD blocks
11:12:56.653 INFO: CPD Executor Calculating CPD for 32 files
11:12:56.653 DEBUG: Detection of duplications for /someProject/foo_sonar/src/main/c/foo/foo.h
11:12:56.660 DEBUG: Detection of duplications for /someProject/foo_sonar/src/main/c/foo/foo_io.h
11:12:56.661 DEBUG: Detection of duplications for /someProject/foo_sonar/src/main/c/foo/foo_ip.c

For the test code we of course have both C and C++ in the compile commands:

{
  "directory": "/someProject/foo_sonar/target/debug/src/main/c/foo"
  "command": "/usr/bin/cc   -O2 -g0 -O0 -g3 -fprofile-arcs -ftest-coverage   -o CMakeFiles/foo.dir/main.c.o   -c /someProject/foo_sonar/src/main/c/foo/main.c",
  "file": "/someProject/foo_sonar/src/main/c/foo/main.c"
},

{
  "directory": "/someProject/foo_sonar/target/debug/src/test/c/foo",
  "command": "/usr/bin/c++   -I/someProject/foo_sonar/src/main/c/foo  -g   -std=gnu++14 -o CMakeFiles/foo_testsuite1.dir/foo_user_test.cpp.o -c /someProject/foo_sonar/src/test/c/foo/foo_user_test.cpp",
  "file": "/someProject/foo_sonar/src/test/c/foo/foo_user_test.cpp"
},
{
  "directory": "/someProject/foo_sonar/target/debug/src/test/c/foo",
  "command": "/usr/bin/c++   -I/someProject/foo_sonar/src/main/c/foo  -g   -std=gnu++14 -o CMakeFiles/foo_testsuite1.dir/main.cpp.o -c /someProject/foo_sonar/src/test/c/foo/main.cpp",
  "file": "/someProject/foo_sonar/src/test/c/foo/main.cpp"
}

The C code is compiled into a static library which is linked into the test code though sonar only has the #includes to go on.

1 Like

I should also add that I had one instace of C++ rules applying for a .c file rather than a .h.
I discovered that the file was created by configuring (in cmake) a “.c.in” into a “.c”. I resolved that by adding .c.in as a C suffix in sonar.c.file.suffixes=.c,.h,.c.in.
At least I thought I did but I have since been informed that sonar.c.file.suffixes is no longer used. So how I fixed it is a mystery.

1 Like

Hi,

Thank you for additional information, that helps to clarify.

Was the issue you are referring to raised in the header file, which is included from C++ source?
This would explain the behavior you are seeing. We analyze code in header files, as part of the analysis of the source file that is including them. So if the header is included in the file that is a C++ source file (i.e. file that is compiled as C++), it will be analyzed as C++ code.

As you are rightly pointing out, this should not apply to code that is placed inside the extern "C" block, and it is most likely bug in rule implementation. For which rule or rules, have you seen the issue being reported?

We determine a language in which the file should be analyzed, by looking into the compilation command for the source file (and included headers are analyzed with it). This allows us to match to also determine a correct language standard to use. Would it be possible, that the file was compile in c++ mode before?

For the .c.in cases it was definitely only compiled as C.

For the header case its only two rules:

cpp:S5028 - Macros should not be used to define constants
cpp:S5416 - “using” should be preferred for type aliasing

I was thinking it would be a bug in the rule selection code (enabling all C++) rules rather than any specific rule but it could have a lot worse so perhaps not.

As I understand, the issue not longer reproduces. But if you will notice a C++ issues, being raised in C source file again, could you please open a new thread.

I have created CPP-3976 ticket for the above false positives in C code placed inside the extern "C" block, including whole headers being lexically included. You can use it to track the status of the issue.

2 Likes