Build wrapper seems to not capture all of build on Linux building C/C++

Hello,

I’m trying to add my build to a project on our SonarQube Enterprise server but I’m having trouble getting through the analyze step. Our build is run using Makefiles compiling with gcc.

The build is building embedded Linux images so we have Linux kernel code in the build along with all our applications and sub-systems to go onto image. Our code base is about 1.5 million line of code.

After some time working with build wrapper I was able to generate a build-wrapper-dump.json file and feed than into a run of sonar-scanner. It ran for four hours and when it tried to upload the 1.6 GB compressed analysis report to our sonarqube portal, it choked on it and returned a “500” error.

I talked to our admin for the portal and he asked that I try to reduce the size of the analysis report. I said “OK” and adjusted the invocation of sonar-scanner to exclude the building of the Linux kernel code.

When I tried running sonar-scanner with the kernel code excluded I received this error

ERROR: Error during SonarScanner execution
java.lang.IllegalStateException: The “build-wrapper-dump.json” file was found but 0 C/C++/Objective-C files were analyzed. Please make sure that:

  • you are using the latest version of the build-wrapper and the CFamily analyzer
  • you are correctly invoking the scanner with correct configuration
  • your compiler is supported
  • you are wrapping your build correctly
  • you are wrapping a full/clean build
  • you are providing the path to the correct build-wrapper output directory
  • you are building and analyzing the same source checkout, absolute paths must be identical in build and analysis steps

I then went and looked in my “build-wrapper-dump.json” file. It appears that build-wrapper is only capturing some of my build and the part it had captured is the kernel code. Using “grep” and “wc -l” I can see build-wrapper has made 141 captures, far less then I would expect for a build of my size.

So, I think the issue here is build wrapper is not capturing most of my build. What can be done to address that?

Versions:
build-wrapper-linux-x86-64 – 6.34
sonar-scanner – 4.7.0.2747 - linux

Thanks,

Bruce

I’ve got some more information. I’m attaching a zip file containing my build-wrapper-dump.json and build-wrapper.log files.

Looking at my build-wrapper.log I’m seeing the build-wrapper is ignoring most compiler commands. The only ones it seem to be capturing are HOSTCC commands seen when building the Linux kernel code.

In this instance I’m cross-compiling for arm64. I had originally been targeting the compile at x86_64 and I wanted to see if targeting a different architecture made a difference. It does not.

Thanks,

Bruce

build-wrapper_20230127-1453.zip

Hi,

Welcome to the community!

What’s your version of SonarQube?

I think the first thing to look at is why the server was choking on the full build. If it’s not possible to tune that, then as a fallback we can focus on getting a successful partial build/analysis.

 
Ann

Hi Ann,

Sorry for the delay. My work is often interrupt driven and I been putting out fires for the past few days.

While I recognize the problem uploading to the portal needs to be addressed, it feels a bit backwards to go after that problem before the problem with the build-wrapper not recognizing the vast majority of compiler invocations in our build. For what it’s worth, the “full build” (that generated the analysis report the server choked on) is a small fraction of a full build because the build-wrapper only provided 141 compiler invocations for analysis.

That said, you asked to look at the portal issue first, so let’s do that.

Our SonarQube server version is 9.5.0.56709.

I’ll add a link to a zip file containing the log from several days ago of a scan that included the linux kernel code and has SQ server choking on being sent the analysis report. The zip will also contain my sonar-scanner.properties file and my sonar-project.properties file with the “sonar.login” setting redacted.

I’m going to start a build and an analysis pass to run overnight so that I will have a full set of logs from end to end.

Bruce

sq_srv_500_err.zip

Hi,

I wanted to start with the upload error because that’s usually simple. :smiley:

In fact, I didn’t see the error in your log that I anticipated. Normally it’s a connection closed error because there’s something “helpful” on the network gating traffic for payloads over a certain size.

You, on the other hand, have a 500 error. It may still be something “helpful” on the network, but the first thing to do is check your server logs to see if the 500 error is reflected there & what they say the problem is. If the 500 error doesn’t show up on the server side, then it is time to talk to your network folks.

And in parallel, since you’re not on the latest version, can you upgrade to SonarQube 9.9 LTS (announced Tuesday) and see if you still have build-wrapper problems? If you do, please start a new thread for that. It will have to be routed differently.

 
Ann

Hi Ann,

After being buried under emergent tasks, I’m finally getting time to work on projects again.

In the time between my last message and now the manager of our SonarQube infrastructure has updated our server to “Enterprise Edition - Version 9.8 (build 63668)” I know you asked for 9.9 but our manager prefers to let new released software versions “cook” for a while. The idea is to let eager adopters find any undiscovered issues.

I will initiate another build to attempt to reproduce this problem and have it in much more recent logs.

1 Like

Hi Ann,

I’ve reproduced the problem and was able to get a copy of the web log from the web server.

The sonar-project.properties and the sonar-scanner.properties files are unchanged from my previous zip file.

The analysis scan ends in the same error as before. The web server log shows a java error stack trace. I believe this snippet shows cause of the error.

Caused by: java.io.IOException: Bind message length 1,251,769,932 too long.  This can be caused by very large or incorrect length specifications on InputStream parameters.
	at org.postgresql.core.v3.QueryExecutorImpl.sendBind(QueryExecutorImpl.java:1689)
	at org.postgresql.core.v3.QueryExecutorImpl.sendOneQuery(QueryExecutorImpl.java:1968)
	at org.postgresql.core.v3.QueryExecutorImpl.sendQuery(QueryExecutorImpl.java:1488)
	at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:348)
	... 153 common frames omitted

I googled around a bit and I found this source for QueryExecutorImpl.java and it is showing this if statement

if (encodedSize > 0x3fffffff) {
  throw new PGBindException(new IOException(GT.tr(
          "Bind message length {0} too long.  This can be caused by very large or incorrect length specifications on InputStream parameters.",
          encodedSize)));
}

If I read that correctly, there is a hard coded limit to the size of data that an analysis scan can upload to the webportal. If that is the case what can be done to reduce the size of data sent to the webportal by the analysis scan?

Thanks,

Bruce

Scan and web server logs

Hi Bruce,

Maybe you can let your SonarQube admin know that 10.0 is set to drop on the 3rd? We do have a few small things that will be fixed in a 9.9.1 (I don’t have a date for that), but we generally only issue point releases for LTS versions, which 9.9 is.

I ask because 9.8 is officially unsupported. :neutral_face:

I’m not aware of one. How big is the project in terms of files?

Could you (briefly) turn on debug logging so we could see the SQL that precedes that error?

The only real option would be to add exclusions to reduce the source code set under analysis.

Even though you’re still (at the moment :smiley:) on 9.8, probably nothing directly related to this changed in 9.9, so we can probably move forward if we have the SQL and an approximate file count.

 
Ann

Hi Ann,

In terms of files, my count is 90,718. That is from a cleaned workspace so no build artifacts but it does include config files, build system scripts, and other non source code files.

Of those files, 62,290 are from the linux kernel. I have tried to exclude the linux kernel files from the analysis but due to (I believe) the other issue I’m dealing with where only natively installed compiler invocations from the linux kernel source tree (as opposed to compiler invocations from the applicable toolchain) are being recognized by the build wrapper, the analysis does not attempt to upload to the web portal due to the error:

java.lang.IllegalStateException: The “build-wrapper-dump.json” file was found but 0 C/C++/Objective-C files were analyzed.

When I looked at this earlier, I counted only 141 recognized compiler invocations. I expect the total number of actual compiler invocations to be around 2 orders of magnitude larger. I think it’s reasonable to believe that large data sets produced by analysis to continue.

I will work with my SQ infrastructure admin in the first part of next week to coordinate the bumping SQL logging level to DEBUG and reproducing the issue.

Bruce

1 Like

Hi Ann,

I have been able to get the logs showing SQL. You can find the related log entries in the web log by searching for the string “fbx-test_project” around the “2023.04.06 00:03:04” time stamp.

Bruce
2023.04.06 SQ server logs

Hi Bruce,

Thanks for the log. I’m going to flag this for more expert eyes.

 
Ann

Hi Ann,

It’s coming up on three weeks since I’ve heard anything on this issue. Could you update me on the status and any estimate on moving this forward?

Thanks,

Bruce

Hi Bruce,

I’ve flagged this for the team. I don’t have any other updates.

 
Ann

Hello Bruce,

There is a 1 GB limit to the datatype bytea in postgres, which we use to store the analysis report in the database.
Unfortunately at the moment there is no work-around besides reducing the scope of the analysis.

Eric

2 Likes

Hi Bruce,

You said it was backward to look first at why the server was choking. I guess you were right.

So coming back to the build: The build wrapper is picking up the Linux compile because that’s what you’re building. Excluding the kernel from analysis would come later. You said you

What adjustment did you make for this?

 
Ann

Hi Ann,

I have another thread going with a sales engineer who is says they are in contact with R&D about the build wrapper not recognizing invocation of our gcc located in the directory tree of the toolchain we are cross compiling to. The only gcc invocations the build wrapper is recognizing is when a “host gcc” is invoked, i.e. a gcc installed on the host build system.

When we first started looking at the problem with server choking, I tried to exclude building the linux kernel source from the scanning step. Unfortunately, since the kernel sources were the only place that an invocation of gcc was recognized by the build wrapper, excluding those sources caused an error complaining about “…0 C/C++/Objective-C files were analyzed.”

To answer your question, I excluded the kernel source code from being scanned by adding the directory tree containing the sources to the “sonar.exclusions” setting in my “sonar-project.properties” file.

Looking at the captured output from one of the sets of logs I’ve shared, I’m seeing a great deal of text being generated from “Sensors” for different type of files. Things like python scripts, javascript, css files, xml files, etc… Since our intended focus for these scan is to analyze C/C++ code, is there a way to prevent these apparently extraneous analyses on non C/C++ code from being run? My assumption is these analyses are contributing to the size of the data being sent to SQ server. Following on that assumption, removing the extra analyses will mitigate the amount of code being sent.

Thoughts?

Bruce

Hi Bruce,

The cleanest way to narrow the analysis to only C and C++ is to set sonar.inclusions=**/*.c,**/*.cpp (or whatever extensions you’re using). Note that you may also need to explicitly include header files too.

 
HTH,
Ann

Thanks Ann,

I will try those explicit inclusions.

Does making explicit inclusions turn off what appears to be implicit inclusions?

Bruce

Hi Bruce,

Yes. The hierarchy of settings is explained here, and the interactions among “sources”, inclusions & exclusions are discussed here.

 
HTH,
Ann