Publish Quality Gate Result to Azure DevOps Task started to fail with ETIMEDOUT

Setup

  • SonarQube Developer Edition Version 9.4 (build 54424) server
  • Azure DevOps Services
  • Azure DevOps build agents are all self-hosted.

Problem Summary
This setup has been working without any issues until the 1st of June.

Since this date, we are now seeing, on just one of our builds, a problem publishing quality gate results. All our other project builds are working as expected.

The error we see occurs after the ‘Publish Quality Gate Result’ Task Version 4.9.3 has been running for around 4 minutes (prior to this problem with task took 8 minutes to run).

The log shows the following a series of successful API calls then a failure

##[debug][SQ] Waiting for task ‘AYE98Bq2NkN4BFQ0w0HC’ to complete.
##[debug][SQ] API GET: ‘/api/ce/task’ with query “{“id”:“AYE98Bq2NkN4BFQ0w0HC”}”
##[debug]Response: 200 Body: “{“task”:{“id”:“AYE98Bq2NkN4BFQ0w0HC”,“type”:“REPORT”,“componentId”:“AWjmZhRYLOB8UbECIods”,“componentKey”:“proj-xamarin”,“componentName”:“proj- Xamarin”,“componentQualifier”:“TRK”,“status”:“IN_PROGRESS”,“submittedAt”:“2022-06-07T11:32:47+0000”,“submitterLogin”:“analysis”,“startedAt”:“2022-06-07T11:32:48+0000”,“executionTimeMs”:116903,“warnings”:}}”
##[debug][SQ] Task status:IN_PROGRESS
##[debug][SQ] Waiting for task ‘AYE98Bq2NkN4BFQ0w0HC’ to complete.
##[debug][SQ] API GET: ‘/api/ce/task’ with query “{“id”:“AYE98Bq2NkN4BFQ0w0HC”}”
##[debug][SQ] API GET ‘/api/ce/task’ failed, error was: {“code”:“ETIMEDOUT”,“errno”:“ETIMEDOUT”,“syscall”:“connect”,“address”:“13.107.246.64”,“port”:443}
##[error][SQ] API GET ‘/api/ce/task’ failed, error was: {“code”:“ETIMEDOUT”,“errno”:“ETIMEDOUT”,“syscall”:“connect”,“address”:“13.107.246.64”,“port”:443}
##[debug]Processed: ##vso[task.issue type=error;][SQ] API GET ‘/api/ce/task’ failed, error was: {“code”:“ETIMEDOUT”,“errno”:“ETIMEDOUT”,“syscall”:“connect”,“address”:“13.107.246.64”,“port”:443}
##[debug][SQ] Publish task error: [SQ] Could not fetch task for ID ‘AYE98Bq2NkN4BFQ0w0HC’
##[debug]task result: Failed

So we can see many successful calls to the SonarQube API have been made, while code analysis is being performed (which as I mention takes about 8 minutes). Then the failure

Analysis
This is a repeatable issue i.e. only effects one pipeline and always occurs at around 4 minutes (of an expected 8) into the task run.

This seems similar to this issues, but I tried adding the ‘Use Node 12’ task but it has not effect.

Just to confirm between the dates when it worked and when it started to fail

  • The ‘Publish Quality Gate Result’ Task has not changed (version 4.9.3)
  • We have not upgraded SonarQube
  • We have not upgraded our self hosted Agents
  • This only affects a single build, and it was working prior to the 1st June
  • The SonarQube analysis is completing (looking at the background jobs), without error in the expected 8 minutes
  • We are experiencing no reported network issues

Any suggestions as to what to try?

I would suggest taking a look at the access.log of your SonarQube server (to see if the failing requests actually make it), and then the access log of whatever is sitting in front of your SonarQube server to serve it over HTTPS (a reverse proxy, typically). This might help you narrow down the issue.

Since you’re self-hosting your agents, I would also suggest seeing if you can draw some correlation between the agents and the failure (for example: do all the builds for this project end up on one agent, and that is always the one that times out)

Strangely, just as this issue appeared out the blue, it has disappeared. Since I posted this question the publishing of the quality gate results has started working again, and as before we have made no changes to our network, certificates, agents, sonarqube etcs.

All very strange.

@Colin as to your suggestions

  • The access logs show no obvious errors
  • The issue was seen when the build was running on all the agents in the agent pool, so we cannot pin it down to a single bad agent. Also all our agents are identical, build off the same base Hyper-V master disk,so we can rule out different agent configurations.

I guess all I can do is monitor it and see what happens, thanks for the pointers