How to deal with hanging tasks?

We are running 8.3.1 community edition. And we regularly see a task running for a really long time. Most of the time it runs in a matter of seconds. But occasionally one will get stuck running for many many minutes.

Is there a way to kill a long running task?
And how do we diagnose what’s going on?

I got to the logs, and it looks like at some point we start getting timeouts getting to the PostgreSQL server.

It does appear that this has happened a few times. Rebooting the machine always clears it.

Are there any settings we should be looking at to make our connection postgres more reliable?

The error we are seeing is always this:


2020.06.25 03:41:52 ERROR web[AXLpNnurS9Cr3slMAAKC][o.s.s.w.WebServiceEngine] Fail to process request http://sonarqube.bf-app-services-fra02.np.wce.ibm.com/api/settings/values.protobuf
org.apache.ibatis.exceptions.PersistenceException: 
### Error querying database.  Cause: org.postgresql.util.PSQLException: An I/O error occurred while sending to the backend.
### The error may exist in org.sonar.db.property.InternalPropertiesMapper
### The error may involve org.sonar.db.property.InternalPropertiesMapper.selectAsText-Inline
### The error occurred while setting parameters
### SQL: select       kee as "key",       is_empty as empty,       text_value as value,       created_at as createdAt     from       internal_properties     where       kee in  (  ? )
### Cause: org.postgresql.util.PSQLException: An I/O error occurred while sending to the backend.

Hi,

This SO answer might help.

 
Ann

We also see frequent hangs in background tasks - which is a newish problem. We use 8.3.1, it started in the 8.x series, I think.

I have not seen any ERROR output, it just hangs. Until we notice and restart the service.

We run it in a docker container with a companion PostgreSQL in its own container.
So I am not sure the suggested SO is relevant (for our problem, anyway).

I enabled trace output before the last hang, if it is something you can use for anything?
I will be happy to spend time testing workaround/fixes.

Hi @jskovjyskebankdk,

We’ve had other reports that you can hit this problem when you run out of file space. I’d be interested to see what errors if any you have in ce.log when this happens.

 
Ann

No errors in ce.log. And 63GB free disk space.

Below the fairly (to my eyes) harmless output at the last hung session.

Can I provide you with some thread/heap dumps the next time it happens?
It see it pretty much daily, or at least weekly. Random projects, it seems.

2020.06.29 13:21:09 TRACE ce[AXL_LrSMoECOuylqj62b][sql] time=0ms | sql=select i.id, i.kee as kee, i.rule_id as ruleId, i.severity as severity, i.manual_severity as manualSeverity, i.message as message, i.line as line, i.locations as locations, i.gap as gap, i.effort as effort, i.status as status, i.resolution as resolution, i.checksum as checksum, i.assignee as assigneeUuid, i.author_login as authorLogin, i.tags as tagsString, i.issue_attributes as issueAttributes, i.issue_creation_date as issueCreationTime, i.issue_update_date as issueUpdateTime, i.issue_close_date as issueCloseTime, i.created_at as createdAt, i.updated_at as updatedAt, r.is_external as "isExternal", r.plugin_rule_key as ruleKey, r.plugin_name as ruleRepo, r.language as language, p.kee as componentKey, i.component_uuid as componentUuid, p.module_uuid as moduleUuid, p.module_uuid_path as moduleUuidPath, p.path as filePath, root.kee as projectKey, i.project_uuid as projectUuid, i.issue_type as type from issues i inner join rules r on r.id=i.rule_id inner join components p on p.uuid=i.component_uuid inner join components root on root.uuid=i.project_uuid where i.component_uuid = ? and i.status <> 'CLOSED' | params=AXDJDRYMfQL6chJP28Op
2020.06.29 13:21:09 TRACE ce[AXL_LrSMoECOuylqj62b][o.s.c.t.p.i.IntegrateIssuesVisitor] Visiting component jb.it.jrm:jb.it.jrm.ejb/src/main/java/dk/jyskebank/jrm/domain/helper/KundeStamKndJBDW.java (done) | time=2ms
2020.06.29 13:21:09 TRACE ce[AXL_LrSMoECOuylqj62b][o.s.c.t.p.q.MaintainabilityMeasuresVisitor] Visiting component jb.it.jrm:jb.it.jrm.ejb/src/main/java/dk/jyskebank/jrm/domain/helper/KundeStamKndJBDW.java
2020.06.29 13:21:09 TRACE ce[AXL_LrSMoECOuylqj62b][o.s.c.t.p.q.MaintainabilityMeasuresVisitor] Visiting component jb.it.jrm:jb.it.jrm.ejb/src/main/java/dk/jyskebank/jrm/domain/helper/KundeStamKndJBDW.java (done) | time=0ms
2020.06.29 13:21:09 TRACE ce[AXL_LrSMoECOuylqj62b][o.s.c.t.p.q.NewMaintainabilityMeasuresVisitor] Visiting component jb.it.jrm:jb.it.jrm.ejb/src/main/java/dk/jyskebank/jrm/domain/helper/KundeStamKndJBDW.java
2020.06.29 13:21:09 TRACE ce[AXL_LrSMoECOuylqj62b][o.s.c.t.p.q.NewMaintainabilityMeasuresVisitor] Visiting component jb.it.jrm:jb.it.jrm.ejb/src/main/java/dk/jyskebank/jrm/domain/helper/KundeStamKndJBDW.java (done) | time=0ms
2020.06.29 13:21:09 TRACE ce[AXL_LrSMoECOuylqj62b][o.s.c.t.p.q.ReliabilityAndSecurityRatingMeasuresVisitor] Visiting component jb.it.jrm:jb.it.jrm.ejb/src/main/java/dk/jyskebank/jrm/domain/helper/KundeStamKndJBDW.java
2020.06.29 13:21:09 TRACE ce[AXL_LrSMoECOuylqj62b][o.s.c.t.p.q.ReliabilityAndSecurityRatingMeasuresVisitor] Visiting component jb.it.jrm:jb.it.jrm.ejb/src/main/java/dk/jyskebank/jrm/domain/helper/KundeStamKndJBDW.java (done) | time=0ms
2020.06.29 13:21:09 TRACE ce[AXL_LrSMoECOuylqj62b][o.s.c.t.p.q.NewReliabilityAndSecurityRatingMeasuresVisitor] Visiting component jb.it.jrm:jb.it.jrm.ejb/src/main/java/dk/jyskebank/jrm/domain/helper/KundeStamKndJBDW.java
2020.06.29 13:21:09 TRACE ce[AXL_LrSMoECOuylqj62b][o.s.c.t.p.q.NewReliabilityAndSecurityRatingMeasuresVisitor] Visiting component jb.it.jrm:jb.it.jrm.ejb/src/main/java/dk/jyskebank/jrm/domain/helper/KundeStamKndJBDW.java (done) | time=0ms
2020.06.29 13:21:09 TRACE ce[AXL_LrSMoECOuylqj62b][o.s.c.t.p.q.SecurityReviewMeasuresVisitor] Visiting component jb.it.jrm:jb.it.jrm.ejb/src/main/java/dk/jyskebank/jrm/domain/helper/KundeStamKndJBDW.java
2020.06.29 13:21:09 TRACE ce[AXL_LrSMoECOuylqj62b][o.s.c.t.p.q.SecurityReviewMeasuresVisitor] Visiting component jb.it.jrm:jb.it.jrm.ejb/src/main/java/dk/jyskebank/jrm/domain/helper/KundeStamKndJBDW.java (done) | time=0ms
2020.06.29 13:21:09 TRACE ce[AXL_LrSMoECOuylqj62b][o.s.c.t.p.q.NewSecurityReviewMeasuresVisitor] Visiting component jb.it.jrm:jb.it.jrm.ejb/src/main/java/dk/jyskebank/jrm/domain/helper/KundeStamKndJBDW.java
2020.06.29 13:21:09 TRACE ce[AXL_LrSMoECOuylqj62b][o.s.c.t.p.q.NewSecurityReviewMeasuresVisitor] Visiting component jb.it.jrm:jb.it.jrm.ejb/src/main/java/dk/jyskebank/jrm/domain/helper/KundeStamKndJBDW.java (done) | time=0ms
2020.06.29 13:21:09 TRACE ce[AXL_LrSMoECOuylqj62b][o.s.c.t.p.s.LastCommitVisitor] Visiting component jb.it.jrm:jb.it.jrm.ejb/src/main/java/dk/jyskebank/jrm/domain/helper/KundeStamKndJBDW.java
2020.06.29 13:21:09 TRACE ce[AXL_LrSMoECOuylqj62b][o.s.c.t.p.s.LastCommitVisitor] Visiting component jb.it.jrm:jb.it.jrm.ejb/src/main/java/dk/jyskebank/jrm/domain/helper/KundeStamKndJBDW.java (done) | time=0ms
2020.06.29 13:21:09 TRACE ce[AXL_LrSMoECOuylqj62b][o.s.c.t.p.m.MeasureComputersVisitor] Visiting component jb.it.jrm:jb.it.jrm.ejb/src/main/java/dk/jyskebank/jrm/domain/helper/KundeStamKndJBDW.java
2020.06.29 13:21:09 TRACE ce[AXL_LrSMoECOuylqj62b][o.s.c.t.p.m.MeasureComputersVisitor] Visiting component jb.it.jrm:jb.it.jrm.ejb/src/main/java/dk/jyskebank/jrm/domain/helper/KundeStamKndJBDW.java (done) | time=0ms
2020.06.29 13:21:09 TRACE ce[AXL_LrSMoECOuylqj62b][o.s.c.t.p.i.IntegrateIssuesVisitor] Visiting component jb.it.jrm:jb.it.jrm.ejb/src/main/java/dk/jyskebank/jrm/domain/helper/AccountJess.java
2020.06.29 13:21:09 TRACE ce[AXL_LrSMoECOuylqj62b][sql] time=1ms | sql=select i.id, i.kee as kee, i.rule_id as ruleId, i.severity as severity, i.manual_severity as manualSeverity, i.message as message, i.line as line, i.locations as locations, i.gap as gap, i.effort as effort, i.status as status, i.resolution as resolution, i.checksum as checksum, i.assignee as assigneeUuid, i.author_login as authorLogin, i.tags as tagsString, i.issue_attributes as issueAttributes, i.issue_creation_date as issueCreationTime, i.issue_update_date as issueUpdateTime, i.issue_close_date as issueCloseTime, i.created_at as createdAt, i.updated_at as updatedAt, r.is_external as "isExternal", r.plugin_rule_key as ruleKey, r.plugin_name as ruleRepo, r.language as language, p.kee as componentKey, i.component_uuid as componentUuid, p.module_uuid as moduleUuid, p.module_uuid_path as moduleUuidPath, p.path as filePath, root.kee as projectKey, i.project_uuid as projectUuid, i.issue_type as type from issues i inner join rules r on r.id=i.rule_id inner join components p on p.uuid=i.component_uuid inner join components root on root.uuid=i.project_uuid where i.component_uuid = ? and i.status <> 'CLOSED' | params=AXDJDRYMfQL6chJP28Oq
2020.06.29 13:21:09 TRACE ce[AXL_LrSMoECOuylqj62b][o.s.c.t.p.i.IntegrateIssuesVisitor] Visiting component jb.it.jrm:jb.it.jrm.ejb/src/main/java/dk/jyskebank/jrm/domain/helper/AccountJess.java (done) | time=2ms
2020.06.29 13:21:09 TRACE ce[AXL_LrSMoECOuylqj62b][o.s.c.t.p.q.MaintainabilityMeasuresVisitor] Visiting component jb.it.jrm:jb.it.jrm.ejb/src/main/java/dk/jyskebank/jrm/domain/helper/AccountJess.java
2020.06.29 13:21:09 TRACE ce[AXL_LrSMoECOuylqj62b][o.s.c.t.p.q.MaintainabilityMeasuresVisitor] Visiting component jb.it.jrm:jb.it.jrm.ejb/src/main/java/dk/jyskebank/jrm/domain/helper/AccountJess.java (done) | time=0ms
2020.06.29 13:21:09 TRACE ce[AXL_LrSMoECOuylqj62b][o.s.c.t.p.q.NewMaintainabilityMeasuresVisitor] Visiting component jb.it.jrm:jb.it.jrm.ejb/src/main/java/dk/jyskebank/jrm/domain/helper/AccountJess.java
2020.06.29 13:21:09 TRACE ce[AXL_LrSMoECOuylqj62b][o.s.c.t.p.q.NewMaintainabilityMeasuresVisitor] Visiting component jb.it.jrm:jb.it.jrm.ejb/src/main/java/dk/jyskebank/jrm/domain/helper/AccountJess.java (done) | time=0ms
2020.06.29 13:21:09 TRACE ce[AXL_LrSMoECOuylqj62b][o.s.c.t.p.q.ReliabilityAndSecurityRatingMeasuresVisitor] Visiting component jb.it.jrm:jb.it.jrm.ejb/src/main/java/dk/jyskebank/jrm/domain/helper/AccountJess.java
2020.06.29 13:21:09 TRACE ce[AXL_LrSMoECOuylqj62b][o.s.c.t.p.q.ReliabilityAndSecurityRatingMeasuresVisitor] Visiting component jb.it.jrm:jb.it.jrm.ejb/src/main/java/dk/jyskebank/jrm/domain/helper/AccountJess.java (done) | time=0ms
2020.06.29 13:21:09 TRACE ce[AXL_LrSMoECOuylqj62b][o.s.c.t.p.q.NewReliabilityAndSecurityRatingMeasuresVisitor] Visiting component jb.it.jrm:jb.it.jrm.ejb/src/main/java/dk/jyskebank/jrm/domain/helper/AccountJess.java
2020.06.29 13:21:09 TRACE ce[AXL_LrSMoECOuylqj62b][o.s.c.t.p.q.NewReliabilityAndSecurityRatingMeasuresVisitor] Visiting component jb.it.jrm:jb.it.jrm.ejb/src/main/java/dk/jyskebank/jrm/domain/helper/AccountJess.java (done) | time=0ms
2020.06.29 13:21:09 TRACE ce[AXL_LrSMoECOuylqj62b][o.s.c.t.p.q.SecurityReviewMeasuresVisitor] Visiting component jb.it.jrm:jb.it.jrm.ejb/src/main/java/dk/jyskebank/jrm/domain/helper/AccountJess.java
2020.06.29 13:21:09 TRACE ce[AXL_LrSMoECOuylqj62b][o.s.c.t.p.q.SecurityReviewMeasuresVisitor] Visiting component jb.it.jrm:jb.it.jrm.ejb/src/main/java/dk/jyskebank/jrm/domain/helper/AccountJess.java (done) | time=0ms
2020.06.29 13:21:09 TRACE ce[AXL_LrSMoECOuylqj62b][o.s.c.t.p.q.NewSecurityReviewMeasuresVisitor] Visiting component jb.it.jrm:jb.it.jrm.ejb/src/main/java/dk/jyskebank/jrm/domain/helper/AccountJess.java
2020.06.29 13:21:09 TRACE ce[AXL_LrSMoECOuylqj62b][o.s.c.t.p.q.NewSecurityReviewMeasuresVisitor] Visiting component jb.it.jrm:jb.it.jrm.ejb/src/main/java/dk/jyskebank/jrm/domain/helper/AccountJess.java (done) | time=0ms
2020.06.29 13:21:09 TRACE ce[AXL_LrSMoECOuylqj62b][o.s.c.t.p.s.LastCommitVisitor] Visiting component jb.it.jrm:jb.it.jrm.ejb/src/main/java/dk/jyskebank/jrm/domain/helper/AccountJess.java
2020.06.29 13:21:09 TRACE ce[AXL_LrSMoECOuylqj62b][o.s.c.t.p.s.LastCommitVisitor] Visiting component jb.it.jrm:jb.it.jrm.ejb/src/main/java/dk/jyskebank/jrm/domain/helper/AccountJess.java (done) | time=0ms
2020.06.29 13:21:09 TRACE ce[AXL_LrSMoECOuylqj62b][o.s.c.t.p.m.MeasureComputersVisitor] Visiting component jb.it.jrm:jb.it.jrm.ejb/src/main/java/dk/jyskebank/jrm/domain/helper/AccountJess.java
2020.06.29 13:21:09 TRACE ce[AXL_LrSMoECOuylqj62b][o.s.c.t.p.m.MeasureComputersVisitor] Visiting component jb.it.jrm:jb.it.jrm.ejb/src/main/java/dk/jyskebank/jrm/domain/helper/AccountJess.java (done) | time=0ms
2020.06.29 13:21:09 TRACE ce[AXL_LrSMoECOuylqj62b][o.s.c.t.p.i.IntegrateIssuesVisitor] Visiting component jb.it.jrm:jb.it.jrm.ejb/src/main/java/dk/jyskebank/jrm/domain/helper/DWAccountId.java
2020.06.29 13:21:09 TRACE ce[AXL_LrSMoECOuylqj62b][sql] time=1ms | sql=select i.id, i.kee as kee, i.rule_id as ruleId, i.severity as severity, i.manual_severity as manualSeverity, i.message as message, i.line as line, i.locations as locations, i.gap as gap, i.effort as effort, i.status as status, i.resolution as resolution, i.checksum as checksum, i.assignee as assigneeUuid, i.author_login as authorLogin, i.tags as tagsString, i.issue_attributes as issueAttributes, i.issue_creation_date as issueCreationTime, i.issue_update_date as issueUpdateTime, i.issue_close_date as issueCloseTime, i.created_at as createdAt, i.updated_at as updatedAt, r.is_external as "isExternal", r.plugin_rule_key as ruleKey, r.plugin_name as ruleRepo, r.language as language, p.kee as componentKey, i.component_uuid as componentUuid, p.module_uuid as moduleUuid, p.module_uuid_path as moduleUuidPath, p.path as filePath, root.kee as projectKey, i.project_uuid as projectUuid, i.issue_type as type from issues i inner join rules r on r.id=i.rule_id inner join components p on p.uuid=i.component_uuid inner join components root on root.uuid=i.project_uuid where i.component_uuid = ? and i.status <> 'CLOSED' | params=AXDJDRYMfQL6chJP28Os
2020.06.29 13:21:09 TRACE ce[AXL_LrSMoECOuylqj62b][o.s.c.t.p.i.IntegrateIssuesVisitor] Visiting component jb.it.jrm:jb.it.jrm.ejb/src/main/java/dk/jyskebank/jrm/domain/helper/DWAccountId.java (done) | time=2ms
2020.06.29 13:21:09 TRACE ce[AXL_LrSMoECOuylqj62b][o.s.c.t.p.q.MaintainabilityMeasuresVisitor] Visiting component jb.it.jrm:jb.it.jrm.ejb/src/main/java/dk/jyskebank/jrm/domain/helper/DWAccountId.java
2020.06.29 13:21:09 TRACE ce[AXL_LrSMoECOuylqj62b][o.s.c.t.p.q.MaintainabilityMeasuresVisitor] Visiting component jb.it.jrm:jb.it.jrm.ejb/src/main/java/dk/jyskebank/jrm/domain/helper/DWAccountId.java (done) | time=0ms
2020.06.29 13:21:09 TRACE ce[AXL_LrSMoECOuylqj62b][o.s.c.t.p.q.NewMaintainabilityMeasuresVisitor] Visiting component jb.it.jrm:jb.it.jrm.ejb/src/main/java/dk/jyskebank/jrm/domain/helper/DWAccountId.java
2020.06.29 13:21:09 TRACE ce[AXL_LrSMoECOuylqj62b][o.s.c.t.p.q.NewMaintainabilityMeasuresVisitor] Visiting component jb.it.jrm:jb.it.jrm.ejb/src/main/java/dk/jyskebank/jrm/domain/helper/DWAccountId.java (done) | time=0ms
2020.06.29 13:21:09 TRACE ce[AXL_LrSMoECOuylqj62b][o.s.c.t.p.q.ReliabilityAndSecurityRatingMeasuresVisitor] Visiting component jb.it.jrm:jb.it.jrm.ejb/src/main/java/dk/jyskebank/jrm/domain/helper/DWAccountId.java
2020.06.29 13:21:09 TRACE ce[AXL_LrSMoECOuylqj62b][o.s.c.t.p.q.ReliabilityAndSecurityRatingMeasuresVisitor] Visiting component jb.it.jrm:jb.it.jrm.ejb/src/main/java/dk/jyskebank/jrm/domain/helper/DWAccountId.java (done) | time=0ms
2020.06.29 13:21:09 TRACE ce[AXL_LrSMoECOuylqj62b][o.s.c.t.p.q.NewReliabilityAndSecurityRatingMeasuresVisitor] Visiting component jb.it.jrm:jb.it.jrm.ejb/src/main/java/dk/jyskebank/jrm/domain/helper/DWAccountId.java
2020.06.29 13:21:09 TRACE ce[AXL_LrSMoECOuylqj62b][o.s.c.t.p.q.NewReliabilityAndSecurityRatingMeasuresVisitor] Visiting component jb.it.jrm:jb.it.jrm.ejb/src/main/java/dk/jyskebank/jrm/domain/helper/DWAccountId.java (done) | time=0ms
2020.06.29 13:22:44 TRACE ce[][o.s.c.c.CeCleaningSchedulerImpl] Resetting state of tasks with unknown worker UUIDs
2020.06.29 13:31:11 TRACE ce[][o.p.c.QueryExecutorBase]  FE=> Terminate

A thread dump would really help.

@dmeneses OK, will do next time I see a hang. Cheers!

Hi @dmeneses ,

Here are thread dumps from the 4 java processes running in the docker container (#1 is the main process).

stack1.txt (37.5 KB)
stack2.txt (66.7 KB) stack3.txt (50.4 KB) stack4.txt (38.9 KB)

And the support info, if that is relevant.

sonarqube-support-info-B16BF065-AXDI2EJE8iwh_7V0_e4H-2020-7-6-8-22.json (21.0 KB)

Please let me know if you want me to do something else.

Cheers,
Jesper

PS: Around 200 jobs were pending when restarting, and it hung again. So I got another thread dump for comparison stack1-reset.txt (36.9 KB)

Thanks.
Based on the thread dumps, it looks like the Compute Engine had one active task that was busy writing logs at the time when the thread dump was performed.
This might indicate that the volume where logs are written is running out of space, or is blocking for another reason.
Could you please check the volume and maybe try creating files in it with the same user that is running SonarQube to see if it’s in a writable state?

Sure, will do next time there is a hang. Thanks!

Hi @dmeneses

I tried updating to the new 8.4, and also updated the JVM to 11.0.7.

I also updated the heap/file limits to the new recommendations for 8.4.

Sadly it hangs in the same way _hung-2020.07.13.txt (34.6 KB)

I tried connecting to the container, and can without problems write to the logs folder.
access.log in the same folder also gets updated without problems.

Maybe something in the logback configuration causing problems?

Seems like others experience the same problem with logback: https://jira.qos.ch/projects/LOGBACK/issues/LOGBACK-1406

Workarounds include reducing logging, and avoiding use of console appender.

The first suggestion is not worth much in practice.
But if the log output is going into a file anyway, maybe allow console log output to be suppressed?

The last logs you sent seem to be from the start process, not from the Compute Engine.

Anyway, good find about the Logback issue. Unfortunately, the bug report and the comments don’t really explain why the thread holding the lock is blocked writing to the file. I actually added a comment to it, since a lot of the comments are misleading.

Looking at the logs of the CE that you sent before (stack4.txt), it appears that you’re experiencing the same problem, except that there are no other threads waiting for the lock to be released.

"Worker 0 (UUID=AXMf7TQqAK9iwH23_ROA) on ce-worker-0" #33 prio=1 os_prio=0 cpu=92818.66ms elapsed=48441.33s tid=0x00007f2654fb0800 nid=0x138 runnable  [0x00007f26156f4000]
   java.lang.Thread.State: RUNNABLE
	at java.io.FileOutputStream.writeBytes(java.base@11.0.5/Native Method)
	at java.io.FileOutputStream.write(java.base@11.0.5/FileOutputStream.java:354)
	at java.io.BufferedOutputStream.write(java.base@11.0.5/BufferedOutputStream.java:123)
	- locked <0x00000000e07945e0> (a java.io.BufferedOutputStream)
	at java.io.PrintStream.write(java.base@11.0.5/PrintStream.java:559)
	- locked <0x00000000e07945b8> (a java.io.PrintStream)
	at java.io.FilterOutputStream.write(java.base@11.0.5/FilterOutputStream.java:108)
	at ch.qos.logback.core.joran.spi.ConsoleTarget$1.write(ConsoleTarget.java:37)
	at ch.qos.logback.core.OutputStreamAppender.writeBytes(OutputStreamAppender.java:199)
	at ch.qos.logback.core.OutputStreamAppender.subAppend(OutputStreamAppender.java:231)
	at ch.qos.logback.core.OutputStreamAppender.append(OutputStreamAppender.java:102)
	at ch.qos.logback.core.UnsynchronizedAppenderBase.doAppend(UnsynchronizedAppenderBase.java:84)
	at ch.qos.logback.core.spi.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:51)
	at ch.qos.logback.classic.Logger.appendLoopOnAppenders(Logger.java:270)
	at ch.qos.logback.classic.Logger.callAppenders(Logger.java:257)
	at ch.qos.logback.classic.Logger.buildLoggingEventAndAppend(Logger.java:421)
	at ch.qos.logback.classic.Logger.filterAndLog_0_Or3Plus(Logger.java:383)
	at ch.qos.logback.classic.Logger.info(Logger.java:579)
	at org.sonar.api.utils.log.LogbackLogger.doInfo(LogbackLogger.java:90)
	at org.sonar.api.utils.log.BaseLogger.info(BaseLogger.java:76)
	at org.sonar.core.util.logs.DefaultProfiler.logInfo(DefaultProfiler.java:259)
	at org.sonar.core.util.logs.DefaultProfiler.log(DefaultProfiler.java:228)
	at org.sonar.core.util.logs.DefaultProfiler.doStop(DefaultProfiler.java:197)
	at org.sonar.core.util.logs.DefaultProfiler.stopInfo(DefaultProfiler.java:131)
	at org.sonar.ce.task.step.ComputationStepExecutor.executeStep(ComputationStepExecutor.java:85)
	at org.sonar.ce.task.step.ComputationStepExecutor.executeSteps(ComputationStepExecutor.java:72)
	at org.sonar.ce.task.step.ComputationStepExecutor.execute(ComputationStepExecutor.java:59)
	at org.sonar.ce.task.projectanalysis.taskprocessor.ReportTaskProcessor.process(ReportTaskProcessor.java:81)
	at org.sonar.ce.taskprocessor.CeWorkerImpl$ExecuteTask.executeTask(CeWorkerImpl.java:209)
	at org.sonar.ce.taskprocessor.CeWorkerImpl$ExecuteTask.run(CeWorkerImpl.java:191)

It’s writing to a file, so I don’t see it being related to console output.
If you’d like to try change the logging configuration, you can do so by modifying the commented out section LOGGING in the file conf/sonar.properties.

Ah, I misunderstood.

In the log I posted, I see:
“Gobbler[ce]” #46 prio=5 os_prio=0 cpu=580.18ms elapsed=1534.72s tid=0x00007f8970004800 nid=0x101 runnable [0x00007f89b84e5000]
java.lang.Thread.State: RUNNABLE
at java.io.FileOutputStream.writeBytes(java.base@11.0.7/Native Method)

I thought that was what you meant by the Compute Enging hanging.

But is this the root of the problem? Do the two processes write to the same file and cause some kind of lock perhaps?

I will try to tweak the config file to see if it makes a difference.
However, I am taking 3 weeks of vacation starting Friday, so it will take a while before I can look at it again.

yea, that wasn’t much help. We did notice the problem when our cluster got upgraded from ubuntu 16 -> 18. Does that help at all? Nothing about our firewall changed at all, that upgrade was the only thing.

That’s just a thread in the start process waiting for logs to be sent out to the stdout of the Compute Engine process. Nothing abnormal there.

If you see a task in the CE hanging again, get the thread dump of the Compute Engine process.
Please let us know if you have any new information. It’s hard to find the root cause without a reproducer.

We did end up changing the following:
SONAR_JDBC_MAXIDLE=0
and
SONAR_JDBC_MINIDLE=0

Basically to keep it from trying to keep connections idle. But there does appear to be some sensitivity to idle connections getting closed that changes when our nodes upgraded from Ubuntu 16 -> 18.