Many Backgound task failed

Sonarqube Version : 8.9 LTS - community version
scanner 4.6
Hi , I check on my sonarqube project there’s many project with failed background process with below error:

java.lang.IllegalStateException: Unrecoverable indexation failures: 10 errors among 10 requests
	at org.sonar.server.es.IndexingListener$1.onFinish(IndexingListener.java:39)
	at org.sonar.server.es.BulkIndexer.stop(BulkIndexer.java:131)
	at org.sonar.server.component.index.ComponentIndexer.delete(ComponentIndexer.java:170)
	at org.sonar.ce.task.projectanalysis.purge.IndexPurgeListener.onComponentsDisabling(IndexPurgeListener.java:41)
	at org.sonar.db.purge.PurgeCommands.purgeDisabledComponents(PurgeCommands.java:180)
	at org.sonar.db.purge.PurgeDao.purgeDisabledComponents(PurgeDao.java:102)
	at org.sonar.db.purge.PurgeDao.purge(PurgeDao.java:64)
	at org.sonar.ce.task.projectanalysis.purge.ProjectCleaner.purge(ProjectCleaner.java:63)
	at org.sonar.ce.task.projectanalysis.purge.PurgeDatastoresStep.execute(PurgeDatastoresStep.java:54)
	at org.sonar.ce.task.step.ComputationStepExecutor.executeStep(ComputationStepExecutor.java:81)
	at org.sonar.ce.task.step.ComputationStepExecutor.executeSteps(ComputationStepExecutor.java:72)
	at org.sonar.ce.task.step.ComputationStepExecutor.execute(ComputationStepExecutor.java:59)
	at org.sonar.ce.task.projectanalysis.taskprocessor.ReportTaskProcessor.process(ReportTaskProcessor.java:81)
	at org.sonar.ce.taskprocessor.CeWorkerImpl$ExecuteTask.executeTask(CeWorkerImpl.java:212)
	at org.sonar.ce.taskprocessor.CeWorkerImpl$ExecuteTask.run(CeWorkerImpl.java:194)
	at org.sonar.ce.taskprocessor.CeWorkerImpl.findAndProcessTask(CeWorkerImpl.java:160)
	at org.sonar.ce.taskprocessor.CeWorkerImpl$TrackRunningState.get(CeWorkerImpl.java:135)
	at org.sonar.ce.taskprocessor.CeWorkerImpl.call(CeWorkerImpl.java:87)
	at org.sonar.ce.taskprocessor.CeWorkerImpl.call(CeWorkerImpl.java:53)
	at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)
	at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:69)
	at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
	at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
	at java.base/java.lang.Thread.run(Thread.java:834)

found this error on es.log, my disk usage is fine. why should i do ?

2022.09.18 03:25:48 WARN  es[][o.e.c.InternalClusterInfoService] failed to retrieve stats for node [DZRB_RhKQQSquvxu4vLf6w]: [sonarqube][127.0.0.1:36168][cluster:monitor/nodes/stats[n]]
2022.09.18 03:26:18 WARN  es[][o.e.c.InternalClusterInfoService] failed to retrieve stats for node [DZRB_RhKQQSquvxu4vLf6w]: [sonarqube][127.0.0.1:36168][cluster:monitor/nodes/stats[n]]
2022.09.18 03:26:48 WARN  es[][o.e.c.InternalClusterInfoService] failed to retrieve stats for node [DZRB_RhKQQSquvxu4vLf6w]: [sonarqube][127.0.0.1:36168][cluster:monitor/nodes/stats[n]]
2022.09.18 03:27:18 WARN  es[][o.e.c.InternalClusterInfoService] failed to retrieve stats for node [DZRB_RhKQQSquvxu4vLf6w]: [sonarqube][127.0.0.1:36168][cluster:monitor/nodes/stats[n]]
2022.09.18 03:27:23 ERROR es[][o.e.m.f.FsHealthService] health check failed
org.apache.lucene.store.AlreadyClosedException: Underlying file changed by an external force at 2021-06-11T15:19:21Z, (lock=NativeFSLock(path=/apps/sonarqube/data/es7/nodes/0/node.lock,impl=sun.nio.ch.FileLockImpl[0:9223372036854775807 exclusive valid],creationTime=2021-06-11T15:19:21.855132Z))
	at org.apache.lucene.store.NativeFSLockFactory$NativeFSLock.ensureValid(NativeFSLockFactory.java:191) ~[lucene-core-8.8.0.jar:8.8.0 b10659f0fc18b58b90929cfdadde94544d202c4a - noble - 2021-01-25 19:07:45]
	at org.elasticsearch.env.NodeEnvironment.assertEnvIsLocked(NodeEnvironment.java:1050) ~[elasticsearch-7.12.1.jar:7.12.1]
	at org.elasticsearch.env.NodeEnvironment.nodeDataPaths(NodeEnvironment.java:809) ~[elasticsearch-7.12.1.jar:7.12.1]
	at org.elasticsearch.monitor.fs.FsHealthService$FsHealthMonitor.monitorFSHealth(FsHealthService.java:147) [elasticsearch-7.12.1.jar:7.12.1]
	at org.elasticsearch.monitor.fs.FsHealthService$FsHealthMonitor.run(FsHealthService.java:135) [elasticsearch-7.12.1.jar:7.12.1]
	at org.elasticsearch.threadpool.Scheduler$ReschedulingRunnable.doRun(Scheduler.java:202) [elasticsearch-7.12.1.jar:7.12.1]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:732) [elasticsearch-7.12.1.jar:7.12.1]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26) [elasticsearch-7.12.1.jar:7.12.1]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
	at java.lang.Thread.run(Thread.java:834) [?:?] '

Hi,

Do you have “helpful” background processes that would be touching SonarQube’s (and by extension, Elasticsearch’s) files running on the same box? Because that’s what this looks like to me:

 
Ann

Hi,

We still investigate what process that locking the file, we have delete the es data folder and restart sonar service and it’s back to normal now. thanks

Hi,

after delete the es folder and restarting the sonar service monitoring show that es status is green, it’s running fine for +/- 5 days then the es shown the red status again,
the only changes on the server is we use sonar api bulk delete project and it’s run by schedule on 1st of every month, is there any change after the api call it’s makes the es goes red ?
we check the log after the bulk delete the es begin unhealthy.

Hi,

Your bulk deletion would likely trigger a re-index (without diving into the code…) but the file lock is likely still external.

Would you mind sharing why you’re running a regular bulk delete?

 
Ann