Cannot cancel ongoing background analysis

SonarQube LTS 7.9.1 Enterprise

I have an ongoing background task (analysis) going on for hours. It usually take few minutes to complete. Looks like the analysis is stuck and not doing anything at all.

The CE logs does not show anything for project except this line:
2020.06.02 09:31:04 INFO ce[][o.s.c.t.CeWorkerImpl] Execute task | project=COBOLDev | type=REPORT | id=XX | submitter=XX

There is no option to cancel the analysis. The documentation also says the same.

But how to get cancel this analysis now ? All further analysis (including PR analysis) are now stuck for this project.

Was this itself for a PR analysis? You’re certain this is the only single line that was logged in ce.log for the task?

Also, what database are you using?

Microsoft SQL Server 14.00.3294

No, this was not PR analysis. This was scan for complete project.

I actually started searching based on task id and noticed that there are more entries, all of which came within 2 minutes of starting the analysis. Nothing came after that.

2020.06.02 09:31:04 INFO ce[][o.s.c.t.CeWorkerImpl] Execute task | project=COBOLDev | type=REPORT | id=AXJ13zRdntn9DDKd5zU4 | submitter=xxxxx
2020.06.02 09:31:37 INFO ce[AXJ13zRdntn9DDKd5zU4][o.s.c.t.s.ComputationStepExecutor] Extract report | status=SUCCESS | time=33127ms
2020.06.02 09:31:37 INFO ce[AXJ13zRdntn9DDKd5zU4][o.s.c.t.s.ComputationStepExecutor] Persist scanner context | status=SUCCESS | time=16ms
2020.06.02 09:31:37 INFO ce[AXJ13zRdntn9DDKd5zU4][o.s.c.t.s.ComputationStepExecutor] Propagate analysis warnings from scanner report | status=SUCCESS | time=16ms
2020.06.02 09:31:37 INFO ce[AXJ13zRdntn9DDKd5zU4][o.s.c.t.s.ComputationStepExecutor] Execute DB migrations for current project | status=SUCCESS | time=15ms
2020.06.02 09:31:37 INFO ce[AXJ13zRdntn9DDKd5zU4][o.s.c.t.s.ComputationStepExecutor] Generate analysis UUID | status=SUCCESS | time=0ms
2020.06.02 09:31:37 INFO ce[AXJ13zRdntn9DDKd5zU4][o.s.c.t.s.ComputationStepExecutor] Load analysis metadata | status=SUCCESS | time=47ms
2020.06.02 09:31:37 INFO ce[AXJ13zRdntn9DDKd5zU4][o.s.c.t.s.ComputationStepExecutor] Initialize | status=SUCCESS | time=16ms
2020.06.02 09:31:37 INFO ce[AXJ13zRdntn9DDKd5zU4][o.s.c.t.s.ComputationStepExecutor] Verify billing | status=SUCCESS | time=0ms
2020.06.02 09:32:06 INFO ce[AXJ13zRdntn9DDKd5zU4][o.s.c.t.s.ComputationStepExecutor] Build tree of components | components=5336 | status=SUCCESS | time=29344ms
2020.06.02 09:32:06 INFO ce[AXJ13zRdntn9DDKd5zU4][o.s.c.t.s.ComputationStepExecutor] Validate project | status=SUCCESS | time=15ms
2020.06.02 09:32:07 INFO ce[AXJ13zRdntn9DDKd5zU4][o.s.c.t.s.ComputationStepExecutor] Load quality profiles | status=SUCCESS | time=297ms
2020.06.02 09:32:07 INFO ce[AXJ13zRdntn9DDKd5zU4][o.s.c.t.s.ComputationStepExecutor] Load Quality gate | status=SUCCESS | time=16ms
2020.06.02 09:32:07 INFO ce[AXJ13zRdntn9DDKd5zU4][o.s.c.t.s.ComputationStepExecutor] Load new code period | status=SUCCESS | time=15ms

@Jeff_Zapotoczny Any update on this ? I am facing this issue again.

Analysis gets stuck at the same point as was reported earlier:
2020.07.15 12:58:40 INFO ce[AXNUDi01pW3XBiMb0KdE][o.s.c.t.s.ComputationStepExecutor] Load quality profiles | status=SUCCESS | time=328ms
2020.07.15 12:58:40 INFO ce[AXNUDi01pW3XBiMb0KdE][o.s.c.t.s.ComputationStepExecutor] Load Quality gate | status=SUCCESS | time=33ms
2020.07.15 12:58:40 DEBUG ce[AXNUDi01pW3XBiMb0KdE][o.s.c.t.p.s.LoadPeriodsStep] Resolving first analysis as new code period as there is only one existing version
2020.07.15 12:58:40 INFO ce[AXNUDi01pW3XBiMb0KdE][o.s.c.t.s.ComputationStepExecutor] Load new code period | status=SUCCESS | time=14ms

FYI, this has happened multiple times now and making SonarQube Enterprise unusable for us. Since there is no way to cancel an ongoing task, the only way out seems the deletion of entire project, which deletes entire scan history for all branches and all Pull Requests. Definitely not a nice clean way of doing things.

Would you be willing to share your entire SonarQube server logs covering a period where this ongoing task situation has occurred? And can you also share your System Info file?

That scan actually got completed in 6 hours 40 mins (which usually takes 12-13 minutes). Attaching the log file here.
Notice the long delay after:
Load new code period | status=SUCCESS | time=14ms
stuck scan log.txt (12.4 KB)

Do you still need more logs and system info ? Any email or upload link where I can send it privately ?

Thanks for providing the logs & your system info file privately.

Without full logs covering the time of the long-running task, I can only guess as to the true cause. One cause for concern I noted: it appears you configured your SonarQube instance to use 2 Compute Engine workers while otherwise keeping the Compute Engine memory settings at defaults. This runs against the recommendations we provide in our Compute Engine Peformance documentation. If you’re looking to improve performance here, you need to make sure you can allocate additional memory to cover the needs of each worker otherwise the resource contention between them can actually make performance worse.

Hope this helps!

I don’t think the problem is related to 2 CE workers or related to performance. A stuck/hung scan is a different problem than a low-performing scan. If CE worker configuration was the real issue it would have affected all scans and all projects. Even for the project that got stuck, all other scans completed within 10-12 minutes while this particular scan (for which I shared the logs) took around 6 hrs 40 minutes.

I have also noticed that in both stuck cases, it got stuck after this step:
[o.s.c.t.s.ComputationStepExecutor] Load new code period | status=SUCCESS | time=14ms

and the logs shows this clearly:

Having said that, this thread was started to ask if there is any way at all to cancel an ongoing analysis and doesn’t look like there is any way or any plans to provide this basic functionality.

I confirm there is no way to cancel a background task, primarily due to data integrity concerns. It would not be “basic functionality” to determine how to safely undo a task at any arbitrary stage of execution.

When I said “basic” I meant it from user functionality point of view. I use other code scanning tools and they provide this feature of cancelling an ongoing scan. Technically, there could be multiple ways of handling data-integrity issues (one of them being usage of a temporary location/db/schema until the final stage), but of course it would not be wise for me to comment on it as you know the product and challenges better.

From usability point of view, you have no idea how helpless the situation is in case of a stuck scan when a CI/CD pipeline is held up waiting for QG status and when even an admin (who has full control of SonarQube server/configuration and database) can do nothing but “hope” for scan to finish.

If there is no such functionality to cancel a scan, I would like to know what are my options when a scan gets stuck. This situation is a reality and not a hypothetical situation, so there should be some recommendation/guidance coming from SonarSource.

There is always an underlying cause that should be addressed; so our recommendation would be to capture this with comprehensive logging on the scanner and server side, at as low of a log level as possible. If you can predict which project(s) most likely suffer from this, setting the Compute Engine log level down to TRACE and then re-analyzing the project could help us see the cause.

It cannot be predicted which scan will get stuck. So far I have noticed this happening 3 times, all with COBOL scans, and in unpredictable ways, sometime even when no code has changed.

The only common point being stuck after this step:
[o.s.c.t.s.ComputationStepExecutor] Load new code period | status=SUCCESS | time=14ms

I can also not keep the log level as Trace forever waiting for a stuck scan, as trace level will have performance impact. Is it possible to set the log level to TRACE for a particular project only ?

Thanks for re-emphasizing that this was a COBOL project and which step was last before the big lag: the step after the “Load new code period” is “Detect file moves” and indeed there was a bug in this functionality involving COBOL projects that was fixed in the 7.9.3 release. I’d suggest planning to upgrade to this patch release as it may prevent this problem from recurring.

Thanks for this information and for the bug link.