Search API problem after upgrading 8.9 LTS -> 9.9 LTS

Must-share information (formatted with Markdown):

  • sonarqube verion: 9.9 LTS
  • sonarqube deployment method: zip file
  • what I am trying to achieve: use search API to look up a specific project by name

We have an issue with our recently upgraded server in which a specific project (only one that I’m aware of at this time) is causing the api/projects/search endpoint to time out / fail. It appears to hang for 60 seconds and then return nothing (not a result with an empty list, literally nothing is printed back to my console after the request appears to time out).

This could possibly be related to another issue I found while working with an engineer on this forum previously, where we have two quality gates that have the same ID, despite IDs obviously needing to be unique. One of the quality gates in question is named Sonar Way Updated, and the other is named after the project in question here. So far I’ve tried (on our testing server, not main) making a copy of the offending quality gate, ensured it had a new ID, and used the api to update the offending project to use that new quality gate, but still the search API fails when queried for this project’s name. So I’m not sure if this is duplicate ID issue is related to the query failing, but it would be a strange coincidence.

Our servers are set to INFO level logging, I haven’t seen anything related to this pop up in the web.log or sonar.log.

Sorry if this report is lacking any relevant info, if it is please let me know.

Hey there.

What database platform are you using that backs your SonarQube instance?

We’re using a postgresql 14.4 database in amazon RDS

I would suggest making sure you’ve run some database maintenance after the upgrade, particularly the following:

https://community.sonarsource.com/t/sonarqube-becomes-slow-after-postgresql-upgrade/37925/2

Sorry for the slow replies, we ran some database maintenance last night, specifically we reindexed the database and ran an ANALYZE as described here: Upgrade guide

We skipped running the vacuum command for the time being, as the database has been running for about a week with the new sonarqube version and the autovacuum daemon is enabled, and we weren’t sure if it would finish in time to avoid impacting our users (the last time we ran it after a postgres version update, it took ~5 hours to complete). If you feel we need to specifically run a manual VACUUM, we’ll need to do that over a weekend.

After the reindex and analysis finished, things were looking promising, and the impacted projects were searchable via the API again with ~10 second response times. However, after several hours of being online, the API has slowed down for those projects again enough that they no longer return in under 60 seconds and consequently time out.

Do you have any further advice on how to address this? Should we go ahead and schedule a manual VACUUM command despite the autovacuum daemon running? If it helps, our current database in RDS is a db.m5.large, which has 2 vCPUs and 8GB of memory, and runs on SSD storage, and our server instance is a c5.xlarge which has 4 vCPUs and 8GB of memory.

For some additional context, here’s a screenshot of our insights panel for this database:

image

It seems that the most intensive task by far are these tie_breaker queries:

select tie_breaker.projectUuid as projectUuid, s.build_date as lastAnalysis, ncloc as loc, pm.text_value as textValue from (select counter.projectUuid as projectUuid, counter.maxncloc ncloc, min(br.uuid) as branchUuid, counter.projectName, counter.projectKey from (select b.project_uuid as projectUuid, p.name as projectName, p.kee as projectKey, max(lm.value) as maxncl

When we performed our maintenance yesterday evening, we noticed that there were a few (3-5, I think) queries sitting on the database for 3+ days which I believe were these same tie_breaker queries. We terminated those at the time, but now we’re already seeing 3 or 4 new instances of that query which have been running for many hours. Maybe they’re being left behind by the timed out search API queries?

Thanks for the update (and the investigation).

Can you confirm which exact version of SonarQube v9.9 you’re using? There was a patch release, v9.9.1, which addresses some database performance issues with [SONAR-18472] - Jira and https://sonarsource.atlassian.net/browse/SONAR-18872.

It may be a matter of simply upgrading to the patch version. If you are already running the patch version… yes I recommend a full vacuum.

There may also be some impacts from (this ticket: [SONAR-19324] - Jira) related to api/projects/search performance, but as the related query also calculates the max LoC, I’m hoping one of the existing tickets has improved the situation.

Is there anything unique about this project? Changed housekeeping settings, many branches / pull requests, size of the project…

And can you mention the specific context in which you use GET api/projects/search? Is it API calls from an external script? Using the Projects Management section of the global Administration?

We’re currently on Version 9.9 (build 65466), so I guess the next action item on our end is to upgrade to the patch version and see if that helps at all. I see now that the LTS version has moved up to include the patch. I’m definitely more interested in 19324 though, as that describes our current situation exactly, so +1000000 on that issue from us.

The two primarily impacted projects are both quite large and include quite a lot of activity and branches. One of the teams is considering abandoning the current sonar project and creating a new one to get around these issues, but obviously that’s not a desirable outcome.

The search API is being called from our build system. We have some shared pipeline definitions that teams use, which look up the team’s project based on values from a yaml file and ensure that their project’s quality gate thresholds are in line with what they have defined in the yaml file. Part of that process is looking up their project based on its name, so each code build is hitting this endpoint once.

Perhaps I’m not completely understanding the nature of these calls — because I would have guessed that if one project was affected, all would be.

What query parameter(s) are you using that make this a project specific issue?

curl --user username:password -X GET -H "Content-Type: application/json" https://$SONARQUBE_URL/api/projects/search?ps=500&q=$PROJECT_NAME

Note that the issue is the same regardless of the page size parameter, whether it’s 500 or 5 (for example), and that successful hits of the API only return the one project being searched for (as intended).

When searching for all of our other projects, the api returns the expected result almost instantly. But when searching for these two specific projects, the api takes much longer than expected. In our troubleshooting and maintenance, I’ve observed that upon starting the server following database maintenance, the search API for these projects returns in a slow but usable timeframe of ~8-15 seconds, and that after several hours of the server being live, the response time slowed to 60+ seconds, at which time it hits a timeout limit (not sure what link in the chain this timeout is configured, could be on our load balancer, could be in sonarqube server, could be db query, etc.). It seems like this issue occurs regardless of user-generated load on the server, because I just tried the same queries on our test server which isn’t in use by our userbase and observed the same timeout problem.

Searching for other unaffected projects seems to remain quick, but these two projects become unsearchable via the API. I’ve noticed that I still have no problem searching for them in the web UI, though loading some tabs of the project page (e.g. Measures or Activity) are also quite slow.

Just in case anyone else stumbles across this issue with similar symptoms, upgrading to the 9.9.1 LTS seemed to resolve this issue for us.

2 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.