Dear SonarSource Team,
We have done a SonarQube upgrade last Thursday the 3rd.
We had a developer edition server version
18.104.22.168506 backed up by Postgres
9.x.x and we do the hosting on AWS.
We also have around
2.5 million lines of code.
We initiated the upgrade around
20:00 local time (
19:00 UTC), the upgrade was completed, infrastructure wise, in about 1-hour time.
We moved to Enterprise edition version
22.214.171.124762 backed up by Postgres
Here’s the current server details:
SonarQube ID information
After the completion of the upgrade, background tasks for reloading project data started. There was about
1300 tasks in total.
The progress was insanely slow and after some looking around, one of our team members suggested to give the database server more resources.
We upgraded it to a 4 CPU, 16 GBs RAM server and suddenly things started moving. Database utilization spiked immediately to 100%
After around six hours, the task progress got stock at about 60% and we noticed two project analysis were in progress for 5 hours and 3 hours, respectively.
We decided to restart from scratch and give it an even bigger database, we upgraded the server to 8 CPUs and 32GBs of RAM.
After the upgrade, something started hanging again and it didn’t move past the 0% point.
We restarted the application server again, and sure enough it started moving faster.
It managed to finish all background tasks within 4-hour time, but there was one important project which still displays as unavailable although all background tasks are complete.
One day after this, the database server utilization is still at 100% and the project still hasn’t completed.
During this while process, all log files were perfectly clean, and I have attached them to this e-mail.
On the database side, we see a single query taking all the CPU time:
SELECT p.kee, p.uuid FROM components p INNER JOIN components root ON root.uuid=p.project_uuid AND root.kee=$1
I don’t know if the behavior we are observing is normal, it looks like a bug. As the application server is idle during the weekend, yet the database is at 100%
Given the above information, especially the project that didn’t load (one of our most important ones). We might potentially need to roll back to a last well-known state.
That means a high chance of a new server ID and if things don’t work out with a Enterprise edition of the same version we had, it could potentially mean we have to rollback to Developer edition for now.
Any suggestions or solutions are welcome, this is really critical for us.