file_sources is currently the biggest table on our SQ’s SQL server by far with 12+ GB and growing.
a quick check shows that it seems most of the information in it
supposedly orphaned entries not covered by the built in SQ maintenance tasks.
SELECT TOP 1000 fs.project_uuid, fs.line_count, DATEADD(SECOND, fs.created_at / 1000, CAST('1970-01-01 00:00:00' AS DATETIME2)) AS created_at_human, DATEADD(SECOND, fs.updated_at / 1000, CAST('1970-01-01 00:00:00' AS DATETIME2)) AS updated_at_human FROM file_sources fs LEFT JOIN projects p ON fs.project_uuid = p.uuid WHERE p.uuid IS NULL ORDER BY fs.line_count DESC;
Deleting them could be a bit risky without proper background undestanding of the usage.
Both dates shown are actually mostly not very old, so I wonder why those project UUIDs are not existing anymore.
this is not the end of the world, but schema changes fast, a year ago we were fighting the infamous live_measures problems with dependency-check ( Long loading times at project dashboard since upgrade from 9.9.6 to 2025.1 ) so we have an eye on the table sizes, especially the ones rising fast and obviously with unused content.
the only FK I can see is the project_UUID and file_UUID. for projects I checked the orphan status with the SQL query, I could not find a table that fits the the 2nd FK
MS SQL, check if other tables have foreign keys pointing to your table’s primary key, use
EXEC sp_fkeys @pktable_name = ‘file_sources’;
→ 0 (no pointers)
EXEC sp_help ‘file_sources’;
no hints I could see in schema etc.
Since we fixed the measures storage, as you alluded, I suspect that your distribution is not unusual. Still the orphans are a bit curious. I’m going to flag this for more expert eyes.
Despite the confusing column name (our bad for that), file_sources.project_uuid actually points to the branch UUID, not the project UUID. You will need to join on the project_branches table instead of projects in your query to retrieve the actual orphaned entries.
If that returns few or no rows, your data is fine. The 12GB consists of legitimate source snapshots for active branches and PRs.
If you want to reduce growth, you can optimize the values under Administration → General Settings → Housekeeping → Branches and Pull Requests by lowering the sonar.dbcleaner.daysBeforeDeletingInactiveBranchesAndPRs setting and checking your configured exceptions for the sonar.dbcleaner.branchesToKeepWhenInactive setting.
If you have any other issues on this matter, feel free to reply to this topic
thanks a lot for your correction.
indeed, orphaned went down to zero.
Housekeeping is on default values and seem to work, but I wonder why:
SELECT COUNT(*) AS total_count FROM ( SELECT DATEADD(SECOND, fs.updated_at / 1000, CAST('1970-01-01 00:00:00' AS DATETIME2)) AS updated_at_human, fs.project_uuid, pb.kee FROM file_sources fs JOIN project_branches pb ON fs.project_uuid = pb.uuid ) subquery WHERE subquery.kee NOT IN ('main', 'master', 'develop', 'trunk') AND subquery.updated_at_human < DATEADD(DAY, -31, GETDATE());
is still at ~15% of the table. But must admit I’d not consider this as a big problem anymore.
thank you!
I’m glad that everything is working as expected for your SQS instance
Regarding your query, housekeeping does not rely on when files were last updated on a branch to decide whether a branch should be deleted.
Branch cleanup is based on branch inactivity, essentially if the branch’s last analysis is older than the configured threshold in daysBeforeDeletingInactiveBranchesAndPRs plus the branchesToKeepWhenInactive setting to protect specific branches (note that PRs are always cleaned up and main branches are always kept).
If you want to dig more into this you can you can check the snapshots table and look at the last analysis for the branch (islast = true). Also, to check if a branch is excluded from the housekeeping you can look into the project_branches table (exclude_from_purge, branch_type, is_main etc…)