Regarding the file indexing and source code storage

Hello,

We are using SonarQube 7.9.1 and are having below questions. Can you please clarify ?

  1. SonarQube stores all the source that is scanned regardless of whether any violation is reported or not on the files. Is that correct ?
  2. Is Source code stored as plain text or in encrypted format (by sonarqube proprietary standard) in database ?

With Regards,
Vara Prasad.

Yes. Files indexed by the scanner and have not been excluded will be uploaded to SonarQube, whether or not there’s a violation reported.

Each line is stored in plain text in the database.

1 Like

Thanks for the response.

Can you please let us know the table to look for ?
We have checked a few tables and came across file_sources table and data in this table is in form of hashes (as column names indicates) and BLOB (for binary_data).

With Regards,
Vara Prasad.

1 Like

I guess it’s more precise to say it’s stored as binary data (my bad)… but from an encryption standpoint, that’s about as good as plain text. file_sources is indeed where that data is stored.

1 Like

Thank you.
Are below statements correct ? Please clarify.

  1. From database, we cannot view source code as plain text/human readable format by doing select queries on file_sources table.
  2. SonarQube server internally uses the hashes to decode the binary data from file_sources table to show it on dashboard as plain text/human readable format.

Also is there a way source upload can be disabled for files that don’t have any violations reported ? Please suggest.

With regards,
Vara Prasad.

Hello,

Can you please provide clarifications for the above questions ?
Thanks in advance.

With Regards,
Vara Prasad

It’s probably trivial to convert blob data to plain text / human readable format, but indeed that would take an additional step

The hashes are important to SonarQube, but not really involved in decrypting the binary data.

I have used blob editor in SQL developer and tried to see the data as “text” and i see data in binary format (with mostly special characters and symbols).

When you said an additional step, does that user require additional knowledge on how sonarqube converts the column’s data to binary (to store as BLOB) and use the same logic to convert binary data back to text / human readable format ? or is there an tool you know of which can do that ?

The reason for asking this question is to know if a user with “select” privileges on file_sources table can see the source code or not ?

With Regards,
Vara Prasad