Developer Edition Injection Vulnerability Scanning with the Python Psycopg2 Library

I am testing the Developer Edition (SonarQube 8.6.1, Scanner 4.5) for the injection vulnerability scanning feature, but my scan results appear to be the same as the Community Edition.

According to the documentation, the Python libraries Flask and Django are supported by the injection vulnerability scanner. I believe Psycopg2 is used by Django underneath and has essentially the same syntax. Does SonarQube support scanning for injection vulnerabilities when using the Psycopg2 Python library?

Thanks

Hello @sonaruser11 ,

Welcome to SonarSource community! :wave:

Can you point me to the documentation that you are referencing? I could not find that in our official documentation.

Anyways, SonarQube’s SAST capabilities do not necessarily differentiate by framework but moreso by language. So yes, SonarQube should support scanning of injection vulnerabilities for Python, regardless of framework or library.

I highly suggest you browse our rule sets for Python in the “Vulnerability” and “Security Hotspot” sections: Python static code analysis

Hopefully this will cover your use case. If it does not, can you give an example that you were expecting?

Joe

Hello @sonaruser11,

Indeed SonarPython supports Django and Flask quite comprehensively.
We are also able to detect usage of a couple of psycopg2 “sink” methods when called directly.
In your particular case do you use Django or psycopg2 directly ? If you use psycopg2 directly, and the Python analyzer misses something, can you send a code snippet using psycopg2 where you would expect SonarPython to detect a vulnerability ?

Thanks, Olivier

2 Likes

I was looking at some of the vulnerability rules for Python. He is one example.

Python: Database queries should not be vulnerable to injection attacks (sonarsource.com)

Specifically, this Django syntax in that example is very similar to Psycopg2:
cursor = connection.cursor()
cursor.execute(“SELECT username FROM auth_user WHERE id=%s” % id) # Noncompliant; Query is constructed based on user inputs
row = cursor.fetchone()

We would be using psycopg2 directly. I was hoping it would detect the issues mentioned in the python vulnerability ruleset since Django is so similar in syntax. Is there anyway to get complete python vulnerability scanning with direct usage of psycopg2?

Hello,

Usage of psycopg2.Cursor.execute or psycopg2.Cursor.executemany should be detected as a potential sink for rule python:S3649 - Database queries should not be vulnerable to injection attacks
Of course it requires that there is a vulnerability to SQL injection, i.e. that some of the arguments used to build the SQL query come from a source that can be manipulated by an external user and have not been sanitized before used (otherwise no issue is raised).

If you have a case where you think an SQL injection vulnerability should be detected and it is not please send us the code snippet and we’ll have a look.

Olivier

I am still having difficulty getting SonarQube to identify potential SQL injection. This a concern because I was hoping it would help us identify things we might not know to look for. Perhaps that is related to Psycopg2 support vs Django? Is there a way to make vulnerability detection more aggressive? For example, by expanding SonarQube’s definition of tainted sources, etc.?

I tried multiple things in an attempt to trigger the unsafe injection rule:

  • full / partial queries passed as a function parameters
  • session.get(“some url”).content
  • requests.get(“some url”).content
  • input("Get input: ")

I was only able to get the Django code from the example to mark a query as tainted:

  • request.GET.get(“id”, “”)

It correctly identified two instances of potential SQL injection in the following code example. However, neither were found without the Django tainted source. For example, setting the “text” variable to “some string” should only remove the finding in self.run_unsafe_query(), not both findings.

Also, renaming self.execute() prevented the finding inside of self.execute(). The issue in self.run_tainted_query() was still reported correctly. I think it may be confusing self.execute() and cursor.execute() in that case? I am unsure what the issue is there.

from django.http import HttpResponse
from django.db import connection
import psycopg2

class SomeClass:
    def __init__(self, params):
        self._cursor = psycopg2.connect(params).cursor()

    def execute(self, sql):
        self._cursor.execute(sql) # Detected as SQL injection (1)

    def run_unsafe_query(self, unsafe_query):
        self.execute(unsafe_query)

    def run_tainted_query(self, request):
        text = request.GET.get("id", "") # Django tainted source
        self._cursor.execute("Unsafe query" + text) # Detected as SQL injection (2)

Hello @sonaruser11,

Sorry for the delay in following up.
Thanks for the code snippet, but having all possible scenarios (those that work and those that don’t) confuses me. In the given example, I supposed that the false negative is on the

def run_unsafe_query(self, unsafe_query):
    self.execute(unsafe_query)

But I don’t see any invocation of the run_unsafe_query() in your snippet so it that case it’s normal that no vulnerability is raised.

So could you send a snippet where you would expect the injection vulnerability to be found and where it is not.

I didn’t get any findings on the following function. I thought perhaps I was supposed to get a finding even without it being called since it could be a potential issue in the future when called. Perhaps that is outside the scope of SonarQube.

def run_unsafe_query(self, unsafe_query):
    self.execute(unsafe_query)

The more significant issue is with the two findings I did get SonarQube to detect. I marked these with “# Detected as SQL injection (1) and (2)” respectively as comments. However, the detection of those two findings depended on whether a “Django tainted source” was present.

self._cursor.execute(sql) # Detected as SQL injection (1)
...
self._cursor.execute("Unsafe query" + text) # Detected as SQL injection (2)

When I substituted several other sources I considered “tainted” for the following line, those two previous findings disappeared:

text = request.GET.get("id", "") # Django tainted source

Since we are not using Django, I need to make sure SonarQube can find these issues using other tainted sources than just “request.GET.”

Thanks for the clarification. I get it.

So just to be clear, we only detect injection vulnerabilities if there is some tainted data (coming for a detected source) that reaches a sink without being sanitized in the path.
So if you don’t have a source somewhere, a sink is never determined as a vulnerability (it’s not because the variable is called unsafe_query that we’ll consider it tainted :slight_smile: , the reverse is also true)
So the below standalone code will never raise a vulnerabillity

def run_unsafe_query(self, unsafe_query):
    self.execute(unsafe_query)

(There may be a Security Hotspot raised though, did you check that ?)
In your case I think that the problem is that we don’t recognize your (non Django) source.
2 options:

  • This source is a well known source in a popular framework and if we don’t recognize it, it’s a lack of good inventory of sources in that framework. Let us know we’l l have a look
  • This source is pretty special to you. In that case we cannot detect it off-the-sheft but we offer the ability for you to configure you own custom sources, sinks and sanitizers. Have a look at: Security Engine Custom Configuration | SonarQube Docs
    This custom configuration can also be a workaround to temporarily cover standard frameworks where we have holes…

Olivier

I thought, “run_unsafe_query” would have issues since there is no sanitation performed on the parameter. However, from what you said, it appears SonarQube is tracking that across function calls, so it is not an automatic vulnerability finding. It may have triggered a Security Hotspot as you mentioned.

Do you have a link to a full list of tainted sources recognized? Custom sources looks interesting, but I think that requires the Enterprise Edition, and we would most likely be limited to the Developer Edition.

That’s correct. unsafe_query may have been sanitized BEFORE calling run_unsafe_query() so no vulnerability. We raise a vunerability ONLY when we can connect a source to a sink without sanitization.
There should be a Security Hotspot though, since every construction & execution of an SQL query should raise one.

Yes but we don’t disclose it, because it’s a big part of our intellectual property.

Correct. I lost sight that you were evaluating a Developer Edition. You would need EE to use that feature.

Can you provide an example of a tainted source that will work with psycopg2, and does not require django or any other imports? The following tainted source from the example does trigger the vulnerability finding, but still requires importing django from what I can tell.

text = request.GET.get("id", "") # Django tainted source

I really just need a testcase to verify that we can get vulnerability findings from psycopg2 without importing django.

Hello,

I dug in this and actually there are no inventoried sources except from Flask and Django endpoints. Your request generated an interesting discussion internally if other sources may be relevant for our taint engine.
Quick question: Could you provide some information about the python project that you’re trying to analyze with SonarQube and what kind of sources you have ? is this a CLI project, a library, a lambda, a desktop app? What are the sources ?

Olivier

It is a desktop application. Unfortunately, I cannot provide any details on sources. Will Psycopg2 support likely be comparable to Django at some point in the future?

I cannot commit but I would think so. We’re already looking at your case to see if we can bring improvements in the coming releases.

Hi @sonaruser11,

We do support psycopg2. The problem is that we don’t support any of the frameworks that are commonly used to build desktop GUI in Python and therefore we don’t know which values are user controlled.

Do you use a particular framework for the GUI of your desktop application? I can’t guarantee we will support it in the future but if we decide to add support for a python desktop application, I will consider it.