I’ve got a couple of questions regarding the compatibility of SonarCloud.
Is the service compatible with Azure DataBricks notebooks? Are magic commands being supported,
Is it possible to check the code while using multiple languages in 1 notebook? For example combination of PySpark, python and Spark SQL Sequential code .
I’m looking forward for your response.
Welcome to the community!
I’m not familiar with Azure DataBricks notebooks, so I’m hoping you’ll help me out a little. Is a notebook a single file? Or are the multiple languages each in their own files? And is there anything unusual about the code itself (e.g. special notebook markup in the files)?
Yes , It is single file with multiple langauages like Python and Pyspark
it has magic commands (converting sql code to python )
Thanks for the lesson. Because you have multiple languages, plus ‘magic’ in a single file, this means that analysis won’t “just work”. We have to do extra work to
- ignore the magic
- make each language find and analyze only its parts of the file
None of that work has been done yet, so I’m going to move this topic to the ‘New rules / language support’ category and flag it for the product managers.
Welcome to the community and thank you for your question.
While SonarCloud does not have a direct integration with Azure Databricks, we do support analyzing Jupyter Notebooks (with file extension
ipynb) via our SonarLint extension in VS Code. So if your Data Scientists are using VS Code, they would simply need to add the SonarLint extension and scan locally. Additionally, they would likely use an extension such as the one provided by Microsoft in VS Code to connect to the remote workspace in Databricks.
Since there are multiple ways in which an Azure Databricks integration could work with SonarCloud, it would be great to hear in more detail what the development workflows and tooling look like for your team as it can vary greatly. As for multiple language support, since PySpark is a Python library (and therefore coded in Python), our analysis will run as normal within Python code, although we do not have rules specifically for the PySpark library. I’d like to hear more about what you’re looking for with regards to PySpark, Spark SQL and Databricks magic commands. I’m happy to discuss this in more detail and have sent you a private message, and you’re more than welcome to leave your thoughts about a Databricks integration here.