How sonarqube analyse source codes

Hello, I am a Sonar green hand. And I need to develop a plugin for C language Source-code analysis.

And I have some questions with the source-codes of SonarQube from github

1.Is all source-code-analysis-function integrated on the plugin.Such as if you develop SonarJava plugin,you can use SonarQube to analyse java. And if you develop SonarC++ ,you can analyse C++. No need to develop a new Sonar Scanner?

2.How does the language plugins get the input-source-codes? For example, how does SonarJava get input-java-codes? And There are modules like its,java-checks,java-fronted,sonar-java-plugin etc.,what are thoese modules function respectively? And how does those language plugin output theirs report to SonarQube Server to show in the webpages?

3.Is the Syntax tree SonarQube use resemble ANTLR? In other words,can I develop a C plugin based on ANTLR

Please help and advise
Thank you very much

hello @for-just-we,

I think this documentation will give you a good introduction into development of custom plugin. To answer your question specifically

  1. yes, everything analysis related is done in the plugin itself. Scanner is just providing access to the filesystem and project configuration.
  2. Language plugins use org.sonar.api.batch.fs.FileSystem#inputFiles API to get access to the filesystem.
  3. SonarQube doesn’t provide the AST, it’s plugins responsibility to parse the source files and they can use any mechanism to do so.

For your information, SonarSource is already providing a plugin for C code, maybe you want to try it before developing your own? Or if you tried it and disagree with its results, if you share them here, we might improve it?

This plug-in is available for free if you use :sonarcloud: SonarCloud to analyze an open-source project.

Developing a plug-in for C is not such an easy task, especially because you have to run the preprocessor before having something you can base an AST on, which also means you need to know how the file was compiled…

1 Like

Thank you for your reply. Here are some questions I still got?

  1. As SonarQube doesn’t provide the AST, then the plugins such as SonarJava provide AST?
  2. like In SonarJava, the Interface Tree and the class SubscriptionVisitor. Are those classes or Interfaces related to AST defined by SonarSource?
  3. If so,could I use ANTLRv4 API as AST?
    4.If I develop a new C plugin. I just need to do as says. No extra needs?

Thank you again

Thank you for your reply.

  1. Here is my situations. Our group is working on code detection now. And focus on CWE119 and CWE399. No codesmell or some bugs. The method is referenced from the deep-learning-model Vuldeepecker. And there are a lot to do in generate codegadget. what I need to do is develop a SonarPlugin based on that.

2.I know two plugins of C, first is CFamily,but it seems not for free.Second is Sonar-Cxx,but it must be used with another tools such as Cppcheck,And I already made some modifications based on that.So I don’t know which one you talk about?

BTW,thank you again

Hello @for-just-we,

I’m talking about the CFamily plugin. Which is free in the situation I described (which can allow you to test its capabilities), but you must pay for it in other situations (and I would balance this cost with the cost of developing and maintaining your own plugin).
I don’t know much about the community plugin sonar-cxx, and can’t help you with it.

In CFamily, we have the rule that targets CWE119. For CWE-399, I’m not sure, it seems to be a category, and a broad one, so I cannot tell for sure. We do have rules that cover some aspects of this category.

If your goal is not just to find issues in your code, but to explore the specific tool Vuldeepecket, then you are on your own. Some plugins provide the AST in some form for users to develop custom plugins on top of the plugin. This is the case for Java, but not for CFamily. So you would have to create your own AST, using the tool of your choice (ANTLR or any other tool, but, as said in my previous post, I don’t think it would be a good choice, unless you run ANTLR on the result of already preprocessed code…).

Hope this helps…

1 Like

OK, appriciate you for the informations about CFamily.

And my work is kind of explore the specific tool Vuldeepecker and we could use it to detect CWE119 and CWE399, BUT we still need to improve in generating code gadgets.

Still I have the a question.

In the doucumentation . There are 6 points of parsing a new language. What confused me is how to input my source-code to the new plugin. I know that org.sonar.api.batch.fs.FileSystem#inputFiles API could get access to the filesystem. But how to use it to generate AST. I saw the classes Context,SensorContext are important in SonarJava and PythonVisitorContext in SonarPython. I just didn’t see how they relate to org.sonar.api.batch.fs.FileSystem#inputFiles.

Appriciate for your help again and please enlighten me:joy::joy:

Because these plugins internally build the AST for their own purpose, and allow external plugins to access these internal ASTs. For C, you are on your own, and you should directly create the AST from the files, using whatever technique is appropriate to you.

For your situation, SonarQube will not help you writing your analyzer, it will only help you storing, managing, displaying the results of your analysis.

Your answer just enlighten me to the real questions:joy:

  1. which is how to get input files to plugin ? In our project, we directly use .cpp files as input and output as xml. So If I intend to modify our project to a plugin. could we just use the classes org.sonar.api.batch.fs.inputFiles directly create a InputStream to create a AST or should I find another way. For there are no documentations about the basic need to develop a totally new plugin. Just how to extend plugins

2.How to output our result to SonarQube ? Are there any classes we should extend or Interface we should implement?

BTW, it’s really enlightening to talk with you:joy:

Hello @for-just-we,

Have you looked at the example plug-in, and specially at this part on creating rules? It shows how to retrieve the list of files associated to a language (hasLanguage), and how to output violations to SonarQube (the newIssue part).

You “just” have to put real code analysis between those lines :slight_smile:

1 Like

Hello, I have looked at this plugin, it’s one month ago. Last time I looked I am confused by many things, and still, I am confused by how does this plugin works? For there are a lot of Script files like some HTML files, and the RulesDefinitions and the Sensor definitions, Which require loader. Maybe I should learn how to load those than adding our code analysis to the plugin. I wish there are some detailed documentation to talk about the frame of some Open-Source Sonar-Plugins. I guess it will take some days to figure out.

BTW, this conversation makes me learn many things, thank you

The simpler would be Help with developing a custom plugin for c and c++ as recommended in the other thread.