Custom rule to detect an international language (like French)

Hello,

I have as a study project to realize custom rules under SonarQube (tool used by the company) by respecting their coding convention.

Today, after having done research and the tutorial on how to create a plugin under SonarQube I do not know if everything I want to do is feasible.

The biggest rule I would like to do would be to know if people write in French (one of the rules to respect) but I don’t know how to go about it, because I admit I don’t necessarily understand everything about how to write a rule correctly.

Our project are mainly in Java JEE.

Would you have any advice or help or anything else, I had the idea to start with a dictionary etc… but I have no idea if it can work. If not, I’ll have to find other third party solutions…

Thanks for your help,
Elio.

Hi Elio,

The docs point to a tutorial on writing custom rules. That should get you started.

For what you want to do, I think you need to start by defining the target. Is it comments? Symbol names? Something else?

For identifying French, I suspect a dictionary isn’t practical - Google tells me there are ~130k words in French. So then you’ll need to identify characteristics to target. Perhaps “et”, “la” & “le”? Or maybe the use of accents? Words with an excessive vowel count? (Sorry, couldn’t resist. :joy: )

Once you’ve figured those things out, feel free to come back to us in new threads with specific development questions.

 
HTH,
Ann

Hello Ann,

Thank you for the answer, I have already done the tutorial but it doesn’t help me much. Moreover I would like to identify the French language in all possible cases, that is to say in the names of variables, methods, functions, classes etc…

So I confess I don’t know how to do it without using a dictionary :confused:

Elio

Hi Elio,

Good luck, and prepare for looong analyses.

 
Ann