SonarQube and Non-English encoding

Our firm is based in a french region of Canada, so, in comments, we use accented characters like é, à, û and such.

So, our source code is normally encoded in Windows-1252 which happens to be the default encoding for our region.

However, when I scan without specifying anything special, I got the dreaded following error:
SonarQube encoding is ‘UTF-8’, Roslyn encoding is ‘windows-1252’. File will be skipped.

But for some files only. Seems that Roslyn is encoding UTF-8 when it can…

So, I tried to use /d:sonar.sourceEncoding=windows-1252 in the parameters of the scanner

And… It was worse (as there are more files with no accents): All the files that were Windows-1252 were now imported but files that are UTF-8 were not!

SonarQube encoding is ‘windows-1252’, Roslyn encoding is ‘UTF-8’. File will be skipped.

So, I tried converting all code to UTF-8

But now, I got MSBuild that crashes with an error of invalid <?> character!

Then I tried reverting to the original code and converting everything in Windows-1252

And I still got the following error:
SonarQube encoding is ‘windows-1252’, Roslyn encoding is ‘UTF-8’. File will be skipped.

Please help me, I’m going insane.

1 Like

hi, @Michel.Racicot

I apologize, this went under our radar as it didn’t have a “csharp” or “dotnet” label. adding it now.

did you manage to fix the problem?

Turns out that there is a semi-hidden functionality in Visual Studio Projects that can fix this:

You can specify an alternate code-page for the entire project with the CodePage tag like so:

<?xml version="1.0" encoding="utf-8"?>
<Project ToolsVersion="12.0" DefaultTargets="Build" xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
  <Import Project="$(MSBuildExtensionsPath)\$(MSBuildToolsVersion)\Microsoft.Common.props" Condition="Exists('$(MSBuildExtensionsPath)\$(MSBuildToolsVersion)\Microsoft.Common.props')" />
  <PropertyGroup>
    <CodePage>1252</CodePage>
2 Likes

Thanks, @Michel.Racicot !

We’ll keep an eye on this subject, as we need to better understand encoding problems and improve our error messages.