Project Name with "accents" or "ñ" character is not working in SonarCloud

scanner
sonarcloud
bitbucket
encoding
sonarqube

(Rubén Recacha) #1

Hi !!

We have named projects in Spanish, but when running the analysis with Bitbucket Pipelines (Sonar-Scanner), it doesn’t detect the rare characters (accents or “ñ”)

EXAMPLE OF NAME: Rubén España

Our bitbucket-pipelines.yml is:

    clone:
      depth: 1

    pipelines:
      custom:
        test:
          - step:
              caches:
                - sonarcloud
              script:
                - export LANG=en_US.UTF-8
                - export LC_ALL=C
                - curl --insecure -OL https://sonarsource.bintray.com/Distribution/sonar-scanner-cli/sonar-scanner-cli-3.2.0.1227-linux.zip
                - unzip sonar-scanner-cli-3.2.0.1227-linux.zip
                - ./sonar-scanner-3.2.0.1227-linux/bin/sonar-scanner -Dsonar.host.url=https://sonarcloud.io -Dsonar.login=$<TEST> -Dsonar.organization=<TEST>

    definitions:
      caches:
        sonarcloud: ~/.sonar/cache

and the sonar-project.properties configuration file is:

    sonar.projectKey=<KEY>
    sonar.projectName=Rubén España
    sonar.sources=<SOURCES>
    sonar.language=php
    sonar.sourceEncoding=UTF-8

We have tried all these scripts, but it does not work either:

script:
    - export LANG = en_US.UTF-8
    - export LANG = en_US.UTF-8
    - export LC_ALL = C

Then in the analysis with Sonar-Scanner, the information always comes out: “Scan… Rub??n Espa??a”

How could we set the image/environment or the analysis to detect the name of the project correctly?

Thank you!!

Kind Regards,
Rubén Recacha


(Fabrice Bellingard) #2

Hi Rubén, your sonar-project.properties file is encoded in UTF-8, whereas it is expected to be encoded in latin (ISO-8859-1).


(Rubén Recacha) #4

Hi @Fabrice_Bellingard

And how could I solve it? And how could I change it to ISO (any script in the server?)

I think the problem is the default image of Atlassian (Ubuntu 14.04), that not allow strange characters (accents, ‘ñ’, etc)

Thanks!

Regards,
Rubén Recacha


(Julien Henry) #5

Hi,

We rely on Java properties file support, and the encoding of .properties files is assumed to be ISO-8859-1 with JRE < 9. The safest way is to convert your text containing non ascii characters using the native2ascii tool:
https://native2ascii.net/

For example, Rubén España should be written Rub\u00e9n Espa\u00f1a

Another option is to try using JRE 9 or greater, since it will first try to open the .properties file as UTF-8.