Invalid character warning UTF-8

  • versions used (SonarQube 7.9.1 LTS, sonar-java-plugin 5.14.0.18788)
  • error observed

When invalid characters are detected in the code, a warning in the scan logs make it very hard to address. We really need this kind of information to be conveyed with Issues. Also, the specific character that is prompting the message appears to be a valid UTF-8 character, though one that would still be good to generate an issue around - which is the Unicode “Replacement Character”, https://www.fileformat.info/info/unicode/char/0fffd/index.htm

This is what we see in the scan log.

[WARNING] Invalid character encountered in file /my/path/to/File.java at line 9999 for encoding UTF-8. Please fix file content or configure the encoding to be used using property 'sonar.sourceEncoding'.
  • steps to reproduce
    A java file with the UTF-8 EF BF BD in a comment, like so.
package test;

/**
 * Replacement Character in UTF-8 �
 */
public class App 
{
    public static void main( String[] args ) {
    }
}
  • potential workaround
    Manually scan the SonarQube logs, or add automated log scanning to the Continuous Integration builds to generate issues in JIRA or somewhere so that the problems can be seen and fixed.