Suspicious multiple warnings about encoding on binary files

  • ALM used: GitHub

  • CI system used: Jenkins CI

  • Scanner command used when applicable:

    mvn -Duser.home=/home//slave/workspace/CI-PR-1111/.home org.sonarsource.scanner.maven:sonar-maven-plugin:sonar --batch-mode -pl --also-make --no-snapshot-updates -Psonar -Dsonar -Dsonar.projectName= -Dsonar.projectKey= -Dsonar.organization= -Dsonar.host.url=https://sonarcloud.io -Dsonar.token= -Dsonar.pullrequest.branch=other/branchname -Dsonar.pullrequest.key=1111 -Dsonar.pullrequest.base=develop -Dmaven.repo.local=/home//slave/workspace/CI-PR-1111/.home/.m2/repository -f /home//slave/workspace/CI-PR-1111/Parent/pom.xml

  • Languages of the repository: Java, JS, TS

  • Warnings in the command execution log:


[WARNING] Invalid character encountered in file <hidden for privacy>.dll at line 1 for encoding UTF-8. Please fix file content or configure the encoding to be used using property ‘sonar.sourceEncoding’.….
[WARNING] Invalid character encountered in file <hidden for privacy>.png at line 1 for encoding UTF-8. Please fix file content or configure the encoding to be used using property ‘sonar.sourceEncoding’.
[WARNING] Invalid character encountered in file <hidden for privacy>.woff at line 1 for encoding UTF-8. Please fix file content or configure the encoding to be used using property ‘sonar.sourceEncoding’.


We face multiple warnings about encoding inside BINARY files. Some details:

  1. It affects only pull requests (both from short-lived and long-lived branches) but not affects branch analyzing

  2. It started to happen between April 2025 and August 2025 (we cannot triage more precisely because we don’t have CI logs in-between).

It seems that something redundant is done: why does Sonar try to parse binary files?

Hi,

For that, we’d need to see your analysis configuration and your analysis logs.

It might just be easier to set an exclusion for those file.

 
HTH,
Ann

The logs sent in private @ganncamp.

analysis configuration is also available there (at least the command executed).

If that’s not enough please let me know.

Hi,

Thanks for the log. I see these warnings are coming from the Docker portion of the scan.

I’ve flagged this for the relevant folks. In the meantime, you can simply ignore these warnings if you like. Or, you could edit the Docker file patterns (Administration → Languages → Docker) to make sure it skips these files.

Speaking of, could you share what your Docker file patterns are currently set to, at both the global and project levels?

 
Thx,
Ann

@ganncamp

We haven’t customized anything related to Docker in SonarCloud.

Here are our settings

Project level:

File Patterns:

*.dockerfile

Dockerfile

dockerfile

Global level: couldn’t find the corresponding settings if you meant the Organization settings

1 Like

Hi,

Thanks for this. Let’s see what the language experts say when they show up.

 
Thx,
Ann

1 Like

Hello @lrozenblyum ,

thanks for reporting the issue!

TL;DR: The Docker analysis is not actually analyzing those files. The logs come from trying to determine the git status to determine if we actually should analyze the file.

Looking into the issue in more detail, the warning is not coming directly from the Docker analysis.

To determine which files should be analyzed for the Docker analysis, the Docker Sensor receives the files from the Scanner Engine.

Inside the Scanner Engine is an optimization, which restricts the files, the Docker Sensor receives, to only changed files (Sensor IaC Docker Sensor is restricted to changed files only) when running an analysis inside a Pull Request.

In order to determine which files are changed or not, the scanner calculates some metadata for all files. Most importantly here, it calculates the git status.

However while calculating this metadata, the scanner also tries to reason about the charset used in the file, where the WARNING log messages stem from.

This is all happening before the Docker analysis, so these files never reach the Docker analyzer. They’re definitely not analyzed here and changing the Docker file patterns won’t make any difference.

If you want to exclude the files in question completely, also from the scanner, you can add them to the exclusions via sonar.exclusions .

In your case I would imagine to exclude them like sonar.exclusions=**/*.dll

I hope this resolves the issue, let us know if you have further questions.

I’ll additionally mention this to the Scanner experts, to see if we can improve the situation here.

Best

Jonas