SonarXML bug when parsing multiline comments in DOCTYPE definition

xml

(Jose Maria Cubel) #1

Hello.

Since I’ve upgraded my SonarQube instance to version 6.x LTS and updated my SonarXML plugin I get an exception when the XmlSensor tries to highlight some XML code. I’ve neved had problems with those XML files before the upgrade.

Versions used and project config:

  • SonarQube: 6.7.5
  • SonarXML: 1.5.1.1452
  • SonarQube Maven Plugin: 3.5.0.1254
  • pom source-enconding: iso-8859-1
  • source file enconding: iso-8859-1

Error observed:

  • This simple XML file produces an exception when analizing (I’ve simplified the original one just to focus on the problem):
<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE menu [
<!--
Some comment
-->
<!ELEMENT menu (modulo)* >
]>
<menu>
</menu>
  • This is the exception I get:
[ERROR] Failed to execute goal org.sonarsource.scanner.maven:sonar-maven-plugin:3.5.0.1254:sonar (default-cli) on project foo: Unable to analyse file C:/tmp/foo/include/foo.xml: Unable to highlight file include/foo.xml: 5 is not a valid line offset for pointer. File include/foo.xml has 2 character(s) at line 7 -> [Help 1]
org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal org.sonarsource.scanner.maven:sonar-maven-plugin:3.5.0.1254:sonar (default-cli) on project foo: Unable to analyse file C:/tmp/foo/include/foo.xml

Steps to reproduce

  • Analize the provided xml with a SonarQube Maven Scanner

Potential workaround

  • Based on several tests I’ve performed, my guess is that the exception raises when the XML file has a multiline comment and the starting delimiter is followed by a new line, with no whitespace between.
  • If you write a whitespace between the starting delimiter and the newline "<!-- " you’ll be able to avoid the exception.

Can you confirm this bug?


(Jose Maria Cubel) #2

I have another XML file that raises a similar exception.

<!DOCTYPE glassfish-web-app PUBLIC "-//GlassFish.org//DTD
GlassFish Application Server 3.1 Servlet 3.0//EN"
"http://glassfish.org/dtds/glassfish-web-app_3_0-1.dtd">
<glassfish-web-app>
    <parameter-encoding default-charset="UTF-8" />
</glassfish-web-app>

In this case, there’s no multiline comment in the XML file itself but the referenced DTD has some multiline comments in its definition.


(Michael Gumowski) #4

Hey @jmcubel ,

Thanks a lot for your very precise reproducer and early analysis. It’s quite a nice bug. It is indeed related to how the XML analyzer is handling the (lack of) spaces in comments when parsing.

I consequently created the following JIRA ticket to handle it: SONARXML-73

I have not been able to reproduce the issue with your second example. However I’m quite sure it does not come from the DTD, as we are not trying to read it, and the analyzer is also not checking the XML file consistency. Maybe it’s also a question of trailing space somewhere, but so far I have not been able to make my analysis of it fail on this one.

Cheers,
Michael