HTML: RSPEC-1096 does not detect title when there is no head start tag

RSPEC-1096 requires that every HTML page has a title element, which is correct.
However, when using SonarLint in Visual Studio Code, the following code snippet triggers an error message:

<!DOCTYPE html>
<html>
  <title lang="en">Test Page for Linters</title>
  <body lang="en">

This triggers the “bug” "Add a <title> to this page, even though a <title> is evidently present.
Note that the above code is valid HTML because the start and end tags for head (like the start and end tags for html and body) are optional.

Note also that the error message disappears when I deleted the ´` start tag, as if SonarLint doesn’t really understand how to parse HTML.

Hello @ChristopheS,

Thank you for your valuable feedback.

You’re correct in saying that the start and end tags of an html element, including the body element, can be omitted. However, unless I’m mistaken, the HTML specification indicates that the title element must be enclosed within a head element. While your snippet is syntactically valid, it is semantically invalid due to the absence of a head element enclosing the title.

The current rule implementation assumes that a title element can only be present if it’s inside a head element. If not, it considers there to be no (semantically valid) title element. Having said that, I agree that the error message triggered by this code can be misleading since a title element is indeed present. This leads us to consider two potential adjustments to the rule behavior:

  1. We could simply reduce the noise by not triggering an error when a title element is present without a corresponding head element.
  2. Alternatively, we could still trigger an error but with a revised message that emphasizes the absence of a head element for the title.

In my opinion, the second option aligns more closely with the HTML specification, as it indicates that the title element isn’t “correctly” present. Would you agree that this option would be an appropriate fix for the rule?

Cheers,

This is an example of the classic confusion between tags and elements. html and title are elements, not tags; whereas <html>, </html>, <title> and </title> are tags, not elements.
When the HTML specification says that an element’s start tags, an element’s end tags or even both an element’s start and end tags are optional, HTML parsers are expected to infer the scope of those elements (or nodes) from the presence of other elements.
Because the HTML specification says that the start and end tags of the elements html and head are optional, the presence of those elements can be inferred from other elements, especially elements that are required. Thus, when an HTML parser does not encounter the start tags <html> and <head> and then encounters <title>Home page</title>, it is expected to know that this element can only occur inside a head element and therefore creates a DOM node for the head element, to which it can then the element <title>Home page</title>. In other words, even in the absence of the start tag <head>, the parser creates a head element node in the DOM. This kind of rule is why the following code is perfectly both syntactically and semantically valid:

<!DOCTYPE html>
<title>Minimal Valid HTML Page</title>
<p>This is a minimalistic but valid <abbr>HTML</abbr> page.

You can load this in a browser and check that it creates the following DOM:

image

(You can test this on the page Minimal Valid HTML Page.)

Based on this, it should be clear that the title element is actually in a head element, as required by the HTML specification.

The current rule implementation assumes that a title element can only be present if it’s inside a head element. If not, it considers there to be no (semantically valid) title element.

This is not a correct description of what the linter actually does. What the linter gets wrong is the following:

  1. If there is no <head> start tag, it assumes that there is no head element.
  2. If there is no <head> start tag, it does not consider the title element as a child of the head element (even though the presence of <title>...</title> signals that the parser is currently inside the head, as explained above)…
  1. We could simply reduce the noise by not triggering an error when a title element is present without a corresponding head element.

This would be the correct approach. The other option is in contradiction with the HTML specification.

1 Like

Ticket created. Thanks!

1 Like