Non-cap group w/o quantifier specified by S5850 and proscribed by S6395

Windows 11
VS Code
Java
SonarLint v4.3.0 (not in connected mode)

I’ve found a rule conflict.
In this line of code:
Pattern fieldPat=Pattern.compile("(?<=^|;)([^;]*+;|[^;]++$)");
the regex alternation "[^;]*+;|[^;]++$"
violates rule: Alternatives in regular expressions should be grouped when used with anchors (java:S5850). However, when it is made compliant by applying a non-capturing group, so:
"[^;]*+;|(?:[^;]++$)"
the group:
"(?:[^;]++$)"
violates rule: Non-capturing groups without quantifier should not be used (java:S6395).
Rule java:S5850 strongly implies that "[^;]*+;|(?:[^;]++$)" (effectively "a|(?:b$)") is compliant code, based on the explicitly compliant snippet "(?:^a)|b|(?:c$)".

1 Like

Hello @Musky8181, and welcome to the Sonar Community.

The rule S5850 correctly raises an issue on the code:

Pattern fieldPat=Pattern.compile("(?<=^|;)([^;]*+;|[^;]++$)");

because the scope of the end of string anchor, i.e. $, is ambiguous.

The rule asks you to explicitly group the alternatives to identify where the $ should be applied. You have two alternatives:

  1. apply $ to both the alternatives
Pattern fieldPat = Pattern.compile("(?<=^|;)(([^;]*+;|[^;]++)$)"); // alternatives are grouped together
  1. apply $ only to the second alternative
Pattern fieldPat = Pattern.compile("(?<=^|;)([^;]*+;|([^;]++$))"); // the $ is grouped with the second alternative

It is unnecessary to introduce any non-capturing group to be compliant. The fix consists of grouping explicitly the alternatives to clarify ambiguities whenever anchors are used.

Cheers,
Angelo

Thanks! Option 2 passes both rules and captures the matches I want. Just 2 gripes, though:

  1. This introduces a superfluous capturing group. It doesn’t seem to have any performance impact, but does complicate referencing, a little.
  2. The rule S5850 documentation specifically recommends grouping alternation parts and their anchors using non-capturing groups, which makes sense, really. I think the problem rule here is S6395. It already carves out an exception for non-cap groups containing alternations. This should be extended to cover n-c groups added for clarity.
1 Like

Hello @Musky8181, thanks for sharing these two valid points.

I’m unsure if S6395 can distinguish when the non-capturing group is added for clarity, for example, in corner cases where anchors are used. I’ll check with the team and trigger a discussion.

Therefore, the best option is to comply with both rules and use the non-capturing group with a quantifier. What about using the optional quantifier?

Pattern fieldPat = Pattern.compile("(?<=^|;)([^;]*+;|(?:[^;]++$)?)");

Cheers,
Angelo