jerone
(jerone)
September 10, 2023, 10:01am
1
When using multiple special character classes in the regex, SonarCloud rule typescript:S5869
reports them as duplicated. Special characters like \n
, \r
, \s
, \t
.
const result = /^[\r\n\s\t]+/gm.exec(source);
https://sonarcloud.io/project/issues?open=AYJ_Asgte7FPvNTvNf5F&id=jerone_eslint-plugin-angular-template-consistent-this
1 Like
gab
(Gabriel Vivas)
September 11, 2023, 11:36am
3
Hi @jerone ,
I think this is being raised because the \s
character class actually includes the \r
, \n
, and \t
characters.
You should be able to write:
const result = /^\s+/gm.exec(source);
You can read more about it in MDN:
Unless I’m missing something!
Gab
Edit: I was struggling to find the right name for these, the official terminology seems to be character class escape sequences
jerone
(jerone)
September 11, 2023, 8:36pm
4
Oh wow, how did I miss that. You are absolutely right.
\s
is equivalent to [\f\n\r\t\v\u0020\u00a0\u1680\u2000-\u200a\u2028\u2029\u202f\u205f\u3000\ufeff]
.
Based on that information, maybe this rule could use a more descriptive explanation for this particular situation. Every other duplication’s make sense, but \s
special character is a combination of other special characters.
It’s not clear that all other special characters are duplication’s of \s
.
3 Likes
gab
(Gabriel Vivas)
September 12, 2023, 10:24am
5
Glad this helped.
You have a good point about improving the rule description.
I created a ticket to extend the explanation in the rule.
I think we can elaborate about \s
, \w
, and \d
since those might trip other people too!
opened 10:23AM - 12 Sep 23 UTC
| Label | Description |
| --- | --- |
| Rule | S5869 "Duplicate regex chars" … |
| Subject | Add clarification in the description about special character escapes |
| Community | https://community.sonarsource.com/t/false-positive-for-typescript-s5869-duplicate-regex-chars-with-special-character-classes/98995 |
### Explanation
A community member opened a thread for rule S5869 (duplicate chars in regex character class) thinking there was a false positive:
```
const result = /^[\r\n\s\t]+/gm.exec(source);
```
However, the rule is working as expected, since `\s` includes `\r`, `\n`, `\t`, among other invisible characters.
So the regex could be written this way:
```
const result = /^\s+/gm.exec(source);
```
This means the rule is accurate, however, it was not obvious from the rule description.
Understandably, users would expect to see an explicit duplicate character since this is the example provided in the rule:
```
/[0-99]/ // Noncompliant, this won't actually match strings with two digits
/[0-9.-_]/ // Noncompliant, .-_ is a range that already contains 0-9 (as well as various other characters such as capital letters)
```
### Suggestion
We could improve the rule description to mention that there are special escape characters that work like character classes:
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Regular_expressions/Character_class_escape
* We should mention: `\d`, `\D`, `\s`, `\S`, `\w`, `\W`, since some developers might not be familiar with them
* We could consider adding a specific example for `\s`, like in the case reported by the user in the community thread.
I think this will make the rule more useful, and help with the learning aspect.
1 Like
system
(system)
Closed
September 22, 2023, 6:25am
8
This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.