Don't replace another useful rule with a far less useful one

Rule java:S1850 has been deprecated and the description says to use java:S2589 instead.

This is another example of what I mentioned before (Don't replace a useful rule with a useless one) about replacing pattern-based rules with a high degree of specificity with vague general rules.

In this case, S1850 points to cases where an instanceof operator always returns true or false. The rule is of type bug, I assume on the grounds that code that looks stupid probably wasn’t intended that way, so is probably a bug. (That seems to be the rationale for a lot of the rules of type bug.)

The suggested replacement rule, S2589, points to all boolean expressions whose values are statically predictable. It’s a smell. In our code, a lot of the flagged code is of the form:

if (foo == null || (foo != null && …)

and SQ sees that the second comparison is guaranteed to be true, and the line could be simplified to

if (foo == null || …)

Obviously the dev either didn’t understand boolean short circuiting, or maybe was forced to follow some policy about null checks.

So while yeah, the second form is preferable, it’s pretty clear what the intent was, and is merely “smelly.” Whereas instanceof is probably less clearly understood by all programmers, so a predictable result is more likely based on a mistaken assumption (is more likely a bug).

Hello Mister Pi,

I am not sure I am following you here. The example you provide seems like a perfectly valid case where a report is desired. Which rule is the one you think is problematic? Why do you think so?

The link I included discussed another case of essentially the same thing. What I see is rules that cover very specific patterns being deprecated in favor of more general replacements. While I can see the benefit (fewer rules to list), there is also a benefit to these more specialized rules.

In the other case (see the link), there was a rule that checked for certain patterns that indicated the wrong short-circuit logical operator was being used in some kind of null check. Something like "if (foo != null || foo.bar()… " which is clearly wrong because if foo is null, the first test will be false and it will jump to the second test and attempt to dereference foo. Clearly, && should be used instead of ||. But this rule has been deprecated in favor of a more generic rule about possible null pointer dereferencing.

The problem is, the old rule is close to 100% accurate. Whereas the general null pointer rule is tackling a more difficult (actually, impossible) task, and therefore has a high rate of false positives. (I’m not dissing SQ for not solving a problem which is intrinsically impossible to solve! :slight_smile: ) So it’s a lot more productive to hand devs a list of 10 bugs that are mostly bona fide than to hand them several hundred that are mostly FP, and have them tell me to pound sand after they determine the first 10 or so are FP.

So the rule I brought up here is similar. The deprecated rule determines statically that an instanceof always returns true (or false). The premise behind the rule (which I think is a good one) is that this probably indicates the coder INTENDED the check to be true sometimes and false other times (else why would it be there in the first place), but implemented it wrong. (That’s the reasoning behind a lot of the rules of type bug: this is valid code but WHY DID YOU DO THIS? You must have meant something else.)

The favored replacement rule, first of all, is only a smell, so is more likely to be deprioritized. Also, it catches a lot of cases that have redundant checks. And yes, redundant code is clutter, and qualifies as a smell. But it’s harmless (mostly). Whereas the deprecated rule java:S1850 seems to be pointing out genuine problems.

Thanks for the clarification, this helped me understand your argument much better! :slight_smile: I see your point, especially with the effective reclassification of this issue from ‘bug’ to ‘code smell’. I can also understand the argument around merging an accurate and very specific rule with a more general but potentially less accurate one.

Let me trigger some discussion on this topic, I’ll come back to it here as soon as I can report more.

1 Like

Glad I was able to explain my thoughts before someone there got the idea to combine all rules into One Rule to Rule Them All: “Code Shouldn’t Suck”! :slight_smile:

But on a more serious note: the docs explain that Security Hotspots are separated from Vulnerabilities essentially according to an estimated probability of accuracy: the hotspots are more likely to be FP because they’re intrinsically harder to check. Is this based on any actual empirical data, or is it just an educated guess? If you have empirical data about FP rates for each rule then you could use that as a guide: don’t merge rule X into rule Y if p(X is FP) << p(Y is FP).

Of course, even if Y is actually a superset of X (e.g., S2589 is a superset of S1850), one might still want both. But having the same code flagged twice would be annoying. So for the benefit of the user, the scanner should have a policy like “if the code is tagged for X then don’t tag it for Y” – but you can still tag it for Y if X is turned off in the quality profile, for instance.

Hi,

Ehm… that’s not what we meant. If the docs say exactly that, then could you please point me to the page?

A Security Hotspot is a security-sensitive piece of code where there might be a vulnerability depending on context.

I think they actually dropped the Security Hotspot rule about cookie setting, but it’s a great example so I’m going to use it here anyway.

Let’s say I’m setting an unencrypted cookie in your browser. If that cookie contains your current sock color, then it’s no big deal. If it contains PII, or your credit card information… that’s a big deal! Algorithmically, we can’t tell. Human review is required. And that’s what characterizes a Security Hotspot.

Does that make sense?

 
Ann

Well, on Rules | SonarQube Docs it says:

For Vulnerabilities, the target is to have more than 80% of issues be true-positives. … It is expected that more than 80% of the [security hotspots] will be quickly resolved as “Reviewed” after review by a developer.

But from what you’re saying it sounds like the decision of classifying a rule as vulnerability or hotspot is based on a high-level analysis of the rule rather than any empirical data. Thanks for clarifying.

If you want a good example to cite here, use S2068 about hard-coded credentials. It basically just looks for strings like “PASSWORD” but those can be as harmless as “this is the string to use as the prompt.”

1 Like