We are using 8.2 Community Edition with all up to date plugins. We are very experienced with SQ having used it since version 3.
For a long time we have been noticing old issues raised again as new issues and have tried for a long time to understand the causes. We have researched the explanations here and on the web and ruled out every other suggestion that we have found. We know that upgrades can do this as a natural part of the process, however we see this on files that haven’t been touched and when the platform has not had alterations. It is tightly locked down.
Here is an example:
Our Develop branch was originally branched from Master in 2018. Here is the history of an issue on that branch:
No. You’ll see that same issue/same line is considered “old” regardless of the availability of blame data.
That’s a bit of internal housekeeping. Non-Closed issues are tied to a line number in the file. When we close an issue, we remove its line number because over time the code can drift away from the original issue line. Presenting a closed issue at an unrelated line is terribly confusing. (I’ve seen and been confused by it.)
Nope.
It would be helpful in diagnosing this if you looked at the closed issues on the file. Can you find the original instance? And if so how long ago was it closed?
Actually I’m a little confused by that first screenshot. It shows an Open issue on line 66. The second screenshot shows… a different catch block with the issue on line 71. Is the one on line 66 still Open?
These are the same lines, only one has moved a few places in the file. We are using the Master branch as the original, suspending any analysis on it so that we can keep the issue history intact to compare then the develop branch is analysed. If there is a better way of looking for this history (even if from the database), please let me know.
This Master file was last analysed on March 9 with v7.9.1. The edit that altered the code in the catch block was made on March 21 and analysed with the updated platform v8.2 - you can see the altered code as logging changes in the catch block.
This March 21 analysis appears to have removed the previous issues and the history and (re)created the issue Exception issue as new. Do you know why that might be?
Using the Status=Closed filter on the Issues page, were you able to find the original, Closed issue? And if so, what does its change log look like?
Regarding why the issue would have been closed, as described in the docs
Issues are automatically closed (status: Closed) when:
an issue (of any status) has been properly fixed => Resolution: Fixed
an issue no longer exists because the related coding rule has been deactived or is no longer available (ie: plugin has been removed) => Resolution: Removed
So an analysis at some point didn’t find the issue and it was closed. Now the next question is why a new issue was opened rather than the old issue being re-opened. The answer there is that it must not have “matched” the closed issue (see matching criteria per my docs link in the previous post) or the old issue was closed >30 days (with an intervening analysis) before the issue was re-raised.
So here’s the flow:
analysis day 1: issue raised
analysis day 2: issue disappears for whatever reason; issue Closed
…
analysis day 35: Closed issue cleaned out of DB
analysis day 36: issue reappears: raised as new issue
The real question is why the issue is being closed in the first place. Issues are closed when analysis no longer finds them. Causes of that:
the code is actually fixed
the rule is removed from the relevant profile and/or a different profile is used
the code is removed (excluded) from analysis
So… you might take a look at your Activity tab. Do you see Events indicating that different profiles were applied? Do you have exclusions defined in places other than the UI that might be being applied intermittently?
Thanks very much for your detailed reply, it is much appreciated.
I cannot find it as a closed issue. In fact clearing filters, selecting closed issues and then filtering by the file returns no results. Perhaps I am using it wrong…
The fact that you can’t find it as a closed issue tells me that more than likely it was closed >30 days ago & cleaned out of the DB automatically. So that brings us back to why it was closed. The contents of your profile haven’t changed; what about which profile is being applied? You’ll see an Event in your Activity stream each time a different profile is used for analysis versus the previous analysis.
For exclusions, you might be able to deduce this from a LoC graph, also available on the Activity page.
We’ve done a fair amount of work to make sure they aren’t. If you can provide examples where this is happening, we’d love to see them.
No changes to the rules within the profile nor a change of the profile have been made since then. We tend to do annual Profile reviews where we clone the profile, make changes together with teams, then switch to the new profile and then lock them down so they can’t be changed. This approach has been taken since as far back as 2014.
We have done some more digging here and pulled up a few things.
A case where a statement (legally) spread over 2 lines was combined and then a previous Issue found 4 years ago was flagged as new 8 days ago.
Is it normal for this to raise a new issue in this case? Is this because the hash of the statement taken only covers the first line so the old issue is then autoclosed and reopened because of this line change?
The filter for closed issues on this project shows 0. The project has a history that goes back to March 2018 with hundreds of changes and is regularly analysed, however there are no Closed issues showing. Other projects do show Closed Issues
We do observe this behaviour most following merges from feature branches into the Develop Branch probably because a lot of changes are happening at once. In this example below, the feature branch has altered the Catch block in a file with code changes to improve the logging within but, following the merge, suddenly the issue about not using a specific exception type is raised again even though that line has not changed.
Original code (as seen in the frozen Master Project analysis):
Is this because an exception block is treated as part of the same Exception Statement, so too many changes to the block will close all issues on the block AND the exception statement, and then create new issues again with what is found?
In the next example a rule that has been present since 2015 is suddenly triggered on code in the newly analysed feature branch (analysed with 8.2) whereas in the earlier analysis (7.9.1) this was not raised. I should note that the function this was within was refactored to two functions but little else was changed. What is the reason for this?
Even if this appearing due to an upgraded plugin between these two versions (and that may not be the reason), should this not be backdated in any case?.
It seems to appear on Java code only and not be isolated to particular plugins or rules
I note this link where the intermittent availability of bytecode appears to produce similar symptoms. The build system is maven and analysis is only attempted after build and test/coverage steps.
Teams are required not to introduce new issues in branches and their performance reporting is based on this. This means they are quite sensitive to this behaviour so are very keen to find explanations.
There’s a lot going on here. I may not get through it all but this will likely be an important reference for all of it: Understanding which Issues are “New”
I believe you’ve guessed correctly; the line hash changed enough that the issue didn’t match the old one & was re-raised as New
Closed issues are cleaned out after 30 days. I guess this is just an accident of timing
Your first screenshot shows this issue as closed and reopened so I suspect more has gone on here than you realized. But okay, it was open before the merge. After the merge it’s New. I see that the line number changed from 219 to 232 (so it fails the first test “same rule, with the same line number and with the same line hash”). In addition, it appears that the code all around it changed as well so the “detect block move” wouldn’t have kicked in here because the “block” didn’t move; it changed massively. I’ll let you step through the rest of the match criteria yourself; I think you’ll see that it doesn’t meet any of them.
The standard answer for this one is that new releases bring not just new rules, but smarter versions of old rules as well. That said, I can see from the rule description that this particular rule comes from fb-contrib, so you should really ask in that community.
Re: 2
We increased the Closed Issues expiry period around 6 months ago from the 30 days to 365 days so would expect to be seeing many, many closed issues from this project which has a large team working on it.
We still believe something odd is going on here with this project. It has to be said that is very large (>500K lines) and has many files that are very complex and very poorly written with overly complex code that breach many, many rules. I have seen files over 8 or 10 thousand or more lines with more than 1000 persistent issues within them that have not been fixed for years. In cases such as this, is it possible for the “detect move block” routine or other routines to be overwhelmed with some seemingly simple changes and reflag issues as new?
Re: 6
Is there a way to force Sonar not to analyse if there is no bytecode found, or flag it there is missing bytecode in some places?
Am I right in thinking that the process of reflagging issues is part of the way Sonar guarantees it reports all issues – it can be thought of as a ‘catch all the rest’ technique when it can’t match an issue found on the new scan with one found on the previous scan (seemingly around 0.001% of the time), so it closes those old unmatched and raises them as new?
If so this unfortunately conflicts with our desire to measure the progress of improvement in the team by the count of ‘0 New issues’ in large complex projects where this ‘catch all the rest’ happens in larger numbers than expected. In this project, with 73,000 issues this is seemingly happening in the low hundreds of cases across a typical feature branch workload when merged back into develop.
Weeell… You must have made that config change before the upgrade to 8.2 because the cleanout period got hard-coded to 30 days in the versions before 7.9 (I’ll find the ticket in Jira if you want me to but I’m feeling lazy at the mo). So it’s definitely a 30-day cleanout for you.
This stirs some vague memories of talk of circuit breakers in that logic when very large files are involved. Laziness aside, I’m not sure that’s even documented in a ticket (“implementation details”) but I think it’s plausible that block move detection could fail (max-out) in such cases.
Right outcome, wrong philosophy. We do our best not to raise old issues as new. You’ve seen the algorithm for that. During each analysis we look at all the issues the language analyzers found, apply the matching algorithm and compare them to the ones on file from the most recent analysis:
issue in both sets: “still open” → nothing to do.
issue in existing, not in new set: issue has been fixed in code → close it
issue in new set but not existing: raise new issue
This used to be how it worked & I’m not sure when that restriction went away. I think your best bet here is to do some scripting to analyze conditionally.
Regarding “missing bytecode in some places”, you do get warnings in the analysis logs for missing class files.
We appreciated your answers to the above which helped immensely. Thank-you and apologies for not replying with thanks in May.
We have discovered this behaviour happening again on a feature branch analysis for the same Java project discussed above.
To test this, here is what we did:
Created 2 branches from the same source branch with identical source code.
Ran the analysis 3 times on each with NO code changes.
In both branches new issues were flagged on the 2nd and 3rd analyses that weren’t flagged on the original analyses, even though there were no code changes - i.e. the same commit was analysed 3 times for each branch. Although the total number of issues is the same in each project (75,164), the number of new issues since the first analysis are different in each project (969 in one, 765 in another), indicating each of the analyses on each branch are inconsistently finding issues.
The issues raised were spread across hundreds of files and are not of the same type. The attached zip contains more details.
We are at loss to explain why this could be happening. On smaller projects with less issues we are not seeing (or at least noticing) this and wondering if the size and complexity of this project is causing these inconsistencies. We have noted the characteristics of this project previously.
In both image sets you’ve shown me an issues graph that apparently has the same number of issues in all analyses. I.e. I see perfectly parallel, horizontal stripes in both images.
Sorry, but this is really impossible to diagnose without more context. You should examine the logs of the build/analysis to make sure they’re consistent with each other.
Thanks for getting back to me. The zip had 6 images for each of the two branches. The counts are different and the new issues are showing when there have been no code or platform changes. The reason those graphs look horizontal is due to the scale - there are 75,000 issues and less than 1000 have changed.
The other 4 screen shots (the overview and the issues) show the impact quite clearly and I would draw your attention to those.
I see 6 images total: Issues, Activity, and Overview for A and for B.
Without having some detail on at least a few “wrongly new” issues, this is going to be very difficult to diagnose. Do you think you could isolate a few cases? And if so, it would be quite interesting to see those issues’ change logs (click on the date in the issue box to see the change log).
The behaviour is quite easy to reproduce with this analysis. Simply reanalysing without changing code seems to produce random new issues each time.
Given this is a very large project with >70,000 unresolved issues then perhaps tolerances are exceeded somewhere. To that end, after some experimentation, when we remove the CheckStyle plugin the behaviour stops for all but one issue i.e. One branch finds 1 issue, the other does not, even though the code in each branch is the same (and not changed since the branches were created).
This may simply be because the number of rules are now reduced (although only by a very small number) or it may indicate a problem with the plugin.
This is very interesting! It still makes me wonder how/why the Checkstyle issues aren’t meeting the old-issue match criteria, but that’s really out of our hands. If you can narrow this down to being a Checkstyle issue, I’m sure those guys would love a report (raised in their GH project).
I’m waiting with bated breath to hear what rule this is…?
TBH, I’ve really never heard of issue volume being a problem before. This is one reason I’ve seemed stumped by the behavior you report.
Running again it picked up two issues this time… again with no code changes.
They are in different code files and are from different plugins:
findbugs:NP_GUARANTEED_DEREF: Null value is guaranteed to be dereferenced
java:S2259 : Null pointers should not be dereferenced
A copied identical analysis did not pick up these. And on another machine configured the same way, neither were picked up.
Then removing the findbugs plugin and restoring the checkstyle plugin both of the above issues disappeared but back came 1,200 of the same Checkstyle issues of before… a different count to the previous times but they are of the same type:
checkstyle:com.puppycrawl.tools.checkstyle.checks.imports.ImportOrderCheck: Import Order
checkstyle:com.puppycrawl.tools.checkstyle.checks.imports.CustomImportOrderCheck: Custom Import Order
So it doesn’t seem to be limited to the CheckStyle plugin. Removing it may resolve this for this project but for another project the appearance of new issues may reappear and be from different plugins.
I can’t speak for Checkstyle or FindBugs. You’ve listed one rule provided by SonarQube’s own analysis: java:S2259. In the context of that rule, the news that you don’t always run these analyses on the same environments is quite interesting. For NPE detection specifically, this could have something to do with the classpath on the machine - think not just libraries but versions of libraries. If a library is present on one machine but missing on another that could explain why such an issue is intermittently detected; analysis can’t step through code that isn’t there.
I would be very interested to know how reproducible this issue is when
We do run the analysis on the same environment. The alternate machines described were used just to test out and confirm this problem wasn’t specific to one machine/configuration.
Because the behaviour does not seem to be isolated to just one plugin, it seems most likely a problem with the core platform.
Removing plugins will naturally drop the rules being analysed, and the issues counted, which might make it seem the problem had gone, but that may be masking an underlying cause.
This is a large and complex project with more than 73K issues and I am wondering again if these aspects are at the root of the cause.
Perhaps we need to split this maven java analysis into separate modules or groups of modules.