RSPEC-5977 (tests and random): rule text can be much improved

Hello.

The title of this rule is fine: “Tests should use fixed data instead of randomized data”, but the text inside can be much improved, as it:

  • gives wrong reasons to avoid random data, and not the fundamental one,
  • gives wrong solutions, and not the most trivial one.

I do not propose a better version here, I’m just pointing things out, that will make this post long enough already :slight_smile:

Wrong reason 1: “Using random values in tests will not necessarily check edge cases.”
That can be true of fixed values too, and on the contrary, looping many rounds on carefully designed random values helps a lot to cover edge cases that would otherwise be hard to figure out (like this one: https://bugs.openjdk.org/browse/JDK-8136874).

Wrong reason 2: “it will make test logs a lot harder to read.”
Indeed, random values, when floating-point, can be harder to read and grasp than simple values like “1000.0”, but the primary purpose of a test is not to be readable, it is to detect eventual issues, for which non-trivial values might be required.

Wrong solution 1: “it is better to use easily readable hardcoded values.”
It’s only enough for simple treatments, for which everyone just does that anyway. The more a treatment is complex, the more you want to test it, and with a lot of values, the more it is impractical to hardcode them.

Wrong solution 2: “You should however make sure that random values are logged so that you can reproduce failures.”
It’s not practical to replay something from a possibly large bunch of logged values.

Wrong solution 3: “there is one valid case for random data in tests: when testing every value would make tests impractically slow.”
You don’t need random values to test many cases in a loop.

Wrong solution 4: “You can for example use property-based testing libraries”.
Such libraries might be helpful, depending on the kind of things you need to test, but they might also be inapplicable (for example for stress-unit-tests of classes or modules of various sizes, that you might want to run for a few seconds by default but possibly for hours for occasional deeper checks), and therefore should not be indicated as a general replacement for random values.

The fundamental reason to avoid randomness is this one: as its name implies, it is not… deterministic!
You want tests to be deterministic:

  • So that when debugging, you can replay exactly the same scenario over and over, progress in its understanding as you add logs or study it in a debugger, and are able to compare logs to see the effects of your modifications (and just that).
  • So that you can be confident that you fixed the bug as soon as you don’t see it happen anymore, without risk that it’s just luck.
  • So that the CI pipeline doesn’t randomly fail at improper times without any code modification. You can always run longer or actually random tests outside of it if you want to hunt for more issues than it can currently detect.

The trivial solution to replace random values is to use values that are… pseudo-random!
In practice, that just means using a constant seed for Random(), giving it as constructor argument or with setSeed().
If there are multiple Random instances, the constant seed can be given just to a first one, and random.nextLong() used for other seeds.
If you want to use an actual random seed, for example to cover more state space over time as you run the tests, you just need to log that seed, or to use for example the current time rounded to the hour, which should allow to easily figure out which actual seed has been used for a bug to occur.

NB1: The rule could also check for usages of Math.random() and StrictMath.random().

NB2: For having determinist tests, just avoiding random values might not be enough, as you might also need to get rid of other sources of randomness, such as system hashCode() (can be fixed by overriding it), and threading (can be fixed by using virtual time and scheduling - unless you are testing concurrency primitives), but these are out of scope for this rule (except maybe for the first one).

1 Like

This is the rule description. I agree with the remarks of @Jeff_Hain .

On top of that, I would like to see the rule implemented for both C#/.NET and TypeScript.

Hi Jeff and Corniel! Thank you for your feedback! I just created a ticket (SONARJAVA-5410) to track this issue.