Automated tests are a must IMO - this gives you the confidence to refactor code and be (reasonably) sure that you haven’t broken anything.
Certainly during initial development, it is useful to create a lot of unit tests as “scaffolding” for the project and then you may choose to take some of those away in favour of integration/system/e2e tests as the project matures, though a downside of that approach however is that it can be hard/impossible to obtain coverage information or indeed to identify which of the many changes made is the one that broke the appllication. Personally I prefer to keep the unit tests even if the code is covered by higher level testing.
Agree with @Alix on testing your public functions - don’t test internal implementation details unless it is fundamental to the code that is under test.
Also agree with @Alix that tests should break for as few reasons as possible - ideally a test has only one “logical” reason for failure. By “logical” I mean that it could encompass a number of assertions - imagine you are testing a clone method and you want to test that every property has been cloned - this may require many actual property-by-property comparisons but is “logically” just testing that the clone worked.
In terms of naming tests, I prefer more descriptive names that describe the intent of the test - you could imagine the test names being akin to a series of bullet points which describe the behaviour of a method / class.
In C# we have the concept of “inner classes” so typically the outer class would represent the class under test, then one inner class per method in the class under test, and then each individual test name would be an expressive natural language-like description of the expected behaviour of the method under test - e.g. When_Foo_Is_Bar_Then_Baz()
More concretely as an example, we use Swaskbuckle for generating OpenAPI documentation and have an extension method, AddOperationIdSelector
for adding an operation Id selector.
Our inner test class name is exactly the name of the method under test, so AddOperationIdSelector
, and then we have tests like this:
Selects_The_OperationId_From_The_Method_Name_If_There_Is_No_RouteInfo_And_No_RelativePath_Exists
The idea is that a developer can just skim the names of the tests to understand the expected behaviour - the tests act as “living documentation” for the code.
You only need to to actually delve into the test details if you really need to examine the details of that test.
It is also important that the test name expresses the intent correctly - the implementation of the test may not match that (which is an error). It can be hard to distinguish between what the test author was trying to test and the test simply testing the current behaviour, even if that is not the indended behaviour.
Ideally, test names (and the tests themselves) do not reference implementation specific details, making them more resilient to changes in the underying code. There’s nothing worse than a tiny change in implementation requiring you to update multiple tests - either names or the implementations of the tests themselves because they are not resilient against internal changes to the methods.
Granted this is not always possible, particularly when developing low level libraries.
Tests are written in Arrange/Act/Asset style - ideally the “Act” is a single line and as mentioned before, ideally there is only one (or at least only one “logical” assertion.
We try to ensure tests are clear and unambiguous - e.g. assertions should refer to expected and actual variables/constants defined. e.g.
- Check an array length is equaly to the expected array’s length and not the “magic” value 2 which happens to be the size of the array when you initially created the test.
- Check that actual.someProperty == expected.someProperty rather than the “magic” value “true” - this can be a particular problem if multiple properties happen to have the same value - is Foo == true because Bar == true or is it just a coincidence that they are the same?
I encourage developers to be careful not to test something that is not relevant to the behaviour under test, which could make the test brittle to unrelated changes. For example if you expect an array to contain a specific value, do not test that it is the 0th element in the array simply because there is only one item in the array at the time you write the test - test that the array contains the item you are interested on. If the item moves to another position in the array due to some later refactor, then your test should still pass.
Of course if it is important that the item is the 0th in the array then you should test that but that is generally not the case IME
RE: TDD - I am a big fan of TDD and would love to do it all the time…however I have been a developer since long before TDD was a twinkle in anybody’s eye and change is hard for a hoary old developer like me…so I try to practice TDD - and indeed I find it very useful when trying to “feel my way around a problem” - but quite often I don’t do TDD per-se.
What I do however is write the tests in the same coding session so that, whilst I am not strictly following TDD because I might write a method and then the tests for it (immediately, not at some unspecified future point in time), I do end up with testable code and I do experience it from the consumption side and sometimes adjust classes/methods based on my own experiences when writing the tests.
This may not be the most efficient way of doing it but it works for me and the end result is essentially the same - certainly it is better than not writing the tests at all or simply resolving to come back to them later!
Finally, bringing this around to SonarQube, I know that the product distinguishes between “test” code and “non test” code and runs only a handful of rules against tests - mainly around ensuring that each test class has tests and each test has at least one assert. My view is that test code is as important as application code - it lives as long, it is prone to refactoring, it is important to be able to understand - and therefore it should be maintained to a very similar standard as the code under test.
This means we disable “test” detection in Sonar and consider all code to be application code - the upside is we get all the quality rules run against it; the downside is that we lose the test-specific rules around “there should be at least one assert” etc.
Ideally we would be able to configure test analysis as testing specific rules + a large subset of the application code rules.