Most of the times when I talk about unit testing or I ask encourage my colleagues to write unit tests, I emphasis on writing GOOD Unit Tests which are easy to write, they test only one thing and they run in isolation. I always make a clear distinction between these very granular and highly isolated unit tests, which I call GOOD Unit Tests and the Integration Tests. I roughly name Integration Tests any automated test that has more than one point of failure or it exercises more external objects (other classes objects, services, databases, etc.). These tests verify that all those pieces work well together. What I want to focus on instead, are the very granular unit tests, which verify the basic correctness of each unit in isolation and they assert only one thing. When such test fails I know exactly where the problem is, from the test name only.
The greatest benefit we get from writing GOOD Unit Tests, which are easy to write, they test only one thing and they run in isolation, is a better code design that results into the production code. These unit tests put positive pressure on the production code making it better. I am not chasing a bug free code, nor a full coverage in regression testing with these unit tests, but a higher quality code design when I am working with teams that are less experienced in doing a good design.
Yes, there are drawbacks as well. We might get into the other extreme and have the production code design suffer from this granular unit testing, resulting into too small classes, with code smells like feature envy or unnecessary complexity. I totally see the point of DHH during the ‘Is TDD Dead?’ debate, that this might lead to creating interfaces or abstract classes only for the sake of mocking. Indeed, it may get to making interfaces which do not create abstractions and which break encapsulation. Yes, granular unit testing will not help in preventing these, on the contrary. When our code design falls into being too granular, practices like eliminate duplication and reduce the dependencies of a class while refactoring, plus a consistent code structure, are helpful to keep things in balance. In the end it all depends on your context: what problem you want to solve, which are your constraints, how experienced is your team and many other aspects.
We couple things by nature. It is more common to put everything in one class or one function rather than thinking about separating the concerns by factors of change. In most of the cases when I am looking over a poorly designed code I am seeing too much coupling (or coupling without benefits) and poor cohesion. Everything is done in one or two big classes. In all most of the cases it is clear that if someone would have tried to write GOOD Unit Tests on that code, it would have been forced to refactor it towards a better design. It wouldn’t be possible to write small unit tests, otherwise. This would have turned the code into a better one. Maybe not the best design, but certainly better. It would break the one thing into more, and the breaking would be driven by test cases, which leads to a good separation of concerns. I often see that the GOOD Unit Tests make the design to move from one like this
One big thing
towards one like this
which is clearly better than the first one, maybe even better than this
Coupled things. Everything is talking to everything
, and clearly better than this
Very many tiny, small things
, which is at the other extreme than the first one. We want to balance things and to get somewhere in the middle. (These design examples are copied from Juval Lowy presentation on ‘Modular Application Design’, which I value a lot.)
Another important advantage of the Good Unit Tests is that they are easy to write and to maintain, therefore they have low costs. For example if we consider the following setup:
where, function f() of class A, calls g() of class B, which in its turn calls h() of class C. After the return of g(), f() may make a call to h’() depending of the result of g(). A quite common setup in an OO program. Now, if we want to write a test to verify the correctness of f() and only f(), it will not be easy. When we write the arrange part of the test, we need to take into account the preconditions and post conditions of all the functions, and find the test data that will make the code flow to be the one we want in our test scenario. This will make our arrange code hard to write. When we assert, things it will also be difficult. The expected result of f() may need to take into account calculations made by the other functions as well. Even if we manage to write the test, it will be hard to maintain. Also, there are high changes for this test to break when refactoring or other changes occur to classes B or C. Another, disadvantage of such integration test is that if it fails we cannot clearly say where the bug is. It may be in classes B or C even though those are not the classes under our test. As a conclusion, this kind of test has high costs (difficult to write and maintain) and low benefits (I don’t know where the bug is, when it fails).
The alternative, is to refactor my code and change B and C with interfaces which abstract them. This will allow me to replace them with fakes (mocks or stubs) in my tests, which are totally under the control of my test code. Now I can easily write many GOOD Unit Test for class A, in isolation to verify its basic correctness. I can write one test to verify ONE thing. Looking more closely to the production code, we’ll see design patterns and SOLID principles that may emerge from this refactoring. For example, we’ll tend to program against interfaces, we’ll depend on abstractions and not on implementation details (partially what DIP says), we’ll have our class dependencies visible, we’ll move towards a more extensible and reusable code (partially what OCP says).
It is good to focus to cover with unit test all the logic and calculations. This means the code that has ifs, whiles, fors, etc. and mathematical or other data structure operations. Covering plumbing code, data conversion or basic variable assignment brings little value. When we strive to achieve a good coverage for the logic, because it is easier to fully cover it with GOOD Unit Test rather than Integration Tests, another positive effect happens: the logic gets pushed away from the external frameworks and libraries. The external libraries tend to get abstracted, and our core logic separated from the plumbing code that integrates them. This leads to a better separation of concerns. Our business logic code tends to be pushed away from the presentation and from the data access towards the middle. Wasn’t this something we’ve always wanted?
Logic is pushed from Presentation, Data Access and Cross-Cutting Concerns
Another aspect of GOOD Unit Tests is that we can fully and easily test all the scenarios that the code needs to handle in a very detailed way. I usually think of the car running example. If the car does not move, it is clearly broken. However, we don’t know what part is not working, so we need to verify each part separately to find and fix the problem. Moreover, even if the car moves, we can’t say by this integration test only, if all the parts function at correct parameters. It may be that only after driving more than 1000 km without stopping the engine will get overheated, because of a malfunction in the cooling parts. Coming back to code, I rely on Good Unit Tests to verify the basic correctness in detail, with high coverage. Like in the car example, there may be scenarios that I cannot test otherwise or I would just ignore because it is too hard to be done with integration test. For instance how would you test that your code behaves correctly when the hard-disk is full? Would you fill it to run a tests? Can you do that on the CI server?
Therefore, we target a high coverage with unit tests, and we verify that each of the services work well in isolation. The unit tests can go in detail at lower costs. Then on top we may have few Integration Tests, which do not go in many details because they are costly. They just check that all the services and components can work together and that all the configurations are consistent. The integration tests usually follow some happy flow scenarios and focus that all the components they touch are running. They do not target correctness as the unit tests do. Below figure, shows three classes fully tested in isolation by unit tests (blue arrows) and two integration tests (purple arrows) on top, which touch all the working components, but fewer code paths.
Integration tests touch all classes, but have lower coverage
In theory both with isolated Unit Tests and Integration Tests we could reach a full coverage of the entire code we write. With unit tests we need to write many pairs of isolated Collaboration Tests and Contract Tests to reach a full coverage. With Integration Tests we need to be able to specify and simulate all the test scenarios. In general, the costs of writing and maintaining such integration tests which cover all the details, grows exponentially and they are very fragile to change. Any change in the system is very likely to break them, which makes them not feasible. That is why we need to test on intermediate results. We break the end-to-end tests in smaller tests, until we get to the Good Unit Tests, which are at the lowest level checking each detail. By doing this, not only that we get to full a high coverage at lower costs, but we also get better production code in the process. Integration Tests rarely bring any benefits to the design of the production code. However, they contribute to the test suite covering the component integration and the configurations consistency.
To conclude, I always treat my unit test code as first class citizen. It is separated from production or other test code. I like my unit tests to be small, easy to write, to test only one thing and to run in isolation. The highest benefit I want from them is to put pressure on my production code to make it of better quality. I write them in very short cycles, with the production code, even if I do code first, because I want to refactor in small steps. I rely on them for basic correctness and I target a high coverage of the code that does logic and calculations. Separately, I write different levels of Integration Tests. They check that more services or components (which I already know are functioning well in isolation) can work together. They may also check interactions with external services, frameworks or data sources. Sometimes, I also have integration tests for regression testing, performance or load testing.