Why not to use unit tests (and use integration tests instead)

During the last few years, when I worked on different projects, I always ended up in the inevitable discussion: how should we test the code? Often I heard that unit tests are a must and they should cover most of the code and its flows. Maybe afterwards integration tests can be added, but they are complex to make, slow to set up and slow to run. So instead of setings those up, let’s write more unit tests instead. I think unit tests still serve their purpose, but there’s a lot of downsides to them in comparison to integration tests. I think unit tests are overrated and integration tests should be preferred. Of course there’s also E2E tests to test multiple services in a single test, but in this post I’ll stay within the boundaries of testing a single service and look at unit tests versus integration tests.

Reason 1: Mocks of unit tests are time consuming to write

To be able to unit test a method or class that has depedencies, each dependency should be replaced with a mock. This mock must be build so it imitates the behavior sufficiently enough for the unit test to work. The time to write the necessary mocks can vary, but in the worst case it can take a lot of time. If the method you want to test has multiple dependencies and the logic of each depedency is relatively complex to mock, it might take longer to write this than the actual code you want to test. It might even be necessary to change the code you want to test, so it can be mocked.

A mock is always a simplified copy of the actual code, which only implements the necessary methods and flows within these methods as are necessary for the tested method. When the implementation of the tested method changes, for example a different method is called on the dependency, it’s necessary to also write a mock for this method. This makes mocks not that flexible and time consuming to maintain.

Reason 2: Unit tests create duplicate logic

In the previous reason I already mentioned that mocks should mirror the actual depedency that they imitate. In extreme cases the mock could be an almost copy of the implementation of the dependency, including its logic. At this point there’s duplicate logic in two places and as is well known within programming: code duplication is bad. If the logic within the dependency is changed while the logic in the mock isn’t, it can lead to a false feelin of safety when all the tests are green.

If an integration test was written for the above case this would have not happened, because the logic is only in a single place: the actual code. Of course integration tests still mock calls to other services than the one being tested, but there’s also a solution for this. There’s often plenty of libraries available to use to mock common databases or cloud resources, for example https://github.com/spulec/moto for AWS SDK calls in Python or https://github.com/testcontainers for common databases in multiple programming languages. Unless you like to make your own implementation of MySQL or AWS, please use those.

Reason 3: Unit tests hide unreachable code

For most programming languages there’s a way to measure the flows within the code that unit tests cover. It’s a nice feeling to see this increase when you write unit tests and on top of that you can also brag to other teams that your teams code coverage is higher than theirs. Although I think code coverage of unit tests for an isolated, complex method is really insightful, I don’t think code coverage says a lot for a whole service when it’s mostly covered with unit tests.

When the service is being run in production, users or other systems don’t directly call specific method within the service, but they call the interface. At the interface the user can only pass a limited amount of combinations of parameters, similarly for the outgoing calls to other services from the service. If you have written integration tests and a large part of the code is not covered, it’s some important information. This is a clear hint that either your integration tests don’t cover all of the flows or the code in unreachable. If the code in unreachable, why is it there and can’t it just be removed? If you only had unit tests, the code was tested and fully covered, but it would be hard to judge whether the code is actually used within the service.

Reason 4: Unit tests are not that much more quicker than integration tests

One big reason to use unit tests over integration tests is that they are quicker. This is actually one of the big reasons to have less integration tests than unit tests according to the test pyramid. This is true, but I think with new developments of the last few years the difference is getting smaller. Nowadays software is written in microservice architecture, so an integration test covers less code in comparison to when a monolith is tested. Also new techniques like Docker (used by the earlier linked TestContainers) allow to quickly spin up a fake depedency. For example instead of mocking the AWS SDK, it’s also possible to spin up a DynamoDB fake using TestContainers and make actual HTTP calls. Allowing to also performance, connection pooling and more. These developments make the gap between running integration tests and unit tests much smaller. From my experience if the integration tests are set up properly, with for example Docker, they take around 2 seconds per test case. Of course this is slower than the miliseconds it takes to run unit tests, but I still think the duration of integration tests is workable, especially with all the benefits it adds.

Reason 5: Unit tests are a bad way for documenting how a service works

Writing documentation is often not that interesting and can be time consuming, while you could be writing actual code. Often people say that tests are also a form of documentation, so what’s wrong with unit tests as documentation? In my opinion unit tests are not a good way of documenting how a service works and what it does. Unit tests only cover a small piece of the puzzle. The test tells you what that specific puzzle piece does, but it doesn’t show you the full picture. This can often be seen in the ambiguous names of unit tests, like: MyDeluxeCalender_GetDayForTimeAndTimezone_TestForTimeWith2300InCETTimeZone. It’s possible to find out what the test does, but it doesn’t show why you it’s tested. Probably some user input can lead to this test case, but maybe not, who knows? It’s possible to change the name of the unit test to what it tests (in this case test whether the day is flipped over due to the timezone), but it still doesn’t describe the context of this method. It could be possible that this method is always called with UTC as timezone, meaning the code can be simplified a lot (see Reason 4).

Integration tests show what goes into the service and what comes out, without having to know the implementation. This makes it easier for people who didn’t build it to understand what the service does. It’s even possible to use (Cucumber) feature files to define the test cases, so people can understand what the service does without knowing the used abstractions, frameworks or even programming lanugage. If people want to dive deeper into how the code works, they can debug an integration test case and step through the called methods. This makes it much easier to understand how all the puzzle pieces fit together.

Conclusion

The above reasons are just an opinion and I’m curious if you think differently about them. I could be completly wrong and I think there’s always something to learn. Maybe for a follow up post I’ll collect some real-world examples of the above reasons to make them more clear.

In the above reasons I didn’t answer the big question: why do people choose unit test over integration tests? I’m not a psychologist, but I think it has something to do with how developers think. They often like to solve complex problems, which is a good trait for a programmer. Thinking in complex solution to solve the complex problems can become a habit, even when it’s not necessary to complicate things. Unit tests is a good way to test this complex logic and confirm that your complex solution works within the scope of the tests, but unit tests don’t tell you whether the written code has to be made as complicated as it is within the context of the service. The service ends up being relatively complex and since there’s a lack of integration tests, it’s hard to tell what each piece of the code does within the context of the service. No one dares to touch the code, because they don’t want to be the one to cause bugs in production.

Of course there’s already much written about testing and to be honest I got most of my inspiration from the post Unit Testing is Overrated of Oleksii Holub. His blog goes much more in depth and his other post Prefer Fakes Over Mocks is also interesting to read. So kudos to him.