Employee Space
Jobs

Here is the second article written by our collaborator Julien Busset, Study and Development Engineer at Cat-Amania.

At Touraine Tech, Benjamin Cavy introduced us to his passion for mutations. It’s not about mutating guinea pigs to get a fuchsia-colored one, but about mutating code to see if it can survive more or less random modifications. Of course, I know you’re all experts, your code is flawlessly impeccable (ahem…) and your unit tests (UTs) cover 100% of your code and all pass (ahem ahem… excuse me, I have a tickle in my throat). That said, in case of doubt, mutation testing can be a tool that will still help you significantly improve the resilience of your code by putting your UTs to the test.

By the way, what is a unit test?

The writing of Unit Tests is undeniably part of good development practices, it no longer needs to be demonstrated. These tests ensure that methods return the expected output for a given input. We don’t test everything, but we test as much as possible, and above all, we test what is useful, what must work at all costs. To find out if our tests cover the majority of our code, we can use tools like the very popular SonarQube. These tools, for example, offer to measure code coverage, indicating the percentage of code lines executed during tests. Many companies set a quality barrier for their applications that includes, among other things, a minimum code coverage.

“To test is to doubt”… and it’s doubting your tests.

But does the quantity of tests guarantee the quality of tests? A lack of quantity certainly guarantees a lack of quality, but for Benjamin Cavy, the reverse is not necessarily true: I can execute a lot of code in my tests, but only verify the accuracy of part of the expected output. For example, if a method is supposed to multiply an input number by 2, then store the result in a database, and finally return it to the method caller, then I can create a test that checks if my method returns 6 when I input 3… But have I really checked if the result is also stored in the database? Whether the verification has been done or not, the test coverage will be complete for this function. I admit that the example is trivial, but after all, it can happen: how can I make sure I’m really testing, even with 100% code coverage? An executed line of code is not necessarily a tested line of code if a check of what it does is not performed.

Mutation Testing, or Code Put to the Test of Natural Selection and Genetic Drift

It’s highly unlikely for your code to mutate on its own… However, it’s entirely possible that a less competent colleague inadvertently breaks your code. In such a case, it would be very useful for your unit tests to alert them to prevent any regression. According to Benjamin, a good test is a test that fails when it should. This is where mutation testing comes into play: the principle is to perform random but viable modifications to your code (mutations), and then run the resulting code through your unit tests. If the unit tests detect the modification, meaning they fail, then the mutation is killed by your unit tests. Otherwise, the mutation survives your unit tests, which implies it wouldn’t be detected if it happened for real. The goal is to have your unit tests kill as many mutants as possible. Therefore, this measurement, combined with test coverage and the percentage of possible mutations, gives you the ‘strength’ of your tests (in percentage), which is a good indicator of the resilience of your code.

How it works?

Well, it works quite well! Your humble servant tested an implementation of this concept on a Spring Boot application. This implementation, presented by Benjamin, is called PIT, and was largely developed by Henry Coles. My app uses Maven, which makes it very easy to use PIT: just a few lines to add pitest-maven to the pom [ed. note: don’t forget the pitest-junit5-plugin dependency if you’re using JUnit 5 like me ^^’], and you’re good to go! After a quick build, I received a nice report about 18 minutes later (for ~2k lines of code, and in the unit tests, there’s no Spring context loading, but rather the use of a Mockserver) showing me the mutants that survived, similar to a Sonar report with faulty code lines highlighted in red. I found some untested loggers (not very serious), but also some method calls that survived their deletion (!). I achieved 70% test strength, which is good but not top-notch, so there’s room for improvement. Of course, it’s possible to configure the tool to exclude certain classes from mutation testing, for example. Like Sonar, I find that the main benefit is educational: if you’re an old sea dog, you may be able to do without it (although it’s sometimes useful to revisit the basics), but if you’re still a fresh-water sailor, you can learn to navigate the sea of unit tests more effectively with this great tool.