Mutation Testing: Are Your Tests Actually Testing Anything?
Your tests pass with 100% code coverage. But does that mean your tests are good? Mutation testing helps you answer this question by changing your code and seeing if your tests fail.
You’ve written a function and a suite of unit tests for it. You run your tests, and the output is beautiful: 100% Coverage. You feel confident that your code is well-tested and robust.
But what if I told you that 100% coverage can be dangerously misleading? Consider this function:
function isPositive(num: number): boolean {
return num > 0;
}
And this test:
test('isPositive should return true for positive numbers', () => {
expect(isPositive(5)).toBe(true);
});
This test suite achieves 100% line coverage. But is it a good test? What if we introduced a bug in the function?
// Bug introduced: changed > to >=
function isPositive(num: number): boolean {
return num >= 0; // This is wrong! 0 is not positive.
}
If we run our test suite again, the test expect(isPositive(5)).toBe(true) still passes. Our tests gave us 100% coverage but failed to detect a bug. This is where Mutation Testing comes in.
The Core Idea: If the Code Changes, the Tests Should Break
Mutation testing is a technique where you intentionally introduce small changes (“mutations”) into your source code and then run your test suite. The goal is to see if your tests can “kill” the mutant.
- Mutant: A version of your source code with one small change (e.g.,
>is changed to>=,+is changed to-,trueis changed tofalse). - Killed Mutant: A mutant is considered “killed” if at least one of your tests fails. This is good! It means your tests are effective enough to detect the change.
- Survived Mutant: A mutant “survives” if all of your tests still pass. This is bad. It indicates a weakness in your test suite. Your tests aren’t specific enough to catch that particular bug.
The quality of your test suite is measured by the Mutation Score Indicator (MSI):
MSI = (Killed Mutants / Total Mutants) * 100
A high score (e.g., >80%) means your tests are robust.
TypeScript Example with Stryker
Stryker is the leading mutation testing framework for JavaScript and TypeScript. Let’s use it on our isPositive example.
First, install Stryker:
npm install --save-dev @stryker-mutator/core @stryker-mutator/jest-runner
Then, create a stryker.conf.json file:
{
"$schema": "./node_modules/@stryker-mutator/core/schema/stryker-schema.json",
"mutator": "typescript",
"reporters": ["html", "clear-text", "progress"],
"testRunner": "jest",
"coverageAnalysis": "perTest"
}
Now, let’s run npx stryker run. Stryker will perform the following steps:
- Run the original tests to make sure they pass (a sanity check).
- Create mutants. For
return num > 0;, it will generate mutants like:return num >= 0;(Boundary Operator Mutant)return num < 0;(Conditional Operator Mutant)return num === 0;return false;(Boolean Literal Mutant)
- For each mutant, it runs your entire test suite.
Our original test expect(isPositive(5)).toBe(true) will kill some mutants (like return false), but it will fail to kill return num >= 0;. This mutant survives.
The Stryker report will look something like this:
----------------|---------------|-------------|
| File | % score | # killed |
----------------|---------------|-------------|
| All files | 50.00 | 1 |
| isPositive.ts | 50.00 | 1/2 mutants |
----------------|---------------|-------------|
# Ran 2 mutants in total.
# 1 survived, 1 killed.
The report clearly shows that one mutant survived. To kill it, we need to add a better test case.
Improving the Tests
Let’s add a test for the zero case.
test('isPositive should be false for zero', () => {
expect(isPositive(0)).toBe(false);
});
Now, if we run Stryker again, the mutant return num >= 0; will be killed, because our new test will fail (isPositive(0) would return true, but we expect false). Our mutation score will go up to 100%.
Python Example with Mutmut
Mutmut is a popular mutation testing tool for Python.
pip install mutmut pytest
The Code and Initial Test
# calculator.py
def add(a, b):
return a + b
# test_calculator.py
from calculator import add
def test_add():
assert add(1, 2) == 3
This test gives 100% line coverage. Now, let’s run mutmut run.
Mutmut will create mutants. For return a + b, a key mutant will be return a - b. Our single test assert add(1, 2) == 3 will fail against this mutant (since 1 - 2 is not 3), so this mutant is killed.
But what about this code?
# operators.py
def is_admin(user):
# Bug: This should be user.role == "admin"
if user.role != "guest":
return True
return False
# test_operators.py
class MockUser:
def __init__(self, role):
self.role = role
def test_is_admin_for_admin_user():
admin = MockUser("admin")
assert is_admin(admin) is True
This test passes and gives 100% coverage. But if we run mutmut, it might change if user.role != "guest": to if True:. Our current test only uses an admin user, so is_admin will still return True, and the mutant will survive.
To kill this mutant, we need to test the negative case:
def test_is_admin_for_guest_user():
guest = MockUser("guest")
assert is_admin(guest) is False
This new test will now kill the mutant, proving our test suite is more robust.
The Value of Mutation Testing
- It tests your tests. It’s the ultimate auditor of your test suite’s quality.
- It exposes weaknesses in your assertions. Often, a surviving mutant is a sign that your test assertions are too loose or are not checking all the important outputs of a function.
- It forces you to test edge cases. It’s the best way to ensure you’ve tested boundary conditions (
>,>=,==) and other tricky parts of your logic. - It provides a more reliable quality metric than code coverage. A high mutation score is a much stronger indicator of test quality than a high coverage percentage.
Downsides
- It’s slow. Running your entire test suite for every single mutation can be very time-consuming. It’s often run in a CI pipeline overnight, not on every commit.
- Some mutants are equivalent. Sometimes a mutation results in code that is functionally identical to the original (e.g., changing
x = i + 1tox = i - -1). These are called “equivalent mutants” and cannot be killed. You may need to manually ignore them.
Mutation testing is a powerful tool that pushes you to write better, more precise tests. It moves you beyond simple line coverage and helps you build a truly effective automated testing safety net.