The tale of legacy code It is one of those days again. It's me sitting here - looking at code I don't understand. The elderly wise people who wrote it once in the ancient times are vanished. The secret scrolls of documentation that some tails claim existed are gone. It's me sitting here - adding another piece of code, that another one will find some time in the future and wonder why is it there.
Like presumably most of you – I don’t enjoy working with code that is not mine. With code that I don’t know well. With code that I don’t feel comfortable to change. In short: I don’t like working with legacy code.
If I don’t like working with it – I found it fair to no longer produce it for others.
|“To me, legacy code is simply code without tests”
Michael C. Feathers – Working Effectively with Legacy Code
Once I accepted that well written tests can increase the understandability of code and make easier to maintain for a long time, and therefore prevent it from become legacy code, I started to add tests everywhere I worked on. More precise: automated tests. The only type of automated tests I knew at this point in time: UnitTests.
It all worked fine with fresh projects and I started to see positive results. When I started adapting this behaviour to bigger existing projects it often felt like using a hammer to drill in a screw. Using UnitTests everywhere was exactly this – using the wrong tool for the job.
In this post I would like to show the different kind of tests I use to add tests to an existing (untested) code base.
Know your goals
There is more than one reason why one might want to add tests – and it’s important to know yours before choosing your tool.
The first and obvious one is:
|“Ensure that software behaves like expected”.|
That goal implies the prevention of defects in the first place but I would like to state it explicit:
|“Prevent defects to reach release state”|
Since in reality bugs will appear anyway – the third goal is
|“Reduce the costs to fix a defect”|
Defining the costs is not an easy task and the existing studies ( NIST, IBM ) are very controversial topics in the community. In the end the bottom line of those studies is: Costs to fix a defect increase with time and software lifecycle stage. So I decided to go with a definition that takes this bottom line into account and matches my day to day experience. If my software is working properly and I change one line of code – suddenly it breaks – the issue could only be the one line that I just wrote. Easy fix easy live.
If somebody tells me that something is not working in a system that I wrote a year a go – I need to learn that system first (again). In addition the technical knowledge of the person reporting a bug highly contributes to the costs. If one of my fellow peers tells me: “If I try to buy a product that is out of stock – I receive a null reference exception in the Inventory System line X” is way easier to track down than a customer telling me “When I press the buy button nothing happens!“
So this experience leads to the following formula
|Costs to fix a defect = stage it was found in x complexity of the system x time that passed since the implementation|
The test-types presented in the next chapter aim to reduce on of the multiplicands to reduce the overall risk and costs.
Know your poison
Adding tests will always come at a price, but also bring some benefit. One needs to know both well and choose wisely.
Documented Manual Tests
One way to achieve this first goal is to write a checklist, hand it over to our QA-Team and let them run it manually. Thats a valid written Test. This test adds up to your test-coverage. The important word here is written!
Unwritten tests look fine when they discover bugs, but introduce the issue that the status: “All working” does not tell me a defined state. Questions like: “Was the shop checked?” “Was something purchased with a real credit-card?“, “Was a not valid credit card checked?“, need to be answered.
With written manual tests I can reach high test coverage with very low test creation costs, but those tests show multiple challenges:
- They are very expensive to run
- Execution time is very long
- Because they are very expensive – you cant run them very often (usually you run them once every iteration, maybe once a week)
According to the formula above, the last point will increase the costs of fixing bug that is found during testing, and therefore increase the costs of this type of tests. Overall manual tests are an easy way to establish a basic layer of testing and work towards the first goal, but will show pretty expensive to reach a high test coverage. However manual tests will stay a part of the test suite for all areas where the implementation costs of other test-types are extremely high.
This is the first approach on automated testing, which tries to mimic the user behaviour and test the results automatically. I can take the written manual test, and implement whatever is possible. Every manual test that I implement can be removed from the manual test area to reduce costs. Even though this test suite could easily run several hours, it can run nightly and reduce the factor “time passed”.
In my day to day work I actually pick subsets that cover the area I’m working on and run them multiple times a day. With that approach I can reach near to 100% test coverage for a comparably low costs, and I can completely satisfy my first goal:
- “Ensure that Software behaves like expected”
However I cannot reasonably reduce the time that passed below that hourly range, and I did not touch the complexity factor. Those topics will be covered by the next categories.
This test is the first that should reduce complexity. Usually by removing the input layer, system tests trigger functions directly on the systems. Doing that they still use the full environment but instead of triggering the functionality through user input they invoke the functions directly. That reduces the complexity by a great amount. If something fails on the Behaviour Test, and does not fail on the System Test – I can exclude all non UI Code as source of the bug. That’s usually a huge chunk of our code.
It also is the first approach on getting a grip on the rarity of a bug. If I get rid of the Ui Layer and have full control over the execution time, I can write tests that execute system calls very quickly or overlapping, to reproduce issues that happen only under rare combinations. I could also run a test that triggers functions in a random order, to just name some examples.
While that all sounds awesome – system tests are the first test layer that add requirements to my production code. It depends on my code having separated view and game logic. Since it’s common and good practice to not mix view and business logic, the changes that I need to do to be ready for the implementation of system tests are usually pretty low.
If parts of the codebase don’t reflect this separation of concerns, there is a tough decision to make: Either refactor it, so system tests can be properly implemented – or miss out on the benefits and stop refining the test level further down for this code. It will still be covered with behaviour tests to fulfil the first goal, but will never achieve the second one.
The idea behind Integration tests is one of the oldest algorithm idea in computer science: “Divide & Conquer”. If I have a problem I divide the problem in two equal parts and solve these smaller problems separately. If a test fails in the Behaviour and the System Layer I can exclude the UI as source of the defect, but I’m still left with a huge chunk of code to check. Thats where Integration Tests come into play.
Integration test is the first layer where I introduce mocking. Mocking means I replace a complex system with a dummy that has exactly one hard coded behaviour. That behaviour is only correct under one specific circumstance, but that doesn’t matter – because I write that behaviour so it matches my test case.
If I have a game that has two systems and the game has a bug. It would be nice to know which system is the cause – that would save me half of searching time. Usually the problem is that those two systems depend on each other. Here is where the mock comes handy – with the mock I simulate that the second systems definitely work correct. If I now run the test – the system that fails is the problem. If none fails I know that the problem is in the communication between the systems.
With the integration test I split the huge “non-ui” part that is left from the system test in chunks that belong together.
For example: Inventory, Battle, Town building etc. From my experience Integration tests split the code into the 5-10 Main areas of my game which reduces the complexity down to around 10-20% of my full game.
Since integration tests already can mock lengthy behaviour like backend connection, waiting for animations etc they run a little bit faster. Usually in the ten-minute area and if you only run some of them even in the minute area – so that way I’m also able to reduce time that passed.
The needed mocking raises the bar on my code quality – since it requires that my code follows the Dependency Inversion Principle.
Depending on the existing code quality the changes are usually not very risky or complex. They are just work that scales with the number of usages ( so if I have a singleton used in 100 files – I need to update 100 files)
Component Test also require the Dependency Inversion principle – now on a per class level
Component test is the first layer that only tests one System ( usually 1-5 classes ) at the time – the rest is mocked.
That usually reduces the complexity down to 1%-10% of your whole game.
Since all time-based systems can be mocked away Component Test is also the first layer that can run in only one frame. Which leads to an execution time in the seconds to minute area – so they can run on every commit which also reduces the time passed. I usually run component tests 5-10 times a day.
Unit tests are the crown of automated testing. One test only tests one function – the rest is mocked.
Unit Test add a new requirement to your code : Single Responsibility – you can only reduce the complexity with unit tests if the code you test only does one thing at the time.
Unit tests reduce the complexity down to single methods, they always run in one frame and in the milliseconds to seconds area (speed usually roughly equals to the compile time).
This means they can run every time I let the compiler run – so nearly every line of code that you change – which reduces the time passed to nearly zero. The complexity is reduced to one function – most of the time only a couple of lines.
I showed that test coverage is not only about Unit Test – it is the sum of all test approaches – including the non-automated tests.
Writing tests to verify the correctness of your software is only part of of the truth that splits in three areas:
1. Detect the existence of defects
2. Reducing the time that it takes to fix a defect.
3. Preventing bugs in the first place – if you find an issue during feature development it may not even be considered a bug.
If you have an existing project that does not have the full test coverage I showed: Start top down to keep the requirements ( and therefore changes ) to you production code as low as possible.
If you would start bottom up ( with unit tests ) you might need to change larger parts of your production code without having tests that cover your changes. On top of that the costs to add a small amount of coverage is already quite high and leave you with an unsatisfying result.
It doesn’t help if you have UnitTests for 5% of your code and the rest has nothing. Because unless the bug is in that 5% you have an effective coverage of 0%.
If you start on an empty project you could consider starting bottom up and have a look into Test Driven Development.