Fortran - Testing - Unit tests and test suites
Matthias Noback
In the previous post, we’ve spent some time improving a number of “temporary” tests. We introduced an assertion function to compare two real values, and we prevent failing tests from stopping the entire test program.
Moving tests to their own procedures
The approach of adding tests directly in the main test program block doesn’t really scale, as the saying goes. polyline_length may be a simple function, but for more complicated functions with multiple branches and loops, we’d have to write many of these tests. The test program keeps growing, and eventually it becomes a mess. It doesn’t help that all local variables have to be declared at the top. Even if that wasn’t needed, it isn’t very clear where a test starts or ends. Everything happens in the same scope, potentially becoming a memory management issue too. This also makes it hard to delete tests we no longer want. Removing lines from a big test program likely breaks other tests, or we forget to remove things we no longer need.
A simple solution is to move every test to its own procedure, with its own local state that will automatically be cleared when we jump out of the function. To ensure the error counting mechanism in the test program can still be used, we’ll let each test procedure return a logical test_failed argument. As an example, we extract the following function for testing the polyline with 2 points. Note that we assign the result of assert_equals directly to the test_failed return variable. This indicates correctly that the test itself fails if the assertion fails:
function test_polyline_with_2_points() result(test_failed)
logical :: test_failed
real(kind=real64), dimension(2, 2) :: two_points
print *, 'Calculate length of polyline with 2 points'
! ...
test_failed = assert_equals(5.0_real64, &
polyline_length(two_points), &
1.0e-10_real64)
end function test_polyline_with_2_points
The main program block looks like this now:
integer :: error_counter
logical :: test_failed
test_failed = test_polyline_with_2_points()
if (test_failed) then
error_counter = error_counter + 1
end if
test_failed = test_polyline_with_3_points()
if (test_failed) then
error_counter = error_counter + 1
end if
if (error_counter > 0) then
print *, error_counter, 'tests failed'
error stop
end if
It just calls all test procedures. There’s no irrelevant local state left, which gives us a nice and clean test program. The only code quality issue here is the duplication after each test. We’ll be able to fix that soon.
Moving tests to their own modules
It’s sad, but the code that’s written in a “quick test program” often doesn’t survive. Once a function is verified to return the correct result, the code often gets deleted. We lose the time and effort that went into writing this “throw-away” code. We also lose the intention that the original author had; we lose their understanding of what the function should do and which of its behaviors they were interested in. But we also lose the safety net for future changes to this function: if we make a mistake when doing so, we likely won’t even notice it.
To prevent all these losses, let’s make sure these tests get a proper place in the code base: not inside a test program, but inside their own module. Fortunately, this is easy to do: test functions only have local variables, so we can literally just cut-and-paste the test procedures into a separate module; let’s call it test_polyline. There we may even add new tests for polyline-related code. As always, we have to keep an eye on these modules: are they growing too big, maybe because new tests are merely tangentially related, then we should move some tests to new test modules.
module test_polyline
use iso_fortran_env, only: real64
implicit none(type, external)
private
public :: test_polyline_with_2_points
public :: test_polyline_with_3_points
contains
function test_polyline_with_2_points() result(test_failed)
! ...
Generalizing the test program
When we move test procedures to their own modules (where we make them public), we also have to import them in the main test program to keep everything working:
program tester
use test_polyline, only: test_polyline_with_2_points, &
test_polyline_with_3_points
! ...
test_failed = test_polyline_with_2_points()
if (test_failed) then
error_counter = error_counter + 1
end if
! ...
In the long run, the number of use statements for all of our test modules will really start to add up. Also, we still have this problem of code duplication: every time we call a test, we need those if (test_failed) lines. What if we could make an array of tests and loop through them, removing the code duplication, but also removing the need to call test procedures by their specific name. In other words: what if we could generalize the test loop itself, like this:
do i = 1, size(tests)
test_failed = tests(i)()
if (test_failed) then
error_counter = error_counter + 1
end if
end do
If tests could be an array of procedure pointers, we’d be done. Unfortunately, Fortran doesn’t allow creating an array of (procedure) pointers. To work around this limitation, we have to define a derived type with a procedure pointer as a data component. First, let’s define what a test procedure should look like, using an abstract interface: it should always have no arguments, but return a logical. The existing test procedures already match this interface:
module test_framework
implicit none(type, external)
private
abstract interface
function test_procedure_interface() result(test_failed)
implicit none(type, external)
logical :: test_failed
end function test_procedure_interface
end interface
end module test_framework
Note that we’re placing this interface in a new test_framework module.
Now we declare a new derived type, let’s call it unit_test_t, which also lives in the test_framework module. It holds a procedure pointer to the actual test procedure. When declaring this component, we specify the interface we just defined:
type :: unit_test_t
procedure(test_procedure_interface), pointer :: test_procedure
end type unit_test_t
Compiling this leads to an interesting error:
For a type-bound procedure that has the
PASSbinding attribute, the first dummy argument must have the same declared type as the type being defined.
As we know from previous articles, regular type-bound procedures are expected to have the instance on which the procedure is called as their first argument (e.g. class(unit_test_t), intent(in) :: self). Surprisingly, procedure pointer data components are treated as type-bound procedures too. To get rid of the error, we can specify the nopass attribute:
type :: unit_test_t
- procedure(test_procedure_interface), pointer :: test_procedure
+ procedure(test_procedure_interface), pointer, nopass :: test_procedure
end type unit_test_t
Note that this is a very interesting language feature. Type-bound procedures are “hard-coded”; their behavior is determined at compile time. With procedure pointers as data components, we can change the behavior of a derived type at runtime, because we “bind” a procedure with the correct interface to an existing derived type instance dynamically, even from outside the module.
For now, let’s populate an array of these new unit_test_t instances directly inside the test program. Later we can let the test modules do it themselves:
program tester
use test_polyline, only: test_polyline_with_2_points, &
test_polyline_with_3_points
use test_framework, only: unit_test_t
implicit none(type, external)
integer :: error_counter
integer :: unit_test_index
logical :: test_failed
type(unit_test_t), dimension(:), allocatable :: unit_tests
unit_tests = [ &
unit_test_t(test_polyline_with_2_points), &
unit_test_t(test_polyline_with_3_points) &
]
do unit_test_index = 1, size(unit_tests)
test_failed = .false.
test_failed = unit_tests(unit_test_index)%test_procedure()
if (test_failed) then
error_counter = error_counter + 1
end if
end do
if (error_counter > 0) then
print *, error_counter, 'tests failed'
error stop
end if
end program tester
Note that creating the array of unit tests still requires explicit references to the test procedures, meaning they have to be explicitly imported as well. The list will be very long, and every time we write a new test we have to add it to the test program itself. If we work in a team of eager testers, we’ll have a lot of merge conflicts at this location.
We can reduce the amount of coupling, and thereby the change rate for this code location, by letting the test module populate an array with its own unit test procedures. For instance, we can extract those unit test instantiation lines to a dedicated function in the test_polyline module:
module test_polyline
! ...
public :: collect_polyline_tests
contains
function collect_polyline_tests() result(unit_tests)
type(unit_test_t), dimension(:), allocatable :: unit_tests
unit_tests = [ &
unit_test_t(test_polyline_with_2_points), &
unit_test_t(test_polyline_with_3_points) &
]
end function collect_polyline_tests
! ...
end module test_polyline
We make that function public, so the test program can invoke it:
-use test_polyline, only: test_polyline_with_2_points, &
- test_polyline_with_3_points
+use test_polyline, only: collect_polyline_tests
type(unit_test_t), dimension(:), allocatable :: unit_tests
-unit_tests = [ &
- unit_test_t(test_polyline_with_2_points), &
- unit_test_t(test_polyline_with_3_points) &
- ]
+unit_tests = collect_polyline_tests()
Note that the only public procedure of the test_polyline module is now collect_polyline_tests, so test_polyline_with_2_points and test_polyline_with_3_points remain private. Nevertheless, the test program is able to call these procedures. The fact that these procedures are private to their module is irrelevant; as long as we target these procedures in test_polyline, they can still be called by their pointers inside the test program. This is a powerful yet somewhat surprising feature of procedure pointers. It allows us to expose private functionality to other modules, without the need to expose actual procedure names.
Collecting unit tests from multiple test modules
What if we have multiple test modules? We’d have many similar collect_*_tests functions that we have to call from the test loop. We can easily define a common abstract interface for those procedures again:
! in the `test_framework` module:
abstract interface
function collect_unit_tests_interface() result(unit_tests)
import :: unit_test_t
implicit none(type, external)
type(unit_test_t), dimension(:), allocatable :: unit_tests
end function collect_unit_tests_interface
end interface
Just as before, we also define a derived type that can hold the reference to the collect procedure, let’s say test_suite_t (which lives in too):
! in the `test_framework` module:
type :: test_suite_t
procedure(collect_unit_tests_interface), pointer, nopass :: collect
end type test_suite_t
In the test program we can now do two loops: one on the array of test suites, one on the array of test procedures provided by the test suite (or actually, test module):
program tester
use test_polyline, only: collect_polyline_tests
use test_framework, only: unit_test_t, &
test_suite_t
implicit none(type, external)
integer :: error_counter
integer :: unit_test_index
integer :: test_suite_index
logical :: test_failed
type(test_suite_t), dimension(:), allocatable :: test_suites
type(unit_test_t), dimension(:), allocatable :: unit_tests
test_suites = [test_suite_t(collect_polyline_tests)]
do test_suite_index = 1, size(test_suites)
unit_tests = test_suites(test_suite_index)%collect()
do unit_test_index = 1, size(unit_tests)
test_failed = .false.
test_failed = unit_tests(unit_test_index)%test_procedure()
if (test_failed) then
error_counter = error_counter + 1
end if
end do
end do
! ...
end program tester
At this point we get a strange compiler error:
error #8180: The procedure pointer and the procedure target must
have matching result types. [COLLECT_POLYLINE_TESTS]
test_suite_t(collect_polyline_tests)
This seems incorrect, because the collect_unit_tests_interface defines the following as its return type:
type(unit_test_t), dimension(:), allocatable :: unit_tests
And the collect_polyline_tests procedure matches this perfectly:
type(unit_test_t), dimension(:), allocatable :: unit_tests
The way to fix this is to introduce a factory function for test_suite_t instances. This is always a good idea, because it supports encapsulation of its data components, but I’m surprised I have to do it in this case. This is the factory function:
! in the `test_framework` module:
function new_test_suite(collect_procedure) result(test_suite)
procedure(collect_unit_tests_interface) :: collect_procedure
type(test_suite_t) :: test_suite
test_suite%collect => collect_procedure
end function new_test_suite
After making this public, we can use it in the program:
-test_suites = [test_suite_t(collect_polyline_tests)]
+test_suites = [new_test_suite(collect_polyline_tests)]
Could the test framework discover all test modules and test procedures automatically?
There is no mechanism built into the language to allow some kind of enumeration at runtime of all the modules, or all the procedures in those modules. If there were a language feature like that we could easily filter the modules and procedures that start with
test_and automatically register them inside our test framework. Such a language features is often called reflection and is used in libraries like test frameworks to make test discovery easy. In the case of Fortran, we have to hard-code the test module and procedure names in our code. Alternatives are imaginable though. We could for example take the following steps:
- Parse the
.f90files in ourtest/folder- Collect the names of test modules and test procedures in those files
- Generate the test program that enumerates them all
- Build the test program
The approach I’m taking in these articles is a lot simpler and doesn’t require pre-build steps. It remains a bit cumbersome to declare tests explicitly, but honestly it’s not that much work. Every test module needs to be declared only once, and from that moment on the only overhead is to add a call to
new_unit_testfor every new test procedure. It takes only a short time to get used to that.
We have been adding more and more code to the new test_framework module. Giving it that name inspires us to make some more of this. It also inspires another thought: shouldn’t we test our test framework? We’ll continue with this journey in the next post.