Fortran - Testing - Improving temporary test programs
Matthias Noback
How can we know that the function we wrote, works as intended? We could run it, and manually verify its correctness. The simplest way to do this is to call the function in the main program block, print the output, and compare it with what we expect. Say our function calculates the length of a polyline, stored as a two-dimensional array of reals, representing (x,y) coordinates:
pure function polyline_length(coordinates) result(length)
real(kind=real64), dimension(:, :), intent(in) :: coordinates
real(kind=real64) :: length
real(kind=real64) :: distance
integer :: index
length = 0.0_real64
do index = 1, size(coordinates, 1) - 1
distance = sqrt((coordinates(index, 1) - &
coordinates(index + 1, 1))**2 + &
(coordinates(index, 2) - &
coordinates(index + 1, 2))**2)
length = length + distance
end do
end function polyline_length
Temporary test programs
We could modify the main program block of our actual program, but it’s a lot simpler and safer to create a separate “throw-away” test program, with only the code we need. We’d write a short test program that sets up some coordinates, calls the function, then prints the result:
program tester
use iso_fortran_env, only: real64
implicit none(type, external)
real(kind=real64), dimension(2, 2) :: two_points
! x = 0, y = 0
two_points(1, 1) = 0.0_real64
two_points(1, 2) = 0.0_real64
! x = 3, y = 4
two_points(2, 1) = 3.0_real64
two_points(2, 2) = 4.0_real64
! Should be 5.0
print *, polyline_length(two_points)
end program tester
Visually this amounts to the following polyline:

Manually comparing output
Running the program, we can compare the value mentioned by the comment (“5.0”) with the actual output (5.00000000000000). We happily conclude: this function works. Or does it? It’s supposed to work for a larger number of points. How many points should we check? Passing 2 coordinates works, but maybe we should also try the function with 3 points (which define 2 polyline segments). We may also want to show that it works with distances that are not whole numbers like 5:
real(kind=real64), dimension(2, 2) :: two_points
real(kind=real64), dimension(3, 2) :: three_points
! ... (the two_points test)
! x = 0, y = 0
three_points(1, 1) = 0.0_real64
three_points(1, 2) = 0.0_real64
! x = 3, y = 4
three_points(2, 1) = 3.0_real64
three_points(2, 2) = 4.0_real64
! x = 4, y = 5
three_points(3, 1) = 4.0_real64
three_points(3, 2) = 5.0_real64
! Should be 5.0 + square root of 2, which is about 1.41
print *, polyline_length(three_points)
These coordinates represent the following polyline:

Looking at the output of the test program, the function still behaves well, because it returns: 6.41421356237309
Using assertion functions
The pattern of this ad hoc testing workflow is: we think of some input, and define the expected output. We then manually compare the output to our expectations. Although this approach gives us some automated feedback on the quality of our work, it’s still relatively tedious. We can make the work easier by delegating the work of comparing values to the computer itself. Let’s make a subroutine that we can pass two reals to: one that is the value we expect, and the other one is the value that the function actually returned. If these values don’t match, we just stop execution, explaining what was wrong:
subroutine assert_same(expected, actual)
real(kind=real64), intent(in) :: expected
real(kind=real64), intent(in) :: actual
if (actual /= expected) then
print *, 'Actual: ', actual
print *, 'Expected: ', expected
error stop 'Reals are not the same'
end if
end subroutine assert_same
I’m using some traditional testing terminology here: an assertion means that if what is being asserted isn’t true, we want the function or program to stop execution because something is wrong, and we should know about it.
We can now replace our existing print statements with calls to assert_same, e.g.
-print *, polyline_length(three_points)
+call assert_same(6.41421356237309_real64, &
polyline_length(three_points))
Running this, we have to conclude that our assert_same procedure is broken. It pretty much always fails:
Actual: 6.41421356237309
Expected: 6.41421356237309
Reals are not the same
Dealing with error margins
When comparing reals we can’t use == and /= because we are comparing decimal numbers with a certain precision. When working with these values, and comparing them to hard-coded values, there is likely to be some difference. This difference can be considered a rounding error or error margin. We can improve the comparison between reals by switching concepts: we shouldn’t say “assert same” but “assert equals”, to indicate these two numbers aren’t exactly the same but would be considered “equal” when used in calculations. Then we also need to define what the difference is that we would tolerate. This should be another real, passed as an argument, so we can finetune the error margin (often called “epsilon”) when needed. Note: since we are potentially dealing with negative differences, we have to make the difference absolute before we compare it to the (presumably) positive epsilon.
subroutine assert_equals(expected, actual, epsilon)
real(kind=real64), intent(in) :: expected
real(kind=real64), intent(in) :: actual
real(kind=real64), intent(in) :: epsilon
if (abs(actual - expected) > epsilon) then
print *, 'Actual: ', actual
print *, 'Expected: ', expected
print *, 'Epsilon: ', epsilon
error stop 'Reals are not equal'
end if
end subroutine assert_equals
Whenever we call assert_equals, we also pass a very small allowed difference as the last argument:
call assert_equals(6.41421356237309_real64, &
polyline_length(three_points), &
1.0e-10_real64)
This produces the desired effect: the test passes, because the hard-coded real and the real returned by polyline_length can be considered equal. As long as a call to assert_equals doesn’t stop the program, we know that the actual and expected values really matched. We no longer have to manually compare output. If the program reaches the end of the code, everything should be okay.
Not stopping on the first failed assertion
There is one problem related to that error stop statement in assert_equals. As soon as the first test fails, the whole program fails. In reality, it would be very useful if all the tests are executed, even if one fails. We would still get a nice overview of all the failing tests, not just the first one. This would help us find the issue faster. When we fix the code, we’ll be able to quickly find out if our change fixed all the failing tests, or left some of them still failing.
The simplest way to do this is to let assertion functions indicate whether the comparison failed, instead of abruptly stopping the program. We could do this by passing a mutable (intent(out)) argument, e.g. assertion_failed, which the subroutine may set to .true.:
-subroutine assert_equals(expected, actual, epsilon)
+subroutine assert_equals(expected, actual, epsilon, assertion_failed)
real(kind=real64), intent(in) :: expected
real(kind=real64), intent(in) :: actual
real(kind=real64), intent(in) :: epsilon
+ logical, intent(out) :: assertion_failed
+ assertion_failed = .false.
if (abs(actual - expected) > epsilon) then
print *, 'Actual: ', actual
print *, 'Expected: ', expected
print *, 'Epsilon: ', epsilon
- error stop 'Reals are not equal'
+ print *, 'Reals are not equal'
+ assertion_failed = .true.
end if
end subroutine assert_equals
We update the test program as follows. To keep track of failed tests we add two local variables:
integer :: error_counter
logical :: assertion_failed
To every assertion we pass assertion_failed:
call assert_equals(5.0_real64, &
polyline_length(two_points), 1.0e-10_real64, &
assertion_failed)
And after every call to assert_equals, we check if its value was set to .true.:
if (assertion_failed) then
error_counter = error_counter + 1
end if
Then at the end of the program we know if any call to assert_equals failed by inspecting the error_counter variable:
if (error_counter > 0) then
print *, error_counter, 'tests failed'
error stop
end if
Here is the example output when the assertions fail:
Actual: 6.00000000000000
Expected: 5.00000000000000
Epsilon: 1.000000000000000E-010
Reals are not equal
...
2 tests failed
Printing a description of the test
Our test program is getting better by the minute, but we still have a maintainability issue: if a test fails, we don’t see which test it was. This makes debugging very hard. One way to fix this is to print something about what we’re going to test before every test:
+print *, 'Calculate length of polyline with 2 points'
! x = 0, y = 0
two_points(1, 1) = 0.0_real64
two_points(1, 2) = 0.0_real64
! ...
If an assertion fails, we can now visually link the error message to the test:
Calculate length of polyline with 2 points
...
Reals are not equal
In the next post we’ll consider what happens when the number of tests keeps growing, and what we can do to keep our test program maintainable. After all, now that we’re spending some time improving the quality of our temporary tests, it would be sad to delete them, wasting the development effort. As the authors of the function, we know how it’s supposed to work, and keeping this knowledge codified in tests will prove to be a great help to future maintainers. So let’s make these tests permanent…