Fortran - Testing - Improving the design of the test framework - Part 1
Matthias Noback
The right approach to software design, in my opinion, is to work with what you have, build more functionality on top of it, then realizing there are design issues, then fixing those issues by redesigning ad hoc. Where, in my experience, most software design efforts go wrong is:
- We realize there are design issues, yet we don’t fix them. Maybe it’s too scary.
- We fix design issues, but too early, when we can’t yet know if the new design is better. We lack feedback from actual use.
In this article series, I’m happy to report, I haven’t spent too much time designing upfront. But now I want to tackle some issues, that I encountered while adding more tests. Let’s look at one of our tests:
function test_empty_string() result(test_failed)
class(test_failed_t), allocatable :: test_failed
type(assertion_failed_t), allocatable :: assertion_failed
assertion_failed = assert_equals('', &
str_to_upper(''))
if (allocated(assertion_failed)) then
test_failed = assertion_failed
end if
end function test_empty_string
The current design of the test framework has the following problems:
- Every test requires a number of lines to be duplicated: we need to manually convert a failed assertion into a failed test result.
- The framework and the test “abuses”
allocatablereturn values for indicating whether a test or an assertion failed. - It’s nice that the conceptual and logical link “the test fails if the assertion fails” is visible in the code, but this link is repeated over and over. Besides, the types we use for them are part of the same hierarchy. Do they really need to be separate concepts?
I realize using allocatable variables is quite “idiomatic” for Fortran programming, even using them as intent(out) arguments or function results. However, there are several huge disadvantages of using them to communicate function results:
- We always have to check if they are
allocatedbefore we can use them. - Memory allocation and checking for it is a very low-level (even machine-level), technical thing. For code that is otherwise very high-level, it’s not great that we have to rely on
allocatedchecks. - We can’t pass an
allocatablereturn value directly as anallocatableargument again. This means we always have to assign the return value to a local variable first, only then can we pass that variable again as an argument to another function.
The last point deserves a code sample. Because assert_equals returns a type(assertion_failed_t), allocatable, the following is impossible:
call process(assert_equals( &
! ...
))
You have to store the return value of assert_equals in a local variable, matching the specification of the return value:
type(assertion_failed_t), allocatable :: assertion_failed
assertion_failed = assert_equals( &
! ...
))
call process(assertion_failed)
In other words, making a return variable allocatable to indicate optionality blocks the possibility to use this function inside an expression.
In the following sections I’m making some changes that will fix all the design issues, including those related to the use of allocatable return values. This will make it a lot easier to write tests, since it will require less code, and it will be harder to make mistakes like forgetting to return after a failed assertion. On top of that, it paves the way for us to add some useful new features.
A reusable type for “test result”
First we get rid of the dedicated types for failed assertions and failed tests. We just stick to test_failed_t. This required some find-and-replace actions. We can also change usages of class(test_failed_t) into type(test_failed_t) because we “collapsed” the hierarchy and want to deal only with this specific type:
-class(test_failed_t), allocatable :: test_failed
+type(test_failed_t), allocatable :: test_failed
-type(assertion_failed_t), allocatable :: assertion_failed
+type(assertion_failed_t), allocatable :: assertion_failed
The use of allocatable here means: if it’s allocated, it means the test failed. If it’s not, it means the test passed. As we discussed, we should get rid of allocatables that represent optionality. The way to do that is to:
- Remove
allocatablefor variables of this type. - Add a
logicaldata component to the type that can be set to.true.if there was a test or assertion failure. - Generalize the name to indicate it can be used for both failed and passes tests/assertions.
-type :: test_failed_t
+type :: test_result_t
+ logical :: failed = .false.
character(len=:), allocatable :: message
-end type test_failed_t
+end type test_result_t
In our tests and assertion functions, wherever we used an allocatable variable we now declare a regular variable. We can just start to populate its data components in case of an assertion or test error:
-type(test_failed_t), allocatable :: test_failed
+type(test_result_t) :: test_result
-test_failed = test_failed_t('The error message')
+test_result%failed = .true.
+test_result%message = 'The error message'
Tests now look as follows:
function test_move_point_to_the_right() result(test_result)
type(test_result_t) :: test_result
type(test_result_t) :: assertion_result
type(point_t), allocatable :: expected, actual
expected = point_t(3.0_real64, 4.0_real64)
actual = move_x(point_t(1.0_real64, 4.0_real64), 2.0_real64)
assertion_result = assert_equals(expected, actual)
if (assertion_result%failed) then
test_result%failed = .true.
test_result%message = assertion_result%message
end if
end function test_move_point_to_the_right
We still need that local variable assertion_result. We no longer check for allocated though, because we can safely assume the function will always return a test_result_t. Instead, we can immediately check for %failed. Anyway, we still have a lot of code duplication because of this. We’ll fix that next.
Since tests will now always return a test_result_t, the test runner itself needs to be modified:
-class(test_failed_t), allocatable :: test_failed
+type(test_result_t) :: test_result
-if (allocated(test_failed)) then
+if (test_result%failed) then
Letting the test result handle assertion results
In order to remove the duplicated code for copying the assertion result into the test result, we may extract this code to a type-bound procedure on test_result_t itself, so it can handle or process an assertion result. We first add the type-bound procedure process:
type :: test_result_t
logical :: failed = .false.
character(len=:), allocatable :: message
contains
+ procedure :: process => test_result_process
end type test_result_t
subroutine test_result_process(self, assertion_result)
class(test_result_t), intent(inout) :: self
type(test_result_t), intent(in) :: assertion_result
if (assertion_result%failed) then
self%failed = .true.
self%message = assertion_result%message
end if
end subroutine test_result_process
At once, we can get rid of the local variable assertion_result and the duplicated if clause:
function test_empty_string() result(test_result)
type(test_result_t) :: test_result
- type(test_result_t), allocatable :: assertion_result
-
- assertion_result = assert_equals('', str_to_upper(''))
- if (assertion_result%failed) then
- test_result%failed = .true.
- test_result%message = assertion_result%message
- end if
+ call test_result%process(assert_equals('', str_to_upper(''))
end function test_empty_string
Encapsulating data components of test_result_t
Usually when a refactoring project leads to more types or more specific types being introduced, or when we start using type-bound procedures instead of direct data component assignments, several things happen:
- We end up making implicit concepts explicit. For example, earlier we only had a concept in our code for “test failed”, now we have a concept for “test result”, and zooming in on that concept we learn that a test may fail or not.
- We recognize further steps that could be taken. The code becomes simpler, self-explanatory, and it’s easier to reason with the concepts that we now have. For instance, we may recognize a potential link between
test_results_t(plural) andtest_result_t(singular). We considered test results to be just a counter for failed, passed, and all tests. But now we realize these numbers can just as easily be derived from an array of all the actualtest_result_tinstances. - We also recognize more opportunities for encapsulation, i.e. manipulating the state of our type instances via type-bound procedures instead of directly manipulating data components. Encapsulation gives us more options for future refactorings inside those types, without affecting users.
Let’s work on this last idea. All assertion functions will have code that’s similar to this:
assertion_result%failed = .true.
assertion_result%message = 'Reals are not equal.'
We can combine these two steps in a single type-bound procedure:
type :: test_result_t
logical :: failed = .false.
character(len=:), allocatable :: message
contains
procedure :: process => test_result_process
+ procedure :: fail => test_result_fail
end type test_result_t
The implementation of test_result_fail combines the two steps:
subroutine test_result_fail(self, message)
class(test_result_t), intent(inout) :: self
character(len=*), intent(in) :: message
self%failed = .true.
self%message = message
end subroutine test_result_fail
Because message is a required argument, this ensures that if failed == .true., there will also be a message.
All assertion functions may now be updated as follows:
-assertion_result%failed = .true.
-assertion_result%message = 'Reals are not equal.'
+call assertion_result%fail('Reals are not equal.')
To complete the encapsulation process, we should make the data components private. Since tests often need to check if an assertion failed before they make another assertion, at least we need to provide a public type-bound procedure has_failed(), which returns the current value of the failed data component:
type :: test_result_t
logical :: failed = .false.
character(len=:), allocatable :: message
contains
procedure :: process => test_result_process
procedure :: fail => test_result_fail
+ procedure :: has_failed => test_result_has_failed
end type test_result_t
The test_result_has_failed procedure is very simple:
pure function test_result_has_failed(self) result(has_failed)
class(test_result_t), intent(in) :: self
logical :: has_failed
has_failed = self%failed
end function test_result_has_failed
Now only the test framework itself needs direct access to the data components of test_result_t, so we can make them private:
type :: test_result_t
+ private
logical :: failed = .false.
character(len=:), allocatable :: message
contains
procedure :: process => test_result_process
procedure :: fail => test_result_fail
procedure :: has_failed => test_result_has_failed
end type test_result_t
This makes them effectively immutable outside the framework itself: tests can’t “cheat” with test results by modifying the state, even if it happens by accident.
For users of the test framework, it’s now very clear how to use test_result_t. You can only modify it in ways that the framework allows. If your IDE understands Fortran code (like VS Code with the Modern Fortran extension), you get this nice “auto-complete” overview of things you can do with a test_result_t variable:

Preventing “uncaught” errors
With the improved and encapsulated test_result_t type, we can do some interesting things. As an example, we can prevent a mistake that’s easily made by developers who write tests or custom assertion functions. We discussed this before; if a test has multiple assertions (which is quite common), we should not forget to return, or else our assertion error will go unnoticed, and may even get overwritten by another error:
if (len(expected) /= len(actual)) then
call assertion_result%fail('Strings have different lengths.')
! Don't forget:
return
end if
if (expected /= actual) then
call assertion_result%fail('Strings have different content.')
! Don't forget:
return
end if
Let’s say the developer forgets the first return statement. Then the next time a call to fail() happens, the failed data component will already be .true. and a message has been set too. If we take no precautions, fail() will set failed to .true. again, which has no effect, but it will overwrite the message. It would be more helpful if we could keep the original message, and somehow make a note of the fact that the developer forgot to return. A good idea would be to add another data component uncaught which we set to .true. if fail() is called twice:
type :: test_result_t
private
logical :: failed = .false.
+ logical :: uncaught = .false.
character(len=:), allocatable :: message
contains
procedure :: process => test_result_process
procedure :: fail => test_result_fail
end type test_result_t
And the modifications to the fail() procedure:
subroutine test_result_fail(self, message)
class(test_result_t), intent(inout) :: self
character(len=*), intent(in) :: message
+ if (self%failed) then
+ self%uncaught = .true.
+ return
+ end if
self%failed = .true.
self%message = message
end subroutine test_result_fail
Finally, we should update print_test_result_failed so it prints a special message in the case of an uncaught error:
subroutine print_test_result_failed(test_result)
type(test_result_t), intent(in) :: test_result
write (stdout, '(A)') 'FAIL'
+ if (test_result%uncaught) then
+ write (stdout, '(A)', advance='no') 'UNCAUGHT ERROR '
+ end if
write (stdout, '(A)'), test_result%message
end subroutine print_test_result_failed
We also need to propagate the value of the uncaught flag from the assertion result to the test result:
subroutine test_result_process(self, assertion_result)
class(test_result_t), intent(inout) :: self
type(test_result_t), intent(in) :: assertion_result
if (assertion_result%failed) then
call self%fail(assertion_result%message)
+ if (assertion_result%uncaught) then
+ self%uncaught = assertion_result%uncaught
+ end if
end if
end subroutine test_result_process
Finally, the output will contain:
UNCAUGHT ERROR Strings have different lengths.
Reporting “risky” tests
When a test makes no assertion, and the execution of the test procedure just finishes at some point, the test result will be “passed” (at least, it’s not “failed”). This is a programming mistake: there should be at least one call to an assertion function, because that’s the way we can indicate what our expectations are. Only if all assertions pass (even if it’s just one assertion), we can call the test properly passed. Any other test should be reported as “risky”. We might even go as far as to fail such a test manually.
A nice solution is to add an integer data component called assertion_count to test_result_t:
type :: test_result_t
private
logical :: failed = .false.
logical :: uncaught = .false.
character(len=:), allocatable :: message
+ integer :: assertion_count = 0
contains
procedure :: process => test_result_process
procedure :: fail => test_result_fail
end type test_result_t
We increment this value on every call to process(assertion_result). We count assertions recursively: if one assertion uses two assertions itself, the total should be three:
subroutine test_result_process(self, assertion_result)
class(test_result_t), intent(inout) :: self
type(test_result_t), intent(in) :: assertion_result
self%assertion_count = self%assertion_count + &
assertion_result%assertion_count + 1
if (assertion_result%failed) then
! ...
end if
end subroutine test_result_process
Finally, we add an is_risky type-bound procedure to test_result_t:
type :: test_result_t
! ...
contains
procedure :: process => test_result_process
procedure :: fail => test_result_fail
+ procedure :: is_risky => test_result_is_risky
end type test_result_t
pure function test_result_is_risky(self) result(is_risky)
class(test_result_t), intent(in) :: self
logical :: is_risky
is_risky = self%assertion_count == 0
end function test_result_is_risky
We can use this new information in the output printer. First, we combine the separate procedures for passed and failed tests into a single print_test_result subroutine that receives the test_result_t instance. Then we can make a nice decision tree to print the right information in the terminal:
subroutine print_test_result(test_result)
type(test_result_t), intent(in) :: test_result
if (test_result%failed) then
write (stdout, '(A)') 'FAIL'
if (test_result%uncaught) then
write (stdout, '(A)', advance='no') 'UNCAUGHT ERROR '
end if
write (stdout, '(A)'), test_result%message
else
if (test_result%is_risky()) then
write (stdout, '(A)') 'RISKY'
else
write (stdout, '(A)') 'PASS'
end if
end if
end subroutine print_test_result
In the next post we’ll continue with the redesign of the framework by splitting the one large test_framework module into smaller modules and submodules.