Fortran - Testing - Unit tests and test suites

Matthias Noback

January 16, 2026

In the previous post, we’ve spent some time improving a number of “temporary” tests. We introduced an assertion function to compare two real values, and we prevent failing tests from stopping the entire test program.

Moving tests to their own procedures

The approach of adding tests directly in the main test program block doesn’t really scale, as the saying goes. polyline_length may be a simple function, but for more complicated functions with multiple branches and loops, we’d have to write many of these tests. The test program keeps growing, and eventually it becomes a mess. It doesn’t help that all local variables have to be declared at the top. Even if that wasn’t needed, it isn’t very clear where a test starts or ends. Everything happens in the same scope, potentially becoming a memory management issue too. This also makes it hard to delete tests we no longer want. Removing lines from a big test program likely breaks other tests, or we forget to remove things we no longer need.

A simple solution is to move every test to its own procedure, with its own local state that will automatically be cleared when we jump out of the function. To ensure the error counting mechanism in the test program can still be used, we’ll let each test procedure return a logical test_failed argument. As an example, we extract the following function for testing the polyline with 2 points. Note that we assign the result of assert_equals directly to the test_failed return variable. This indicates correctly that the test itself fails if the assertion fails:

function test_polyline_with_2_points() result(test_failed)
   logical :: test_failed

   real(kind=real64), dimension(2, 2) :: two_points

   print *, 'Calculate length of polyline with 2 points'

   ! ...

   test_failed = assert_equals(5.0_real64, &
                      polyline_length(two_points), &
                      1.0e-10_real64)
end function test_polyline_with_2_points

The main program block looks like this now:

integer :: error_counter
logical :: test_failed

test_failed = test_polyline_with_2_points()
if (test_failed) then
   error_counter = error_counter + 1
end if

test_failed = test_polyline_with_3_points()
if (test_failed) then
   error_counter = error_counter + 1
end if

if (error_counter > 0) then
   print *, error_counter, 'tests failed'
   error stop
end if

It just calls all test procedures. There’s no irrelevant local state left, which gives us a nice and clean test program. The only code quality issue here is the duplication after each test. We’ll be able to fix that soon.

Moving tests to their own modules

It’s sad, but the code that’s written in a “quick test program” often doesn’t survive. Once a function is verified to return the correct result, the code often gets deleted. We lose the time and effort that went into writing this “throw-away” code. We also lose the intention that the original author had; we lose their understanding of what the function should do and which of its behaviors they were interested in. But we also lose the safety net for future changes to this function: if we make a mistake when doing so, we likely won’t even notice it.

To prevent all these losses, let’s make sure these tests get a proper place in the code base: not inside a test program, but inside their own module. Fortunately, this is easy to do: test functions only have local variables, so we can literally just cut-and-paste the test procedures into a separate module; let’s call it test_polyline. There we may even add new tests for polyline-related code. As always, we have to keep an eye on these modules: are they growing too big, maybe because new tests are merely tangentially related, then we should move some tests to new test modules.

module test_polyline
   use iso_fortran_env, only: real64

   implicit none(type, external)

   private

   public :: test_polyline_with_2_points
   public :: test_polyline_with_3_points
   
contains
   function test_polyline_with_2_points() result(test_failed)
! ...

Generalizing the test program

When we move test procedures to their own modules (where we make them public), we also have to import them in the main test program to keep everything working:

program tester
   use test_polyline, only: test_polyline_with_2_points, &
                            test_polyline_with_3_points

   ! ...

   test_failed = test_polyline_with_2_points()
   if (test_failed) then
      error_counter = error_counter + 1
   end if
   ! ...

In the long run, the number of use statements for all of our test modules will really start to add up. Also, we still have this problem of code duplication: every time we call a test, we need those if (test_failed) lines. What if we could make an array of tests and loop through them, removing the code duplication, but also removing the need to call test procedures by their specific name. In other words: what if we could generalize the test loop itself, like this:

do i = 1, size(tests)
   test_failed = tests(i)()
   if (test_failed) then
      error_counter = error_counter + 1
   end if
end do

If tests could be an array of procedure pointers, we’d be done. Unfortunately, Fortran doesn’t allow creating an array of (procedure) pointers. To work around this limitation, we have to define a derived type with a procedure pointer as a data component. First, let’s define what a test procedure should look like, using an abstract interface: it should always have no arguments, but return a logical. The existing test procedures already match this interface:

module test_framework
   implicit none(type, external)

   private
   
   abstract interface
      function test_procedure_interface() result(test_failed)
         implicit none(type, external)

         logical :: test_failed
      end function test_procedure_interface
   end interface
end module test_framework

Note that we’re placing this interface in a new test_framework module.

Now we declare a new derived type, let’s call it unit_test_t, which also lives in the test_framework module. It holds a procedure pointer to the actual test procedure. When declaring this component, we specify the interface we just defined:

type :: unit_test_t
   procedure(test_procedure_interface), pointer :: test_procedure
end type unit_test_t

Compiling this leads to an interesting error:

For a type-bound procedure that has the PASS binding attribute, the first dummy argument must have the same declared type as the type being defined.

As we know from previous articles, regular type-bound procedures are expected to have the instance on which the procedure is called as their first argument (e.g. class(unit_test_t), intent(in) :: self). Surprisingly, procedure pointer data components are treated as type-bound procedures too. To get rid of the error, we can specify the nopass attribute:

type :: unit_test_t
-  procedure(test_procedure_interface), pointer :: test_procedure
+  procedure(test_procedure_interface), pointer, nopass :: test_procedure
end type unit_test_t

Note that this is a very interesting language feature. Type-bound procedures are “hard-coded”; their behavior is determined at compile time. With procedure pointers as data components, we can change the behavior of a derived type at runtime, because we “bind” a procedure with the correct interface to an existing derived type instance dynamically, even from outside the module.

For now, let’s populate an array of these new unit_test_t instances directly inside the test program. Later we can let the test modules do it themselves:

program tester
   use test_polyline, only: test_polyline_with_2_points, &
                            test_polyline_with_3_points
   use test_framework, only: unit_test_t

   implicit none(type, external)

   integer :: error_counter
   integer :: unit_test_index
   logical :: test_failed
   type(unit_test_t), dimension(:), allocatable :: unit_tests
   
   unit_tests = [ &
                unit_test_t(test_polyline_with_2_points), &
                unit_test_t(test_polyline_with_3_points) &
                ]

   do unit_test_index = 1, size(unit_tests)
      test_failed = .false.

      test_failed = unit_tests(unit_test_index)%test_procedure()
      if (test_failed) then
         error_counter = error_counter + 1
      end if
   end do

   if (error_counter > 0) then
      print *, error_counter, 'tests failed'
      error stop
   end if
end program tester

Note that creating the array of unit tests still requires explicit references to the test procedures, meaning they have to be explicitly imported as well. The list will be very long, and every time we write a new test we have to add it to the test program itself. If we work in a team of eager testers, we’ll have a lot of merge conflicts at this location.

We can reduce the amount of coupling, and thereby the change rate for this code location, by letting the test module populate an array with its own unit test procedures. For instance, we can extract those unit test instantiation lines to a dedicated function in the test_polyline module:

module test_polyline
   ! ...
   public :: collect_polyline_tests

contains
   function collect_polyline_tests() result(unit_tests)
      type(unit_test_t), dimension(:), allocatable :: unit_tests

      unit_tests = [ &
                   unit_test_t(test_polyline_with_2_points), &
                   unit_test_t(test_polyline_with_3_points) &
                   ]
   end function collect_polyline_tests
   
   ! ...
end module test_polyline

We make that function public, so the test program can invoke it:

-use test_polyline, only: test_polyline_with_2_points, &
-                         test_polyline_with_3_points
+use test_polyline, only: collect_polyline_tests

type(unit_test_t), dimension(:), allocatable :: unit_tests

-unit_tests = [ &
-            unit_test_t(test_polyline_with_2_points), &
-            unit_test_t(test_polyline_with_3_points) &
-            ]
+unit_tests = collect_polyline_tests()

Note that the only public procedure of the test_polyline module is now collect_polyline_tests, so test_polyline_with_2_points and test_polyline_with_3_points remain private. Nevertheless, the test program is able to call these procedures. The fact that these procedures are private to their module is irrelevant; as long as we target these procedures in test_polyline, they can still be called by their pointers inside the test program. This is a powerful yet somewhat surprising feature of procedure pointers. It allows us to expose private functionality to other modules, without the need to expose actual procedure names.

Collecting unit tests from multiple test modules

What if we have multiple test modules? We’d have many similar collect_*_tests functions that we have to call from the test loop. We can easily define a common abstract interface for those procedures again:

! in the `test_framework` module:

abstract interface
   function collect_unit_tests_interface() result(unit_tests)
      import :: unit_test_t

      implicit none(type, external)

      type(unit_test_t), dimension(:), allocatable :: unit_tests
   end function collect_unit_tests_interface
end interface

Just as before, we also define a derived type that can hold the reference to the collect procedure, let’s say test_suite_t (which lives in too):

! in the `test_framework` module:

type :: test_suite_t
   procedure(collect_unit_tests_interface), pointer, nopass :: collect
end type test_suite_t

In the test program we can now do two loops: one on the array of test suites, one on the array of test procedures provided by the test suite (or actually, test module):

program tester
   use test_polyline, only: collect_polyline_tests
   use test_framework, only: unit_test_t, &
                             test_suite_t

   implicit none(type, external)

   integer :: error_counter
   integer :: unit_test_index
   integer :: test_suite_index
   logical :: test_failed
   type(test_suite_t), dimension(:), allocatable :: test_suites
   type(unit_test_t), dimension(:), allocatable :: unit_tests
   test_suites = [test_suite_t(collect_polyline_tests)]

   do test_suite_index = 1, size(test_suites)
      unit_tests = test_suites(test_suite_index)%collect()

      do unit_test_index = 1, size(unit_tests)
         test_failed = .false.

         test_failed = unit_tests(unit_test_index)%test_procedure()
         if (test_failed) then
            error_counter = error_counter + 1
         end if
      end do
   end do
   ! ...
end program tester

At this point we get a strange compiler error:

error #8180: The procedure pointer and the procedure target must 
have matching result types.   [COLLECT_POLYLINE_TESTS]
 test_suite_t(collect_polyline_tests)

This seems incorrect, because the collect_unit_tests_interface defines the following as its return type:

type(unit_test_t), dimension(:), allocatable :: unit_tests

And the collect_polyline_tests procedure matches this perfectly:

type(unit_test_t), dimension(:), allocatable :: unit_tests

The way to fix this is to introduce a factory function for test_suite_t instances. This is always a good idea, because it supports encapsulation of its data components, but I’m surprised I have to do it in this case. This is the factory function:

! in the `test_framework` module:

function new_test_suite(collect_procedure) result(test_suite)
   procedure(collect_unit_tests_interface) :: collect_procedure

   type(test_suite_t) :: test_suite
   
   test_suite%collect => collect_procedure
end function new_test_suite

After making this public, we can use it in the program:

-test_suites = [test_suite_t(collect_polyline_tests)]
+test_suites = [new_test_suite(collect_polyline_tests)]

Could the test framework discover all test modules and test procedures automatically?

There is no mechanism built into the language to allow some kind of enumeration at runtime of all the modules, or all the procedures in those modules. If there were a language feature like that we could easily filter the modules and procedures that start with test_ and automatically register them inside our test framework. Such a language features is often called reflection and is used in libraries like test frameworks to make test discovery easy. In the case of Fortran, we have to hard-code the test module and procedure names in our code. Alternatives are imaginable though. We could for example take the following steps:

Parse the .f90 files in our test/ folder

Collect the names of test modules and test procedures in those files

Generate the test program that enumerates them all

Build the test program

The approach I’m taking in these articles is a lot simpler and doesn’t require pre-build steps. It remains a bit cumbersome to declare tests explicitly, but honestly it’s not that much work. Every test module needs to be declared only once, and from that moment on the only overhead is to add a call to new_unit_test for every new test procedure. It takes only a short time to get used to that.

We have been adding more and more code to the new test_framework module. Giving it that name inspires us to make some more of this. It also inspires another thought: shouldn’t we test our test framework? We’ll continue with this journey in the next post.