Fortran - Testing - Improving temporary test programs

Matthias Noback

January 12, 2026

How can we know that the function we wrote, works as intended? We could run it, and manually verify its correctness. The simplest way to do this is to call the function in the main program block, print the output, and compare it with what we expect. Say our function calculates the length of a polyline, stored as a two-dimensional array of reals, representing (x,y) coordinates:

pure function polyline_length(coordinates) result(length)
   real(kind=real64), dimension(:, :), intent(in) :: coordinates

   real(kind=real64) :: length

   real(kind=real64) :: distance
   integer :: index

   length = 0.0_real64

   do index = 1, size(coordinates, 1) - 1
      distance = sqrt((coordinates(index, 1) - &
                       coordinates(index + 1, 1))**2 + &
                      (coordinates(index, 2) - &
                       coordinates(index + 1, 2))**2)
      length = length + distance
   end do
end function polyline_length

Temporary test programs

We could modify the main program block of our actual program, but it’s a lot simpler and safer to create a separate “throw-away” test program, with only the code we need. We’d write a short test program that sets up some coordinates, calls the function, then prints the result:

program tester
   use iso_fortran_env, only: real64

   implicit none(type, external)

   real(kind=real64), dimension(2, 2) :: two_points
   ! x = 0, y = 0
   two_points(1, 1) = 0.0_real64
   two_points(1, 2) = 0.0_real64
   ! x = 3, y = 4
   two_points(2, 1) = 3.0_real64
   two_points(2, 2) = 4.0_real64

   ! Should be 5.0
   print *, polyline_length(two_points)
end program tester

Visually this amounts to the following polyline:

Manually comparing output

Running the program, we can compare the value mentioned by the comment (“5.0”) with the actual output (5.00000000000000). We happily conclude: this function works. Or does it? It’s supposed to work for a larger number of points. How many points should we check? Passing 2 coordinates works, but maybe we should also try the function with 3 points (which define 2 polyline segments). We may also want to show that it works with distances that are not whole numbers like 5:

real(kind=real64), dimension(2, 2) :: two_points
real(kind=real64), dimension(3, 2) :: three_points

! ... (the two_points test)

! x = 0, y = 0
three_points(1, 1) = 0.0_real64
three_points(1, 2) = 0.0_real64
! x = 3, y = 4
three_points(2, 1) = 3.0_real64
three_points(2, 2) = 4.0_real64
! x = 4, y = 5
three_points(3, 1) = 4.0_real64
three_points(3, 2) = 5.0_real64

! Should be 5.0 + square root of 2, which is about 1.41
print *, polyline_length(three_points)

These coordinates represent the following polyline:

A polyline going from 0,0 to 3,4, then to 4,5

Looking at the output of the test program, the function still behaves well, because it returns: 6.41421356237309

Using assertion functions

The pattern of this ad hoc testing workflow is: we think of some input, and define the expected output. We then manually compare the output to our expectations. Although this approach gives us some automated feedback on the quality of our work, it’s still relatively tedious. We can make the work easier by delegating the work of comparing values to the computer itself. Let’s make a procedure that we can pass two reals to: one that is the value we expect, and the other one is the value that the function actually returned. If these values don’t match, we just stop execution, explaining what was wrong. Because there is no return value, it should be a subroutine:

function assert_same(expected, actual)
   real(kind=real64), intent(in) :: expected
   real(kind=real64), intent(in) :: actual

   if (actual /= expected) then
      print *, 'Actual: ', actual
      print *, 'Expected: ', expected
      error stop 'Reals are not the same'
   end if
end subroutine assert_same

I’m using some traditional testing terminology here: an assertion means that if what is being asserted isn’t true, we want the function or program to stop execution because something is wrong, and we should know about it.

We can now replace our existing print statements with calls to assert_same, e.g.

-print *, polyline_length(three_points)
+call assert_same(6.41421356237309_real64, &
                  polyline_length(three_points))

Running this, we have to conclude that our assert_same procedure is broken. It pretty much always fails:

Actual:    6.41421356237309     
Expected:    6.41421356237309     
Reals are not the same

Dealing with error margins

When comparing reals we can’t use == and /= because we are comparing decimal numbers with a certain precision. When working with these values, and comparing them to hard-coded values, there is likely to be some difference. This difference can be considered a rounding error or error margin. We can improve the comparison between reals by switching concepts: we shouldn’t say “assert same” but “assert equals”, to indicate these two numbers aren’t exactly the same but would be considered “equal” when used in calculations. Then we also need to define what the difference is that we would tolerate. This should be another real, passed as an argument, so we can finetune the error margin (often called “epsilon”) when needed. Note: since we are potentially dealing with negative differences, we have to make the difference absolute before we compare it to the (presumably) positive epsilon.

subroutine assert_equals(expected, actual, epsilon)
   real(kind=real64), intent(in) :: expected
   real(kind=real64), intent(in) :: actual
   real(kind=real64), intent(in) :: epsilon

   if (abs(actual - expected) > epsilon) then
      print *, 'Actual: ', actual
      print *, 'Expected: ', expected
      print *, 'Epsilon: ', epsilon
      error stop 'Reals are not equal'
   end if
end subroutine assert_equals

Whenever we call assert_equals, we also pass a very small allowed difference as the last argument:

call assert_equals(6.41421356237309_real64, &
                   polyline_length(three_points), &
                   1.0e-10_real64)

This produces the desired effect: the test passes, because the hard-coded real and the real returned by polyline_length can be considered equal. As long as a call to assert_equals doesn’t stop the program, we know that the actual and expected values really matched. We no longer have to manually compare output. If the program reaches the end of the code, everything should be okay.

Not stopping on the first failed assertion

There is one problem related to that error stop statement in assert_equals. As soon as the first test fails, the whole program fails. In reality, it would be very useful if all the tests are executed, even if one fails. We would still get a nice overview of all the failing tests, not just the first one. This would help us find the issue faster. When we fix the code, we’ll be able to quickly find out if our change fixed all the failing tests, or left some of them still failing.

The simplest way to do this is to let the assertion function tell the caller whether the comparison failed, instead of abruptly stopping the program. The simplest thing we can do is to let the function return a logical (boolean) value assertion_failed:

-subroutine assert_equals(expected, actual, epsilon)
+function assert_equals(expected, actual, epsilon) result (assertion_failed)
   real(kind=real64), intent(in) :: expected
   real(kind=real64), intent(in) :: actual
   real(kind=real64), intent(in) :: epsilon
+  logical :: assertion_failed

+  assertion_failed = .false.

   if (abs(actual - expected) > epsilon) then
      print *, 'Actual: ', actual
      print *, 'Expected: ', expected
      print *, 'Epsilon: ', epsilon
-     error stop 'Reals are not equal'
+     print *, 'Reals are not equal'
+     assertion_failed = .true.
   end if
end subroutine assert_equals

We update the test program as follows. To keep track of failed tests we add two local variables:

integer :: error_counter
logical :: assertion_failed

To every assertion we pass assertion_failed:

assertion_failed = assert_equals(5.0_real64, &
                                 polyline_length(two_points), &
                                 1.0e-10_real64)

And after every call to assert_equals, we check if its value was set to .true.:

if (assertion_failed) then
   error_counter = error_counter + 1
end if

Then at the end of the program we know if any call to assert_equals failed by inspecting the error_counter variable:

if (error_counter > 0) then
   print *, error_counter, 'tests failed'
   error stop
end if

Here is the example output when the assertions fail:

Actual:    6.00000000000000     
Expected:    5.00000000000000     
Epsilon:   1.000000000000000E-010
Reals are not equal
...
2 tests failed

Printing a description of the test

Our test program is getting better by the minute, but we still have a maintainability issue: if a test fails, we don’t see which test it was. This makes debugging very hard. One way to fix this is to print something about what we’re going to test before every test:

+print *, 'Calculate length of polyline with 2 points'

! x = 0, y = 0
two_points(1, 1) = 0.0_real64
two_points(1, 2) = 0.0_real64
! ...

If an assertion fails, we can now visually link the error message to the test:

Calculate length of polyline with 2 points
...
Reals are not equal

Integrating with FPM

When using FPM, we can integrate any custom check program with it, and treat it as our official test program. Simply move the file containing the program to the test/ folder, and from now on you can run fpm test.

In the next post we’ll consider what happens when the number of tests keeps growing, and what we can do to keep our test program maintainable. After all, now that we’re spending some time improving the quality of our temporary tests, it would be sad to delete them, wasting the development effort. As the authors of the function, we know how it’s supposed to work, and keeping this knowledge codified in tests will prove to be a great help to future maintainers. So let’s make these tests permanent…