Fortran - Errors and error handling - Part 2
Matthias Noback
In the previous post we’ve looked at several ways to communicate to the user of a function that something could go wrong, or that maybe the function couldn’t produce the result that the user was looking for. In the case of an average function, we’d want to communicate to the user that in some cases the function can’t produce a meaningful result, namely if the array of numbers is empty. This isn’t really an error, it’s just that we don’t have a way to answer the question “what’s the average of this array?”. To communicate that with the function signature, we tried adding a “success” return value, making the calculated average an intent(out)
argument. The function modifies the variable passed by the user to contain the average value, but only if the array is not empty:
function average_with_success_result(numbers, average) result(success)
real(kind=wp), dimension(:), intent(in) :: numbers
real(kind=wp), intent(out) :: average
logical :: success
if (size(numbers) == 0) then
success = .false.
else
success = .true.
average = sum(numbers)/size(numbers)
end if
end function average_with_success_result
We already noticed the potential problems the user runs into when they forget to check the success
return value: average
will have a real
value, but an unpredictable one. There’s another surprising aspect of this function: the fact that besides returning a value, it also modifies an argument. This, in my opinion, is confusing behavior for a function.
In similar situations people may turn the function into a subroutine
, “demoting” the return value to another intent(out)
argument. But this looks even weirder:
subroutine average_no_return_value(numbers, average, success)
real(kind=wp), dimension(:), intent(in) :: numbers
real(kind=wp), intent(out) :: average
logical, intent(out) :: success
if (size(numbers) == 0) then
success = .false.
else
success = .true.
average = sum(numbers)/size(numbers)
end if
end subroutine average_no_return_value
Here’s how you’d call such a subroutine:
call average_no_return_value(numbers, avg, success)
if (success) then
print *, avg
end if
In my opinion, this is a strange solution. An average
function doesn’t sound like a function that would need to modify its arguments, or one that doesn’t even have a return value. Subroutines should be used for producing side effects, like writing to a file, or to the terminal, or even for modifying some module variable (more about this in a future post). We really don’t have to or should use a subroutine in this case. But we still have to implement the optionality of the result…
A single, optional return value
In some other languages you can define a return value as “optional” with a so-called Option
type. It’s a generic or template type that allows a function to define what type of value it may or may not return, e.g. integer
, real
, etc. Option
itself is an abstract type, and has two concrete subtypes: Some
and None
. Some
holds an actual value (e.g. an actual real
), but None
doesn’t. The first benefit of such a type-based approach is that a function can now have just one return type (Option
), which is polymorphic (at runtime it will be either Some
or None
). There’s no need for additional success or error return values, or mutable intent(out)
arguments.
In Fortran, we can’t define a generic option/some/none type because the language doesn’t support generic/templating. So we can’t have something like option_t[real]
, option_t[integer]
, we’d always have to define explicit option types for each optional value type we want to return from a function.
In the case of a real
, we’d define an abstract
type optional_real_t
, and two concrete types some_real_t
and no_real_t
. Only some_real_t
actually has a real
data component. By the way, I’m using the name optional_real_t
instead of option_real_t
and no_real_t
instead of none_real_t
to make them more grammatically appealing.
module optional
use iso_fortran_env, only: wp => real64
implicit none(type, external)
private
public :: optional_real_t
public :: no_real_t
public :: some_real_t
type, abstract :: optional_real_t
end type optional_real_t
type, extends(optional_real_t) :: no_real_t
! Nothing here
end type no_real_t
type, extends(optional_real_t) :: some_real_t
real(kind=wp) :: value
end type some_real_t
end module optional
Note: there are no type-bound procedures (yet) for these types. Also, I’m using an arbitrary “working precision” for the real
value here. Given that the real
kind
has to be a compile-time parameter
, you have to redefine optional_real_t
and related types for every kind of real
you want to support in your project.
Now we can modify the average
function to return an optional_real_t
, and update the code to return either some_real_t
with an actual real
value inside, or no_real_t
:
pure function optional_average(numbers) result(res)
real(kind=wp), dimension(:), intent(in) :: numbers
class(optional_real_t), allocatable :: res
if (size(numbers) == 0) then
res = no_real_t()
else
res = some_real_t(sum(numbers)/size(numbers))
end if
end function optional_average
Note: the return type becomes class(optional_real_t)
to indicate that it will be any subtype of optional_real_t
. If you use class
as a return type, you always have to make it allocatable
too, because the actual type hasn’t been determined yet. Assigning no_real_t
or some_real_t
will automatically allocate res
.
Retrieving the value from some_real_t
When using this function, we also need a local variable with the same definition as res
. We then let the function populate this variable for us. However, if we want to get access to the real
inside some_real_t
, we can’t do it right-away. That’s because only some_real_t
has this real
value, no_real_t
doesn’t. And we don’t know yet if the function returned some_real_t
or no_real_t
. To figure this out, we have to use a select type
statement. Inside that statement we can run code only when the provided value matches a specific type (using type is (derived_type_name)
) or when the value is a subtype of the given type (using class is (derived_type_name)
):
avg = optional_average(some_numbers)
! This doesn't work, `optional_real_t` has no `value` component:
print *, avg%value
select type (avg)
type is (some_real_t)
print *, avg%value
type is (no_real_t)
print *, 'No real result'
end select
In each type is
or class is
branch of a select type
statement, the compiler “remembers” the matched type. So inside the some_real_t
branch it recognizes that you can indeed access the value
data component, because some_real_t
has it. In the no_real_t
branch you can’t, and the compiler will produce an error if you (accidentally) try to access it.
A select statement doesn’t have to be “complete”, that is, if you’re only interested in some_real_t
, you don’t have to add a branch for no_real_t
:
select type (avg)
type is (some_real_t)
print *, avg%value
end select
Also, you can have a “catch all” branch at the end with class default
:
select type (avg)
type is (some_real_t)
print *, avg%value
type is (no_real_t)
print *, 'No real result'
class default
print *, 'Unexpected return value'
end select
Conclusion
Now, how is this different from the earlier example where the user had to check success
and is only then allowed to use the modified intent(out)
argument avg
?
call average_no_return_value(numbers, avg, success)
if (success) then
print *, avg
end if
The new approach comes with several improvements actually:
- We don’t have two separate values (
success
andavg
). There’s now only one value we need to deal with. This simplifies the usage of the function. - We can’t make the mistake of forgetting to check
success
but still using thereal
. We can only use thereal
value when we have matched the type of the return value tosome_real_t
. - The new function is
pure
; it doesn’t have to modify any of its arguments. A function that modifies its arguments is not very safe to work with because it may lead to surprising behavior and bugs. - The new function is easier to test with a single assertion. We will look into unit testing later, and then we’ll see that a single return value/result is much easier to check than multiple ones.
In conclusion, we’ve been able to solve a design issue with the average
function by extending the type system. When the language doesn’t support an otherwise very useful design concept like optionality, we can implement it ourselves by adding one or more types. We then increase type safety, while decreasing the chance of making mistakes.
Often there are alternative solutions too. In the next post we’ll look at another type-based approach that helps us get rid of this special case of calculating the average of “no numbers”.