Fortran - Errors and error handling - Part 1 - Exploration
Matthias Noback
Fortran doesn’t have exceptions. You can’t jump out of a function when something unexpected happens, then reflect on what happened outside the function, with the help of a try/catch block. As an alternative, intrinsic (built-in) procedures and constructs like write
, read
and open
allow you to pass an integer
variable as one of its arguments, to which it will assign a specific value indicating whether an error occurred. For example, when you try to open a file that doesn’t exist, you will get a non-zero value in the variable passed as the argument for iostat
:
integer :: open_iostat
integer :: file_unit
open (file='does_not_exist.txt', iostat=open_iostat, &
status='old', newunit=file_unit)
print *, open_iostat
if (open_iostat /= 0) then
error stop 'Could not open file for reading'
end if
Note that calling error stop
means the whole program will stop (with some non-zero exit code in this case, but you can determine that value by providing an integer instead of a string message).
We may find this kind of error handling in a lot of user code as well. However, I think we can do a lot better when writing functions ourselves.
Let’s first look at the various ways of dealing with errors that may be found “in the wild”. In the next articles we’ll consider some less common approaches to dealing with edge cases or exceptional situations, which are inspired by functional programming concepts once more.
The naive way; ignoring problems
The first way of dealing with problems in a function is to ignore them. In the previous example, not checking the value of open_iostat
doesn’t result in some kind of runtime error. Even doing a read
statement using the file_unit
that hasn’t got an assigned value won’t result in a fatal error. If you want to terminate the program, you have to do it yourself.
Let’s consider another function that might fail to do what it promises: calculate the average of an array of real
s. The “naive implementation” is this:
pure function average_naive(numbers) result(res)
real(kind=wp), dimension(:), intent(in) :: numbers
real(kind=wp) :: res
res = sum(numbers)/size(numbers)
end function average_naive
Why is it naive? Because the argument for numbers
may be an array of any size, including 0
. In that case, both sum()
and size()
will return 0
, so the average will be calculated as 0/0
. In many languages division-by-zero results in some kind of fatal error. But in Fortran it doesn’t. The function still returns a real
, but if we print its value, it shows NaN
- not-a-number. The funny thing is that we can still do things with such a real
, e.g. multiply it, divide it, etc. Every new calculation result will still be NaN
though.
A user of this function doesn’t want to get NaN
s because this ruins their own data. So they need to do extra work before they call such a “bad” function as average_naive
. They have to read the function’s code: in what ways may it give bad, incorrect, useless, or dangerous results? They will conclude that they have to check the size of the array before calling the function:
if (size(my_numbers) > 0) then
avg = average_naive(my_numbers)
else
! Maybe do something else
end if
This is still a simple example, but if a function like this is used in many places in the code base, you will see this same if
clause copied there as well. If you don’t see it in a place where average_naive
is used, it’s a mistake: it should be there. But if indeed the if
statement has been copied, then we should be aware of the following:
- This is code that tends to become out-dated; a check like this may no longer be needed at a future time, if the implementation of the function changes.
- Whenever some pre-condition for calling the function changes, all these similar code fragments have to be updated in the same way, leading to a risky code change in many different places that may not all be under test.
- If a new pre-condition needs to be added, we also have to modify all the places where the function is called.
A sensible default or alternative
In some cases it’s okay to use a default/fallback return value, when a specific answer couldn’t be provided. As an example, when there are no numbers to calculate the average of we may be tempted to return not NaN
but 0.0
:
pure function average_with_default(numbers) result(res)
real(kind=wp), dimension(:), intent(in) :: numbers
real(kind=wp) :: res
if (size(numbers) == 0) then
res = 0.0_wp
else
res = sum(numbers)/size(numbers)
end if
end function average_with_default
In a mathematical sense, this isn’t correct. The answer to 0/0
is “not defined” or “meaningless”. However, depending on what users want, it could be that 0.0
is an acceptable answer somehow.
For similar functions where a default value would be good enough, you may also give the control to the user and let them pass an argument that should be used as the default. For instance, if the user likes 1.0
or NaN
better, we allow them to pass it to the function:
pure function average_with_user_provided_default(numbers, default) result(res)
real(kind=wp), dimension(:), intent(in) :: numbers
real(kind=wp), intent(in) :: default
real(kind=wp) :: res
if (size(numbers) == 0) then
res = default
else
res = sum(numbers)/size(numbers)
end if
end function average_with_user_provided_default
Users can call it like this:
print *, average_with_user_provided_default(no_numbers, 1.0_wp)
print *, average_with_user_provided_default(no_numbers, 0.0_wp/0.0_wp)
One downside is that if you let the user specify a default value, it’s impossible to find out if the return value was the actual result of the calculation, or if it was the provided default value. In that sense it may be better to let the user do the fallback instead of the function.
Relying on a fallback value, even when you let the user pass this value, is rarely a good option. For a user it’s often more useful to know what went wrong or that the function deviated from the “happy path” in some way. That way they can decide what to do instead.
Using a boolean success return value
An implementation that may help the user make an informed decision is to change the return value into a boolean (logical
) to indicate whether the function successfully performed its task. In this case the return value means: was it able to calculate the average?
function average_with_success_result(numbers, average) result(success)
real(kind=wp), dimension(:), intent(in) :: numbers
real(kind=wp), intent(out) :: average
logical :: success
if (size(numbers) == 0) then
success = .false.
else
success = .true.
average = sum(numbers)/size(numbers)
end if
end function average_with_success_result
The function will only assign a value to the intent(out)
dummy argument if the size of numbers
is larger than 0
.
success = average_with_success_result(no_numbers, avg)
if (success) then
print *, 'Success', avg
else
print *, 'Failure', avg
end if
If the user forgets to check the value of success
, then the avg
variable will still be a real
. It’s not NaN
, but a real
that you can do calculations with. In my case it was 3.952525166729972E-323
, but it could be anything. The mistake is easy to make, but the consequences may not be noticed soon enough. Also, the usefulness of a logical
return value is quite limited. What was the exact reason that the function didn’t success?
Using an error return value
To indicate why something went wrong, we may return a specific integer
error code instead of a logical
:
function average_with_error_result(numbers, average) result(error)
real(kind=wp), dimension(:), intent(in) :: numbers
real(kind=wp), intent(out) :: average
integer :: error
if (size(numbers) == 0) then
error = ERROR_AVERAGE_NO_NUMBERS
else
error = ERROR_NO_ERROR
average = real(sum(numbers)) / size(numbers)
end if
end function average_with_error_result
The error constants are defined at the top of the module
so they can be imported by users:
integer, parameter :: ERROR_NO_ERROR = 0
integer, parameter :: ERROR_AVERAGE_NO_NUMBERS = 1
In this case there is just one actual error condition, but it’s easy to come up with other functions that may fail for different reasons, like a function that opens a file. In fact, the open
statement we looked at earlier also uses an integer
to return errors, which can be many different things.
We still have the problem that the user may not check the error return value, and just start using the avg
variable, even though it may contain some random real
value. The advantage over using a success
return value is that we can indicate to the user what exactly went wrong. However, they have to analyze the return value themselves:
err = average_with_success_result(no_numbers, avg)
if (err == ERROR_NO_ERROR) then
print *, 'Success', avg
elseif (err == ERROR_NO_NUMBERS) then
print *, 'Failure: there were no numbers', avg
end if
They need to know the location of those error parameters in order to import them. If they want to show an error message, they need to know which integer
options there are, yet there is no way to enumerate them. They are not forced to use the constants, they may copy the integer values associated with them, even if it’s just in a simple comparison like /= 0
.
The lack of enumeration can be solved in different ways. And producing a human-readable error message is often solved with a dedicated function like get_error_message(error_code)
. What can’t be solved is the lack of context in the error: we can’t add additional information to an integer
error code.
All of these options for error handling lack encapsulation. The error handling code replicates outside the function how the code works internally. We need a better way. In the next post we’ll discuss how to implement an optional return value, to indicate that the function doesn’t always produce a usable result. Later we’ll look at a better way to represent errors than just as an integer
.