Fortran - Errors and error handling - Part 7 - Fatal errors

Matthias Noback

July 21, 2025

We’ve encountered several ways of designing functions in a way that allows them to fail for some reason, without stopping the program, or making it otherwise risky or awkward to use the function. We introduced the error_t type which is very flexible. It can be used to provide some information to the caller, helping them understand what went wrong and how it can be fixed. By allowing errors to be wrapped inside others, we can create chains of errors that describe the problem at various abstraction levels. It gives back control to the user: how do they want to deal with an error? Would they like to try something else? Or, in the end, should we just stop trying and quit te program?

It is generally advisable to not just stop a program as soon as you notice something is wrong. There is often some work that needs to be done to clean things up. But when the time is there, you need to stop the program in the best way possible… What does that mean?

The program has to return a non-zero exit code. This helps other programs that run our program (like a terminal, shell, script, etc.) detect the failure.
We should show a detailed error message, providing as many helpful clues as possible, so the user can fix the problem and run the program again.
We should include some tracing information for a programmer who wants to find out where in the code the fatal error occurred.

Non-zero exit codes

We can quit a Fortran program using a stop statement:

stop 'Error'
! Output: Error
! Exit code: 0

This is confusing: we stop because of an error, but the exit code is 0, indicating success. We can provide a number higher than 0 instead of a string:

stop 1
! Output: 1
! Exit code: 1

The exit code is okay, but the output is just the exit code itself. It’s better to print a message, then exit with the provided code. You can’t give both an exit code and a message to stop. We can add a quiet specifier though to prevent the code exit from being added to the output:

stop 1, quiet = .true.
! Output: [nothing]
! Exit code: 1

Providing a specific error code may not be a portable thing to do. What’s the best value or value range to use may vary between operating systems, so it would be a good idea to let the compiler pick a value for “failure”. This can be accomplished with the error stop command:

error stop 'Error'
! Output: Error
! Exit code: 128

If needed, we can still pick a different exit code:

error stop 129
! Output: 129
! Exit code: 129

As with stop, this shows the exit code in the output… Luckily, we can silence the output again:

error stop 129, quiet=.true.
! Output: [nothing]
! Exit code: 129

A good suggestion would be to use error stop for fatal errors and stop for situations where the program did was what expected from it, but didn’t want to wait until the main program block finishes.

Printing an error

Both stop and error stop can print messages, but when we pass a string we no longer have control over the exit code. So it’s better to separate the responsibility of printing a useful error message on screen, and exiting with a specific error code. These can be two separate steps in a subroutine fatal_error:

subroutine fatal_error(error, exit_code)
   class(error_t), intent(in) :: error
   integer, intent(in), optional :: exit_code

   print *, error%get_message()

   if (present(exit_code)) then
      error stop exit_code, quiet = .true.
   else
      error stop
   end if
end subroutine fatal_error

We accept an instance of our own error_t type and print the result of calling get_message(). We created that function earlier to return the error’s own message, with any previous error message concatenated to it. Note that exit_code is an optional argument:

! Providing a specific exit code
call fatal_error(error_t('Something went wrong'), 129)

! Using the default exit code
call fatal_error(error_t('Something went wrong'))

When printing errors, it’s best practice to use a different output “channel”. Instead of stdout, which is used by print, we should explicitly send our output to stderr. This has to be done with a write statement, passing the error_unit from the instrinsic module iso_fortran_env as the write target:

subroutine fatal_error(error, exit_code)
+  use, intrinsic :: iso_fortran_env, only: error_unit
  
   class(error_t), intent(in) :: error
   integer, intent(in), optional :: exit_code

-  print *, error%get_message()
+  write (error_unit, fmt=*) error%get_message()

   if (present(exit_code)) then
      error stop exit_code, quiet = .true.
   else
      error stop
   end if
end subroutine fatal_error

Note that we’re importing error_unit inside the subroutine itself instead of at the top of the module definition. This works fine, although I’m not sure if it should be promoted to become a best practice. One advantage is that if we ever (re)move this procedure, we don’t have to clean up the use statement(s) at the top of the module. In a sense, adding imports to a subroutine is the most cohesive thing to do. But of course, in many cases some code duplication is the downside of this approach.

Tracing the origin of an error

Languages that have exceptions built-in also have a mechanism to capture a stack trace of the location where the exception was produced. The stack trace gets passed as data alongside the exception, and can be rendered to the user. This kind of mechanism isn’t available in Fortran. We have to write something ourselves.

There are several things we can do. For instance, we can at least record the source location where we created an error_t instance. Any time we want to do that, we have to pass this information explicitly. Let’s start by defining a location_t type that holds a file name and a line number:

type :: location_t
   character(len=:), allocatable :: file
   integer :: line
end type location_t

We then allow the location to be stored on an error_t instance:

type :: error_t
   character(len=:), allocatable :: message
   class(error_t), allocatable :: previous
+  class(location_t), allocatable :: location
contains
   procedure :: get_message => error_get_message
end type error_t

To show the location as part of the message we can update the type-bound procedure get_message to concatenate the location but only if it’s allocated:

pure recursive function error_get_message(self) result(res)
   class(error_t), intent(in) :: self
   character(len=:), allocatable :: res

   character(len=32) :: temp

+  res = self%message
+  if (allocated(self%location)) then
+     write (temp, *) self%location%line
+     res = res//' in '//self%location%file//' on line '//trim(adjustl(temp))
+  end if
   if (allocated(self%previous)) then
      res = res//' Previous error: '//self%previous%get_message()
   end if
end function error_get_message

Note that we have to do some complicated work before we can concatenate an integer to a string. We’ll find an easier way to do it in another post.

When creating the error we can now also pass the file and line number:

call fatal_error(error_t('Something went wrong', &
                         location=location_t('file.f90', 10))

Note that we have to use a named argument (location=) because the default structure constructor for error_t expects the second argument to be of type error_t (to match the previous data component).

This shows the following message on screen:

Something went wrong in file.f90 on line 10

Of course, hard-coding the file name and line number is asking for trouble. It would be a nightmare to ensure these values stay up-to-date. Instead, we can use a pre-processor like cpp to do the work for us. It has macros for the file name and line number. With FPM, you can easily enable the pre-processor by adding to fpm.toml:

[preprocess]
[preprocess.cpp]

Now we can write:

call fatal_error(error_t('Something went wrong', &
                         location=location_t(__FILE__, __LINE__)))

One thing that is not very user-friendly and leads to code duplication is the use of a derived type and a named argument. The quick solution for that is to define an interface for error_t, effectively overriding the structure constructor. If a user passes a string (the message), another string (the file), and an integer (the line), then the function will do the rest:

   interface error_t
      module procedure :: create_error_with_message_and_location
   end interface
   
contains

   pure function create_error_with_message_and_location(message, file, line) result(res)
      character(len=*), intent(in) :: message
      character(len=*), intent(in) :: file
      integer, intent(in) :: line
      type(error_t) :: res

      res = error_t(message, location=location_t(file, line))
   end function create_error_with_message_and_location

Now we can use it like this, and the result will be the same as before.

call fatal_error(error_t('Something went wrong', __FILE__, __LINE__))

We can create any number of variants we need for the error_t interface, e.g. with or without a previous error, etc.

Stack traces

Having a file and line number in the error itself is nice. But most likely we also want to find out what has happened before the error occurred. In other words, we want the stack trace: which procedure calls have lead to this problem. When actually trying to figure out what went wrong, we most likely need an interactive debugging session anyway, but the stack trace helps us find out where to start.

Unfortunately, there’s no built-in way to get a stack trace either. Compilers provide their own ways of doing this. For example, the IFX compiler we’re using has a subroutine tracebackqq, provided by its own ifcore module. Before we can use it, the compiler should be able to find it (this will be the case when you use oneAPI’s setvars script). Because we’re using FPM we need to define ifcore as an external module, so it doesn’t try to find the ifcore module in the src folder. We do this in fpm.toml:

[build]
external-modules = ["ifcore"]

Now we can add the traceback to our fatal_error subroutine:

subroutine fatal_error(error, exit_code)
+  use ifcore, only: tracebackqq
   use, intrinsic :: iso_fortran_env, only: error_unit

   class(error_t), intent(in) :: error
   integer, intent(in), optional :: exit_code

   write (error_unit, fmt=*) error%get_message()

+  call tracebackqq()

   ! ...
end subroutine fatal_error

The output will look like this:

Something went wrong in errors_part_7.f90 on line 21
Image              PC                Routine            Line        Source             
errors_part_7      0000000000405C18  fatal_error                38  errors_part_7_module.f90
errors_part_7      00000000004055D7  demo_fatal_error           21  errors_part_7.f90
errors_part_7      00000000004052FE  main                       11  errors_part_7.f90
errors_part_7      00000000004052CD  Unknown               Unknown  Unknown
libc.so.6          0000780C403821CA  Unknown               Unknown  Unknown
libc.so.6          0000780C4038228B  __libc_start_main     Unknown  Unknown
errors_part_7      00000000004051E5  Unknown               Unknown  Unknow

It includes the name of the subroutines, files and line numbers. That is, in a debug build. A release build will likely not have this information, although you can just turn it on if you want.

Checking the exit code of the program, it turns out to be 0. That’s because tracebackqq() also stops the program. If we don’t want that, we have to pass -1 as the argument for user_exit_code:

-call tracebackqq()
+call tracebackqq(user_exit_code=-1)

Note that using subroutines like tracebackqq() is compiler-dependent. If you have to support multiple compilers, you may also have to provide alternatives or disable some code. In such cases you can use pre-processor macros like this:

#ifdef __INTEL_COMPILER
      call tracebackqq(user_exit_code=-1)
#endif

There’s a nice fortran-error-handling library that implements many of the ideas offered in this article series. It offers standardized, portable solutions, e.g. for the backtrace/stack trace problem. The library can be installed with FPM.

This post concludes the series on error handling.

Fortran