Fortran: Enumeration, part 3
Matthias Noback
In the previous post we properly encapsulated the integer
level with a log_level_t
derived type, achieving type safety.
Now, let’s consider the following feature request: we want to show the log level in front of the message. We could do it with decoration of course, but in the scope of this article, let’s do it in the log()
subroutine directly. In this case we’re “lucky” that we can access the module-private data component level
, so we can do a select case
on it to determine what string should be printed:
subroutine log(message, level)
character(len=*), intent(in) :: message
type(log_level_t), intent(in) :: level
character(len=:), allocatable :: level_as_string
select case (level%level)
case (0)
level_as_string = 'DEBUG'
case (1)
level_as_string = 'INFO'
case (2)
level_as_string = 'WARNING'
case (3)
level_as_string = 'ERROR'
case (4)
level_as_string = 'FATAL'
end select
print *, level_as_string, ' ', message
end subroutine log
This works, but the design isn’t great:
- We’re replicating the full list of integer constants outside the level derived type. This breaks the encapsulation we’ve so carefully implemented.
- When we add a new factory method, we shouldn’t forget to add a new case here. The compiler won’t remind us, and there’s no visible link between these two places. This is a very common code smell…
Restoring encapsulation
When you break encapsulation by reaching inside a type for one of its private
data components, there’s often a simple solution: move the code to a procedure bound to the type itself. In this case it would be most helpful if the log_level_t
instance itself could produce the string label:
subroutine log(message, level)
character(len=*), intent(in) :: message
type(log_level_t), intent(in) :: level
print *, level%as_string(), ' ', message
end subroutine log
We need to declare the as_string
procedure:
type :: log_level_t
private
integer :: level
+contains
+ procedure :: as_string => log_level_as_string
end type log_level_t
And the module procedure itself:
pure function log_level_as_string(self) result(level_as_string)
class(log_level_t), intent(in) :: self
character(len=:), allocatable :: level_as_string
select case (self%level)
case (0)
level_as_string = 'DEBUG'
case (1)
level_as_string = 'INFO'
! And so on ...
end select
end function log_level_as_string
At least this helps keep the integer
level
inside log_level_t
. But we still have that duplicate list of possible integer
values, that doesn’t automatically track the list of actually supported log levels.
An idea to keep the list automatically in sync is to tie the number and the label together: the factory functions that create the log_level_t
instances should provide both these values, which are then stored as data components on log_level_t
:
module logging_base
! ...
type :: log_level_t
private
integer :: level
+ character(len=:), allocatable :: label
contains
procedure :: as_string => log_level_as_string
end type log_level_t
contains
! ...
function log_level_debug(self) result(res)
class(log_level_factory_t), intent(in) :: self
type(log_level_t) :: res
res%level = 0
+ res%label = 'DEBUG'
end function log_level_debug
! And so on...
end module logging_base
We no longer need the select case
statement; we can just return the label
data component from log_level_t
’s as_string()
procedure:
pure function log_level_as_string(self) result(level_as_string)
class(log_level_t), intent(in) :: self
character(len=:), allocatable :: level_as_string
level_as_string = self%label
end function log_level_as_string
We’ve solved the problem of the disconnected list of log levels! But what’s still somewhat strange about the current design is that every time you retrieve a log level from the factory, it will create a copy of the label too. That’s because the label has to end up in a data component, which is essentially a variable. However, it doesn’t have to be a variable, because we never change it.
Subtypes
To fix this, we should stop relying on data components and instead rely only on functions. We do this by extending the type system: log_level_t
becomes an abstract
type.
-type :: log_level_t
+type, abstract :: log_level_t
private
integer :: level
+ character(len=:), allocatable :: label
contains
- procedure :: as_string => log_level_as_string
+ procedure(log_level_as_string), deferred :: as_string
end type
We removed the label
data component, since we no longer want to store the label in a variable. Instead, we want as_string
to return it as a constant string. Since the label still needs to be different for each specific log level, each subtype needs to implement the deferred
procedure as_string
itself, following the specified interface
log_level_as_string
:
interface
pure function log_level_as_string(self) result(level_as_string)
import log_level_t
implicit none(type, external)
class(log_level_t), intent(in) :: self
character(len=:), allocatable :: level_as_string
end function log_level_as_string
end interface
Specific log levels become concrete types that extend log_level_t
, like debug_level_t
:
type, extends(log_level_t) :: debug_level_t
contains
procedure :: as_string => debug_level_as_string
end type debug_level_t
! And so on for info, warning, etc.
contains
pure function debug_level_as_string(self) result(level_as_string)
class(debug_level_t), intent(in) :: self
character(len=:), allocatable :: level_as_string
level_as_string = 'DEBUG'
end function debug_level_as_string
! And so on for info, warning, etc.
The factory functions should be modified to return a concrete level type, e.g.
function log_level_factory_debug(self) result(res)
class(log_level_factory_t), intent(in) :: self
- type(log_level_t) :: res
+ type(debug_level_t) :: res
res%level = 0
- res%label = 'DEBUG'
end function log_level_factory_debug
Finally, log()
should expect not a specific type but any subtype of log_level_t
, which you can do with the polymorphic specification class
:
subroutine log(message, level)
character(len=*), intent(in) :: message
- type(log_level_t), intent(in) :: level
+ class(log_level_t), intent(in) :: level
print *, level%as_string(), ' ', message
end subroutine log
Admittedly, this approach requires more code, but it comes with a big advantage: the compiler will assist you when you want to expand the type hierarchy. For instance, if you want to support a new level, like an alert
level:
type, extends(log_level_t) :: alert_level_t
end type alert_level_t
The compiler (IFX in this case) will tell you:
error #8322: A deferred binding is inherited by non-abstract type;
It must be overridden. [AS_STRING]
procedure(log_level_as_string), deferred :: as_string
--------------------------------------------------^
Unfortunately, it doesn’t tell you which type does not have a binding for as_string
(alert_level_t
)…
The gfortran compiler says:
67 | type, extends(log_level_t) :: alert_level_t
| 1
Error: Derived-type ‘alert_level_t’ declared at (1) must be
ABSTRACT because ‘as_string’ is DEFERRED and not overridden
This is more helpful as it points to the place where something is wrong. But it gives the wrong solution: we don’t want to make alert_level_t
abstract
. At least it should mention the alternative option: “… or implement the deferred procedure as_string
”.
So, if you ever want to require some other new behavior from all log levels, simply add another deferred
procedure to the abstract
log_level_t
type. The compiler will tell you if you’ve updated all your subtypes.
Encapsulation and cohesion
In a previous post we saw how a point in 2D space is much better off being represented by a point_t
derived type, which is much richer in behavior than the separate real
x
and y
values. After introducing a new type, logic that is spread over the code base easily finds a new home. If you don’t know where to start, it’s helpful to look for places that break encapsulation: code that “reaches inside the type” for some data to do something with. Try to move that code to a type-bound procedure. When all the users of the type are only calling procedures, the data components can be made private
. Then the derived type has full control over the internally used data, including their types, and is free to change how it behaves internally, without breaking user code.
In this 3-part series on enumeration, we upgraded a primitive/scalar type (integer
) to a derived type and saw how this leads to additional behavior being added to it in the form of procedures. In the case of the rather simple concept of a log level there may be limit to that process, but we already introduced the is_less_severe_than
procedure, and the as_string
procedure, which definitely helped us encapsulate all the log-level-related data and logic.
Another reason to encapsulate the as_string
logic was that we didn’t want to replicate the list of integer constants used by log_level_factory_t
in a select case
statement outside the derived type. The reason is that if one thing changes (log_level_factory_t
), the other has to change too (the select case
). This is a specific example, but it occurs in many different ways, which is why it has often been generalized as the design rule to keep together what changes together (also known as cohesion). Make the distance as small as possible, so you can’t make a mistake there. A special case of cohesion is if you not only keep the things that change together really close together, but if you can actually make them the same thing. That’s what we did in this post.