Forgive me for my bad pun. As I mentioned in my previous Bash post I'm going to show you some ways in which you can improve the design of Bash scripts. Again, it's a weird language, and a lot of what's below probably won't feel natural to you. Anyway, there we go.
I started out with a piece of code that looked like this:
BUILD_DIR="build"
function clean_up() {
rm -r "$BUILD_DIR"
}
clean_up
Function arguments
Inside a function you can use all global and environment variables, which easily leads to smelly code like this: clean_up
will behave differently based on what's in the global variable BUILD_DIR
. This makes the function itself quite unpredictable, but also error-prone, as the value of BUILD_DIR
may at one point not contain the name of a directory, or even be an empty string. Usually we would fix this by providing the path to the directory we'd like to remove as an argument of the function call, like this:
function clean_up() {
rm -r "$1"
}
clean_up "$BUILD_DIR"
You may recognize this $1
syntax from previous Bash encounters: variables $1...n
are the arguments that you provide when you run the script at the command-line. Likewise, when calling a Bash function, $1...n
represent the arguments that the caller provided (by the way, I really like this symmetry between calling functions and running programs).
Passing the directory as an argument (although not a named, nor a typed argument) is good practice. It makes the function reusable. And equally important: predictable. Its behavior won't be influenced by changes in global variables.
Input validation
The only problem so far is that the clean_up
function doesn't perform any input validation at all. You can even call this function without any argument and you won't even receive a warning for that...
In order to function correctly, the following pre-conditions need to be met:
- Argument 1 needs to be provided.
- It should represent the path to an existing directory.
We can easily accomplish this by adding a -d
test. However, we can't really throw an exception if the directory doesn't exist. The best thing we can do is follow Unix conventions:
- print an error message to
stderr
. exit
with a non-zero exit code.
That way, the process running our script knows that it encountered a problem. We print the error message to stderr
to prevent other processes from automatically processing the output in case the script was part of longer chain of commands (e.g. command-a | command-b > output.txt
).
function clean_up() {
if [[ ! -d "$1" ]]; then
echo "Argument 1 should be the path of an existing directory" 1>&2
exit 1
fi
rm -r "$1"
}
Note that we echo
and exit
where we would normally like to throw an exception. echo
prints to stdout
(file descriptor 1), but we'd like to print to stderr
(file descriptor 2). We accomplish this by redirecting the output from 1
to 2
: 1>&2
This starts to look like a reasonable function. However, $1
is still a bad variable name. It doesn't explain what it represents. We'd rather call it directory
. We can easily do so of course:
directory="$1"
if [[ ! -d "$directory" ]]; then
#...
fi
rm -r "$directory"
Local, named variables
That's much better already! However, by default, variables have no scope. This means that once we set a variable inside a function, it will be available outside that function:
function clean_up() {
directory="$1"
# ...
}
clean_up "build"
echo "$directory"
Bash has a way to mark variables as "local to the current scope", by adding the local
keyword in front of the variable name, like this: local directory="$1"
. However, I recommend using declare
, as it has many more options, even allowing some rudimentary typing. Let's use the -r
option in this case to mark the variable as read-only (PHP could also benefit from such an option by the way).
function clean_up() {
declare -r directory="$1"
#...
}
clean_up "build"
# This will show an empty string:
echo "$directory"
A nice debugging suggestion is to use declare -p
to print whatever variables (including environment variables) have been declared at that point in the script.
For completeness sake, this is the full code of the final solution:
#!/usr/bin/env bash
function clean_up() {
declare -r directory="$1"
if [[ ! -d "$directory" ]]; then
echo "Argument 1 should be the path of an existing directory" 1>&2
exit 1
fi
rm -r "$directory"
}
clean_up "build"
Conclusion
In this article we've improved the design of the clean_up
method. This is what you'd call a "command" method: it does something, it has side effects, and it may either succeed or fail, providing no particular return value. In another article I'll show you a query function that needs fixing.