Some may say that you shouldn’t write shell beyond a certain, very low bar of complexity. If you reach for arrays, certainly associative arrays (gasp!), or if your script approaches 20, 50, or 100 (how dare you!) lines, maybe you want a “real” language.
Everyone’s bar is different, but I’d wager actual options parsing is above it for most. I think this is misguided; parsing options in shell can be valuable enough, and done with low enough complexity, to more than pay for itself on this scale. I think the problem is a lack of familiarity (did you even know you could parse options in shell?) coupled with confusing alternatives and an information-dense (read: overwhelming) documentation style in the space.
I’ve arrived at a narrow pattern of shell options parsing that I think is drastically improving my scripts, without introducing much by way of downside. By accepting some limitations, I think I’ve found a good 80/20 in benefit/complexity in this space.
Skeleton
Here is how I begin any script I write:
#!/bin/sh
usage() {
cat <<'EOM'
TODO
EOM
}
while getopts h opt; do
case "$opt" in
h)
usage
exit 0
;;
\?)
usage >&2
exit 64
;;
esac
done
shift $((OPTIND - 1))
printf '>%s<\n' "$@" # For demonstration purposes
I used to do this “when I needed”, but I’m done fooling myself. I always end up wanting this, and I’m always happy when I’ve done it from the start. Seeing usage
front-and-center top-of-file is great. Expecting -h
in a script that isn’t used very often is extremely useful, for me and my team.
Let’s break down what’s happening:
The getopts
program is typically a shell built-in and is specified by POSIX. This means you can use it in pretty much any shell, but it will be less featureful; no long options, for example. I prefer this over getopt
, which is Bash-specific and does support long options. I actually don’t care too much about POSIX compatibility, and I most often write scripts with a bash
shebang, but I actually just find getopt
’s usage very clunky. Your mileage may vary.
The h
is the optstring
or “options string”. It defines the options you are going to parse for. In this case, I’m saying the single option h
without any arguments. I’ll extend it later and you’ll see how its syntax works.
opt
is the name of the variable that getopts
will place each option it parses into for each iteration of the loop.
case "$opt" in
h)
usage
exit 0
;;
\?)
usage >&2
exit 64
;;
esac
As mentioned, this will loop with $opt
set to each (valid) flag we see, or ?
if we were given something invalid. If given h
, I print usage information to stdout
and exit successfully. The invalid branch is similar accept going to stderr
and exiting un-successfully.
I prefer to let getopts
print its own error on invalid items,
% ./example -h
TODO
% ./example -f
./example: illegal option -- f
TODO
% 64
I think its messages are perfectly clear and I’m happy to not manage them myself. You can suppress these messages by prefixing the options string with :
. See the manpage for more details.
shift $((OPTIND - 1))
printf '>%s<\n' "$@"
Lastly, we shift
passed the parsed options. That way, anything we don’t handle in getopts
is $@
at this point in the script:
% ./example foo bar "baz bat"
>foo<
>bar<
>baz bat<
% ./example -f foo bar "baz bat"
./example: illegal option -- f
TODO
And since we’re parsing options “for real” instead of adhoc, we get some behavior for free, such as --
to separate option-like arguments, needed to support that last example:
% ./example -- -f foo bar "baz bat"
>-f<
>foo<
>bar<
>baz bat<
Flag options
Now, let’s parse another option:
usage() {
cat <<'EOM'
Usage: thing [-fh]
Options
-f Force the thing
-h Print this help
EOM
}
force=0
while getopts fh opt; do
case "$opt" in
f)
force=1
;;
# ...
esac
done
Here you see one downside compared to “real” languages’ options parsers: we have to do things 3 times.
- The argument to
getopts
contains f
- The
case
statement must look for f
- The
usage
function
If you configure ShellCheck in your editor (you should!), that can at least protect you from most mistakes in item 2:

Options with arguments
Now, let’s add an option with an argument:
usage() {
cat <<'EOM'
Usage: thing [-fh] <-o PATH>
Options
-f Force the thing
-o Output file
-h Print this help
EOM
}
force=0
output=
while getopts fo:h opt; do
case "$opt" in
# ...
o)
output=$OPTARG
;;
# ...
esac
done
if [ -z "$output" ]; then
echo "-o is required" >&2
usage >&2
exit 64
fi
As before the same 3 things:
- Add
o:
to options string, the :
indicates an argument is required
- Look for
o
in the case
; the argument will be present in $OPTARG
- Document accordingly in
usage
And we see a new downside: required options are on us to enforce.
This is certainly error-prone, but again, I’m shooting for the 80/20 on complexity vs featureful-ness. If getopts
somehow supported declaring options as required, it would then need to also support defaulting. Going in this direction can cause the complexity to spiral too far for POSIX.
For what it’s worth, I agree with where they’ve drawn the line; and leaving that to us makes defaulting pretty easy:
usage() {
cat <<'EOM'
Usage: thing [-fh] [-o PATH]
Options
-f Force the thing
-o Output file, default is stdout
-h Print this help
EOM
}
output=/dev/stdout
while getopts # ...
Complete example
This snippet should be a good copy-paste source for the limit of what POSIX getopts
provides:
#!/bin/sh
usage() {
cat <<'EOM'
Usage: thing-mover [-fh] [-o PATH] [--] <THING> [THING...]
Move things into some output.
Options:
-f Overwrite output even if it exists
-o Output path, default is stdout
-h Show this help
Arguments:
THING Thing to move
EOM
}
force=0
output=/dev/stdout
while getopts fo:h opt; do
case "$opt" in
f)
force=1
;;
o)
output=$OPTARG
;;
h)
usage
exit 0
;;
\?)
usage >&2
exit 64
;;
esac
done
shift $((OPTIND - 1))
if [ $# -eq 0 ]; then
echo "At least one thing is required" >&2
usage >&2
exit 64
fi
for thing in "$@"; do
if thing_exists "$thing"; then
if [ "$force" -ne 1 ]; then
echo "Thing exists!" >&2
exit 1
fi
fi
move_thing "$thing" "$output"
done
NOTE: Normally I would just do nothing if no things were passed, as a form of define errors out of existence, but I’m enforcing the argument for demonstration purposes here.
23 Mar 2021, tagged with shell
Have you ever wanted to mock a program on your system so you could write fast and reliable tests around a shell script which calls it? Yeah, I didn’t think so.
Well I did, so here’s how I did it.
Cram
Verification testing of shell scripts is surprisingly easy. Thanks to Unix, most shell scripts have limited interfaces with their environment. Assertions against stdout
can often be enough to verify a script’s behavior.
One tool that makes these kind of executions and assertions easy is cram.
Cram’s mechanics are very simple. You write a test file like this:
The ls command should print one column when passed -1
$ mkdir foo
> touch foo/bar
> touch foo/baz
$ ls -1 foo
bar
baz
Any line beginning with an indented $
is executed (with >
allowing multi-line commands). The indented text below such commands is compared with the actual output at that point. If it doesn’t match, the test fails and a contextual diff is shown.
With this philosophy, retrofitting tests on an already working script is incredibly easy. You just put in a command, run the test, then insert whatever the actual output was as the assertion. Cram’s --interactive
flag is meant for exactly this. Aces.
Not Quite
Suppose your script calls a program internally whose behavior depends on transient things which are outside of your control. Maybe you call curl
which of course depends on the state of the internet between you and the server you’re accessing. With the output changing between runs, these tests become more trouble than they’re worth.
What’d be really great is if I could do the following:
- Intercept calls to the program
- Run the program normally, but record “the response”
- On subsequent invocations, just replay the response and don’t call the program
This means I could run the test suite once, letting it really call the program, but record the stdout
, stderr
, and exit code of the call. The next time I run the test suite, nothing would actually happen. The recorded response would be replayed instead, my script wouldn’t know the difference and everything would pass reliably and instantly.
In case you didn’t notice, this is VCR.
The only limitation here is that a mock must be completely affective while only mimicking the stdout
, stderr
, and exit code of what it’s mocking. A command that creates files, for example, which are used by other parts of the script could not be mocked this way.
Mucking with PATH
One way to intercept calls to executables is to prepend $PATH
with some controllable directory. Files placed in this leading directory will be found first in command lookups, allowing us to handle the calls.
I like to write my cram tests so that the first thing they do is source a test/helper.sh
, so this makes a nice place to do such a thing:
test/helper.sh
export PATH="$TESTDIR/..:$TESTDIR/bin:$PATH"
This ensures that a) the executable in the source directory is used and b) anything in test/bin
will take precedence over system commands.
Now all we have to do to mock foo
is add a test/bin/foo
which will be executed whenever our Subject Under Test calls foo
.
Record/Replay
The logic of what to do in a mock script is straight forward:
- Build a unique identifier for the invocation
- Look up a stored “response” by that identifier
- If not found, run the program and record said response
- Reply with the recorded response to satisfy the caller
We can easily abstract this in a generic, 12 line proxy:
test/bin/act-like
#!/usr/bin/env bash
program="$1"; shift
base="${program##*/}"
fixtures="${TESTDIR:-test}/fixtures/$base/$(echo $* | md5sum | cut -d ' ' -f 1)"
if [[ ! -d "$fixtures" ]]; then
mkdir -p "$fixtures"
$program "$@" >"$fixtures/stdout" 2>"$fixtures/stderr"
echo $? > "$fixtures/exit_code"
fi
cat "$fixtures/stdout"
cat "$fixtures/stderr" >&2
read -r exit_code < "$fixtures/exit_code"
exit $exit_code
With this in hand, we can record any invocation of anything we like (so long as we only need to mimic the stdout
, stderr
, and exit code).
test/bin/curl
#!/usr/bin/env bash
act-like /usr/bin/curl "$@"
test/bin/makepkg
#!/usr/bin/env bash
act-like /usr/bin/makepkg "$@"
test/bin/pacman
#!/usr/bin/env bash
act-like /usr/bin/pacman "$@"
Success!
After my next test run, I find the following:
$ tree test/fixtures
test/fixtures
├── curl
│ ├── 008f2e64f6dd569e9da714ba8847ae7e
│ │ ├── exit_code
│ │ ├── stderr
│ │ └── stdout
│ ├── 2c5906baa66c800b095c2b47173672ba
│ │ ├── exit_code
│ │ ├── stderr
│ │ └── stdout
│ ├── c50061ffc84a6e1976d1e1129a9868bc
│ │ ├── exit_code
│ │ ├── stderr
│ │ └── stdout
│ ├── f38bb573029c69c0cdc96f7435aaeafe
│ │ ├── exit_code
│ │ ├── stderr
│ │ └── stdout
│ ├── fc5a0df540104584df9c40d169e23d4c
│ │ ├── exit_code
│ │ ├── stderr
│ │ └── stdout
│ └── fda35c202edffac302a7b708d2534659
│ ├── exit_code
│ ├── stderr
│ └── stdout
├── makepkg
│ └── 889437f54f390ee62a5d2d0347824756
│ ├── exit_code
│ ├── stderr
│ └── stdout
└── pacman
└── af8e8c81790da89bc01a0410521030c6
├── exit_code
├── stderr
└── stdout
11 directories, 24 files
Each hash-directory, representing one invocation of the given program, contains the full response in the form of stdout
, stderr
, and exit_code
files
I run my tests again. This time, rather than calling any of the actual programs, the responses are found and replayed. The tests pass instantly.
24 Aug 2013, tagged with shell
If you’re like me, (which you’re probably not…) you enjoy listening to your music with the great music playing daemon known as mpd. You also have your entire collection on shuffle.
Occasionally, I’ll fall into a valley of bad music and end up hitting next far too much to get to a good song. For this reason, I wrote goodsong.
What is it?
Essentially, you press one key command to say the currently playing song is good; then press a different key to say play me a good song.
Goodsong accomplishes exactly that. It creates a playlist file which you can auto-magically add the currently playing song to with the command goodsong
. Subsequently, running goodsong -p
will play a random track from that same list.
Here’s the --help
:
usage: goodsong [ -p | -ls ]
options:
-p,--play play a random good song
-ls,--list print your list with music dir prepended
none note the currently playing song as good
Installation
Goodsong is available in its current form in my git repo.
Usage
Using goodsong is easy. You can always just run it from CLI, but I find it’s best when bound to keys. I’ll leave the method for that up to you; xbindkeys
is a nice WM-agnostic way to bind some keys, or you can use your a WM-specific configuration to do so.
Personally, I keep Alt-g as goodsong
and Alt-Shift-g as goodsong -p
.
You’re going to have to spend some time logging songs as “good” before the -p
option becomes useful.
I recently received a patch from a reader for this script. It adds a few features which I’ve happily merged in.
- Various methods are employed to try and determine exactly what
mpd.conf
you’re currently running with at the time
- The goodsong list is now a legitimate playlist file stored in your
playlist_directory
as specified in mpd.conf
05 Dec 2009, tagged with shell
Do not use this for bad things, m’kay?
What it looks like
Usage
usage: dvdcopy [ --option(=<argument>) ] [...]
~/.dvdcopy.conf will be read first if it's found (even if --config
is passed). for syntax, see the help entry for the --config option.
commandline arguments will overrule what's defined in the config.
invalid options are ignored.
options:
--config=<file> read any of the below options from a
file, note that you must strip the
'--' and set any argument-less
options specifically to either true
or false
there is no error if <file> doesn't
exist
--directory=<directory> set the working directory, default
is ./dvdcopy
--keep_files keep all intermediate files; note
that they will be removed the next
time dvdcopy is run regardless of
this option
--device=<file> set the reader/burner, default is
/dev/sr0
--title=<number> set the title, default is longest
--size=<number> set the desired output size in KB,
default is 4193404
--limit=<number> set the number of times to attempt a
read/burn before giving up, default
is 15
--mpeg_only stop after transcoding the mpeg
--dvd_only stop after authoring the dvd
--iso_only stop after generating the iso
--mpeg_dir=<directory> set a save location for the
intermediate mpeg file, default is
blank -- don't save it
--dvd_dir=<directory> set a save location for the
intermediate vob folder, default is
blank -- don't save it
--iso_dir=<directory> set a save location for the
intermediate iso file, default is
blank -- don't save it
--mencoder_options=<options> pass additional arbitrary arguments
to mencoder, multiple options should
be quoted and there is no validation
on these; you'll need to know what
you're doing. the options are placed
after '-dvd-device <device>' but
before all others
--quiet be quiet
--verbose be verbose
--force disable any options validation,
useful if ripping from an image file
--help print this
What’s it do?
Pop in a standard DVD9 (~9GB) and type dvdcopy
. The script will calculate the video bitrate required to create an ISO under 4.3GB (standard DVD5). It will then use mencoder
to create an authorable image and burn it back to a disc playable on any standard player.
Defaults are sane (IMO), but can be adjusted through the config file or the options passed at runtime (or both). I’ve now added a lot of cool features as described in the help.
How to get it
Install the AUR package here.
Grab the source from my git repo here.
05 Dec 2009, tagged with shell