Automated Unit Testing in Haskell

Hspec is a BDD library for writing Rspec-style tests in Haskell. In this post, I’m going to describe setting up a Haskell project using this test framework. What we’ll end up with is a series of tests which can be run individually (at the module level), or all together (as part of packaging). Then I’ll briefly mention Guard (a Ruby tool) and how we can use that to automatically run relevant tests as we change code.

Project Layout

For any of this to work, our implementation and test modules must follow a particular layout:

├── src
│   └── Text
│       ├── Liquid
│       │   ├── Context.hs
│       │   ├── Parse.hs
│       │   └── Render.hs
│       └── Liquid.hs
└── test
    ├── SpecHelper.hs
    ├── Spec.hs
    └── Text
        └── Liquid
            ├── ParseSpec.hs
            └── RenderSpec.hs

Notice that for each implementation module (under ./src) there is a corresponding spec file at the same relative path (under ./test) with a consistent, conventional name (<ModuleName>Spec.hs). For this post, I’m going to outline the first few steps of building the Parse module of the above source tree which happens to be my liquid library, a Haskell implementation of Shopify’s template system.

Hspec Discover

Hspec provides a useful function called hspec-discover. If your project follows the conventional layout above, you can simply create a file like so:


{-# OPTIONS_GHC -F -pgmF hspec-discover #-}

And when that file is executed, all of your specs will be found and run together as a single suite.


I like to create a central helper module which gets imported into all specs. It simply exports our test framework and implementation code:


module SpecHelper
    ( module Test.Hspec
    , module Text.Liquid.Parse
    ) where

import Test.Hspec
import Text.Liquid.Parse

This file might not seem worth it now, but as you add more modules, it becomes useful quickly.

Baby’s First Spec


module Text.Liquid.ParseSpec where

import SpecHelper

spec :: Spec
spec = do
    describe "Text.Liquid.Parse" $ do
        context "Simple text" $ do
            it "parses exactly as-is" $ do
                let content = "Some simple text"

                parseTemplate content `shouldBe` Right [TString content]

main :: IO ()
main = hspec spec

With this first spec, I’ve already made some assumptions and design decisions.

The API into our module will be a single parseTemplate function which returns an Either type (commonly used to represent success or failure). The Right value (conventionally used for success) will be a list of template parts. One such part can be constructed with the TString function and is used to represent literal text with no interpolation or logic. This is the simplest template part possible and is therefore a good place to start.

The spec function is what will be found by hspec-discover and rolled up into a project-wide test. I’ve also added a main function which just runs said spec. This allows me to easily run the spec in isolation, which you should do now:

$ runhaskell -isrc -itest test/Text/Liquid/ParseSpec.hs

The first error you should see is an inability to find Test.Hspec. Go ahead and install it:

$ cabal install hspec

You should then get a similar error for Text.Liquid.Parse then some more about functions and types that are not yet defined. Let’s go ahead and implement just enough to get past that:


module Text.Liquid.Parse where

type Template = [TPart]

data TPart = TString String

parseTemplate :: String -> Either Template String
parseTemplate = undefined

The test should run now and give you a nice red failure due to the attempted evaluation of undefined.

Since implementing Parse is not the purpose of this post, I won’t be moving forward in that direction. Instead, I’m going to show you how to set this library up as a package which can be cabal installed and/or cabal tested by end-users.

For now, you can pass the test easily like so:


parseTemplate :: String -> Either Template String
parseTemplate str = Right [TString str]

For TDD purists, this is actually the correct thing to do here: write the simplest implementation to pass the test (even if you “know” it’s not going to last), then write another failing test to force you to implement a little more. I don’t typically subscribe to that level of TDD purity, but I can see the appeal.


We’ve already got Spec.hs which, when executed, will run all our specs together:

$ runhaskell -isrc -itest test/Spec.hs

We just need to wire that into the Cabal packaging system:


name:          liquid
version:       0.0.0
license:       MIT
copyright:     (c) 2013 Pat Brisbin
author:        Pat Brisbin <pbrisbin@gmail.com>
maintainer:    Pat Brisbin <pbrisbin@gmail.com>
build-type:    Simple
cabal-version: >= 1.8

  hs-source-dirs: src

  exposed-modules: Text.Liquid.Parse

  build-depends: base == 4.*

test-suite spec
  type: exitcode-stdio-1.0

  hs-source-dirs: test

  main-is: Spec.hs

  build-depends: base  == 4.*
               , hspec >= 1.3
               , liquid

With this in place, testing our package is simple:

$ cabal configure --enable-tests
$ cabal build
$ cabal test
Building liquid-0.0.0...
Preprocessing library liquid-0.0.0...
In-place registering liquid-0.0.0...
Preprocessing test suite 'spec' for liquid-0.0.0...
Linking dist/build/spec/spec ...
Running 1 test suites...
Test suite spec: RUNNING...
Test suite spec: PASS
Test suite logged to: dist/test/liquid-0.0.0-spec.log
1 of 1 test suites (1 of 1 test cases) passed.


Another thing I like to setup is the automatic running of relevant specs as I change code. To do this, we can use a tool from Ruby-land called Guard. Guard is a great example of a simple tool doing one thing well. All it does is watch files and execute actions based on rules defined in a Guardfile. Through plugins and extensions, there are a number of pre-built solutions for all sorts of common needs: restarting servers, regenerating ctags, or running tests.

We’re going to use guard-shell which is a simple extension allowing for running shell commands and spawning notifications.

$ gem install guard-shell

Next, create a Guardfile:


# Runs the command and prints a notification
def execute(cmd)
  if system(cmd)
    n 'Build succeeded', 'hspec', :success
    n 'Build failed', 'hspec', :failed

def run_all_tests
  execute %{
    cabal configure --enable-tests &&
    cabal build && cabal test

def run_tests(mod)
  specfile = "test/#{mod}Spec.hs"

  if File.exists?(specfile)
    files = [specfile]
    files = Dir['test/**/*.hs']

  execute "ghc -isrc -itest -e main #{files.join(' ')}"

guard :shell do
  watch(%r{.*\.cabal$})          { run_all_tests }
  watch(%r{test/SpecHelper.hs$}) { run_all_tests }
  watch(%r{src/(.+)\.hs$})       { |m| run_tests(m[1]) }
  watch(%r{test/(.+)Spec\.hs$})  { |m| run_tests(m[1]) }

Much of this Guardfile comes from this blog post by Michael Xavier. His version also includes cabal sandbox support, so be sure to check it out if that interests you.

If you like to bundle all your Ruby gems (and you probably should) that can be done easily, just see my main liquid repo as that’s how I do things there.

In one terminal, start guard:

$ guard

Finally, simulate an edit in your module and watch the test automatically run:

$ touch src/Text/Liquid/Parse.hs

And there you go, fully automated unit testing in Haskell.

01 Dec 2013, tagged with testing, haskell, cabal, hunit, ruby, guard

Mocking Bash

Have you ever wanted to mock a program on your system so you could write fast and reliable tests around a shell script which calls it? Yeah, I didn’t think so.

Well I did, so here’s how I did it.


Verification testing of shell scripts is surprisingly easy. Thanks to Unix, most shell scripts have limited interfaces with their environment. Assertions against stdout can often be enough to verify a script’s behavior.

One tool that makes these kind of executions and assertions easy is cram.

Cram’s mechanics are very simple. You write a test file like this:

The ls command should print one column when passed -1

  $ mkdir foo
  > touch foo/bar
  > touch foo/baz

  $ ls -1 foo

Any line beginning with an indented $ is executed (with > allowing multi-line commands). The indented text below such commands is compared with the actual output at that point. If it doesn’t match, the test fails and a contextual diff is shown.

With this philosophy, retrofitting tests on an already working script is incredibly easy. You just put in a command, run the test, then insert whatever the actual output was as the assertion. Cram’s --interactive flag is meant for exactly this. Aces.

Not Quite

Suppose your script calls a program internally whose behavior depends on transient things which are outside of your control. Maybe you call curl which of course depends on the state of the internet between you and the server you’re accessing. With the output changing between runs, these tests become more trouble than they’re worth.

What’d be really great is if I could do the following:

  1. Intercept calls to the program
  2. Run the program normally, but record “the response”
  3. On subsequent invocations, just replay the response and don’t call the program

This means I could run the test suite once, letting it really call the program, but record the stdout, stderr, and exit code of the call. The next time I run the test suite, nothing would actually happen. The recorded response would be replayed instead, my script wouldn’t know the difference and everything would pass reliably and instantly.

In case you didn’t notice, this is VCR.

The only limitation here is that a mock must be completely affective while only mimicking the stdout, stderr, and exit code of what it’s mocking. A command that creates files, for example, which are used by other parts of the script could not be mocked this way.

Mucking with PATH

One way to intercept calls to executables is to prepend $PATH with some controllable directory. Files placed in this leading directory will be found first in command lookups, allowing us to handle the calls.

I like to write my cram tests so that the first thing they do is source a test/helper.sh, so this makes a nice place to do such a thing:


export PATH="$TESTDIR/..:$TESTDIR/bin:$PATH"

This ensures that a) the executable in the source directory is used and b) anything in test/bin will take precedence over system commands.

Now all we have to do to mock foo is add a test/bin/foo which will be executed whenever our Subject Under Test calls foo.


The logic of what to do in a mock script is straight forward:

  1. Build a unique identifier for the invocation
  2. Look up a stored “response” by that identifier
  3. If not found, run the program and record said response
  4. Reply with the recorded response to satisfy the caller

We can easily abstract this in a generic, 12 line proxy:


#!/usr/bin/env bash
program="$1"; shift

fixtures="${TESTDIR:-test}/fixtures/$base/$(echo $* | md5sum | cut -d ' ' -f 1)"

if [[ ! -d "$fixtures" ]]; then
  mkdir -p "$fixtures"
  $program "$@" >"$fixtures/stdout" 2>"$fixtures/stderr"
  echo $? > "$fixtures/exit_code"

cat "$fixtures/stdout"
cat "$fixtures/stderr" >&2

read -r exit_code < "$fixtures/exit_code"

exit $exit_code

With this in hand, we can record any invocation of anything we like (so long as we only need to mimic the stdout, stderr, and exit code).


#!/usr/bin/env bash
act-like /usr/bin/curl "$@"


#!/usr/bin/env bash
act-like /usr/bin/makepkg "$@"


#!/usr/bin/env bash
act-like /usr/bin/pacman "$@"


After my next test run, I find the following:

$ tree test/fixtures
├── curl
│   ├── 008f2e64f6dd569e9da714ba8847ae7e
│   │   ├── exit_code
│   │   ├── stderr
│   │   └── stdout
│   ├── 2c5906baa66c800b095c2b47173672ba
│   │   ├── exit_code
│   │   ├── stderr
│   │   └── stdout
│   ├── c50061ffc84a6e1976d1e1129a9868bc
│   │   ├── exit_code
│   │   ├── stderr
│   │   └── stdout
│   ├── f38bb573029c69c0cdc96f7435aaeafe
│   │   ├── exit_code
│   │   ├── stderr
│   │   └── stdout
│   ├── fc5a0df540104584df9c40d169e23d4c
│   │   ├── exit_code
│   │   ├── stderr
│   │   └── stdout
│   └── fda35c202edffac302a7b708d2534659
│       ├── exit_code
│       ├── stderr
│       └── stdout
├── makepkg
│   └── 889437f54f390ee62a5d2d0347824756
│       ├── exit_code
│       ├── stderr
│       └── stdout
└── pacman
    └── af8e8c81790da89bc01a0410521030c6
        ├── exit_code
        ├── stderr
        └── stdout

11 directories, 24 files

Each hash-directory, representing one invocation of the given program, contains the full response in the form of stdout, stderr, and exit_code files

I run my tests again. This time, rather than calling any of the actual programs, the responses are found and replayed. The tests pass instantly.

24 Aug 2013, tagged with bash, testing, mocks, cram, aurget, arch