What follows is a literate haskell file runnable via

`ghci`

. The raw source for this page can be found here.
While reading Understanding Computation again last night, I was going back through the chapter where Tom Stuart describes deterministic and non-deterministic finite automata. These simple state machines seem like little more than a teaching tool, but he eventually uses them as the implementation for a regular expression matcher. I thought seeing this concrete use for such an abstract idea was interesting and wanted to re-enforce the ideas by implementing such a system myself -- with Haskell, of course.

Before we get started, we'll just need to import some libraries:

```
import Control.Monad.State
import Data.List (foldl')
import Data.Maybe
```

We're going to model a subset of regular expression patterns.

```
data Pattern =
Empty -- ""
| Literal Char -- "a"
| Concat Pattern Pattern -- "ab"
| Choose Pattern Pattern -- "a|b"
| Repeat Pattern -- "a*"
```

Checking for matches against these patterns means converting them to Deterministic Finite Automata or DFAs, then letting the DFAs operate on input strings and give us a useful answer.

Since this is a recursive data type, we're going to have to recursively create and combine DFAs. For example, in a `Concat`

pattern, we'll need to turn both sub-patterns into DFAs then combine those in some way. In the Ruby implementation, Mr. Stuart used `Object.new`

to ensure unique state identifiers between all the DFAs he has to create. We can't do that in Haskell. There's no global object able to provide some guaranteed-unique value.

What we're going to do to get around this is conceptually simple, but appears complicated because it makes use of monads. All we're doing is defining a list of identifiers at the beginning of our program and drawing from that list whenever we need a new identifier. Because we can't maintain that as a variable we constantly update every time we pull an identifier out, we'll use the `State`

monad to mimic mutable state through our computations.

I apologize for the naming confusion here. This

`State`

type is from the Haskell library and has nothing to with the states of our DFAs.
Our identifiers are just `Int`

s, but we'll call them `DFAState`

s throughout the system.

`type DFAState = Int`

We can then take the polymorphic `State s a`

type, and fix the `s`

variable as a list of (potential) identifiers.

`type DSI a = State [DFAState] a`

We can now use `evalState`

to provide the infinite list of integers as the pool to draw from, execute some stateful action, and ultimately return the result.

```
runDSI :: DSI a -> a
runDSI f = evalState f [1..]
```

This makes it simple to create a `nextId`

action which requests the next identifier from this list as well as updates the computation's state, removing it as a future option before presenting that next identifier as its result.

```
nextId :: DSI DFAState
nextId = do
(x:xs) <- get
put xs
return x
```

As long as our program is a single action passed to `runDSI`

(which we can build incrementally and compose monadically), we're guaranteed to get a unique state identifier every time we want one.

Who needs mutable state to write programs?

DFAs are very simple machines. They have some states and some rules. They read characters in and move from state to state according to those rules. Some states are special, they're known as "accept" states. What we're going to do is construct a DFA whose rules for moving from state to state are derived from the nature of the pattern it represents. Only if the DFA we construct moves to an accept state for a given string of input does it mean the string matches that pattern.

```
matches :: String -> Pattern -> Bool
matches s = (`accepts` s) . runDSI . toDFA
```

We can test this out in `ghci`

:

```
ghci> "" `matches` Empty
True
ghci> "abc" `matches` Empty
False
```

And use it in an example `main`

:

```
main :: IO ()
main = do
-- This AST represents the pattern /ab|cd*/:
let p = Choose
(Concat (Literal 'a') (Literal 'b'))
(Concat (Literal 'c') (Repeat (Literal 'd')))
print $ "xyz" `matches` p
-- => False
print $ "cddd" `matches` p
-- => True
```

A DFA is a machine with a set of rules, one or more current states (to handle non-determinism), and one or more accept states.

```
data DFA = DFA
{ rules :: [Rule]
, currentStates :: [DFAState]
, acceptStates :: [DFAState]
} deriving Show
```

A rule defines what characters tell the machine to change states and which state to move into.

```
data Rule = Rule
{ fromState :: DFAState
, inputChar :: Maybe Char
, nextStates :: [DFAState]
} deriving Show
```

The reason `inputChar`

and `nextStates`

are not single values is because we need to use this deterministic machine to model a non-deterministic one. It's possible (and required, in fact) to have a rule like *If in State 1 and an "a" is read, go to State 2 or State 3*. We can model that by having both 2 and 3 in `nextStates`

for that rule. It's also possible (and also required) to have such a thing as a "Free Move". This means that the machine can change states without reading any input. This is useful if there is a state reachable via a Free Move which has a normal rule for the character about to be read. The machine can "jump" to that state, then follow that rule. These are modelled here by a `Nothing`

value in the `inputChar`

field.

If, after processing some input, any of the machine's current states are in its list of "accept" states, the machine has accepted the input.

```
accepts :: DFA -> [Char] -> Bool
accepts dfa = accepted . foldl' process dfa
where
accepted :: DFA -> Bool
accepted dfa = any (`elem` acceptStates dfa) (currentStates dfa)
```

Processing a single character means finding any followable rules for the given character and the current machine state, and following them.

```
process :: DFA -> Char -> DFA
process dfa c = case findRules c dfa of
-- Invalid input should cause the DFA to go into a failed state.
-- We can do that easily, just remove any acceptStates.
[] -> dfa { acceptStates = [] }
rs -> dfa { currentStates = followRules rs }
findRules :: Char -> DFA -> [Rule]
findRules c dfa = filter (ruleApplies c dfa) $ rules dfa
```

A rule applies if

- The read character is a valid input character for the rule, and
- That rule applies to an available state

```
ruleApplies :: Char -> DFA -> Rule -> Bool
ruleApplies c dfa r =
maybe False (c ==) (inputChar r) &&
fromState r `elem` availableStates dfa
```

An "available" state is one which we're currently in, or can reach via Free Moves.

```
availableStates :: DFA -> [DFAState]
availableStates dfa = currentStates dfa ++ freeStates dfa
```

The process of finding free states (those reachable via Free Moves) gets a bit hairy. We need to start from our current state(s) and follow any Free Moves *recursively*. This ensures that Free Moves which lead to other Free Moves are correctly accounted for.

```
freeStates :: DFA -> [DFAState]
freeStates dfa = go [] (currentStates dfa)
where
go acc [] = acc
go acc ss =
let ss' = followRules $ freeMoves dfa ss
in go (acc ++ ss') ss'
```

Free Moves from a given set of states are rules for those states which have no input character.

```
freeMoves :: DFA -> [DFAState] -> [Rule]
freeMoves dfa ss = filter (\r ->
(fromState r `elem` ss) && (isNothing $ inputChar r)) $ rules dfa
```

Of course, the states that result from following rules are simply the concatenation of those rules' next states.

```
followRules :: [Rule] -> [DFAState]
followRules = concatMap nextStates
```

Now we can model a DFA and see if it accepts a string or not. You could test this in `ghci`

by defining a DFA in state 1 with an accept state 2 and a single rule that moves the machine from 1 to 2 if the character "a" is read.

```
ghci> let dfa = DFA [Rule 1 (Just 'a') [2]] [1] [2]
ghci> dfa `accepts` "a"
True
ghci> dfa `accepts` "b"
False
```

Pretty cool.

Our conversion function, `toDFA`

will live in the `DSI`

monad, allowing it to call `nextId`

at will. This gives it the following type signature:

`toDFA :: Pattern -> DSI DFA`

Every pattern is going to need at least one state identifier, so we'll pull that out first, then begin a case analysis on the type of pattern we're dealing with:

```
toDFA p = do
s1 <- nextId
case p of
```

The empty pattern results in a predictably simple machine. It has one state which is also an accept state. It has no rules. If it gets any characters, they'll be considered invalid and put the machine into a failed state. Giving it no characters is the only way it can remain in an accept state.

` Empty -> return $ DFA [] [s1] [s1]`

Also simple is the literal character pattern. It has two states and a rule between them. It moves from the first state to the second only if it reads that character. Since the second state is the only accept state, it will only accept that character.

```
Literal c -> do
s2 <- nextId
return $ DFA [Rule s1 (Just c) [s2]] [s1] [s2]
```

We can model a concatenated pattern by first turning each sub-pattern into their own DFAs, and then connecting the accept state of the first to the start state of the second via a Free Move. This means that as the combined DFA is reading input, it will only accept that input if it moves through the first DFAs states into what used to be its accept state, hop over to the second DFA, then move into its accept state. Conceptually, this is exactly how a concatenated pattern should match.

*Note that freeMoveTo will be shown after.*

```
Concat p1 p2 -> do
dfa1 <- toDFA p1
dfa2 <- toDFA p2
let freeMoves = map (freeMoveTo dfa2) $ acceptStates dfa1
return $ DFA
(rules dfa1 ++ freeMoves ++ rules dfa2)
(currentStates dfa1)
(acceptStates dfa2)
```

We can implement choice by creating a new starting state, and connecting it to both sub-patterns' DFAs via Free Moves. Now the machine will jump into both DFAs at once, and the composed machine will accept the input if either of the paths leads to an accept state.

```
Choose p1 p2 -> do
s2 <- nextId
dfa1 <- toDFA p1
dfa2 <- toDFA p2
let freeMoves =
[ freeMoveTo dfa1 s2
, freeMoveTo dfa2 s2
]
return $ DFA
(freeMoves ++ rules dfa1 ++ rules dfa2) [s2]
(acceptStates dfa1 ++ acceptStates dfa2)
```

A repeated pattern is probably hardest to wrap your head around. We need to first convert the sub-pattern to a DFA, then we'll connect up a new start state via a Free Move (to match 0 occurrences), then we'll connect the accept state back to the start state (to match repetitions of the pattern).

```
Repeat p -> do
s2 <- nextId
dfa <- toDFA p
let initMove = freeMoveTo dfa s2
freeMoves = map (freeMoveTo dfa) $ acceptStates dfa
return $ DFA
(initMove : rules dfa ++ freeMoves) [s2]
(s2: acceptStates dfa)
```

And finally, our little helper which connects some state up to a DFA via a Free Move.

```
where
freeMoveTo :: DFA -> DFAState -> Rule
freeMoveTo dfa s = Rule s Nothing (currentStates dfa)
```

I want to give a big thanks to Tom Stuart for writing Understanding Computation. That book has opened my eyes in so many ways. I understand why he chose Ruby as the book's implementation language, but I find Haskell to be better-suited to these sorts of modeling tasks. Hopefully he doesn't mind me exploring that by rewriting some of his examples.

Every time I read Learn You a Haskell, I get something new out of it. This most recent time through, I think I've finally gained some insight into the `Applicative`

type class.

I've been writing Haskell for some time and have developed an intuition and explanation for `Monad`

. This is probably because monads are so prevalent in Haskell code that you can't help but get used to them. I knew that `Applicative`

was similar but weaker, and that it should be a super class of `Monad`

but since it arrived later it is not. I now think I have a general understanding of how `Applicative`

is different, why it's useful, and I would like to bring anyone else who glossed over `Applicative`

on the way to `Monad`

up to speed.

The `Applicative`

type class represents applicative functors, so it makes sense to start with a brief description of functors that are *not* applicative.

A functor is any container-like type which offers a way to transform a normal function into one that operates on contained values.

Formally:

```
fmap :: Fuctor f -- for any functor,
=> ( a -> b) -- take a normal function,
-> (f a -> f b) -- and make one that works on contained values
```

Some prefer to think of it like this:

```
fmap :: Functor f -- for any functor,
=> (a -> b) -- take a normal function,
-> f a -- and a contained value,
-> f b -- and return the contained result of applying that
-- function to that value
```

Thanks to currying, the two are completely equivalent.

This is the first small step in the ultimate goal between all three of these type classes: allow us to work with values with context (in this case, a container of some sort) as if that context weren't present at all. We give a normal function to `fmap`

and it sorts out how to deal with the container, whatever it may be.

To say that a functor is "applicative", we mean that the contained value *can be applied*. This is just another way of saying it's a function.

An applicative functor is any container-like type which offers a way to transform a *contained* function into one that can operate on contained values.

```
(<*>) :: Applicative f -- for any applicative functor,
=> f (a -> b) -- take a contained function,
-> (f a -> f b) -- and make one that works on contained values
```

Again because of currying, we can also think of it like this:

```
(<*>) :: Applicative f -- for any applicative functor,
=> f (a -> b) -- take a contained function,
-> f a -- and a contained value,
-> f b -- and return a contained result
```

Applicative functors also have a way to take an un-contained function and put it into a container:

```
pure :: Applicative f -- for any applicative functor,
=> (a -> b) -- take a normal function,
-> f (a -> b) -- and put it in a container
```

In actuality, the type signature is just `a -> f a`

. Since `a`

literally means "any type", it can certainly represent the type `(a -> b)`

too.

`pure :: Applicative f => a -> f a`

Understanding this is very important for understanding the usefulness of `Applicative`

. Even though the type signature for `(<*>)`

starts with `f (a -> b)`

, it can just as easily be used with functions taking any number of arguments.

Consider the following:

`:: f (a -> b -> c) -> f a -> f (b -> c)`

Is this `(<*>)`

or not?

In stead of writing its signature with `b`

, lets use a question mark:

`(<*>) :: f (a -> ?) -> f a -> f ?`

Indeed it is. Just substitute the type `(b -> c)`

for every `?`

rather than the simple `b`

in the actual class definition.

What you just saw was a very concrete example of currying. When we say "a function of *n* arguments", we're actually lying. All functions in Haskell take exactly one argument. Multi-argument functions are really just single-argument functions that return other single-argument functions that accept the remaining arguments via the same process.

Using the question mark approach, we see that multi-argument functions are simply the form:

```
f :: a -> ?
f = -- ...
```

And it's entirely legal for that `?`

to be replaced with `(b -> ?)`

, and for *that* `?`

to be replaced with `(c -> ?)`

and so on ad infinitum. Thus you have *the appearance* of multi-argument functions.

As is common with Haskell, this results in what appears to be happy coincidence, but is actually the product of developing a language on top of such a consistent mathematical foundation. You'll notice that after using `(<*>)`

on a function of more than one argument, the result is not a wrapped result, but another wrapped function -- does that sound familiar? Exactly, it's an applicative functor.

Let me say that again: if you partially apply a function of more than one argument using `(<*>)`

, you end up with another applicative functor which can be given to `(<*>)`

yet again with another wrapped value to supply the remaining argument to that original function. This can continue as long as the function needs more arguments. Just like normal function application.

Consider what this might look like if you start with a plain old function that (conceptually) takes more than one argument, but the values that it wants to operate on are wrapped in some container.

```
-- A normal function
f :: (a -> b -> c)
f = -- ...
-- One contained value, suitable for its first argument
x :: Applicative f => f a
x = -- ...
-- Another contained value, suitable for its second
y :: Applicative f => f b
y = -- ...
```

How do we pass `x`

and `y`

to `f`

to get some overall result? Easy, you wrap the function with `pure`

then use `(<*>)`

repeatedly:

```
result :: Applicative f => f c
result = pure f <*> x <*> y
```

The first portion of that expression is very interesting: `pure f <*> x`

. What is this bit doing? It's taking a normal function and applying it to a contained value. Wait a second, normal functors know how to do that!

Since in Haskell every `Applicative`

is also a `Functor`

, that means it could be rewritten as just `fmap f x`

, turning the whole expression into `fmap f x <*> y`

.

Never satisfied, Haskell introduced a function called `(<$>)`

which is just `fmap`

but infix. With this alias, we can write:

`result = f <$> x <*> y`

Not only is this epically concise, but it looks exactly like `f x y`

which is how this code would be written if there were no containers involved. Here we have another, more powerful step towards the goal of writing code that has to deal with some context (in our case, still that container) without actually having to care about that context. You write your function like you normally would, then just pepper `(<$>)`

and `(<*>)`

between the arguments.

With both `Functor`

and `Applicative`

, anything and everything was wrapped. Both arguments to `(<*>)`

are wrapped, the result is wrapped, and `pure`

wraps something up. We never have to deal with *unwrapping* anything.

Simply put, a `Monad`

is a type that can do everything an `Applicative`

can do plus handle unwrapping. However, it can't just unwrap values willy-nilly. It can only unwrap a value in a very specific case: while passing it to a function which returns a wrapped result.

Formally:

```
(>>=) :: Monad m -- for any monad,
=> m a -- take wrapped value
-> a -> m b -- and a function which needs it unwrapped
-> m b -- unwrap it, and apply that function
```

Let's look at this through the lens of currying:

```
(>>=) :: Monad m -- for any monad,
=> m a -- take a wrapped value
-> (a -> m b -> m b) -- and return a function which can take an
-- unwrapped value and a wrapped one and
-- return another wrapped one
```

This clarifies why it's the only way we can support unwrapping. We're taking a wrapped value and producing *a function which operates on an unwrapped value*. The type signature describes the nature of this function: it takes yet another wrapped value as argument and produces a wrapped value of the same type as its result.

This gives us the needed flexibility to implement unwrapping. Consider a type like `Maybe`

. If we were able to unwrap values at any point and return them directly, we'd be in trouble when we come across a `Nothing`

. If, on the other hand, our type signature says we ourselves have to return a wrapped result, we can take the reasonable step of not unwrapping anything and simply returning another `Nothing`

.

The above type signature ensures that's **always** an option.

Haskell has no generic function of the type `Monad m => m a -> a`

. Without that, there is no opportunity for unwrapping something that can't be unwrapped. Haskell does have a function called `join`

with the signature `Monad m => m (m a) => m a`

. This is indeed a function that just unwraps a value directly, but because the type signature enforces that the value coming in is doubly-wrapped and the value going out is still wrapped, we can maintain our safety. Yay type systems.

Up until now, we've been calling these types wrappers, containers, or contexts. With `Monad`

it can be easier to think of them as *actions*. An action implies that *something else* may occur as a result of evaluating this otherwise pure function: side-effects. These can be real-world side effects in the case of `IO`

, or context-changing side effects in the case of `Maybe`

or `List`

.

Unwrapping as a concept should then be replaced with *evaluating* or *running* an action, it's when any side-effects will be realized. Again in the case of `Maybe`

, when we attempt to unwrap a `Nothing`

value via `(>>=)`

, that's the point at which the entire computation becomes a `Nothing`

.

Once we've made that conceptual leap, we can think about *dependant*, or *sequenced* actions. In the case of `IO`

, we have an expectation that actions will be performed in a particular order. In the case of `Maybe`

, we need to know that if an *earlier* function returns `Nothing`

, the *later* functions will know about it.

The ability for a `Monad`

to be unwrapped or evaluated combined with the type signature of `(>>=)`

provides for sequencing because it enforces that the left hand side is evaluated before the right hand side. This must be true because the left hand value has to be evaluated (i.e. unwrapped) for the right hand side to even be evaluable at all.

With all of this background knowledge, I came to a simple mental model for applicative functors vs monads: *Monad is for series where Applicative is for parallel*.

We use a monad for composing multiple actions (values with context) into a single action (a new value with context). We use applicative for the same reason. The difference lies (of course) in how that composition is carried out. With a monad, each action is evaluated in turn and the results of each are fed into the next via `(>>=)`

. This implies ordering. With an applicative functor, every value is unwrapped in turn as functions are applied via `(<*>)`

and the results are combined into a single value in "parallel".

Let's walk through a real example.

In an application I'm working on, I'm doing OAuth based authentication. My domain has the following (simplified) user type:

```
data User = User
{ userFirstName :: Text
, userLastName :: Text
, userEmail :: Text
}
```

During the process of authentication, an OAuth endpoint provides me with some profile data which ultimately comes back as an association list:

```
type Profile = [(Text, Text)]
-- Example:
-- [ ("first_name", "Pat" )
-- , ("last_name" , "Brisbin" )
-- , ("email" , "me@pbrisbin.com")
-- ]
```

Within this list, I can find user data via the `lookup`

function which takes a key and returns a `Maybe`

value. I had to write the function that builds a `User`

out of this list of profile values. I also had to propagate any `Maybe`

values by returning `Maybe User`

.

First, let's write this without exploiting the fact that `Maybe`

is a monad or an applicative:

```
buildUser :: Profile -> Maybe User
buildUser p =
case lookup "first_name" p of
Nothing -> Nothing
Just fn -> case lookup "last_name" p of
Nothing -> Nothing
Just ln -> case lookup "email" p of
Nothing -> Nothing
Just e -> Just $ User fn ln e
```

Oof.

Treating `Maybe`

as a `Monad`

makes this much, much cleaner:

```
buildUser :: Profile -> Maybe User
buildUser p = do
fn <- lookup "first_name" p
ln <- lookup "last_name" p
e <- lookup "email" p
return $ User fn ln e
```

Up until a few weeks ago, I would've stopped there and been extremely proud of myself and Haskell. Haskell for supplying such a great abstraction for potential failed lookups, and myself for knowing how to use it.

Hopefully, the content of this blog post has made it clear that we can do better.

Think about the thing we're modelling here. A monad is best used for sequencing dependant actions with side-effects. Does it matter in what order we look things up? If one key's not found, we want `Nothing`

regardless of which key it is or when it goes missing. What we're really doing here is taking the three values with context (the `Maybe`

profile values) and combining them all together via the `User`

data constructor.

This is `Applicative`

, I know this.

```
-- f :: a -> b -> c -> d
User :: Text -> Text -> Text -> User
-- x :: f a
lookup "first_name" p :: Maybe Text
-- y :: f b
lookup "last_name" p :: Maybe Text
-- z :: f c
lookup "email" p :: Maybe Text
-- result :: f d
-- result = f <$> x <*> y <*> z
buildUser :: Profile -> Maybe User
buildUser p = User
<$> lookup "first_name" p
<*> lookup "last_name" p
<*> lookup "email" p
```

And now, I understand when to reach for `Applicative`

over `Monad`

. Perhaps you do too?

Lately at work, I've been fortunate enough to work on a JSON API which I was given the freedom to write in Yesod. I was a bit hesitant at first since my only Yesod experience has been richer html-based sites and I wasn't sure what support (if any) there was for strictly JSON APIs. Rails has a number of conveniences for writing concise controllers and standing up APIs quickly -- I was afraid Yesod may be lacking.

I quickly realized my hesitation was unfounded. The process was incredibly smooth and Yesod comes with just as many niceties that allow for rapid development and concise code when it comes to JSON-only API applications. Couple this with all of the benefits inherent in using Haskell, and it becomes clear that Yesod is well-suited to sites of this nature.

In this post, I'll outline the process of building such a site, explain some conventions I've landed on, and discuss one possible pitfall when dealing with model relations.

This post assumes you're familiar with Haskell and Yesod. It also won't cover some important but un-interesting aspects of API design. We'll give ourselves arbitrary requirements and I'll show only the code required to meet those.

Specifically, the following will not be discussed:

- Haskell basics
- Yesod basics
- Authentication
- Embedding relations or side-loading
- Dealing with created-at or updated-at fields

To begin, let's get a basic Yesod site scaffolded out. How you do this is up to you, but here's my preferred steps:

```
$ mkdir ./mysite && cd ./mysite
$ cabal sandbox init
$ cabal install yesod-platform yesod-bin
$ yesod init --bare
$ cabal install --dependencies-only
$ yesod devel
```

The scaffold comes with a number of features we won't need. You don't have to remove them, but if you'd like to, here they are:

- Any existing models
- Any existing routes/templates
- Authentication
- Static file serving

For our API example, we'll consider a site with posts and comments. We'll keep things simple, additional models or attributes would just mean more lines in our JSON instances or more handlers of the same basic form. This would result in larger examples, but not add any value to the tutorial.

Let's go ahead and define the models:

**config/models**

```
Post
title Text
content Text
Comment
post PostId
content Text
```

It's true that we can add a `json`

keyword in our model definition and get derived `ToJSON`

/`FromJSON`

instances for free on all of our models; we won't do that though. I find these JSON instances, well, ugly. You'll probably want your JSON to conform to some conventional format, be it jsonapi or Active Model Serializers. Client side frameworks like Ember or Angular will have better built-in support if your API conforms to something conventional. Writing the instances by hand is also more transparent and easily customized later.

Since what we do doesn't much matter, only that we do it, I'm going to write JSON instances and endpoints to appear as they would in a Rails project using Active Model Serializers.

**Model.hs**

```
share [mkPersist sqlOnlySettings, mkMigrate "migrateAll"]
$(persistFileWith lowerCaseSettings "config/models")
-- { "id": 1, "title": "A title", "content": "The content" }
instance ToJSON (Entity Post) where
toJSON (Entity pid p) = object
[ "id" .= (String $ toPathPiece pid)
, "title" .= postTitle p
, "content" .= postContent p
]
instance FromJSON Post where
parseJSON (Object o) = Post
<$> o .: "title"
<*> o .: "content"
parseJSON _ = mzero
-- { "id": 1, "post_id": 1, "content": "The comment content" }
instance ToJSON (Entity Comment) where
toJSON (Entity cid c) = object
[ "id" .= (String $ toPathPiece cid)
, "post_id" .= (String $ toPathPiece $ commentPost c)
, "content" .= commentContent c
]
-- We'll talk about this later
--instance FromJSON Comment where
```

Let's start with a RESTful endpoint for posts:

**config/routes**

```
/posts PostsR GET POST
/posts/#PostId PostR GET PUT DELETE
```

Since our API should return proper status codes, let's add the required functions to `Import.hs`

, making them available everywhere:

**Import.hs**

```
import Network.HTTP.Types as Import
( status200
, status201
, status400
, status403
, status404
)
```

Next we write some handlers:

**Handlers/Posts.hs**

```
getPostsR :: Handler Value
getPostsR = do
posts <- runDB $ selectList [] [] :: Handler [Entity Post]
return $ object ["posts" .= posts]
postPostsR :: Handler ()
postPostsR = do
post <- parseJsonBody_ :: Handler Post
_ <- runDB $ insert post
sendResponseStatus status201 ("CREATED" :: Text)
```

You'll notice we need to add a few explicit type annotations. Normally, Haskell can infer everything for us, but in this case the reason for the annotations is actually pretty interesting. The `selectList`

function will return any type that's persistable. Normally we would simply treat the returned records as a particular type and Haskell would say, "Aha! You wanted a Post" and then, as if by time travel, `selectList`

would give us appropriate results.

In this case, all we do with the returned `posts`

is pass them to `object`

. Since `object`

can work with any type than can be represented as JSON, Haskell doesn't know which type we mean. We must remove the ambiguity with a type annotation somewhere.

**Handlers/Post.hs**

```
getPostR :: PostId -> Handler Value
getPostR pid = do
post <- runDB $ get404 pid
return $ object ["post" .= post]
putPostR :: PostId -> Handler Value
putPostR pid = do
post <- parseJsonBody_ :: Handler Post
runDB $ replace pid post
sendResponseStatus status200 ("UPDATED" :: Text)
deletePostR :: PostId -> Handler Value
deletePostR pid = do
runDB $ delete pid
sendResponseStatus status200 ("DELETED" :: Text)
```

I love how functions like `get404`

and `parseJsonBody_`

allow these handlers to be completely free of any error-handling concerns, but still be safe and well-behaved.

There's going to be a small annoyance in our comment handlers which I alluded to earlier by omitting the `FromJSON`

instance on `Comment`

. Before we get to that, let's take care of the easy stuff:

**config/routes**

```
/posts/#PostId/comments CommentsR GET POST
/posts/#PostId/comments/#CommentId CommentR GET PUT DELETE
```

**Handlers/Comments.hs**

```
getCommentsR :: PostId -> Handler Value
getCommentsR pid = do
comments <- runDB $ selectList [CommentPost ==. pid] []
return $ object ["comments" .= comments]
-- We'll talk about this later
--postCommentsR :: PostId -> Handler ()
```

For the single-resource handlers, we're going to assume that a `CommentId`

is unique across posts, so we can ignore the `PostId`

in these handlers.

**Handlers/Comment.hs**

```
getCommentR :: PostId -> CommentId -> Handler Value
getCommentR _ cid = do
comment <- runDB $ get404 cid
return $ object ["comment" .= comment]
-- We'll talk about this later
--putCommentR :: PostId -> CommentId -> Handler ()
deleteCommentR :: PostId -> CommentId -> Handler ()
deleteCommentR _ cid = do
runDB $ delete cid
sendResponseStatus status200 ("DELETED" :: Text)
```

Up until now, we've been able to define JSON instances for our model, use `parseJsonBody_`

, and `insert`

the result. In this case however, the request body will be lacking the Post ID (since it's in the URL). This means we need to parse a different but similar data type from the JSON, then use that and the URL parameter to build a `Comment`

.

**Helpers/Comment.hs**

```
-- This datatype would be richer if Comment had more attributes. For now
-- we only have to deal with content, so I can use a simple newtype.
newtype CommentAttrs = CommentAttrs Text
instance FromJSON CommentAttrs where
parseJSON (Object o) = CommentAttrs <$> o .: "content"
parseJSON _ = mzero
toComment :: PostId -> CommentAttrs -> Comment
toComment pid (CommentAttrs content) = Comment
{ commentPost = pid
, commentContent = content
}
```

This may seem a bit verbose and even redundant, and there's probably a more elegant way to get around this situation. Lacking that, I think the additional safety (vs the obvious solution of making `commentPost`

a `Maybe`

) and separation of concerns (vs putting this in the model layer) is worth the extra typing. It's also very easy to use:

**Handlers/Comments.hs**

```
import Helpers.Comment
postCommentsR :: PostId -> Handler ()
postCommentsR pid = do
_ <- runDB . insert . toComment pid =<< parseJsonBody_
sendResponseStatus status201 ("CREATED" :: Text)
```

**Handlers/Comment.hs**

```
import Helpers.Comment
putCommentR :: PostId -> CommentId -> Handler
putCommentR pid cid = do
runDB . replace cid . toComment pid =<< parseJsonBody_
sendResponseStatus status200 ("UPDATED" :: Text)
```

We don't need a type annotation on

`parseJsonBody_`

in this case. Since the result is being passed to `toComment pid`

, Haskell knows we want a `CommentAttrs`

and uses its `parseJSON`

function within `parseJsonBody_`

With a relatively small amount of time and code, we've written a fully-featured JSON API using Yesod. I think the JSON instances and API handlers are more concise and readable than the analogous Rails serializers and controllers. Our system is also far safer thanks to the type system and framework-provided functions like `get404`

and `parseJsonBody_`

without us needing to explicitly deal with any of that.

I hope this post has shown that Yesod is indeed a viable option for projects of this nature.

In lecture 5A of Structure & Interpretation of Computer Programs, Gerald Sussman introduces the idea of assignments, side effects and state. Before that, they had been working entirely in purely functional Lisp which could be completely evaluated and reasoned about using the substitution model. He states repeatedly that this is a horrible thing as it requires a far more complex view of programs. At the end of the lecture, he shows a compelling example of why we must introduce this horrible thing anyway; without it, we cannot decouple parts of our algorithms cleanly and would be reduced to huge single-function programs in some critical cases.

The example chosen in SICP is estimating π using Cesaro's method. The method states that the probability that any two random numbers' greatest common divisor equals 1 is itself equal to 6/π^{2}.

Since I know Ruby better than Lisp (and I'd venture my readers do too), here's a ported version:

```
def estimate_pi(trials)
p = monte_carlo(trials) { cesaro }
Math.sqrt(6 / p)
end
def cesaro
rand.gcd(rand) == 1
end
def monte_carlo(trials, &block)
iter = ->(trials, passed) do
if trials == 0
passed
else
if block.call
iter.call(trials - 1, passed + 1)
else
iter.call(trials - 1, passed)
end
end
end
iter.call(trials, 0) / trials.to_f
end
```

I've written this code to closely match the Lisp version which used a recursive iterator. Unfortunately, this means that any reasonable number of trials will exhaust Ruby's stack limit.

The code above also assumes a `rand`

function which will return different random integers on each call. To do so, it must employ mutation and hold internal state:

```
def rand
@x ||= random_init
@x = random_update(@x)
@x
end
```

Here I assume the same primitives as Sussman does, though it wouldn't be difficult to wrap Ruby's built-in `rand`

to return integers instead of floats. The important thing is that this function needs to hold onto the previously returned random value in order to provide the next.

Sussman states that without this impure `rand`

function, it would be very difficult to decouple the `cesaro`

function from the `monte_carlo`

one. Without utilizing (re)assignment and mutation, we would have to write our estimation function as one giant blob:

```
def estimate_pi(trials)
iter = ->(trials, passed, x1, x2) do
if trials == 0
passed
else
x1_ = rand_update(x2)
x2_ = rand_update(x1_)
if x1.gcd(x2) == 1
iter.call(trials - 1, passed + 1, x1_, x2_)
else
iter.call(trials - 1, passed, x1_, x2_)
end
end
end
x1 = rand_init
x2 = rand_update(x1)
p = iter.call(trials, 0, x1, x2) / trials.to_f
Math.sqrt(6 / p)
end
```

Ouch.

It's at this point Sussman stops, content with his justification for adding mutability to Lisp. I'd like to explore a bit further: what if remaining pure were non-negotiable? Are there other ways to make decoupled systems and elegant code without sacrificing purity?

Let's start with a non-mutating random number generator:

```
class RGen
def initialize(seed = nil)
@seed = seed || random_init
end
def next
x = random_update(@seed)
[x, RGen.new(x)]
end
end
def rand(g)
g.next
end
```

This allows for the following implementation:

```
def estimate_pi(trials)
p = monte_carlo(trials) { |g| cesaro(g) }
Math.sqrt(6 / p)
end
def cesaro(g)
x1, g1 = rand(g)
x2, g2 = rand(g1)
[x1.gcd(x2) == 1, g2]
end
def monte_carlo(trials, &block)
iter = ->(trials, passed, g) do
if trials == 0
passed
else
ret, g_ = block.call(g)
if ret
iter.call(trials - 1, passed + 1, g_)
else
iter.call(trials - 1, passed, g_)
end
end
end
iter.call(trials, 0, RGen.new) / trials.to_f
end
```

We've moved out of the single monolithic function, which is a step in the right direction. The additional generator arguments being passed all over the place makes for some readability problems though. The reason for that is a missing abstraction; one that's difficult to model in Ruby. To clean this up further, we'll need to move to a language where purity was in fact non-negotiable: Haskell.

In Haskell, the type signature of our current `monte_carlo`

function would be:

```
monteCarlo :: Int -- number of trials
-> (RGen -> (Bool, RGen)) -- the experiment
-> Double -- result
```

Within `monte_carlo`

, we need to repeatedly call the block with a fresh random number generator. Calling `RGen#next`

gives us an updated generator along with the next random value, but that must happen within the iterator block. In order to get it out again and pass it into the next iteration, we need to return it. This is why `cesaro`

has the type that it does:

`cesaro :: RGen -> (Bool, RGen)`

`cesaro`

depends on some external state so it accepts it as an argument. It also affects that state so it must return it as part of its return value. `monteCarlo`

is responsible for creating an initial state and "threading" it though repeated calls to the experiment given. Mutable state is "faked" by passing a return value as argument to each computation in turn.

You'll also notice this is a similar type signature as our `rand`

function:

`rand :: RGen -> (Int, RGen)`

This similarity and process is a generic concern which has nothing to do with Cesaro's method or performing Monte Carlo tests. We should be able to leverage the similarities and separate this concern out of our main algorithm. Monadic state allows us to do exactly that.

For the Haskell examples, I'll be using `System.Random.StdGen`

in place of the `RGen`

class we've been working with so far. It is exactly like our `RGen`

class above in that it can be initialized with some seed, and there is a `random`

function with the type `StdGen -> (Int, StdGen)`

.

The abstract thing we're lacking is a way to call those function successively, passing the `StdGen`

returned from one invocation as the argument to the next invocation, all the while being able to access that `a`

(the random integer or experiment outcome) whenever needed. Haskell, has just such an abstraction, it's in `Control.Monad.State`

.

First we'll need some imports.

```
import System.Random
import Control.Monad.State
```

Notice that we have a handful of functions with similar form.

`(StdGen -> (a, StdGen))`

What `Control.Monad.State`

provides is a type that looks awfully similar.

`data State s a = State { runState :: (s -> (a, s)) }`

Let's declare a type synonym which fixes that `s`

type variable to the state we care about: a random number generator.

`type RGenState a = State StdGen a`

By replacing the `s`

in `State`

with our `StdGen`

type, we end up with a more concrete type that looks as if we had written this:

```
data RGenState a = RGenState
{ runState :: (StdGen -> (a, StdGen)) }
```

And then went on to write all the various instances that make this type useful. By using such a type synonym, we get all those instances and functions for free.

Our first example:

```
rand :: RGenState Int
rand = state random
```

We can "evaluate" this action with one of a number of functions provided by the library, all of which require some initial state. `runState`

will literally just execute the function and return the result and the updated state (in case you missed it, it's just the record accessor for the `State`

type). `evalState`

will execute the function, discard the updated state, and give us only the result. `execState`

will do the inverse: execute the function, discard the result, and give us only the updated state.

We'll be using `evalState`

exclusively since we don't care about how the random number generator ends up after these actions, only that it gets updated and passed along the way. Let's wrap that up in a function that both provides the initial state and evaluates the action.

```
runRandom :: RGenState a -> a
runRandom f = evalState f (mkStdGen 1)
-- runRandom rand
-- => 7917908265643496962
```

Unfortunately, the result will be the same every time since we're using a constant seed. You'll see soon that this is an easy limitation to address after the fact.

With this bit of glue code in hand, we can re-write our program in a nice modular way without any actual mutable state or re-assignment.

```
estimatePi :: Int -> Double
estimatePi n = sqrt $ 6 / (monteCarlo n cesaro)
cesaro :: RGenState Bool
cesaro = do
x1 <- rand
x2 <- rand
return $ gcd x1 x2 == 1
monteCarlo :: Int -> RGenState Bool -> Double
monteCarlo trials experiment = runRandom $ do
outcomes <- replicateM trials experiment
return $ (length $ filter id outcomes) `divide` trials
where
divide :: Int -> Int -> Double
divide a b = fromIntegral a / fromIntegral b
```

Even with a constant seed, it works pretty well:

```
main = print $ estimatePi 1000
-- => 3.149183286488868
```

It's easy to fall into the trap of thinking that Haskell's type system is limiting in some way. The `monteCarlo`

function above can only work with random-number-based experiments? Pretty weak.

Consider the following refactoring:

```
estimatePi :: Int -> RGenState Double
estimatePi n = do
p <- monteCarlo n cesaro
return $ sqrt (6 / p)
cesaro :: RGenState Bool
cesaro = do
x1 <- rand
x2 <- rand
return $ gcd x1 x2 == 1
monteCarlo :: Monad m => Int -> m Bool -> m Double
monteCarlo trials experiment = do
outcomes <- replicateM trials experiment
return $ (length $ filter id outcomes) `divide` trials
where
divide :: Int -> Int -> Double
divide a b = fromIntegral a / fromIntegral b
main :: IO ()
main = print $ runRandom $ estimatePi 1000
```

The minor change made was moving the call to `runRandom`

all the way up to `main`

. This allows us to pass stateful computations throughout our application without ever caring about that state except at this highest level.

This would make it simple to add true randomness (which requires `IO`

) by replacing the call to `runRandom`

with something that pulls entropy in via `IO`

rather than using `mkStdGen`

.

```
runTrueRandom :: RGenState a -> IO a
runTrueRandom f = do
s <- newStdGen
evalState f s
main = print =<< runTrueRandom (estimatePi 1000)
```

One could even do this conditionally so that your random-based computations became deterministic during tests.

Another important point here is that `monteCarlo`

can now work with *any Monad!* This makes perfect sense: The purpose of this function is to run experiments and tally outcomes. The idea of an *experiment* only makes sense if there's some outside force which might change the results from run to run, *but who cares what that outside force is?* Haskell don't care. Haskell requires we only specify it as far as we need to: it's some Monad `m`

, nothing more.

This means we can run IO-based experiments via the Monte Carlo method with the same `monteCarlo`

function just by swapping out the monad:

What if Cesaro claimed the probability that the current second is an even number is equal to 6/π^{2}? Seems reasonable, let's model it:

```
-- same code, different name / type
estimatePiIO :: Int -> IO Double
estimatePiIO n = do
p <- monteCarlo n cesaroIO
return $ sqrt (6 / p)
cesaroIO :: IO Bool
cesaroIO = do
t <- getCurrentTime
return $ even $ utcDayTime t
monteCarlo :: Monad m => Int -> m Bool -> m Double
monteCarlo trials experiment = -- doesn't change at all!
main :: IO ()
main = print =<< estimatePiIO 1000
```

I find the fact that this expressiveness, generality, and polymorphism can share the same space as the strictness and incredible safety of this type system fascinating.

Hspec is a BDD library for writing Rspec-style tests in Haskell. In this post, I'm going to describe setting up a Haskell project using this test framework. What we'll end up with is a series of tests which can be run individually (at the module level), or all together (as part of packaging). Then I'll briefly mention Guard (a Ruby tool) and how we can use that to automatically run relevant tests as we change code.

For any of this to work, our implementation and test modules must follow a particular layout:

```
Code/liquid/
├── src
│ └── Text
│ ├── Liquid
│ │ ├── Context.hs
│ │ ├── Parse.hs
│ │ └── Render.hs
│ └── Liquid.hs
└── test
├── SpecHelper.hs
├── Spec.hs
└── Text
└── Liquid
├── ParseSpec.hs
└── RenderSpec.hs
```

Notice that for each implementation module (under `./src`

) there is a corresponding spec file at the same relative path (under `./test`

) with a consistent, conventional name (`<ModuleName>Spec.hs`

). For this post, I'm going to outline the first few steps of building the `Parse`

module of the above source tree which happens to be my liquid library, a Haskell implementation of Shopify's template system.

Hspec provides a useful function called `hspec-discover`

. If your project follows the conventional layout above, you can simply create a file like so:

**test/Spec.hs**

`{-# OPTIONS_GHC -F -pgmF hspec-discover #-}`

And when that file is executed, all of your specs will be found and run together as a single suite.

I like to create a central helper module which gets imported into all specs. It simply exports our test framework and implementation code:

**test/SpecHelper.hs**

```
module SpecHelper
( module Test.Hspec
, module Text.Liquid.Parse
) where
import Test.Hspec
import Text.Liquid.Parse
```

This file might not seem worth it now, but as you add more modules, it becomes useful quickly.

**test/Text/Liquid/ParseSpec.hs**

```
module Text.Liquid.ParseSpec where
import SpecHelper
spec :: Spec
spec = do
describe "Text.Liquid.Parse" $ do
context "Simple text" $ do
it "parses exactly as-is" $ do
let content = "Some simple text"
parseTemplate content `shouldBe` Right [TString content]
main :: IO ()
main = hspec spec
```

With this first spec, I've already made some assumptions and design decisions.

The API into our module will be a single `parseTemplate`

function which returns an `Either`

type (commonly used to represent success or failure). The `Right`

value (conventionally used for success) will be a list of template parts. One such part can be constructed with the `TString`

function and is used to represent literal text with no interpolation or logic. This is the simplest template part possible and is therefore a good place to start.

The `spec`

function is what will be found by `hspec-discover`

and rolled up into a project-wide test. I've also added a `main`

function which just runs said `spec`

. This allows me to easily run the spec in isolation, which you should do now:

`$ runhaskell -isrc -itest test/Text/Liquid/ParseSpec.hs`

The first error you should see is an inability to find `Test.Hspec`

. Go ahead and install it:

`$ cabal install hspec`

You should then get a similar error for `Text.Liquid.Parse`

then some more about functions and types that are not yet defined. Let's go ahead and implement just enough to get past that:

**src/Text/Liquid/Parse.hs**

```
module Text.Liquid.Parse where
type Template = [TPart]
data TPart = TString String
parseTemplate :: String -> Either Template String
parseTemplate = undefined
```

The test should run now and give you a nice red failure due to the attempted evaluation of `undefined`

.

Since implementing `Parse`

is not the purpose of this post, I won't be moving forward in that direction. In stead, I'm going to show you how to set this library up as a package which can be `cabal install`

ed and/or `cabal test`

ed by end-users.

For now, you can pass the test easily like so:

**src/Text/Liquid/Parse.hs**

```
parseTemplate :: String -> Either Template String
parseTemplate str = Right [TString str]
```

For TDD purists, this is actually the correct thing to do here: write the simplest implementation to pass the test (even if you "know" it's not going to last), then write another failing test to force you to implement a little more. I don't typically subscribe to that level of TDD purity, but I can see the appeal.

We've already got `Spec.hs`

which, when executed, will run all our specs together:

`$ runhaskell -isrc -itest test/Spec.hs`

We just need to wire that into the Cabal packaging system:

**liquid.cabal**

```
name: liquid
version: 0.0.0
license: MIT
copyright: (c) 2013 Pat Brisbin
author: Pat Brisbin <pbrisbin@gmail.com>
maintainer: Pat Brisbin <pbrisbin@gmail.com>
build-type: Simple
cabal-version: >= 1.8
library
hs-source-dirs: src
exposed-modules: Text.Liquid.Parse
build-depends: base == 4.*
test-suite spec
type: exitcode-stdio-1.0
hs-source-dirs: test
main-is: Spec.hs
build-depends: base == 4.*
, hspec >= 1.3
, liquid
```

With this in place, testing our package is simple:

```
$ cabal configure --enable-tests
...
$ cabal build
...
$ cabal test
Building liquid-0.0.0...
Preprocessing library liquid-0.0.0...
In-place registering liquid-0.0.0...
Preprocessing test suite 'spec' for liquid-0.0.0...
Linking dist/build/spec/spec ...
Running 1 test suites...
Test suite spec: RUNNING...
Test suite spec: PASS
Test suite logged to: dist/test/liquid-0.0.0-spec.log
1 of 1 test suites (1 of 1 test cases) passed.
```

Another thing I like to setup is the automatic running of relevant specs as I change code. To do this, we can use a tool from Ruby-land called Guard. Guard is a great example of a simple tool doing one thing well. All it does is watch files and execute actions based on rules defined in a `Guardfile`

. Through plugins and extensions, there are a number of pre-built solutions for all sorts of common needs: restarting servers, regenerating ctags, or running tests.

We're going to use guard-shell which is a simple extension allowing for running shell commands and spawning notifications.

`$ gem install guard-shell`

Next, create a `Guardfile`

:

**Guardfile**

```
# Runs the command and prints a notification
def execute(cmd)
if system(cmd)
n 'Build succeeded', 'hspec', :success
else
n 'Build failed', 'hspec', :failed
end
end
def run_all_tests
execute %{
cabal configure --enable-tests &&
cabal build && cabal test
}
end
def run_tests(mod)
specfile = "test/#{mod}Spec.hs"
if File.exists?(specfile)
files = [specfile]
else
files = Dir['test/**/*.hs']
end
execute "ghc -isrc -itest -e main #{files.join(' ')}"
end
guard :shell do
watch(%r{.*\.cabal$}) { run_all_tests }
watch(%r{test/SpecHelper.hs$}) { run_all_tests }
watch(%r{src/(.+)\.hs$}) { |m| run_tests(m[1]) }
watch(%r{test/(.+)Spec\.hs$}) { |m| run_tests(m[1]) }
end
```

Much of this `Guardfile`

comes from this blog post by Michael Xavier. His version also includes cabal sandbox support, so be sure to check it out if that interests you.

If you like to bundle all your Ruby gems (and you probably should) that can be done easily, just see my main liquid repo as that's how I do things there.

In one terminal, start guard:

`$ guard`

Finally, simulate an edit in your module and watch the test automatically run:

`$ touch src/Text/Liquid/Parse.hs`

And there you go, fully automated unit testing in Haskell.