Home

haskell

Regular Expression Evaluation via Finite Automata

What follows is a literate haskell file runnable via ghci. The raw source for this page can be found here.

While reading Understanding Computation again last night, I was going back through the chapter where Tom Stuart describes deterministic and non-deterministic finite automata. These simple state machines seem like little more than a teaching tool, but he eventually uses them as the implementation for a regular expression matcher. I thought seeing this concrete use for such an abstract idea was interesting and wanted to re-enforce the ideas by implementing such a system myself -- with Haskell, of course.

Before we get started, we'll just need to import some libraries:

import Control.Monad.State
import Data.List (foldl')
import Data.Maybe

Patterns and NFAs

We're going to model a subset of regular expression patterns.

data Pattern
    = Empty                   -- ""
    | Literal Char            -- "a"
    | Concat Pattern Pattern  -- "ab"
    | Choose Pattern Pattern  -- "a|b"
    | Repeat Pattern          -- "a*"
    deriving Show

With this, we can build "pattern ASTs" to represent regular expressions:

ghci> let p = Choose (Literal 'a') (Repeat (Literal 'b')) -- /a|b*/

It's easy to picture a small parser to build these out of strings, but we won't do that as part of this post. Instead, we'll focus on converting these patterns into Nondeterministic Finite Automata or NFAs. We can then use the NFAs to determine if the pattern matches a given string.

To explain NFAs, it's probably easiest to explain DFAs, their deterministic counter parts, first. Then we can go on to describe how NFAs differ.

A DFA is a simple machine with states and rules. The rules describe how to move between states in response to particular input characters. Certain states are special and flagged as "accept" states. If, after reading a series of characters, the machine is left in an accept state it's said that the machine "accepted" that particular input.

An NFA is the same with two notable differences: First, an NFA can have rules to move it into more than one state in response to the same input character. This means the machine can be in more than one state at once. Second, there is the concept of a Free Move which means the machine can jump between certain states without reading any input.

Modeling an NFA requires a type with rules, current states, and accept states:

type SID = Int -- State Identifier

data NFA = NFA
    { rules         :: [Rule]
    , currentStates :: [SID]
    , acceptStates  :: [SID]
    } deriving Show

A rule defines what characters tell the machine to change states and which state to move into.

data Rule = Rule
    { fromState  :: SID
    , inputChar  :: Maybe Char
    , nextStates :: [SID]
    } deriving Show

Notice that nextStates and currentStates are lists. This is to represent the machine moving to, and remaining in, more than one state in response to a particular character. Similarly, inputChar is a Maybe value because it will be Nothing in the case of a rule representing a Free Move.

If, after processing some input, any of the machine's current states are in its list of "accept" states, the machine has accepted the input.

accepts :: NFA -> [Char] -> Bool
accepts nfa = accepted . foldl' process nfa

  where
    accepted :: NFA -> Bool
    accepted nfa = any (`elem` acceptStates nfa) (currentStates nfa)

Processing a single character means finding any followable rules for the given character and the current machine state, and following them.

process :: NFA -> Char -> NFA
process nfa c = case findRules c nfa of
    -- Invalid input should cause the NFA to go into a failed state. 
    -- We can do that easily, just remove any acceptStates.
    [] -> nfa { acceptStates = [] }
    rs -> nfa { currentStates = followRules rs }

findRules :: Char -> NFA -> [Rule]
findRules c nfa = filter (ruleApplies c nfa) $ rules nfa

A rule applies if

  1. The read character is a valid input character for the rule, and
  2. That rule applies to an available state
ruleApplies :: Char -> NFA -> Rule -> Bool
ruleApplies c nfa r =
    maybe False (c ==) (inputChar r) &&
    fromState r `elem` availableStates nfa

An "available" state is one which we're currently in, or can reach via Free Moves.

availableStates :: NFA -> [SID]
availableStates nfa = currentStates nfa ++ freeStates nfa

The process of finding free states (those reachable via Free Moves) gets a bit hairy. We need to start from our current state(s) and follow any Free Moves recursively. This ensures that Free Moves which lead to other Free Moves are correctly accounted for.

freeStates :: NFA -> [SID]
freeStates nfa = go [] (currentStates nfa)

  where
    go acc [] = acc
    go acc ss =
        let ss' = followRules $ freeMoves nfa ss
        in go (acc ++ ss') ss'

Free Moves from a given set of states are rules for those states which have no input character.

freeMoves :: NFA -> [SID] -> [Rule]
freeMoves nfa ss = filter (\r ->
    (fromState r `elem` ss) && (isNothing $ inputChar r)) $ rules nfa

Of course, the states that result from following rules are simply the concatenation of those rules' next states.

followRules :: [Rule] -> [SID]
followRules = concatMap nextStates

Now we can model an NFA and see if it accepts a string or not. You could test this in ghci by defining an NFA in state 1 with an accept state 2 and a single rule that moves the machine from 1 to 2 if the character "a" is read.

ghci> let nfa = NFA [Rule 1 (Just 'a') [2]] [1] [2]
ghci> nfa `accepts` "a"
True
ghci> nfa `accepts` "b"
False

Pretty cool.

What we need to do now is construct an NFA whose rules for moving from state to state are derived from the nature of the pattern it represents. Only if the NFA we construct moves to an accept state for a given string of input does it mean the string matches that pattern.

matches :: String -> Pattern -> Bool
matches s = (`accepts` s) . toNFA

We'll define toNFA later, but if you've loaded this file, you can play with it in ghci now:

ghci> "" `matches` Empty
True
ghci> "abc" `matches` Empty
False

And use it in an example main:

main :: IO ()
main = do
    -- This AST represents the pattern /ab|cd*/:
    let p = Choose
            (Concat (Literal 'a') (Literal 'b'))
            (Concat (Literal 'c') (Repeat (Literal 'd')))

    print $ "xyz" `matches` p
    -- => False

    print $ "cddd" `matches` p
    -- => True

Before I show toNFA, we need to talk about mutability.

A Bit About Mutable State

Since Pattern is a recursive data type, we're going to have to recursively create and combine NFAs. For example, in a Concat pattern, we'll need to turn both sub-patterns into NFAs then combine those in some way. In the Ruby implementation, Mr. Stuart used Object.new to ensure unique state identifiers between all the NFAs he has to create. We can't do that in Haskell. There's no global object able to provide some guaranteed-unique value.

What we're going to do to get around this is conceptually simple, but appears complicated because it makes use of monads. All we're doing is defining a list of identifiers at the beginning of our program and drawing from that list whenever we need a new identifier. Because we can't maintain that as a variable we constantly update every time we pull an identifier out, we'll use the State monad to mimic mutable state through our computations.

I apologize for the naming confusion here. This State type is from the Haskell library and has nothing to with the states of our NFAs.

First, we take the parameterized State s a type, and fix the s variable as a list of (potential) identifiers:

type SIDPool a = State [SID] a

This makes it simple to create a nextId action which requests the next identifier from this list as well as updates the computation's state, removing it as a future option before presenting that next identifier as its result.

nextId :: SIDPool SID
nextId = do
    (x:xs) <- get
    put xs
    return x

This function can be called from within any other function in the SIDPool monad. Each time called, it will read the current state (via get), assign the first identifier to x and the rest of the list to xs, set the current state to that remaining list (via put) and finally return the drawn identifier to the caller.

Pattern ⇒ NFA

Assuming we have some function buildNFA which handles the actual conversion from Pattern to NFA but is in the SIDPool monad, we can evaluate that action, supplying an infinite list as the potential identifiers, and end up with an NFA with unique identifiers.

toNFA :: Pattern -> NFA
toNFA p = evalState (buildNFA p) [1..]

As mentioned, our conversion function, lives in the SIDPool monad, allowing it to call nextId at will. This gives it the following type signature:

buildNFA :: Pattern -> SIDPool NFA

Every pattern is going to need at least one state identifier, so we'll pull that out first, then begin a case analysis on the type of pattern we're dealing with:

buildNFA p = do
    s1 <- nextId

    case p of

The empty pattern results in a predictably simple machine. It has one state which is also an accept state. It has no rules. If it gets any characters, they'll be considered invalid and put the machine into a failed state. Giving it no characters is the only way it can remain in an accept state.

        Empty -> return $ NFA [] [s1] [s1]

Also simple is the literal character pattern. It has two states and a rule between them. It moves from the first state to the second only if it reads that character. Since the second state is the only accept state, it will only accept that character.

        Literal c -> do
            s2 <- nextId

            return $ NFA [Rule s1 (Just c) [s2]] [s1] [s2]

We can model a concatenated pattern by first turning each sub-pattern into their own NFAs, and then connecting the accept state of the first to the start state of the second via a Free Move. This means that as the combined NFA is reading input, it will only accept that input if it moves through the first NFAs states into what used to be its accept state, hop over to the second NFA, then move into its accept state. Conceptually, this is exactly how a concatenated pattern should match.

Note that freeMoveTo will be shown after.

        Concat p1 p2 -> do
            nfa1 <- buildNFA p1
            nfa2 <- buildNFA p2

            let freeMoves = map (freeMoveTo nfa2) $ acceptStates nfa1

            return $ NFA
                (rules nfa1 ++ freeMoves ++ rules nfa2)
                (currentStates nfa1)
                (acceptStates nfa2)

We can implement choice by creating a new starting state, and connecting it to both sub-patterns' NFAs via Free Moves. Now the machine will jump into both NFAs at once, and the composed machine will accept the input if either of the paths leads to an accept state.

        Choose p1 p2 -> do
            s2 <- nextId
            nfa1 <- buildNFA p1
            nfa2 <- buildNFA p2

            let freeMoves =
                    [ freeMoveTo nfa1 s2
                    , freeMoveTo nfa2 s2
                    ]

            return $ NFA
                (freeMoves ++ rules nfa1 ++ rules nfa2) [s2]
                (acceptStates nfa1 ++ acceptStates nfa2)

A repeated pattern is probably hardest to wrap your head around. We need to first convert the sub-pattern to an NFA, then we'll connect up a new start state via a Free Move (to match 0 occurrences), then we'll connect the accept state back to the start state (to match repetitions of the pattern).

        Repeat p -> do
            s2 <- nextId
            nfa <- buildNFA p

            let initMove = freeMoveTo nfa s2
                freeMoves = map (freeMoveTo nfa) $ acceptStates nfa

            return $ NFA
                (initMove : rules nfa ++ freeMoves) [s2]
                (s2: acceptStates nfa)

And finally, our little helper which connects some state up to an NFA via a Free Move.

  where
    freeMoveTo :: NFA -> SID -> Rule
    freeMoveTo nfa s = Rule s Nothing (currentStates nfa)

That's It

I want to give a big thanks to Tom Stuart for writing Understanding Computation. That book has opened my eyes in so many ways. I understand why he chose Ruby as the book's implementation language, but I find Haskell to be better-suited to these sorts of modeling tasks. Hopefully he doesn't mind me exploring that by rewriting some of his examples.

published on 07 Apr 2014, tagged with haskell

Applicative Functors

Every time I read Learn You a Haskell, I get something new out of it. This most recent time through, I think I've finally gained some insight into the Applicative type class.

I've been writing Haskell for some time and have developed an intuition and explanation for Monad. This is probably because monads are so prevalent in Haskell code that you can't help but get used to them. I knew that Applicative was similar but weaker, and that it should be a super class of Monad but since it arrived later it is not. I now think I have a general understanding of how Applicative is different, why it's useful, and I would like to bring anyone else who glossed over Applicative on the way to Monad up to speed.

The Applicative type class represents applicative functors, so it makes sense to start with a brief description of functors that are not applicative.

Values in a Box

A functor is any container-like type which offers a way to transform a normal function into one that operates on contained values.

Formally:

fmap :: Fuctor f     -- for any functor,
     => (  a ->   b) -- take a normal function,
     -> (f a -> f b) -- and make one that works on contained values

Some prefer to think of it like this:

fmap :: Functor f -- for any functor,
     => (a -> b)  -- take a normal function,
     -> f a       -- and a contained value,
     -> f b       -- and return the contained result of applying that 
                  -- function to that value

Thanks to currying, the two are completely equivalent.

This is the first small step in the ultimate goal between all three of these type classes: allow us to work with values with context (in this case, a container of some sort) as if that context weren't present at all. We give a normal function to fmap and it sorts out how to deal with the container, whatever it may be.

Functions in a Box

To say that a functor is "applicative", we mean that the contained value can be applied. This is just another way of saying it's a function.

An applicative functor is any container-like type which offers a way to transform a contained function into one that can operate on contained values.

(<*>) :: Applicative f -- for any applicative functor,
      => f (a ->   b)  -- take a contained function,
      -> (f a -> f b)  -- and make one that works on contained values

Again because of currying, we can also think of it like this:

(<*>) :: Applicative f -- for any applicative functor,
      => f (a -> b)    -- take a contained function,
      -> f a           -- and a contained value,
      -> f b           -- and return a contained result

Applicative functors also have a way to take an un-contained function and put it into a container:

pure :: Applicative f -- for any applicative functor,
     =>   (a -> b)    -- take a normal function,
     -> f (a -> b)    -- and put it in a container

In actuality, the type signature is just a -> f a. Since a literally means "any type", it can certainly represent the type (a -> b) too.

pure :: Applicative f => a -> f a

Understanding this is very important for understanding the usefulness of Applicative. Even though the type signature for (<*>) starts with f (a -> b), it can just as easily be used with functions taking any number of arguments.

Consider the following:

:: f (a -> b -> c) -> f a -> f (b -> c)

Is this (<*>) or not?

In stead of writing its signature with b, lets use a question mark:

(<*>) :: f (a -> ?) -> f a -> f ?

Indeed it is. Just substitute the type (b -> c) for every ? rather than the simple b in the actual class definition.

Curried All the Way Down

What you just saw was a very concrete example of currying. When we say "a function of n arguments", we're actually lying. All functions in Haskell take exactly one argument. Multi-argument functions are really just single-argument functions that return other single-argument functions that accept the remaining arguments via the same process.

Using the question mark approach, we see that multi-argument functions are simply the form:

f :: a -> ?
f = -- ...

And it's entirely legal for that ? to be replaced with (b -> ?), and for that ? to be replaced with (c -> ?) and so on ad infinitum. Thus you have the appearance of multi-argument functions.

As is common with Haskell, this results in what appears to be happy coincidence, but is actually the product of developing a language on top of such a consistent mathematical foundation. You'll notice that after using (<*>) on a function of more than one argument, the result is not a wrapped result, but another wrapped function -- does that sound familiar? Exactly, it's an applicative functor.

Let me say that again: if you partially apply a function of more than one argument using (<*>), you end up with another applicative functor which can be given to (<*>) yet again with another wrapped value to supply the remaining argument to that original function. This can continue as long as the function needs more arguments. Just like normal function application.

A "Concrete" Example

Consider what this might look like if you start with a plain old function that (conceptually) takes more than one argument, but the values that it wants to operate on are wrapped in some container.

-- A normal function
f :: (a -> b -> c)
f = -- ...

-- One contained value, suitable for its first argument
x :: Applicative f => f a
x = -- ...

-- Another contained value, suitable for its second
y :: Applicative f => f b
y = -- ...

How do we pass x and y to f to get some overall result? Easy, you wrap the function with pure then use (<*>) repeatedly:

result :: Applicative f => f c
result = pure f <*> x <*> y

The first portion of that expression is very interesting: pure f <*> x. What is this bit doing? It's taking a normal function and applying it to a contained value. Wait a second, normal functors know how to do that!

Since in Haskell every Applicative is also a Functor, that means it could be rewritten as just fmap f x, turning the whole expression into fmap f x <*> y.

Never satisfied, Haskell introduced a function called (<$>) which is just fmap but infix. With this alias, we can write:

result = f <$> x <*> y

Not only is this epically concise, but it looks exactly like f x y which is how this code would be written if there were no containers involved. Here we have another, more powerful step towards the goal of writing code that has to deal with some context (in our case, still that container) without actually having to care about that context. You write your function like you normally would, then just pepper (<$>) and (<*>) between the arguments.

A Missing Piece

With both Functor and Applicative, anything and everything was wrapped. Both arguments to (<*>) are wrapped, the result is wrapped, and pure wraps something up. We never have to deal with unwrapping anything.

Simply put, a Monad is a type that can do everything an Applicative can do plus handle unwrapping. However, it can't just unwrap values willy-nilly. It can only unwrap a value in a very specific case: while passing it to a function which returns a wrapped result.

Formally:

(>>=) :: Monad m  -- for any monad,
      => m a      -- take wrapped value
      -> a -> m b -- and a function which needs it unwrapped
      -> m b      -- unwrap it, and apply that function

Let's look at this through the lens of currying:

(>>=) :: Monad m           -- for any monad,
      => m a               -- take a wrapped value
      -> (a -> m b -> m b) -- and return a function which can take an 
                           -- unwrapped value and a wrapped one and 
                           -- return another wrapped one

This clarifies why it's the only way we can support unwrapping. We're taking a wrapped value and producing a function which operates on an unwrapped value. The type signature describes the nature of this function: it takes yet another wrapped value as argument and produces a wrapped value of the same type as its result.

This gives us the needed flexibility to implement unwrapping. Consider a type like Maybe. If we were able to unwrap values at any point and return them directly, we'd be in trouble when we come across a Nothing. If, on the other hand, our type signature says we ourselves have to return a wrapped result, we can take the reasonable step of not unwrapping anything and simply returning another Nothing.

The above type signature ensures that's always an option.

Haskell has no generic function of the type Monad m => m a -> a. Without that, there is no opportunity for unwrapping something that can't be unwrapped. Haskell does have a function called join with the signature Monad m => m (m a) => m a. This is indeed a function that just unwraps a value directly, but because the type signature enforces that the value coming in is doubly-wrapped and the value going out is still wrapped, we can maintain our safety. Yay type systems.

Wrapper ⇒ Action, Unwrapping ⇒ Sequencing

Up until now, we've been calling these types wrappers, containers, or contexts. With Monad it can be easier to think of them as actions. An action implies that something else may occur as a result of evaluating this otherwise pure function: side-effects. These can be real-world side effects in the case of IO, or context-changing side effects in the case of Maybe or List.

Unwrapping as a concept should then be replaced with evaluating or running an action, it's when any side-effects will be realized. Again in the case of Maybe, when we attempt to unwrap a Nothing value via (>>=), that's the point at which the entire computation becomes a Nothing.

Once we've made that conceptual leap, we can think about dependant, or sequenced actions. In the case of IO, we have an expectation that actions will be performed in a particular order. In the case of Maybe, we need to know that if an earlier function returns Nothing, the later functions will know about it.

The ability for a Monad to be unwrapped or evaluated combined with the type signature of (>>=) provides for sequencing because it enforces that the left hand side is evaluated before the right hand side. This must be true because the left hand value has to be evaluated (i.e. unwrapped) for the right hand side to even be evaluable at all.

What's the Point?

With all of this background knowledge, I came to a simple mental model for applicative functors vs monads: Monad is for series where Applicative is for parallel.

We use a monad for composing multiple actions (values with context) into a single action (a new value with context). We use applicative for the same reason. The difference lies (of course) in how that composition is carried out. With a monad, each action is evaluated in turn and the results of each are fed into the next via (>>=). This implies ordering. With an applicative functor, every value is unwrapped in turn as functions are applied via (<*>) and the results are combined into a single value in "parallel".

Let's walk through a real example.

Building a User

In an application I'm working on, I'm doing OAuth based authentication. My domain has the following (simplified) user type:

data User = User
    { userFirstName :: Text
    , userLastName  :: Text
    , userEmail     :: Text
    }

During the process of authentication, an OAuth endpoint provides me with some profile data which ultimately comes back as an association list:

type Profile = [(Text, Text)]

-- Example:
-- [ ("first_name", "Pat"            )
-- , ("last_name" , "Brisbin"        )
-- , ("email"     , "me@pbrisbin.com")
-- ]

Within this list, I can find user data via the lookup function which takes a key and returns a Maybe value. I had to write the function that builds a User out of this list of profile values. I also had to propagate any Maybe values by returning Maybe User.

First, let's write this without exploiting the fact that Maybe is a monad or an applicative:

buildUser :: Profile -> Maybe User
buildUser p =
    case lookup "first_name" p of
        Nothing -> Nothing
        Just fn -> case lookup "last_name" p of
            Nothing -> Nothing
            Just ln -> case lookup "email" p of
                Nothing -> Nothing
                Just e  -> Just $ User fn ln e

Oof.

Treating Maybe as a Monad makes this much, much cleaner:

buildUser :: Profile -> Maybe User
buildUser p = do
    fn <- lookup "first_name" p
    ln <- lookup "last_name" p
    e  <- lookup "email" p

    return $ User fn ln e

Up until a few weeks ago, I would've stopped there and been extremely proud of myself and Haskell. Haskell for supplying such a great abstraction for potential failed lookups, and myself for knowing how to use it.

Hopefully, the content of this blog post has made it clear that we can do better.

Series vs Parallel

Think about the thing we're modelling here. A monad is best used for sequencing dependant actions with side-effects. Does it matter in what order we look things up? If one key's not found, we want Nothing regardless of which key it is or when it goes missing. What we're really doing here is taking the three values with context (the Maybe profile values) and combining them all together via the User data constructor.

This is Applicative, I know this.

-- f :: a    -> b    -> c    -> d
User :: Text -> Text -> Text -> User

-- x                  :: f     a
lookup "first_name" p :: Maybe Text

-- y                 :: f     b
lookup "last_name" p :: Maybe Text

-- z             :: f     c
lookup "email" p :: Maybe Text

-- result :: f d
-- result = f <$> x <*> y <*> z
buildUser :: Profile -> Maybe User
buildUser p = User
    <$> lookup "first_name" p
    <*> lookup "last_name" p
    <*> lookup "email" p

And now, I understand when to reach for Applicative over Monad. Perhaps you do too?

published on 30 Mar 2014, tagged with haskell applicative

Writing JSON APIs with Yesod

Lately at work, I've been fortunate enough to work on a JSON API which I was given the freedom to write in Yesod. I was a bit hesitant at first since my only Yesod experience has been richer html-based sites and I wasn't sure what support (if any) there was for strictly JSON APIs. Rails has a number of conveniences for writing concise controllers and standing up APIs quickly -- I was afraid Yesod may be lacking.

I quickly realized my hesitation was unfounded. The process was incredibly smooth and Yesod comes with just as many niceties that allow for rapid development and concise code when it comes to JSON-only API applications. Couple this with all of the benefits inherent in using Haskell, and it becomes clear that Yesod is well-suited to sites of this nature.

In this post, I'll outline the process of building such a site, explain some conventions I've landed on, and discuss one possible pitfall when dealing with model relations.

Note: The code in this tutorial was extracted from a current project and is in fact working there. However, I haven't test-compiled the examples exactly as they appear in the post. It's entirely possible there are typos and the like. Please reach out on Twitter or via email if you run into any trouble with the examples.

What We Won't Cover

This post assumes you're familiar with Haskell and Yesod. It also won't cover some important but un-interesting aspects of API design. We'll give ourselves arbitrary requirements and I'll show only the code required to meet those.

Specifically, the following will not be discussed:

Getting Started

To begin, let's get a basic Yesod site scaffolded out. How you do this is up to you, but here's my preferred steps:

$ mkdir ./mysite && cd ./mysite
$ cabal sandbox init
$ cabal install alex happy yesod-bin
$ yesod init --bare
$ cabal install --dependencies-only
$ yesod devel

The scaffold comes with a number of features we won't need. You don't have to remove them, but if you'd like to, here they are:

Models

For our API example, we'll consider a site with posts and comments. We'll keep things simple, additional models or attributes would just mean more lines in our JSON instances or more handlers of the same basic form. This would result in larger examples, but not add any value to the tutorial.

Let's go ahead and define the models:

config/models

Post
  title Text
  content Text

Comment
  post PostId
  content Text

JSON

It's true that we can add a json keyword in our model definition and get derived ToJSON/FromJSON instances for free on all of our models; we won't do that though. I find these JSON instances, well, ugly. You'll probably want your JSON to conform to some conventional format, be it jsonapi or Active Model Serializers. Client side frameworks like Ember or Angular will have better built-in support if your API conforms to something conventional. Writing the instances by hand is also more transparent and easily customized later.

Since what we do doesn't much matter, only that we do it, I'm going to write JSON instances and endpoints to appear as they would in a Rails project using Active Model Serializers.

Model.hs

share [mkPersist sqlOnlySettings, mkMigrate "migrateAll"]
    $(persistFileWith lowerCaseSettings "config/models")

-- { "id": 1, "title": "A title", "content": "The content" }
instance ToJSON (Entity Post) where
    toJSON (Entity pid p) = object
        [ "id"      .= (String $ toPathPiece pid)
        , "title"   .= postTitle p
        , "content" .= postContent p
        ]

instance FromJSON Post where
    parseJSON (Object o) = Post
        <$> o .: "title"
        <*> o .: "content"

    parseJSON _ = mzero

-- { "id": 1, "post_id": 1, "content": "The comment content" }
instance ToJSON (Entity Comment) where
    toJSON (Entity cid c) = object
        [ "id"      .= (String $ toPathPiece cid)
        , "post_id" .= (String $ toPathPiece $ commentPost c)
        , "content" .= commentContent c
        ]

-- We'll talk about this later
--instance FromJSON Comment where

Routes and Handlers

Let's start with a RESTful endpoint for posts:

config/routes

/posts         PostsR GET POST
/posts/#PostId PostR  GET PUT DELETE

Since our API should return proper status codes, let's add the required functions to Import.hs, making them available everywhere:

Import.hs

import Network.HTTP.Types as Import
    ( status200
    , status201
    , status400
    , status403
    , status404
    )

Next we write some handlers:

Handlers/Posts.hs

getPostsR :: Handler Value
getPostsR = do
    posts <- runDB $ selectList [] [] :: Handler [Entity Post]

    return $ object ["posts" .= posts]

postPostsR :: Handler ()
postPostsR = do
    post <- parseJsonBody_ :: Handler Post
    _    <- runDB $ insert post

    sendResponseStatus status201 ("CREATED" :: Text)

You'll notice we need to add a few explicit type annotations. Normally, Haskell can infer everything for us, but in this case the reason for the annotations is actually pretty interesting. The selectList function will return any type that's persistable. Normally we would simply treat the returned records as a particular type and Haskell would say, "Aha! You wanted a Post" and then, as if by time travel, selectList would give us appropriate results.

In this case, all we do with the returned posts is pass them to object. Since object can work with any type than can be represented as JSON, Haskell doesn't know which type we mean. We must remove the ambiguity with a type annotation somewhere.

Handlers/Post.hs

getPostR :: PostId -> Handler Value
getPostR pid = do
    post <- runDB $ get404 pid

    return $ object ["post" .= post]

putPostR :: PostId -> Handler Value
putPostR pid = do
    post <- parseJsonBody_ :: Handler Post

    runDB $ replace pid post

    sendResponseStatus status200 ("UPDATED" :: Text)

deletePostR :: PostId -> Handler Value
deletePostR pid = do
    runDB $ delete pid

    sendResponseStatus status200 ("DELETED" :: Text)

I love how functions like get404 and parseJsonBody_ allow these handlers to be completely free of any error-handling concerns, but still be safe and well-behaved.

Comment Handlers

There's going to be a small annoyance in our comment handlers which I alluded to earlier by omitting the FromJSON instance on Comment. Before we get to that, let's take care of the easy stuff:

config/routes

/posts/#PostId/comments            CommentsR GET POST
/posts/#PostId/comments/#CommentId CommentR  GET PUT DELETE

Handlers/Comments.hs

getCommentsR :: PostId -> Handler Value
getCommentsR pid = do
    comments <- runDB $ selectList [CommentPost ==. pid] []

    return $ object ["comments" .= comments]

-- We'll talk about this later
--postCommentsR :: PostId -> Handler ()

For the single-resource handlers, we're going to assume that a CommentId is unique across posts, so we can ignore the PostId in these handlers.

Handlers/Comment.hs

getCommentR :: PostId -> CommentId -> Handler Value
getCommentR _ cid = do
    comment <- runDB $ get404 cid

    return $ object ["comment" .= comment]

-- We'll talk about this later
--putCommentR :: PostId -> CommentId -> Handler ()

deleteCommentR :: PostId -> CommentId -> Handler ()
deleteCommentR _ cid = do
    runDB $ delete cid

    sendResponseStatus status200 ("DELETED" :: Text)

Handling Relations

Up until now, we've been able to define JSON instances for our model, use parseJsonBody_, and insert the result. In this case however, the request body will be lacking the Post ID (since it's in the URL). This means we need to parse a different but similar data type from the JSON, then use that and the URL parameter to build a Comment.

Helpers/Comment.hs

-- This datatype would be richer if Comment had more attributes. For now 
-- we only have to deal with content, so I can use a simple newtype.
newtype CommentAttrs = CommentAttrs Text

instance FromJSON CommentAttrs where
    parseJSON (Object o) = CommentAttrs <$> o .: "content"
    parseJSON _          = mzero

toComment :: PostId -> CommentAttrs -> Comment
toComment pid (CommentAttrs content) = Comment
    { commentPost    = pid
    , commentContent = content
    }

This may seem a bit verbose and even redundant, and there's probably a more elegant way to get around this situation. Lacking that, I think the additional safety (vs the obvious solution of making commentPost a Maybe) and separation of concerns (vs putting this in the model layer) is worth the extra typing. It's also very easy to use:

Handlers/Comments.hs

import Helpers.Comment

postCommentsR :: PostId -> Handler ()
postCommentsR pid = do
    _ <- runDB . insert . toComment pid =<< parseJsonBody_

    sendResponseStatus status201 ("CREATED" :: Text)

Handlers/Comment.hs

import Helpers.Comment

putCommentR :: PostId -> CommentId -> Handler
putCommentR pid cid = do
    runDB . replace cid . toComment pid =<< parseJsonBody_

    sendResponseStatus status200 ("UPDATED" :: Text)
We don't need a type annotation on parseJsonBody_ in this case. Since the result is being passed to toComment pid, Haskell knows we want a CommentAttrs and uses its parseJSON function within parseJsonBody_

Conclusion

With a relatively small amount of time and code, we've written a fully-featured JSON API using Yesod. I think the JSON instances and API handlers are more concise and readable than the analogous Rails serializers and controllers. Our system is also far safer thanks to the type system and framework-provided functions like get404 and parseJsonBody_ without us needing to explicitly deal with any of that.

I hope this post has shown that Yesod is indeed a viable option for projects of this nature.

published on 22 Feb 2014, tagged with haskell yesod

Random Numbers without Mutation

In lecture 5A of Structure & Interpretation of Computer Programs, Gerald Sussman introduces the idea of assignments, side effects and state. Before that, they had been working entirely in purely functional Lisp which could be completely evaluated and reasoned about using the substitution model. He states repeatedly that this is a horrible thing as it requires a far more complex view of programs. At the end of the lecture, he shows a compelling example of why we must introduce this horrible thing anyway; without it, we cannot decouple parts of our algorithms cleanly and would be reduced to huge single-function programs in some critical cases.

The example chosen in SICP is estimating π using Cesaro's method. The method states that the probability that any two random numbers' greatest common divisor equals 1 is itself equal to 6/π2.

Since I know Ruby better than Lisp (and I'd venture my readers do too), here's a ported version:

def estimate_pi(trials)
  p = monte_carlo(trials) { cesaro }

  Math.sqrt(6 / p)
end

def cesaro
  rand.gcd(rand) == 1
end

def monte_carlo(trials, &block)
  iter = ->(trials, passed) do
    if trials == 0
      passed
    else
      if block.call
        iter.call(trials - 1, passed + 1)
      else
        iter.call(trials - 1, passed)
      end
    end
  end

  iter.call(trials, 0) / trials.to_f
end

I've written this code to closely match the Lisp version which used a recursive iterator. Unfortunately, this means that any reasonable number of trials will exhaust Ruby's stack limit.

The code above also assumes a rand function which will return different random integers on each call. To do so, it must employ mutation and hold internal state:

def rand
  @x ||= random_init
  @x   = random_update(@x)

  @x
end

Here I assume the same primitives as Sussman does, though it wouldn't be difficult to wrap Ruby's built-in rand to return integers instead of floats. The important thing is that this function needs to hold onto the previously returned random value in order to provide the next.

Sussman states that without this impure rand function, it would be very difficult to decouple the cesaro function from the monte_carlo one. Without utilizing (re)assignment and mutation, we would have to write our estimation function as one giant blob:

def estimate_pi(trials)
  iter = ->(trials, passed, x1, x2) do
    if trials == 0
      passed
    else
      x1_ = rand_update(x2)
      x2_ = rand_update(x1_)

      if x1.gcd(x2) == 1
        iter.call(trials - 1, passed + 1, x1_, x2_)
      else
        iter.call(trials - 1, passed, x1_, x2_)
      end
    end
  end

  x1 = rand_init
  x2 = rand_update(x1)

  p = iter.call(trials, 0, x1, x2) / trials.to_f

  Math.sqrt(6 / p)
end

Ouch.

It's at this point Sussman stops, content with his justification for adding mutability to Lisp. I'd like to explore a bit further: what if remaining pure were non-negotiable? Are there other ways to make decoupled systems and elegant code without sacrificing purity?

RGen

Let's start with a non-mutating random number generator:

class RGen
  def initialize(seed = nil)
    @seed = seed || random_init
  end

  def next
    x = random_update(@seed)

    [x, RGen.new(x)]
  end
end

def rand(g)
  g.next
end

This allows for the following implementation:

def estimate_pi(trials)
  p = monte_carlo(trials) { |g| cesaro(g) }

  Math.sqrt(6 / p)
end

def cesaro(g)
  x1, g1 = rand(g)
  x2, g2 = rand(g1)

  [x1.gcd(x2) == 1, g2]
end

def monte_carlo(trials, &block)
  iter = ->(trials, passed, g) do
    if trials == 0
      passed
    else
      ret, g_ = block.call(g)

      if ret
        iter.call(trials - 1, passed + 1, g_)
      else
        iter.call(trials - 1, passed, g_)
      end
    end
  end

  iter.call(trials, 0, RGen.new) / trials.to_f
end

We've moved out of the single monolithic function, which is a step in the right direction. The additional generator arguments being passed all over the place makes for some readability problems though. The reason for that is a missing abstraction; one that's difficult to model in Ruby. To clean this up further, we'll need to move to a language where purity was in fact non-negotiable: Haskell.

In Haskell, the type signature of our current monte_carlo function would be:

monteCarlo :: Int                    -- number of trials
           -> (RGen -> (Bool, RGen)) -- the experiment
           -> Double                 -- result

Within monte_carlo, we need to repeatedly call the block with a fresh random number generator. Calling RGen#next gives us an updated generator along with the next random value, but that must happen within the iterator block. In order to get it out again and pass it into the next iteration, we need to return it. This is why cesaro has the type that it does:

cesaro :: RGen -> (Bool, RGen)

cesaro depends on some external state so it accepts it as an argument. It also affects that state so it must return it as part of its return value. monteCarlo is responsible for creating an initial state and "threading" it though repeated calls to the experiment given. Mutable state is "faked" by passing a return value as argument to each computation in turn.

You'll also notice this is a similar type signature as our rand function:

rand :: RGen -> (Int, RGen)

This similarity and process is a generic concern which has nothing to do with Cesaro's method or performing Monte Carlo tests. We should be able to leverage the similarities and separate this concern out of our main algorithm. Monadic state allows us to do exactly that.

RGenState

For the Haskell examples, I'll be using System.Random.StdGen in place of the RGen class we've been working with so far. It is exactly like our RGen class above in that it can be initialized with some seed, and there is a random function with the type StdGen -> (Int, StdGen).

The abstract thing we're lacking is a way to call those function successively, passing the StdGen returned from one invocation as the argument to the next invocation, all the while being able to access that a (the random integer or experiment outcome) whenever needed. Haskell, has just such an abstraction, it's in Control.Monad.State.

First we'll need some imports.

import System.Random
import Control.Monad.State

Notice that we have a handful of functions with similar form.

(StdGen -> (a, StdGen))

What Control.Monad.State provides is a type that looks awfully similar.

data State s a = State { runState :: (s -> (a, s)) }

Let's declare a type synonym which fixes that s type variable to the state we care about: a random number generator.

type RGenState a = State StdGen a

By replacing the s in State with our StdGen type, we end up with a more concrete type that looks as if we had written this:

data RGenState a = RGenState
    { runState :: (StdGen -> (a, StdGen)) }

And then went on to write all the various instances that make this type useful. By using such a type synonym, we get all those instances and functions for free.

Our first example:

rand :: RGenState Int
rand = state random

We can "evaluate" this action with one of a number of functions provided by the library, all of which require some initial state. runState will literally just execute the function and return the result and the updated state (in case you missed it, it's just the record accessor for the State type). evalState will execute the function, discard the updated state, and give us only the result. execState will do the inverse: execute the function, discard the result, and give us only the updated state.

We'll be using evalState exclusively since we don't care about how the random number generator ends up after these actions, only that it gets updated and passed along the way. Let's wrap that up in a function that both provides the initial state and evaluates the action.

runRandom :: RGenState a -> a
runRandom f = evalState f (mkStdGen 1)

-- runRandom rand
-- => 7917908265643496962

Unfortunately, the result will be the same every time since we're using a constant seed. You'll see soon that this is an easy limitation to address after the fact.

With this bit of glue code in hand, we can re-write our program in a nice modular way without any actual mutable state or re-assignment.

estimatePi :: Int -> Double
estimatePi n = sqrt $ 6 / (monteCarlo n cesaro)

cesaro :: RGenState Bool
cesaro = do
    x1 <- rand
    x2 <- rand

    return $ gcd x1 x2 == 1

monteCarlo :: Int -> RGenState Bool -> Double
monteCarlo trials experiment = runRandom $ do
    outcomes <- replicateM trials experiment

    return $ (length $ filter id outcomes) `divide` trials

  where
    divide :: Int -> Int -> Double
    divide a b = fromIntegral a / fromIntegral b

Even with a constant seed, it works pretty well:

main = print $ estimatePi 1000
-- => 3.149183286488868

And For My Last Trick

It's easy to fall into the trap of thinking that Haskell's type system is limiting in some way. The monteCarlo function above can only work with random-number-based experiments? Pretty weak.

Consider the following refactoring:

estimatePi :: Int -> RGenState Double
estimatePi n = do
  p <- monteCarlo n cesaro

  return $ sqrt (6 / p)

cesaro :: RGenState Bool
cesaro = do
  x1 <- rand
  x2 <- rand

  return $ gcd x1 x2 == 1

monteCarlo :: Monad m => Int -> m Bool -> m Double
monteCarlo trials experiment = do
  outcomes <- replicateM trials experiment

  return $ (length $ filter id outcomes) `divide` trials

  where
    divide :: Int -> Int -> Double
    divide a b = fromIntegral a / fromIntegral b

main :: IO ()
main = print $ runRandom $ estimatePi 1000

The minor change made was moving the call to runRandom all the way up to main. This allows us to pass stateful computations throughout our application without ever caring about that state except at this highest level.

This would make it simple to add true randomness (which requires IO) by replacing the call to runRandom with something that pulls entropy in via IO rather than using mkStdGen.

runTrueRandom :: RGenState a -> IO a
runTrueRandom f = do
    s <- newStdGen

    evalState f s

main = print =<< runTrueRandom (estimatePi 1000)

One could even do this conditionally so that your random-based computations became deterministic during tests.

Another important point here is that monteCarlo can now work with any Monad! This makes perfect sense: The purpose of this function is to run experiments and tally outcomes. The idea of an experiment only makes sense if there's some outside force which might change the results from run to run, but who cares what that outside force is? Haskell don't care. Haskell requires we only specify it as far as we need to: it's some Monad m, nothing more.

This means we can run IO-based experiments via the Monte Carlo method with the same monteCarlo function just by swapping out the monad:

What if Cesaro claimed the probability that the current second is an even number is equal to 6/π2? Seems reasonable, let's model it:

-- same code, different name / type
estimatePiIO :: Int -> IO Double
estimatePiIO n = do
  p <- monteCarlo n cesaroIO

  return $ sqrt (6 / p)

cesaroIO :: IO Bool
cesaroIO = do
  t <- getCurrentTime

  return $ even $ utcDayTime t

monteCarlo :: Monad m => Int -> m Bool -> m Double
monteCarlo trials experiment = -- doesn't change at all!

main :: IO ()
main = print =<< estimatePiIO 1000

I find the fact that this expressiveness, generality, and polymorphism can share the same space as the strictness and incredible safety of this type system fascinating.

published on 09 Feb 2014, tagged with haskell

Automated Unit Testing in Haskell

Hspec is a BDD library for writing Rspec-style tests in Haskell. In this post, I'm going to describe setting up a Haskell project using this test framework. What we'll end up with is a series of tests which can be run individually (at the module level), or all together (as part of packaging). Then I'll briefly mention Guard (a Ruby tool) and how we can use that to automatically run relevant tests as we change code.

Project Layout

For any of this to work, our implementation and test modules must follow a particular layout:

Code/liquid/
├── src
│   └── Text
│       ├── Liquid
│       │   ├── Context.hs
│       │   ├── Parse.hs
│       │   └── Render.hs
│       └── Liquid.hs
└── test
    ├── SpecHelper.hs
    ├── Spec.hs
    └── Text
        └── Liquid
            ├── ParseSpec.hs
            └── RenderSpec.hs

Notice that for each implementation module (under ./src) there is a corresponding spec file at the same relative path (under ./test) with a consistent, conventional name (<ModuleName>Spec.hs). For this post, I'm going to outline the first few steps of building the Parse module of the above source tree which happens to be my liquid library, a Haskell implementation of Shopify's template system.

Hspec Discover

Hspec provides a useful function called hspec-discover. If your project follows the conventional layout above, you can simply create a file like so:

test/Spec.hs

{-# OPTIONS_GHC -F -pgmF hspec-discover #-}

And when that file is executed, all of your specs will be found and run together as a single suite.

SpecHelper

I like to create a central helper module which gets imported into all specs. It simply exports our test framework and implementation code:

test/SpecHelper.hs

module SpecHelper
    ( module Test.Hspec
    , module Text.Liquid.Parse
    ) where

import Test.Hspec
import Text.Liquid.Parse

This file might not seem worth it now, but as you add more modules, it becomes useful quickly.

Baby's First Spec

test/Text/Liquid/ParseSpec.hs

module Text.Liquid.ParseSpec where

import SpecHelper

spec :: Spec
spec = do
    describe "Text.Liquid.Parse" $ do
        context "Simple text" $ do
            it "parses exactly as-is" $ do
                let content = "Some simple text"

                parseTemplate content `shouldBe` Right [TString content]

main :: IO ()
main = hspec spec

With this first spec, I've already made some assumptions and design decisions.

The API into our module will be a single parseTemplate function which returns an Either type (commonly used to represent success or failure). The Right value (conventionally used for success) will be a list of template parts. One such part can be constructed with the TString function and is used to represent literal text with no interpolation or logic. This is the simplest template part possible and is therefore a good place to start.

The spec function is what will be found by hspec-discover and rolled up into a project-wide test. I've also added a main function which just runs said spec. This allows me to easily run the spec in isolation, which you should do now:

$ runhaskell -isrc -itest test/Text/Liquid/ParseSpec.hs

The first error you should see is an inability to find Test.Hspec. Go ahead and install it:

$ cabal install hspec

You should then get a similar error for Text.Liquid.Parse then some more about functions and types that are not yet defined. Let's go ahead and implement just enough to get past that:

src/Text/Liquid/Parse.hs

module Text.Liquid.Parse where

type Template = [TPart]

data TPart = TString String

parseTemplate :: String -> Either Template String
parseTemplate = undefined

The test should run now and give you a nice red failure due to the attempted evaluation of undefined.

Since implementing Parse is not the purpose of this post, I won't be moving forward in that direction. In stead, I'm going to show you how to set this library up as a package which can be cabal installed and/or cabal tested by end-users.

For now, you can pass the test easily like so:

src/Text/Liquid/Parse.hs

parseTemplate :: String -> Either Template String
parseTemplate str = Right [TString str]

For TDD purists, this is actually the correct thing to do here: write the simplest implementation to pass the test (even if you "know" it's not going to last), then write another failing test to force you to implement a little more. I don't typically subscribe to that level of TDD purity, but I can see the appeal.

Cabal

We've already got Spec.hs which, when executed, will run all our specs together:

$ runhaskell -isrc -itest test/Spec.hs

We just need to wire that into the Cabal packaging system:

liquid.cabal

name:          liquid
version:       0.0.0
license:       MIT
copyright:     (c) 2013 Pat Brisbin
author:        Pat Brisbin <pbrisbin@gmail.com>
maintainer:    Pat Brisbin <pbrisbin@gmail.com>
build-type:    Simple
cabal-version: >= 1.8

library
  hs-source-dirs: src

  exposed-modules: Text.Liquid.Parse

  build-depends: base == 4.*

test-suite spec
  type: exitcode-stdio-1.0

  hs-source-dirs: test

  main-is: Spec.hs

  build-depends: base  == 4.*
               , hspec >= 1.3
               , liquid

With this in place, testing our package is simple:

$ cabal configure --enable-tests
...
$ cabal build
...
$ cabal test
Building liquid-0.0.0...
Preprocessing library liquid-0.0.0...
In-place registering liquid-0.0.0...
Preprocessing test suite 'spec' for liquid-0.0.0...
Linking dist/build/spec/spec ...
Running 1 test suites...
Test suite spec: RUNNING...
Test suite spec: PASS
Test suite logged to: dist/test/liquid-0.0.0-spec.log
1 of 1 test suites (1 of 1 test cases) passed.

Guard

Another thing I like to setup is the automatic running of relevant specs as I change code. To do this, we can use a tool from Ruby-land called Guard. Guard is a great example of a simple tool doing one thing well. All it does is watch files and execute actions based on rules defined in a Guardfile. Through plugins and extensions, there are a number of pre-built solutions for all sorts of common needs: restarting servers, regenerating ctags, or running tests.

We're going to use guard-shell which is a simple extension allowing for running shell commands and spawning notifications.

$ gem install guard-shell

Next, create a Guardfile:

Guardfile

# Runs the command and prints a notification
def execute(cmd)
  if system(cmd)
    n 'Build succeeded', 'hspec', :success
  else
    n 'Build failed', 'hspec', :failed
  end
end

def run_all_tests
  execute %{
    cabal configure --enable-tests &&
    cabal build && cabal test
  }
end

def run_tests(mod)
  specfile = "test/#{mod}Spec.hs"

  if File.exists?(specfile)
    files = [specfile]
  else
    files = Dir['test/**/*.hs']
  end

  execute "ghc -isrc -itest -e main #{files.join(' ')}"
end

guard :shell do
  watch(%r{.*\.cabal$})          { run_all_tests }
  watch(%r{test/SpecHelper.hs$}) { run_all_tests }
  watch(%r{src/(.+)\.hs$})       { |m| run_tests(m[1]) }
  watch(%r{test/(.+)Spec\.hs$})  { |m| run_tests(m[1]) }
end

Much of this Guardfile comes from this blog post by Michael Xavier. His version also includes cabal sandbox support, so be sure to check it out if that interests you.

If you like to bundle all your Ruby gems (and you probably should) that can be done easily, just see my main liquid repo as that's how I do things there.

In one terminal, start guard:

$ guard

Finally, simulate an edit in your module and watch the test automatically run:

$ touch src/Text/Liquid/Parse.hs

And there you go, fully automated unit testing in Haskell.

published on 01 Dec 2013, tagged with testing haskell cabal hunit ruby guard

Using Notify-OSD for XMonad Notifications

In my continuing efforts to strip my computing experience of any non-essential parts, I've decided to ditch my statusbars. My desktop is now solely a grid of tiled terminals (and a browser). It's quite nice. The only thing I slightly missed, however, was notifications when one of my windows set Urgency. This used to trigger a bright yellow color for that workspace in my dzen-based statusbar.

A Brief Tangent:

Windows have these properties called "hints" which they can set on themselves at will. These properties can be read by Window Managers in an effort to do the Right Thing. Hints are how a Window tells the Manager, "Hey, I should be full-screen" or, "I'm a dialog, float me on top of everything". One such hint is WM_URGENT.

WM_URGENT is how windows get your attention. It's what makes them flash in your task bar or bounce in your dock. If you're using a sane terminal, it should set WM_URGENT on itself if the program running within it triggers a "bell".

By telling applications like mutt or weechat to print a bell when I get new email or someone nick-highlights me, I can easily get notifications of these events even from applications that are running within screen, in an ssh session, on some server far, far away. Pretty neat.

Now that I'm without a status bar, I need to be notified some other way. Enter Notify-OSD.

Notify-OSD

Notify-OSD is part of the desktop notification system of GNOME, but it can be installed standalone and used to send notifications from the command-line very easily:

$ notify-send "A title" "A message"

So how do we get XMonad to send a useful notification via notify-send whenever a window sets the WM_URGENT hint? Enter the UrgencyHook.

UrgencyHook

Setting a custom urgency hook is very easy, but not exactly intuitive. What we're actually doing is declaring a custom data type, then making it an instance of the UrgencyHook typeclass. The single required function to be a member of this typeclass is an action which will be run whenever a window sets urgency. Conveniently, it's given the window with urgency as an argument. We can use this to format our notification.

First off, add the module imports we'll need:

import XMonad.Hooks.UrgencyHook
import XMonad.Util.NamedWindows
import XMonad.Util.Run

import qualified XMonad.StackSet as W

Then make that custom datatype and instance:

data LibNotifyUrgencyHook = LibNotifyUrgencyHook deriving (Read, Show)

instance UrgencyHook LibNotifyUrgencyHook where
    urgencyHook LibNotifyUrgencyHook w = do
        name     <- getName w
        Just idx <- fmap (W.findTag w) $ gets windowset

        safeSpawn "notify-send" [show name, "workspace " ++ idx]

Finally, update main like so:

main :: IO ()
main = xmonad
     $ withUrgencyHook LibNotifyUrgencyHook
     $ defaultConfig
        { -- ...
        , -- ...
        }

To test this, open a terminal in some workspace and type:

$ sleep 3 && printf "\a"

Then immediately focus away from that workspace. In a few seconds, you should see a nice pop-up like:

notify-send 

You can see the title of the notification is the window name and I use the message to tell me the workspace number. In this case, the name is the default "urxvt" for a terminal window, but I also use a few wrapper scripts to open urxvt with the -n option to set its name to something specific which will then come through in any notifications from that window.

If that doesn't work, it's likely your terminal doesn't set Urgency on bells. For rxvt at least, the setting is:

URxvt*urgentOnBell: true
URxvt*visualBell:   false

In Xresources or Xdefaults, whichever you use.

published on 15 Oct 2013, tagged with xmonad haskell notify-osd

The Advent of IO

What if we wanted to write a Haskell program to behave something like this:

$ runhaskell hello.hs
Hello who?

$ runhaskell hello.hs Pat
Hello Pat

$ runhaskell hello.hs -u Pat
Hello PAT

One implementation may look like this:

main :: IO ()
main = do
    args <- getArgs

    let name = case args of
                ("-u":n:_) -> map toUpper n
                (     n:_) -> n
                otherwise  -> "who?"

    putStrLn $ "Hello " ++ name

And almost immediately, the budding Haskell programmer is met with a number of confusing concepts: What the heck is IO ()? What does <- mean? When questions like these are raised, the answer is "well, because Monad." Not very enlightening.

Haskell's IO monad is an amazingly elegant solution to a very thorny problem, but why is it so hard to wrap one's head around? I think the reason it can be so confusing is that we come at it backwards, we see this elegant result but know not the problem it solves.

In the Beginning

In the very early days of Haskell, there was no IO monad. In stead, programs used a somewhat confusing [Response] -> [Request] model (some details can be found here).

It was clear that if Haskell were to become generally useful, there had to be something better, something that allowed more intuitive interactions with the outside word. The problem was extending this idea of a globally accessible Outside World without sacrificing the purity of the program.

Recently, while pondering the State monad, I had an epiphany which confirms how the problem was solved: Every function is still pure.

How is this possible? Well, first we have to look at IO actions as any other form of stateful computation. Then we just have to prove to ourselves that stateful computations can be done in a pure way.

Take a program like this:

main :: IO ()
main = doTheThing

doTheThing :: IO ()
doTheThing = do
    putStrLn "one"
    putStrLn "two"

It's common to refer to these functions as impure and having side effects. We look at an imperative line like putStrLn and assume that the function is "reaching out" affecting the outside world by printing text to some terminal it has not received as a direct input, and is therefore impure.

This mis-characterization isn't itself bad, we do need a way to differentiate Haskell functions which "live in IO" vs those that don't. Pure vs impure seems like good enough categories, but it's not entirely correct and can lead folks astray when more complex concepts are introduced.

Imagine if we in stead wrote the program like this:

main :: World -> (World, ())
main world = doTheThing world

putStrLn :: String -> World -> (World, ())
putStrLn str world = appendText (str ++ "\n") (terminal world)

doTheThing :: World -> (World, ())
doTheThing world =
    let (world1, _) = (putStrLn "one") world
        (world2, _) = (putStrLn "two") world1

    in (world2, ())

I've purposely left appendText undefined and not told you what World is, but you can still confirm that these functions act only on their direct inputs, thus remaining completely pure. If we accept that there is some notion of a World to which we can appendText provided by the Haskell language, then the above is a completely accurate de-sugaring of the original program.

To further explore this idea, I went through the mental exercise of building the IO monad myself by substituting my own World into the confines of a very simple alternate main syntax.

I hope you'll find it as illustrative as I did.

Limiting Main.main

Let's pretend that Haskell is in its infancy and the designers have punted the idea of IO. They've chosen in stead to flesh out the rest of the language with vastly simpler semantics for a program's main.

In this hypothetical language, a program's main function is of the type [String] -> String. When executed, the Haskell runtime will provide the program's commandline arguments to your main function as a list of Strings. Whatever String your main function returns will then be printed on stdout.

Let's try out this language on our sample problem:

import Data.Char (toUpper)

main1 :: [String] -> String
main1 args = sayHello1 args

sayHello1 :: [String] -> String
sayHello1 args = "Hello " ++ (nameFromArgs1 args)

nameFromArgs1 :: [String] -> String
nameFromArgs1 ("-u":name:_) = map toUpper name
nameFromArgs1 (     name:_) = name
nameFromArgs1            _  = "who?"

Obviously things could be done simpler, but I've purposely written it using two functions: one which requires access to program input and one which affects program output. This will make our exercise much more interesting as we move toward monadic IO.

Our current method of passing everything that's needed as direct arguments and getting back anything that's needed as direct results works well for simple cases, but it doesn't scale. When we consider that the input to and output of main might eventually be a rich object representing the entire outside world (file handles, TCP sockets, environment variables, etc), it becomes clear that passing these resources down into and back out of any functions we wish to use is simply not workable.

However, passing the data directly in and getting the result directly out is the only way to keep functions pure. It's also the only way to keep them honest. If any one function needs access to some piece of the outside world, any functions which use it also need that same access. This required access propagates all the way up to main which is the only place that data is available a-priori.

What if there were a way to continue to do this but simply make it easier on the eyes (and fingers) through syntax or abstraction?

Worldly Actions

The solution to our problem begins by defining two new types: World and Action.

A World is just something that represents the commandline arguments given to main and the String which must be returned by main for our program to have any output. At this point in time, there's no other aspects of the world that we have access to or could hope to affect.

data World = World
    { input  :: [String]
    , output :: String
    }

An Action is a function which takes one World and returns a different one along with some result. The differences between the given World and the returned one are known as the function's side-effects. Often, we don't care about the result itself and only want the side-effects, in these cases we'll use Haskell's () (known as Bottom, or Unit) as the result.

sayHello2 :: World -> (World, ())
sayHello2 w =
    let (w', n) = nameFromArgs2 w

    in (w' { output = output w ++ "Hello " ++ n }, ())

nameFromArgs2 :: World -> (World, String)
nameFromArgs2 w =
    case input w of
        ("-u":name:_) -> (w, map toUpper name)
        (     name:_) -> (w, name)
        otherwise     -> (w, "who?")

Now we can rewrite main to just convert its input and output into a World which gets passed through our world-changing functions.

main2 :: [String] -> String
main2 args =
    let firstWorld    = World args ""
        (newWorld, _) = sayHello2 firstWorld

    in output newWorld

In the above, we've just accepted that World -> (World, a) is this thing we call an Action. There's no reason to be implicit about these things in Haskell, so let's give it a name.

newtype Action w a = Action { runAction :: (w -> (w, a)) }

In order to create a value of this type, we simply need to give a world-changing function to its constructor. The runAction accessor allows us to pull the actual world-changing function back out again. Once we have the function itself, we can execute it on any value of type w and we'll get a new value of type w along with a result of type a.

As mentioned, we often don't care about the result and want to run an Action only for its side-effects. This next function makes running an action and discarding its result easy:

execAction :: Action w a -> w -> w
execAction a w = let (w', _) = (runAction a) w in w'

This becomes immediately useful in our newest main:

main3 :: [String] -> String
main3 args = output $ execAction (Action sayHello2) (World args "")

You'll notice we need to pass sayHello2 to the Action constructor before giving it to execAction. This is because sayHello2 is just the world-changing function itself. For reasons that should become clear soon, we don't want to do this, it would be better for our world-changing functions to be actual Actions themselves.

Before we address that, let's define a few helper Actions:

-- | Access a world's input without changing it
getArgs :: Action World [String]
getArgs = Action (\w -> (w, input w))

-- | Change a world by appending str to its output buffer
putStrLn :: String -> (Action World ())
putStrLn str = Action (\w ->
    (w { output = (output w) ++ str ++ "\n"}, ()))

Now let's fix our program:

sayHello3 :: Action World ()
sayHello3 = Action (\w ->
    let (w', n) = (runAction nameFromArgs3) w

    in (runAction (putStrLn $ "Hello " ++ n)) w')

nameFromArgs3 :: Action World String
nameFromArgs3 = Action (\w ->
    let (w', args) = (runAction getArgs) w

    in case args of
        ("-u":name:_) -> (w', map toUpper name)
        (     name:_) -> (w', name)
        otherwise     -> (w', "who?"))

This allows us to use sayHello3 directly in main:

main4 :: [String] -> String
main4 args = output $ execAction sayHello3 (World args "")

Things are still pretty clunky, but one thing to notice is that now all of the world-changing things are of the same type, specifically Action World a. Getting things to all be the same type has exposed the underlying duplication involved with sequencing lists of actions over some world.

A Monad is Born

One obvious duplication is taking two Actions and combining them into one Action which represents passing a World through them, one after another.

combine :: Action w a -> Action w b -> Action w b
combine f g = Action (\w ->
    -- call the first action on the world given to produce a new world,
    let (w',  _) = (runAction f) w

        -- then call the second action on that new world
        (w'', b) = (runAction g) w'

    -- to produce the final world and result
    in (w'', b))

f = combine (putStrLn "one") (putStrLn "two")

execAction f $ World [] ""
-- => World [] "one\ntwo\n"

What about functions like putStrLn which aren't themselves an Action until they've been given their first argument? How can we combine those with other Actions?

pipe :: Action w a -> (a -> Action w b) -> Action w b
pipe f g = Action (\w ->
    -- call the first action on the world given to produce a new world 
    -- and a result of type a,
    let (w',  a) = (runAction f) w

        -- then give the result of type a to the second function which 
        -- turns it into an action which can be called on the new world
        (w'', b) = (runAction (g a)) w'

    -- to produce the final world and result
    in (w'', b))

f = pipe getArgs (putStrLn . head)

execAction f $ World ["Pat"] ""
-- => World ["Pat"] "Pat\n"

pipe and combine both require their first argument be an Action, but what if all we have is a non-Action value?

-- turn the value into an Action by returning it as the result along 
-- with the world given
promote :: a -> Action w a
promote x = Action (\w -> (w, x))

f = pipe (promote "Hello world") putStrLn

execAction f $ World [] ""
-- => World [] "Hello world\n"

Finally, we can remove that duplication and make our code much more readable:

sayHello4 :: Action World ()
sayHello4 = pipe nameFromArgs4 (\n -> putStrLn $ "Hello " ++ n)

nameFromArgs4 :: Action World String
nameFromArgs4 =
    pipe getArgs (\args ->
        promote $ case args of
                    ("-u":name:_) -> map toUpper name
                    (     name:_) -> name
                    otherwise     -> "who?")

Turns out, the behaviors we've just defined have a name: Monad. And once you've made your type a Monad (by defining these three functions), any and all functions which have been written to deal with Monads (which is a lot) will now be able to work with your type.

To show that there are no tricks here, I'll even use the functions we've defined as the implementation in our real Monad instance:

instance Monad (Action w) where
    return = promote
    (>>=)  = pipe

-- As our first free lunch, Haskell already provides "combine" in terms 
-- of >>=. A combination is just a pipe but with the result of the first 
-- action discarded.
(>>) f g = f >>= \_ -> g

Now our functions are looking like real Haskell syntax:

sayHello5 :: Action World ()
sayHello5 = nameFromArgs5 >>= (\n -> putStrLn $ "Hello " ++ n)

nameFromArgs5 :: Action World String
nameFromArgs5 =
    getArgs >>= \args ->
        return $ case args of
                    ("-u":name:_) -> map toUpper name
                    (     name:_) -> name
                    otherwise     -> "who?"

Do It to It

Now that we've made our type a real Monad, and now that we understand what functions like return and (>>=) mean, we can make the final leap to the more imperative looking code we started with.

Haskell has something called "do-notation". All it is is a form of pre-processing which transforms expressions like this:

f = do
  args <- getArgs

  putStrLn $ head args

Into expressions like this:

f = getArgs >>= (\args -> putStrLn $ head args)

Either syntax is valid Haskell, and I use both freely depending on the scenario. Let's go ahead and rewrite our functions in do-notation:

sayHello6 :: Action World ()
sayHello6 = do
    name <- nameFromArgs5

    putStrLn $ "Hello " ++ name

nameFromArgs6 :: Action World String
nameFromArgs6 = do
    args <- getArgs

    return $ case args of
                ("-u":name:_) -> map toUpper name
                (     name:_) -> name
                otherwise     -> "who?"

It's hard to believe that, to this point, we have no such thing as IO. These functions simply describe how to make one World from another, and that only actually happens when main puts sayHello together with some initial World via execAction.

What we've done is built the system we want for IO all the way up to main. We've given any function in our system "direct" access to program input and output, all that's required is they make themselves Actions. Through the use of the Monad typeclass and do-notation, making functions Actions has become quite pleasant while keeping everything entirely pure.

Final Touches

Let's say that instead of being a primitive [String] -> String, we'll let main be itself an Action World (). Then we can let the Haskell runtime handle constructing a World, calling execAction main on it, then outputting whatever output there is in the new World we get back.

Then, let's imagine we didn't have our simplistic World type which only deals with commandline arguments and an output string. Imagine we had a rich World that knew about environment variables, file handles, and memory locations. That type would live in an impure space with access to all the richness of reality, but we could use pure Actions to describe how to read its files or access its volatile memory.

Things might end up like this:

type IO a = Action World a

main :: IO ()
main = do
    args <- getArgs

    let name = case args of
                ("-u":n:_) -> map toUpper n
                (     n:_) -> n
                otherwise  -> "who?"

    putStrLn $ "Hello " ++ name
$ runhaskell hello.hs -u io
Hello IO

published on 28 Jul 2013, tagged with haskell monad io

Parsing DATABASE_URL

A while back, I made a post about deploying yesod apps to heroku. The method used back then is no longer required (thank God!) and deploying to heroku is super simple these days. So simple, in fact, that I won't reiterate those instructions here, this post is about something a bit more specific.

Chances are, your app is using a database. And you probably don't want to hard-code those database credentials in your (probably shared) source code. What you'd rather do is parse them out of the DATABASE_URL environment variable provided by heroku.

Well, here is how you do that:

herokuConf

Eventually, I might wrap this up in a cabal package you can install, but for now just create a helper like this:

Helpers/Heroku.hs

module Helpers.Heroku (herokuConf) where

import Prelude
import Data.ByteString (ByteString)
import Data.Text (Text)
import Data.Text.Encoding (encodeUtf8)
import Database.Persist.Postgresql (PostgresConf(..))
import Web.Heroku (dbConnParams)

import qualified Data.Text as T

herokuConf :: IO PostgresConf
herokuConf = do
    params <- dbConnParams

    return PostgresConf
        { pgConnStr  = formatParams params
        , pgPoolSize = 10 -- Adjust this as you see fit!
        }

    where
        formatParams :: [(Text, Text)] -> ByteString
        formatParams = encodeUtf8 . T.unwords . map toKeyValue

toKeyValue :: (Text, Text) -> Text
toKeyValue (k, v) = k `T.append` "=" `T.append` v
This relies on the heroku package, so be sure you add that to the build-depends in your cabal file.

makeFoundation

Now, modify your application loading like so:

Application.hs

import Helpers.Heroku

makeFoundation :: AppConfig DefaultEnv Extra -> IO App
makeFoundation conf = do
    -- ...

    dbconf <- if development
                -- default behavior when in development
                then withYamlEnvironment "config/postgresql.yml" (appEnv conf)
                    Database.Persist.loadConfig >>=
                    Database.Persist.applyEnv

                -- but parse DATABASE_URL in non-development
                else herokuConf

    -- ...

    return foundation

That's it. Commit, push, enjoy!

published on 07 Jun 2013, tagged with yesod haskell heroku postgresql

For the Library Authors

Recently, Yesod released version 1.2. You can read the announcement here, the changelog here, and a detailed blog post about the subsite rewrite here. These resources do a great job of getting users' apps up on 1.2. This post won't rehash those, it is in stead intended for those of you (like myself) who maintain libraries dependent on Yesod.

The large refactor to the transformer stack and its implications in how subsites are written made it non-trivial to port my markdown, comments, and pagination libraries over to 1.2; I imagine many other such authors are in the same position and might appreciate a little guidance.

I don't claim to know why or how all this stuff works, but at least you can benefit from my trial and error.

I apologize for the lack of narrative or conclusion here, this is pretty much just a list of things I had to take care of during the update process...

Transformer Stack

You can basically find-and-replace all of these:

fooHandler :: X -> Y -> GHandler s m a
fooWidget  :: X -> Y -> GWidget s m a

Into these:

fooHandler :: X -> Y -> HandlerT m IO a
fooWidget  :: X -> Y -> WidgetT m IO a

Lifting

Anywhere you use lift to run a Handler action from within a Widget now needs to use handlerToWidget for the same purpose.

Route to Master

Subsites and their masters are now very well isolated, this means you no longer need code like this in a master site's hander:

tm <- getRouteToMaster
redirect $ tm SomeRoute

It can be simplified to just:

redirect SomeRoute

Which is way better.

The function getRouteToMaster does still exist as getRouteToParent, and it should be used (only) to route to a master site's route from within the subsite's handler.

Subsite Declaration

If you author a subsite, here is where your largest changes will be. There's a handy demo app which serves as a great reference.

Subsites now have a two-phase construction much like in Foundation.hs. So, where you might've had a single module like this:

module CommentsAdmin
  ( CommentsAdmin
  , getCommentsAdmin
  , Route(..)
  ) where

CommentsAdmin = CommentsAdmin

getCommentsAdmin :: a -> CommentsAdmin
getCommentsAdmin = const CommentsAdmin

mkYesodSub "CommentsAdmin"
    [ ClassP ''YesodComments [ VarT $ mkName "master" ] ]
    [parseRoutes|
        /                            CommentsR      GET
        /edit/#ThreadId/#CommentId   EditCommentR   GET POST
        /delete/#ThreadId/#CommentId DeleteCommentR GET POST
        |]

You now need a separate file to define the routes:

CommentsAdmin/Routes.hs

module CommentsAdmin.Routes where

CommentsAdmin = CommentsAdmin

mkYesodSubData "CommentsAdmin" [parseRoutes|
    /                            CommentsR      GET
    /edit/#ThreadId/#CommentId   EditCommentR   GET POST
    /delete/#ThreadId/#CommentId DeleteCommentR GET POST
    |]

And import/use them separately:

CommentsAdmin.hs

module Foo
  ( CommentsAdmin
  , getCommentsAdmin
  , module CommentsAdmin.Routes
  ) where

import CommentsAdmin.Routes

getCommentsAdmin :: a -> CommentsAdmin
getCommentsAdmin = const CommentsAdmin

instance YesodComments m => YesodSubDispatch CommentsAdmin (HandlerT m IO)
    where yesodSubDispatch = $(mkYesodSubDispatch resourcesCommentsAdmin)
There's probably a way around this, but I had enough wrestling to do.

You'll also want to make a Handler synonym for your subsite routes:

type Handler a = forall master. YesodComments master
               => HandlerT CommentsAdmin (HandlerT master IO) a

getCommentsR :: Handler RepHtml

It's fine to use Handler as long as you don't export it.

Subsite Actions

What you do from within a subsite will definitely need some tweaking, but that's mostly because the old way was very klunky and the new way is much cleaner.

If you want to call any functions in the context of the main site, just use lift. Usually, this'll be lift $ defaultLayout, but also, as may be common, if you have a Typeclass on your master site providing some functionality (like loading comments), you need to use lift to call those functions from within subsite handlers.

Persistent Fields

If you derive PersistField also now derive PersistFieldSql. I don't know the motivation behind the split, but as a user dog-fooding my own library, I soon realized I needed both instances on my Markdown type.

Persistent Actions

If you have a library exposing functions which are meant to be called within runDB, you probably already know those type signatures can get messy.

Well, they stay messy, but at least I can tell you what you need to change. Mine went from this:

selectPaginated :: ( PersistEntity val
                   , PersistQuery m1
                   , PersistEntityBackend val ~ PersistMonadBackend m1
                   , MonadLift (GHandler s m) m1
                => Int
                -> [Filter val]
                -> [SelectOpt val]
                -> m1 ([Entity val], GWidget s m ())

To this:

selectPaginated :: ( PersistEntity val
                   , (PersistQuery (YesodPersistBackend m (HandlerT m IO)))
                   , (PersistMonadBackend (YesodPersistBackend m (HandlerT m IO)) ~ PersistEntityBackend val)
                   , (MonadTrans (YesodPersistBackend m))
                   , Yesod m
                   )
                => Int
                -> [Filter val]
                -> [SelectOpt val]
                -> YesodDB m ([Entity val], WidgetT m IO ())

I probably could've added the Yesod m constraint and used the YesodDB alias prior to 1.2, but oh well.

published on 31 May 2013, tagged with yesod haskell

Developing Web Applications with Yesod

The following was written for issue 7 of Web & PHP magazine. Please, if you enjoy this article (or my articles in general), take the two minutes to register there and download the full PDF to show your support.

Why Haskell?

There's much more to Haskell than just the buzz-words like laziness and parallelism -- which are completely deserved, by the way. Having pure computations defined as side-effect-free morphisms that take and return immutable datatypes allows the compiler to do amazing optimizations. This frees you to write elegant, readable code but get near-C performance at the same time.

Runtime errors. grepping through source to find some method you just rewrote to ensure it's not incorrectly called somewhere. Wondering if that expression represents a String or a Boolean. Wondering how that template behaves when user is nil. Unit tests. Hitting deploy and frantically browsing the site to make sure things still work. These are the hazards of a dynamic language. These are things that all go away when you use a language like Haskell.

It's been my experience that when developing in Haskell: if it compiles, it works. I'd say, conservatively, that 93% of every bug I've ever written in Haskell has been caught immediately by the compiler. That's a testament to both the compiler and the number of bugs I'm able to produce in Haskell code. It's amazingly freeing to gain such a level of confidence in the correctness of your code simply by seeing its successful compilation.

My hope for this article is to illustrate this experience by building out a simple site in the Haskell web framework Yesod. Yesod is just one of many web frameworks in Haskell, but it's the one I'm most comfortable with. I encourage you to check it out at yesodweb.com, there are many features and considerations that I won't be touching on here.

Yesod Development

In order to develop a Yesod site, you'll need the Glasglow Haskell Compiler, along with some additional build tools. These can all be installed by setting up the Haskell Platform. There are installers for Windows, OSX, and most Linux distributions have it in their repositories.

Once the Haskell Platform is setup, type:

$ cabal update
$ cabal install yesod-platform

This one time installation of the framework can take while.

The Lemonstand

The blog example feels a bit overdone, doesn't it? In stead, let's build a lemonade stand (which I'll refer to as "The Lemonstand" from now on). We won't get too crazy with features, we just want a few pages and some database interaction so you can see how this framework can be used.

Much of the site will be provided by the code-generating tool called the Yesod Scaffold.

Yesod Init

The Yesod scaffolding tool will build out a sample site showing some of the more common and useful patterns used in Yesod sites. It will build you a simple "hello world" site with important features like persistence, authentication and static file serving already coded out. You can then edit and extend this site to quickly build out features.

It's important to note that this is not the way to structure a Yesod application, is just one way to do it. That said, this organisational structure has been refined over a long period of time and comes with many benefits.

To start our project, we do the following:

$ yesod init

We'll answer a couple of questions about ourselves and our project. I'm calling it "lemonstand" and choosing the sqlite database type since it does not require a separate server.

The first thing we have to do is pull in any additional dependencies (like the driver for the type of database we chose to use).

$ cd lemonstand
$ cabal install

In order for authentication via Google to work (a feature we'll use down the line), we need to make one small change to config/settings.yml. Please update the development block like so:

Development:
  <<: *defaults
  approot: "http://localhost:3000"

With that bit of housekeeping out of the way, go ahead and fire up the development server:

$ yesod devel

You should see lots of output about compilation, database migrations, etc. Most importantly is "Devel application launched: http://localhost:3000". Go ahead and checkout the sample site by visiting that URL in your browser.

default screenshot 

Now we're ready to hack!

Models

In a "production" lemonade stand app, we might get a little more complex with the data modeling, but to keep this demo simple, I'm going to model the system simply as well.

The scaffold already comes with the concept of a User and authentication, so we'll keep that as-is. The second most important concept will be Orders which our users can create through a typical check-out flow.

Orders will have many Lemonades which have size, price, and quantity.

Open up Model.hs. This is where we'll place our core data type definitions. You should notice a line about persistFile. What this does is parse the text file "config/models" and generate some Haskell datatypes for us. This line also adds the required boilerplate to persist these types to the database and well as the initial migration code. This is where your User model comes from.

We'll get to this file in a second, but first we're going to define some data types that won't be persisted.

Go ahead and add the following after the import lines but before the share line:

type Price = Double
type Qty   = Int

What these are are type aliases. They just allow you to refer to one type as another ([Char] is aliased to String in the standard Prelude, for instance).

If and when we later make functions that deal with Lemonades, we'll see type signatures like this:

-- | Calculate the total price of multiple lemonades
totalPrice :: [Lemonade] -> Price
totalPrice = ...

And not like this:

-- | Calculate the total price of multiple lemonades
totalPrice :: [Lemonade] -> Double
totalPrice = ...

Which is not as descriptive. It's a little thing, but it goes a long way.

We're also going to create an additional data type that won't (itself) be stored as a database record.

data Size = Small
          | Medium
          | Large
          deriving (Show, Read, Eq, Ord, Enum, Bounded)

You might be familiar with this concept as an enum. In Haskell, the concept of enumeration types are just a degenerative form of algebraic data types where the constructors take no arguments.

Don't worry about the deriving line. That just tells Haskell to go ahead and use sane defaults when performing common operations with this type like converting it to string or comparing two values for equality. With this deriving in place, Haskell knows that Small can be shown as "Small" and that Medium == Medium.

Even though we don't want to store Sizes directly in the database as records, we do plan to have fields of other records be of the type Size. To allow this, we just have to ask Yesod to generate some boilerplate on this type:

derivePersistField "Size"

Easy.

When you hit save on this file, you should see in your terminal still running yesod devel that it's recompiled your sources and restarted your development server. The important thing is that it does this successfully each time you make a change. When you introduce a bug, you'll get a compiler error directing you to the problem. This immediate and accurate feedback is important to the development process as we'll see later on.

Next, we'll go ahead and add some database models. Open up config/models.

You'll see some models are already present, we'll just add more to the bottom of the file:

Order
    user UserId

Lemonade
    order OrderId Maybe
    size  Size
    price Price
    qty   Qty

This is exactly as if you had handwritten the Haskell data types:

data Order = Order
    { orderUser :: UserId
    }

data Lemonade = Lemonade
    { lemonadeOrder :: Maybe OrderId
    , lemonadeSize  :: Size
    , lemonadePrice :: Price
    , lemonadeQty   :: Qty
    }

In addition to the above declarations, Yesod will add all of the boilerplate needed for values of these types to be (de)serialized and persisted to or restored from the database.

Again, save the file and make sure it compiles.

Notice that I used the Maybe type on lemonadeOrder. In Haskell, the this type is defined as:

data Maybe a = Just a | Nothing

This allows you to have a function which can return some a or Nothing at all. This is how Haskell can maintain type safety even when you need the concept of an optional parameter or return value.

I'm assuming here that we might want to describe Lemonades that aren't yet associated with an Order. We'll see if that turns out to be the case.

Route Handling

Before we start making further changes, let me provide some context on how the current homepage is rendered. We'll be mimicking this pattern for our other pages.

Every URL that your app responds to is listed in config/routes, so go ahead and open that file.

You'll see some scaffold-provided routes already. /static and /auth use a concept called Subsites to provide additional functionality to your app (namely static file serving and user authentication). We'll not go into this any further as it can get hairy quickly and for the purposes of this article, we can treat these as black boxes.

The rest of the entries are normal routes. For these, you provide:

  1. The relative URL you answer to (we'll get to variable pieces later)
  2. The data type of the route (again, more later)
  3. The supported methods (GET, POST, etc)

Let's look at HomeR.

In your Foundation.hs file there's another line similar to the persistFile line in Model.hs. It works much the same way in that it will parse this flat file (config/routes) and generate some Haskell code for us.

When the parser comes across this HomeR line, it's going to do a number of things. Conceptually, it's something like the following:

  1. HomeR is made a valid constructor for values of type Route which is used by the framework to route requests to your handler functions.
  2. The functions in charge of rendering and parsing URLs can now translate to and from this HomeR type.

In order to accomplish this, two functions need to be in scope: getHomeR and postHomeR. This is because we've specified GET and POST as supported methods.

So, whenever a GET request comes in for "/", Yesod will now translate that URL into the data type HomeR and know to call getHomeR which is a function that returns an HTML response (RepHtml).

If you were to define a route like "/users/#UserId UsersR GET", then your required function getUsersR would have the type UserId -> RepHtml. Since your URL has a variable in it, that piece will match as a UserId and it will be given as the first argument to your handler function -- all in an entirely type safe way.

Let's add a route for buying some lemonade:

/checkout CheckoutR GET POST

While we're here, remove the POST from HomeR since we'll no longer be using that.

When you save this file you should see some problems in your compiler window:

[7 of 7] Compiling Application      ( Application.hs, 
dist/build/Application.o )

Application.hs:30:1: Not in scope: `getCheckoutR'

Application.hs:30:1: Not in scope: `postCheckoutR'

Well, look at that. We've introduced a bug, and it was caught immediately.

Since the app now needs to answer requests for "/checkout" by calling your handler functions, they need to be there or you'd have runtime errors. There is very little potential for runtime errors in Haskell, and this is just our first example of why: the compiler catches us ahead of time.

So let's fix it. The following steps might feel a bit tedious, and in Yesod version 1.1 there is a tool to do them for you, however I think that doing things like this manually at least once is useful.

Add the following around line 36 of lemonstand.cabal:

Handler.Checkout

This tells the build system to include this new source file we'll create.

Add the following around line 26 of Application.hs:

import Handler.Checkout

This imports that module (still not written) into the scope where these functions are needed.

Finally, create the file Handler/Checkout.hs:

module Handler.Checkout where

import Import

getCheckoutR :: Handler RepHtml
getCheckoutR = undefined

postCheckoutR :: Handler RepHtml
postCheckoutR = undefined

We've really just traded one runtime error for another as visiting that page will result in the app calling undefined which will fail. However, we've made the compiler happy and can move onto other things and come back to these later.

Templates

Let's open up Handler/Home.hs and see how our current home page is rendered.

We're going to strip out just about everything here. Similar code will be added later in other handlers, and I'd like you to see those concepts then rather than now.

Rewrite the file so it looks like this:

-- leave everything up to and including the import line as-is.

getHomeR :: Handler RepHtml
getHomeR = do
    -- Use the default overall layout, you'll amost always do this.
    defaultLayout $ do

        -- The page title.
        setTitle "Lemonade Stand"

        -- The template to render.
        $(widgetFile "homepage")

You may notice, you've triggered another compiler error, quite a few actually: not in scope: aDomId.

Our templates reference a variable which we've just removed. Please, take a moment to appreciate type-safe templates. No runtime error, no silent nil-handling, we get an up-front compiler error indicating exactly where the problem is. How cool is that?

In the process of fixing this, I'll also try to provide a little more context.

$(widgetFile "homepage") is a very useful function. What it does is look in your templates directory for any HTML, CSS and Javascript templates for your "homepage". These templates will be combined into a Widget. Widgets can be nested and combined quite naturally throughout your application. In the end, they will all be rolled up into one final Widget and served as a single response. All style sheets and scripts will be concatenated, minified (when configured to do so) and ordered correctly -- all without you having to think about it.

For us, this means templates/homepage.{hamlet,lucius,julius} are being found and compiled.

Julius is Javascript templating, it's essentially a straight passthrough except with variable interpolation. You can go ahead and remove it now, we won't use it on this page.

$ rm templates/homepage.julius

Lucius is a superset of CSS. It was designed to allow existing CSS to be pasted directly in and have it still compile and work. On top of this, it allows for variable interpolation and some Less-like extensions like nesting and mixins. Open up the template and remove the style block referencing aDomId.

Hamlet is the most complex of Yesod's templaters. Open up the template and fill it with the following content:

<h1>_{MsgHello}

<p>
  Click 
  <a href=@{CheckoutR}>here
  \ to buy some Lemonade!

We're going to leave _{MsgHello} in place. The _{ } interpolation will check your messages file for translations and show different content based on the user's preferred language.

@{ } is a Route interpolation. As you might've guessed, it's used to show internal links in a type safe way. Now that we've removed the aDomId references things are compiling, but it's important to realize that had we added this link to CheckoutR in here before actually adding that route to our app, we'd get a similar compiler error. No more dead links in your application, any URLs that don't resolve will immediately show up as compiler errors.

If we had a route as mentioned before for users ("/users/#UserId") we'd have to use something like @{UsersR aUserId} and the compiler would infer and enforce that aUserId is, in fact, a UserId.

There is a lot of functionality in Hamlet templates, some of which we'll get to when we build out our next page. What you can do right now is refresh your browser and see your changes.

homepage screenshot 

Forms

Let's head back to Handler/Checkout.hs. We're going to add a very simple form where the user can pick the size of their lemonade and checkout.

First we'll declare a form:

lemonadeForm :: Form Lemonade
lemonadeForm = renderTable $ Lemonade
    <$> pure Nothing
    <*> areq (selectField optionsEnum) "Size" Nothing
    <*> pure 0.0
    <*> areq intField "Quantity" Nothing

There's a few things going on here worth looking at. First of all, each line represents a record of the Lemonade data type. When shown, this form will have fields according to what's listed and map those values back to a value of type Lemonade when the form is processed. The lines that use pure provide values when processed, but don't actually show any fields.

Where going to cheat here and completely ignore Price. Dealing with dependent fields (setting price based on size, for example) can get tricky, so we're just going to set the price server-side after the size and quantity have been submitted.

Before we can test out this form, there's one thing we need to change about our Foundation.hs. We're going to use the function requireAuthId to force users to authenticate before checking out. This function also gives us the Id of the current user.

To allow this, we've got to change the module exports of Foundation.hs like so:

module Foundation
    ( App (..)
    , Route (..)
    , AppMessage (..)
    , resourcesApp
    , Handler
    , Widget
    , Form
    , maybeAuth
    , requireAuth
    , requireAuthId -- <- add this
    , module Settings
    , module Model
    ) where

With that in place, we can sketch out the Handler now:

getCheckoutR :: Handler RepHtml
getCheckoutR = do
    -- force authentication and tell us who they are
    uid <- requireAuthId

    -- run the defined form. give us a result, the html and an encoding 
    -- type
    ((res,form), enctype) <- runFormPost $ lemonadeForm

    case res of
        -- if a form was posted we get a Lemonade
        FormSuccess l -> do
            -- process it and give us the order id
            oid <- processOrder uid l

            -- TODO: redirect to Thank You page here
            return ()

        -- in all other cases just "fall through"
        _ -> return ()

    -- and display the page
    defaultLayout $ do
        setTitle "Checkout"
        $(widgetFile "checkout")

postCheckoutR :: Handler RepHtml
postCheckoutR = getCheckoutR

processOrder :: UserId -> Lemonade -> Handler OrderId
processOrder = undefined

When requireAuthId is encountered for an unauthenticated user, they will be redirected to login. The scaffold site uses the GoogleEmail plugin which allows users to login using their gmail accounts via Open Id. This authentication system can of course be changed, extended or removed, but we're going to just use it as is.

We're also using a common idiom here: the same Handler handles both GET and POST requests. In the case of a GET, the form result (res) will be FormMissing, that case statement will fall through and the form will be displayed. In the case of a POST, the form result will be FormSuccess, we'll execute processOrder (which we've left undefined for now) and redirect to a "Thank You" page.

Additionally, if there were errors in the parameters, the results would be FormErrors which is handled the same way as a GET (fall through to displaying the form) except this time, the form's HTML will include those errors so they're visible to the user to correct and resubmit.

Upon saving this, we should have another compiler error. We've told yesod to look for "checkout" templates, but there are none. So let's create just "templates/checkout.hamlet":

<h1>Checkout
<p>What size lemonade would you like?
<form enctype="#{enctype}" method="post">
  <table>
    ^{form}
    <tr>
      <td>&nbsp;
      <td>
        <button type="submit">Checkout

Simple variable interpolation is done via #{ }, while embedding one template (like form) into another is done via ^{ }.

form screenshot 

Now that we've got the form showing, we can replace our undefined business logic with some actual updates:

-- | Take a constructed Lemonade and store it as part of a new order in 
--   the database, return the id of the created order.
processOrder :: UserId -> Lemonade -> Handler OrderId
processOrder uid l = runDB $ do
    oid <- insert $ Order uid
    _   <- insert $ l { lemonadeOrder = Just oid
                      , lemonadePrice = priceForSize $ lemonadeSize l
                      }

    return oid

    where
        priceForSize :: Size -> Price
        priceForSize Small  = 0.99
        priceForSize Medium = 1.99
        priceForSize Large  = 2.99

Make sure that compiles, then add in the actual redirect:

getCheckoutR :: Handler RepHtml
getCheckoutR = do
    uid <- requireAuthId

    ((res,form), enctype) <- runFormPost $ lemonadeForm

    case res of
        FormSuccess l -> do
            oid <- processOrder uid l

            -- redirect to a "Thank You" page which takes an order id as 
            -- a parameter.
            redirect $ ThankYouR oid

        _ -> return ()

    defaultLayout $ do
        setTitle "Checkout"
        $(widgetFile "checkout")

Hopefully, you've noticed the compiler error this introduces. Can you guess how to fix it?

We've told our Application to redirect to ThankYouR but that route does not exist. Again, no runtime error, just a clear compiler error.

So, follow the advice of the compiler and add the route declaration to config/routes:

/thank_you/#OrderId ThankYouR GET

Again we get the expected compiler error that getThankYouR is not in scope.

In the interest of time and variety, we'll not create an entirely different module, or template for the Thank You page, We'll inline everything right here in Handler/Checkout.hs:

getThankYouR :: OrderId -> Handler RepHtml
getThankYouR oid = defaultLayout $ do
    setTitle "Thanks!"

    [whamlet|
        <h1>Thank You!
        <p>Your order is ##{toPathPiece oid}
        |]

thank you screenshot 

Conclusion

Obviously, The Lemonstand is quite lacking. The user never gets to see price, there's no concept of buying multiple Lemonades of varying Sizes, and the overall UI/UX is pretty terrible.

These are all things that can be fixed, but this article is already getting quite long, so I'll have to leave them for another time. Hopefully you've seen a good enough mix of theory and practice to agree that there are benefits to working on web applications (or any software) in a purely functional language like Haskell.

published on 01 Nov 2012, tagged with haskell yesod published

Yesod Deployments with Keter

Keter is Michael Snoyman's answer to the Yesod deployment "problem". This is something I've been looking for a good solution to for some time now and keter does a really great job.

Keter is meant to run as a service. It sets up its own workspace and watches for keter "bundles" to be dropped into an incoming/ directory. A keter bundle is just a gzipped tarball of your app's executable, any support files and directories it might expect to have in its current working directory, and a configuration file to tell keter a few simple things about how to manage it.

When a bundle is found in incoming/, keter will:

Keter also manages multiple versions of the same app through a zero-downtime deployment. It will bring up a new version in its own folder and wait until the current version is done serving requests before sending it a SIGTERM and removing its folder.

Though this guide will focus on getting a yesod application deploying with keter on Arch Linux, attempts will be made to explain things in a general enough way that the instructions will allow you to get this going on any distro.

This guide also assumes you've got postgresql setup and working and will manage it outside of keter. Basically, you've already got a running site and a (possibly sub-optimal) method of deployment -- I'm going to cover the transition to a keter-based approach.

Keter

First of all, install keter. At the time of this writing, we need to run the git version since it contains the code needed to customize the nginx start/reload commands.

$ git clone https://github.com/snoyberg/keter
$ cd keter
$ cabal configure
$ cabal build
$ sudo cp dist/build/keter/keter /usr/bin/keter

The last step is optional, just know that you'll need the keter binary somewhere in root's $PATH.

Next, we'll setup a configuration file to tell keter where to place its working files and how to start and reload nginx.

/etc/keter.yaml:

root: /opt/keter
nginx:
  start:
    - /etc/rc.d/nginx
    - start
  reload:
    - /etc/rc.d/nginx
    - reload

And a file to run keter as a service:

/etc/rc.d/keter:

#!/bin/bash

. /etc/rc.conf
. /etc/rc.d/functions

PIDFILE=/var/run/keter.pid

case "$1" in
  start)
    stat_busy 'Starting Keter'
    keter /etc/keter.yaml &>/dev/null &
    echo $! >"$PIDFILE"

    if [[ $? -gt 0 ]]; then
      stat_fail
    else
      add_daemon keter
      stat_done
    fi
    ;;

  stop)
    stat_busy 'Stopping Keter'
    read -r pid < $PIDFILE

    kill $pid || kill -9 $pid

    if [[ $? -gt 0 ]]; then
      stat_fail
    else
      rm_daemon keter
      stat_done
    fi
    ;;

  restart)
    $0 stop
    sleep 3
    $0 start
    ;;
  *)
    echo "usage: $0 {start|stop|restart}"
esac
exit 0

Don't start keter just yet, we've got a few more steps.

Nginx

If you've already got a site being reversed proxied to by nginx, that's good, but it's likely that keter will complete this task differently than you're currently doing it. We'll manually update our configs to the "keter way" first, so the transition to keter goes as smoothly as possible.

You've probably got everything in a single config file; we're going to modularize by site. Keter will write a server block to /etc/nginx/sites-enabled/keter containing the reverse proxy declaration. There's no reason we can't get setup that way now and verify it's working without keter's involvement.

/etc/nginx/conf/nginx.conf

user you;
worker_processes 1;

events {
  worker_connections 1024;
}

http {
  # you can run multiple sites by setting up any number of files in 
  # sites-enabled and having each respond to a specific server_name, 
  # your keterized apps will just be one of them.
  include /etc/nginx/sites-enabled/*;
}

/etc/nginx/sites-enabled/keter

server {
    listen 80;
    server_name example.com
    location / {
       # keter will use a dynamic port in the 4000s, if you let your 
       # current setup use something outside that range you can leave 
       # your current app running when you start keter for the first 
       # time. that way, if it doesn't work, you're no worse off than 
       # you were before.
       proxy_pass http://127.0.01:3001
    }
}
It's been my experience that starting only the keter service will not then bring up nginx. Not sure if this is intended or a bug; just be aware that you need to start the nginx service yourself. Keter only seems to handle sending the reload command on deployments.

Your App

Now we are ready to keterize our app! All it really takes is one additional config file:

config/keter.yaml

exec: ../dist/build/yourapp/yourapp
args:
  - production
host: example.com

I also write a small script to handle the process of building the app and placing the tarball in keter's incoming directory:

config/deploy

#!/bin/bash -ex

cabal clean
cabal configure
cabal build

strip dist/build/yourapp/yourapp

rm -rf static/tmp/

# you can use this to tar directly into the incoming folder, but you 
# need write access to it
tar czfv - dist/build/yourapp/yourapp config static > /opt/keter/incoming/yourapp.keter

# if you don't want to provide a normal user that access, you can split 
# the command and use sudo on the mv
tar czfv yourapp.keter dist/build/yourapp/yourapp config static
sudo mv yourapp.keter /opt/keter/incoming/

Try it

Finally, let's try it out:

# Start the keter service:
sudo /etc/rc.d/keter start

# Tail the log in a separate terminal so you can see any problems
tail -f /opt/keter/log/keter/current.log

# Deploy!
./config/deploy

You should see output like the following in the tailing terminal:

2012-06-01 14:42:07.85: Unpacking bundle '/opt/keter/incoming/yourapp.keter' into folder: /opt/keter/temp/yourapp-0
2012-06-01 14:42:08.54: Created process: config/../dist/build/yourapp/yourapp
2012-06-01 14:42:10.55: App finished reloading: yourapp

And /etc/nginx/sites-enabled/keter should've been overwritten with something like:

server {
    listen 80;
    server_name example.com;
    location / {
       proxy_pass http://127.0.0.1:4003;
       proxy_set_header X-Real-IP $remote_addr;
    }
}

Make sure your site's still working and you're all set!

At this point you can kill off any old version you might've had running and go on developing and deploying at will simply by dropping new keter bundles.

Systemd

If you've made the switch to systemd, there are only a few differences compared to above.

First of all, change the keter config file to use the newer commands:

/etc/keter.yaml:

root: /opt/keter
nginx:
  start:
    - systemctl
    - start
    - nginx.service
  reload:
    - systemctl
    - reload
    - nginx.service

Secondly, rather than creating an rc.d file, create a (much simpler) service file

/etc/systemd/system/keter.service

[Unit]
Description=Keter Deployment Handler
After=local-fs.target network.target

[Service]
ExecStart=/usr/bin/keter /etc/keter.yaml

[Install]
WantedBy=multi-user.target

Recently, a post of mine made it to the front page of Hacker News and I was bombarded with traffic for about 5 hours. Aside from the general network slowness of serving from behind a residential Comcast IP, the site held up surprisingly well. CPU and Memory were no issue. One problem I did run into however was file handles.

Turns out, systemd limits any service it manages to 4096 file handles by default. So, if you expect to get decent traffic, it can be a good idea to increase this. Adding LimitNOFILE=<number> to the [Service] block above does the trick. The special value infinity is also available.

Finally, use the following to start the service and enable it at boot.

# systemctl start keter.service
# systemctl enable keter.service

Benefits

There are a couple of non-obvious benefits to the keter system:

  1. It works equally well for local or remote servers

If you're deploying your app to a remote server just (have keter running there and) change your deployment script to end with:

tar czfv - dist/build/yourapp/yourapp config static |\
  ssh example.com 'cat > ~/keter/incoming/yourapp.keter'
  1. It works equally well for non-yesod apps too

The only real requirement is that the executable respect the $PORT environment variable when choosing how to listen. This is becoming an increasingly popular pattern with hosted solutions like heroko and nodester so any well-behaved app should probably do this anyway.

Besides that, you've just got to make a proper bundle: a config/keter.yaml, your executable and any other files or directories your app expects to have present in its current directory.

Downsides

Keter is in its early stages of development so it's not without its failings. Mainly, it's not very flexible -- you're expected to use the nginx reverse proxy approach with a single executable backend server.

You're also unable to setup any static file serving tricks at this time (though there is code in Keter to handle it, and I've been playing with some ideas in my own fork).

Those issues notwithstanding, I'm still finding the approach incredibly streamlined and useful for both my local deployments of pbrisbin.com and some remote servers I deploy to. I was able to ditch a number of scattered service files and bash scripts that had been hobbled together to fill this gap.

Well done Michael.

published on 01 Jun 2012, tagged with haskell yesod keter

Maybe In Ruby

Sometimes it's fun to do something completely useless.

Recently, I wrote a post about how awesome the Maybe type is in Haskell. In the post, I talked about Functors and Monads and how Maybe can help us understand them.

Shortly thereafter, I was bored on the train one day and decided to implement Maybe and its functor instance in ruby.

In this post I'll be relying on the fact that obj.(args) is translated to obj.call(args) in newer rubies. I find it makes the example read better.

Maybe

So we need an object that can represent "Just something" or "Nothing". Ruby already has the concept of nil, so we'll piggy back on that and just wrap it all in some sugar.

class Maybe
  def initialize(value)
    @value = value
  end

  def nothing?
    @value.nil?
  end

  def just?
    !nothing?
  end

  def value
    if just?
      @value
    else
      raise "Can't get something from nothing."
    end
  end

  # we'll need this to prove the laws
  def ==(other)
    if just? && other.just?
      return value == other.value
    end

    nothing? && other.nothing?
  end
end

def Just(x)
  raise "Can't make something from nothing." if x.nil?

  Maybe.new(x)
end

Nothing = Maybe.new(nil)

Functions

We can't map functions to methods because methods need targets, they can't stand on their own. As an example, take id (which we'll be using later on). One might be tempted to define it like this:

def id(x)
  x
end

This won't work for our purposes since that method (defined on the global object Object) can't be passed around, partially applied or composed.

It's more convenient to do it like this:

# ruby 1.9
id = ->(x) { x }

# ruby 1.8
id = lambda { |x| x }

Now you've got an isolated, callable id object which you can pass around.

Partial Application

Functions need to be partially applied. That means you can give a function a few of the arguments it expects and get back another function which you can then pass around and eventually call with the additional arguments given at that later point:

class Partial
  def initialize(f, *args)
    @f, @args = f, args
  end

  def call(*args)
    new_args = @args + args

    @f.(*new_args)
  end
end

def partial(f, *args)
  Partial.new(f, *args)
end

max = ->(x,y) { x >= y ? x : y }

max.(4, 5) # => 5

max5 = partial(max, 5)

max5.(6) # => 6
max5.(4) # => 5

[4, 5, 6].map { |i| max5.(i) } # => [5, 5, 6]

Composition

Two functions, when composed together, return a new function which represents the first being applied to the result of the second being applied to the argument given.

class Compose
  def initialize(f, g)
    @f, @g = f, g
  end

  def call(x)
    @f.( @g.( x ) )
  end
end

def compose(f, g)
  Compose.new(f, g)
end

get_len = ->(s) { s.length   }
add_bar = ->(s) { s + "_bar" }

get_len_with_bar = compose(get_len, add_bar)

get_len_with_bar.("foo") # => 7
This is all so much easier in Haskell...

Functor

Now that we can define functions, partially apply them and compose them together, we can finally prove the Functor laws for our new Maybe class.

Let's start by defining fmap, just as it is in Haskell:

# fmap f (Just x) = Just (f x)
# fmap _ Nothing  = Nothing
fmap = ->(f, x) do
  if x.just?
    Just(f.(x.value))
  else
    Nothing
  end
end
Strictly speaking, fmap's behavior is type-dependant. So a real implementation (for some definition of "real") would probably make a method on Object which needs to be overridden by any classes that are proper Functors. We won't worry about that here...

First law, the identity operation must behave the same when it's fmapped.

id = ->(x) { x }

fmap_id = partial(fmap, id)

# fmap id = id
fmap_id.(Nothing)     == id.(Nothing)     # => true
fmap_id.(Just("foo")) == id.(Just("foo")) # => true

So far so good.

Second law, fmapping a composed function is no different than composing the result of each function fmapped separately.

f = ->(s) { s + "_bar" }
g = ->(s) { s.length   }

f_g = compose(f, g)

fmap_f_g = partial(fmap, f_g)

fmap_f = partial(fmap, f)
fmap_g = partial(fmap, g)

fmap_f_fmap_g = compose(fmap_f, fmap_g)

# fmap (f . g) == fmap f . fmap g
fmap_f_g.(Nothing)     == fmap_f_fmap_g.(Nothing)    # => true
fmap_f_g.(Just("foo")) == fmap_f_fmap_g.(Just("foo") # => true

As suspected, our new Ruby-Maybe is a proper Functor.

Monad?

Is our class a Monad?

# >>=
f = ->(ma, f) do
  if ma.just?
    f.(ma.value)
  else
    Nothing
  end
end

# >>
f = ->(ma, mb) do
  if ma.just?
    mb
  else
    Nothing
  end
end

# return
f = ->(x) do
  Just(x)
end

# fail
f = -> do
  Nothing
end

Proving the laws is left as an exercise to the reader...

published on 01 May 2012, tagged with ruby haskell

Maybe Is Just Awesome

In Haskell, functions must always return the same consistent type. There is also no concept of nil or null built into the language. This is not meant to handicap you, and the expressiveness and polymorphic-ness of Haskell's types mean it certainly does not. One way to handle such situations where functions (conceptually) may or may not return a value is through the Maybe type.

data Maybe a = Just a | Nothing

Maybe is a perfect and simple solution for this situation. It says that, for all values of type a, we can construct values that are either Just that or Nothing.

This type is also perfect for illustrating some of Haskell's more math-heavy concepts. If we take all the potential a values as one category and all the potential Maybe a values as another, then we can use this type to describe the Functors of Category Theory. If we also think about the difference between some a and its Maybe a counterpart as some sort of state to be managed throughout an execution chain, then we can also use this type to describe a Monad. In both cases, the benefits are more concise code and a greater understanding of these abstract concepts that we can take with us to more complex type interactions.

Functor

A Functor is a way to transform a function (more formally a morphism) that acts on one category of objects into one that can act on objects in another category. In Haskell, this concept is captured by the Functor typeclass. It states that for any type t that takes one argument (like Maybe), we can make it an instance of Functor by defining the translation function fmap:

fmap :: Functor t => (a -> b) -> (t a -> t b)

This specifies precisely how a function that acts on one set of types (a -> b) can be used on types that are wrapped versions of these (t a -> t b).

So how is Maybe a Functor?

instance Functor Maybe where
    fmap _ Nothing  = Nothing
    fmap f (Just a) = Just (f a)

Seems straight forward, if the value is Just we apply the morphism to the underlying object and rewrap the result in Just. Trying to apply a morphism to a Nothing value just results in Nothing.

Monad

A Monad is a very scary term to Haskell noobies. Mainly because the first Monad we are introduced to is IO. It's used for any computation that affects (or draws on) the outside world. We're told that it handles potential failures and manages state between computations. Most times we accept it as magic and blindly memorize the do notation and counter-intuitive return statements.

We can take a step back, talk about a Monad in very general terms, then describe how Maybe types work as a Monad. Given that understanding, we can get a better handle on what State and IO Monads are doing (even if we still have to think of it as a bit of magic).

A Monad is a way to chain multiple computations together and manage how that chain of actions works as a whole. The Monad laws will manage the state between these actions (ensuring dependent actions are run in the correct order since they might rely on more than just their direct arguments) and also any failing cases (if some action fails, future actions are aborted and the whole expression is a failure).

Somewhat surprisingly, any type can act as a Monad by defining a few simple functions. I'm going to show and talk about them separately because I think it can go a long way to understanding Monads in general.

instance Monad Maybe where
    -- (>>=) :: m a -> (a -> m b) -> m b
    (Just x) >>= k = k x
    Nothing  >>= _ = Nothing

Here we're just showing how to chain two dependant actions together -- that's really all it is. The first "action" is a wrapped value (m a), the second argument is a function which acts on the unwrapped value producing a new wrapped value (a -> m b). For Maybe we just have to account for the Just and Nothing cases appropriately.

    -- (>>) :: m a -> m b -> m b
    (Just _) >> k = k
    Nothing  >> _ = Nothing

Here we're showing how to chain two independant actions together. We're still preserving the fact that if the first action "fails" the second action is not run, but in this case the result of the first action has no bearing on the second.

    -- return :: a -> m a
    return = Just

return is simply a way to take some non-monadic value and treat it as a Monadic action. In our case wrapping a value in Just does just that.

    -- fail :: String -> m a
    fail _ = Nothing

There's also the concept of outright failure. For us it's simple: Nothing is the failure case. The reason for the String argument is that Haskell allows you to include a message with the failure. There's much contention in the Haskell community around including fail in the Monad type class, but we won't get into that here as Maybe has a pretty simple implementation of it.

It should also be noted that the do and <- notation that everyone is used to can be "de-sugared" down to an expression using only the above 4 functions. If you're having trouble seeing how an expression is leveraging the above laws to do what it does, it can be a good exercise to de-sugar it by hand.

The super interesting thing (I find) about the above instances of Functor and Maybe is that we're not making Maybe a an instance of anything, we're describing only the behavior of Maybe. The types being wrapped up are irrelevant (they can even be further wrapped in Maybe or IO -- crazy).

Leaving those details out of it, or more importantly being able to leave those details out of it is just another case of Haskell's type system leading to elegant and generalized code.

Example Time

So why do we care? Well, besides using Maybe as an illustration for hard-to-grasp concepts like Functors and Monads, knowing when to use those instances of Maybe can really cut down on code clutter and lead to elegant solutions when you've got a lot of Maybe-heavy code.

Let's say you've got a user model in your webapp with an optional email field. This field is a custom type Email but you've got a function for translating it to Text. You've also got another general function for displaying Text values on the page as Html.

Because you were thinking ahead and you knew there'd be a lot of Maybe Text values in use throughout your site, you've coded your display function to accept maybe values and show an empty string in these cases.

userEmail :: User -> Maybe Email
userEmail = undefined

emailToText :: Email -> Text
emailToText = undefined

display :: Maybe Text -> Html
display = undefined

In the described ecosystem your going to have a lot of core Maybe values and a lot of value-manipulation functions not in Maybe. To put this in category terms, you've got a lot of morphisms in the non-maybe category and a lot of objects in the maybe category. You're going to want to fmap that.

Here's how the code looks without leveraging the fact that Maybe is a Functor:

displayUserEmail :: User -> Html
displayUserEmail u = let me = userEmail u
                     in display $ case me of
                         Just e  -> Just (emailToText e))
                         Nothing -> Nothing

Not terrible, but notice how fmap shrinks it right up:

displayUserEmail :: User -> Html
displayUserEmail = display . fmap emailToText . userEmail

Not only does it make the code clearer and cleaner, but it serves a common purpose: you're going to have a lot of value-manipulating functions that should operate on basic values and not care about any wrapping. Just because you've got a lot of these values wrapped up in Maybe, that shouldn't stop you from using these morphisms from the other category in this one. The nature of that Maybe wrapper allows fmap to easily handle the translation for you.

Sure, you could write a small function that takes functions that operate on normal values and allows them to be used on maybe values (and I think I did just that at one point) -- but this concept of a Functor abstracts all that down to a simple generic fmap that can be used with a zillion different compound "wrapper" types.

Guess what? IO is a Functor too.

-- something like this:
prettyNow :: IO String
prettyNow = do
  now <- getCurrentTime

  return $ formatPretty now

-- can be shorter:
prettyNow = fmap formatPretty getCurrentTime

Oh, and if you're interested in seeing how do notation is de-sugared, here's that first, non-functor version but without the do notation:

prettyNow :: IO ()
prettyNow =
    getCurrentTime >>= \now ->
        return (formatPretty now)
Coming back to our Monadic laws, you can imagine that if getCurrentTime failed in some way (and we know IO has some implementation for fail) then the entire expression will be fail simply because of the mechanics behind >>=.

Using Maybe as a Monad allows for even more verbose "stair-case" code to become much more readable. For this example, we've got a series of functions that translate values from one type to another. Any of these functions can fail if the input is not as expected and they capture this by returning maybe values:

textToXml :: Text -> Maybe Xml
textToXml = undefined

xmlToJson :: Xml -> Maybe Json
xmlToJson = undefined

jsonToResponse :: Json -> Maybe Response
jsonToResponse = undefined

As before, here's that code written in a way that does not leverage Maybe's monadic properties:

textToResponse :: Text -> Maybe Response
textToResponse t = let mx = textToXml t
                   in case mx of
                       Nothing -> Nothing
                       Just x  -> let mj = xmlToJson x
                                  in case mj of
                                      Nothing -> Nothing
                                      Just j  -> jsonToRepsonse j

What do you have here? A series of dependant computations where if any one of them fails we want the whole expression to fail. Strictly using what we've learned in this post, we can simplify this to the following:

textToResponse :: Text -> Maybe Response
textToResponse t = textToXml t >>= xmlToJson >>= jsonToResponse

And if you prefer do notation (I do), then we could write the above like so:

textToResponse :: Text -> Maybe Response
textToResponse t = do
    x <- textToXml t
    j <- xmlToJson x
    r <- jsonToResponse j

    return r

The r <- and return r is redundant but I think it shows more clearly the interaction between the as and Maybe as.

You can even mix do notations within each other:

main :: IO ()
main = do
    -- this is IO
    text <- getSomeText
  
    let mresponse = do
            -- but this is Maybe
            x <- textToXml t
            j <- xmlToJson x
            r <- jsonToResponse j
  
            return r
  
    -- and IO again
    sendResponse mresponse

So hopefully you've all learned a little bit through this post. I know it was helpful for me to write it all out. We've seen that Maybe is a type that is complex enough to be used in a variety of different contexts but also simple enough to illustrate those contexts in an easier to grasp way. We've also seen that using these higher-level qualities of Maybe can lead to smaller, easier to read code.

published on 06 Mar 2012, tagged with haskell

Live Search

I've had some fun recently, adding full-text search support to the posts on the site to try and make a simple-but-still-useful archive.

I'd like to post a bit about the feature and how it works. It's got a few moving parts so I'm going to break it up a bit.

This post will focus on the backend, setting up sphinx, providing content to it from a yesod application, and executing a search from within a handler. The second post will go into the front-end javascript that I implemented for a pretty simple but effective search-as-you-type interface.

For the full context, including required imports and supporting packages, please see this feature in the wild.

Sphinx

Sphinx is a full-text search tool. This assumes you've got some concept of "documents" hanging around with lots of content you want to search through by key word.

What sphinx does is let you define a source -- a way to get at all of the content you have in a digestible format. It will then consume all that content and build an index which you can search very efficiently returning a list of Ids. You can then use those Ids to display the results to your users.

There are other aspects re: weighting and attributes, but I'm not going to go into that here.

The first thing you need to do (after installing sphinx) is to get your content into a sphinx-index.

If you've got the complete text you'll be searching actually in your database, sphinx can natively pull from mysql or postgresql. In my case, the content is stored on disk in markdown files. For such a scenario, sphinx allows an "xmlpipe" source.

What this means is that you provide sphinx with a command to fetch an xml document containing the content it should index.

Now, if you've got a large amount of content, you're going to want to use clever conduit/enumerator tricks to stream the xml to the indexer in constant memory. That's what's being done in this example. I'm doing something a little bit more naive -- for two reasons:

  1. I need to break out into IO to get the content. This is difficult from within a lifted conduit Monad, etc.
  2. I don't have that much shit -- the thing indexes in almost no time and using almost no memory even with this naive approach.

So, here's the simple way:

getSearchXmlR :: Handler RepXml
getSearchXmlR = do
    -- select all posts
    posts <- runDB $ selectList [] []

    -- convert each post into an xml block
    blocks <- liftIO $ forM posts $ \post -> do
        docBlock (entityKey post) (entityVal post)

    -- concat those blocks together to one xml document
    fmap RepXml $ htmlToContent $ mconcat blocks

    where
        htmlToContent :: Html -> Handler Content
        htmlToContent = hamletToContent . const

docBlock :: PostId -> Post -> IO Html
docBlock pid post = do
    let file = pandocFile $ postSlug post

    -- content is kept in markdown files on disk, if the file can't be 
    -- found, try to use the in-db description, else just give up.
    exists <- doesFileExist file
    mkd    <- case (exists, postDescr post) of
        (True, _         ) -> markdownFromFile file
        (_   , Just descr) -> return descr
        _                  -> return $ Markdown "nothing?"

    return $
        -- this is the simple document structure expected by sphinx's 
        -- "xmlpipe" source
        [xshamlet|
            <document>
                <id>#{toPathPiece pid}
                <title>#{postTitle post}
                <body>#{markdownToText mkd}
            |]

    where
        markdownToText :: Markdown -> Text
        markdownToText (Markdown s) = T.pack s

With this route in place, a sphinx source can be setup like the following:

source pbrisbin-src
{
    type        = xmlpipe
        xmlpipe_command = curl http://localhost:3001/search/xmlpipe
}

index pbrisbin-idx
{
    source      = pbrisbin-src
    path        = /var/lib/sphinx/data/pbrisbin
    docinfo     = extern
    charset_type    = utf-8
}

Notice how I actually hit localhost? Since pbrisbin.com is reverse proxied via nginx to 3 warp instances running on 3001 through 3003 there's no need to go out to the internet, dns, and back through nginx -- I can just hit the backend directly.

With that setup, we can do a test search to make sure all is well:

$ sphinx-indexer --all # setup the index, ensure no errors
$ sphinx-search mutt
Sphinx 2.1.0-id64-dev (r3051)
Copyright (c) 2001-2011, Andrew Aksyonoff
Copyright (c) 2008-2011, Sphinx Technologies Inc 
(http://sphinxsearch.com)

using config file '/etc/sphinx/sphinx.conf'...
index 'pbrisbin-idx': query 'mutt ': returned 6 matches of 6 total in 
0.000 sec

displaying matches:
1. document=55, weight=2744, gid=1, ts=Wed Dec 31 19:00:01 1969
2. document=62, weight=2728, gid=1, ts=Wed Dec 31 19:00:01 1969
3. document=73, weight=1736, gid=1, ts=Wed Dec 31 19:00:01 1969
4. document=68, weight=1720, gid=1, ts=Wed Dec 31 19:00:01 1969
5. document=56, weight=1691, gid=1, ts=Wed Dec 31 19:00:01 1969
6. document=57, weight=1655, gid=1, ts=Wed Dec 31 19:00:01 1969

words:
1. 'mutt': 6 documents, 103 hits

Sweet.

Haskell

Now we need to be able to execute these searches from haskell. This part is actually going to be split up into two sub-parts: first, the interface to sphinx which returns a list of SearchResults for a given query, and second, the handler to return JSON search results to some abstract client.

I've started to get used to the following "design pattern" with my yesod sites:

Keep Handlers as small as possible.

I mean no bigger than this:

getFooR :: Handler RepHtml
getFooR = do
    things      <- getYourThings

    otherThings <- doRouteSpecificStuffTo things

    defaultLayout $ do
        setTitle "..."
        $(widgetFile "...")

And that's it. Some of my handlers break this rule, but many of them fell into it accidentally. I'll be going through and trying to enforce it throughout my codebase soon.

For this reason, I've come to love per-handler helpers. Tuck all that business logic into a per-handler or per-model (which often means the same thing) helper and export a few smartly named functions to call from within that skinny handler.

Anyway, I digress -- Here's the sphinx interface implemented as Helpers.Search leveraging gweber's great sphinx package:

The below helper actually violates my second "design pattern": Keep Helpers generic and could be generalized away from anything app-specific by simply passing a few extra arguments around. You can see a more generic example here.
sport :: Int
sport = 9312

index :: String
index = "pbrisbin-idx"

-- here's what I want returned to my Handler
data SearchResult = SearchResult
    { resultSlug    :: Text
    , resultTitle   :: Text
    , resultExcerpt :: Text
    }

-- and here's how I'll get it:
executeSearch :: Text -> Handler [SearchResult]
executeSearch text = do
    res <- liftIO $ query config index (T.unpack text)

    case res of
        Ok sres -> do
            let pids = map (Key . PersistInt64 . documentId) $ matches sres

            posts <- runDB $ selectList [PostId <-. pids] []

            forM posts $ \(Entity _ post) -> do
                excerpt <- liftIO $ do
                    context <- do
                        let file = pandocFile $ postSlug post

                        exists <- doesFileExist file
                        mkd    <- case (exists, postDescr post) of
                            (True, _         ) -> markdownFromFile file
                            (_   , Just descr) -> return descr
                            _                  -> return $ Markdown "nothing?"

                        return $ markdownToString mkd

                    buildExcerpt context (T.unpack text)

                return $ SearchResult
                            { resultSlug    = postSlug post
                            , resultTitle   = postTitle post
                            , resultExcerpt = excerpt
                            }

        _ -> return []

    where
        markdownToString :: Markdown -> String
        markdownToString (Markdown s) = s

        config :: Configuration
        config = defaultConfig
            { port   = sport
            , mode   = Any
            }

-- sphinx can also build excerpts. it doesn't do this as part of the 
-- search itself but once you have your results and some context, you 
-- can ask sphinx to do it after the fact, as I do above.
buildExcerpt :: String -- ^ context
             -> String -- ^ search string
             -> IO Text
buildExcerpt context qstring = do
    excerpt <- buildExcerpts config [concatMap escapeChar context] index qstring
    return $ case excerpt of
        Ok bss -> T.pack $ C8.unpack $ L.concat bss
        _      -> ""

    where
        config :: E.ExcerptConfiguration
        config = E.altConfig { E.port = sport }

        escapeChar :: Char -> String
        escapeChar '<' = "&lt;"
        escapeChar '>' = "&gt;"
        escapeChar '&' = "&amp;"
        escapeChar c   = [c]

OK, so now that I have a nice clean executeSearch which I don't have to think about, I can implement a JSON route to actually be used by clients:

getSearchR :: Text -> Handler RepJson
getSearchR qstring = do
    results <- executeSearch qstring

    objects <- forM results $ \result -> do
        return $ object [ ("slug"   , resultSlug    result)
                        , ("title"  , resultTitle   result)
                        , ("excerpt", resultExcerpt result)
                        ]

    jsonToRepJson $ array objects

Gotta love that skinny handler, does its structure look familiar?

You can see the result by visiting search/j/mutt for example.

In the next post, I'll give you the javascript that consumes this, creating the search-as-you-type interface you see on the Archives page.

published on 29 Jan 2012, tagged with haskell website yesod

Lighttpd Reverse Proxy

This site was previously served via lighttpd using fastcgi. My haskell source was compiled using Network.Wai.Handler.FastCGI and the binary was placed at /srv/http/app.cgi to be handled by lighttpd's mod_fastcgi.

I decided to switch it up and let Warp serve the haskell app directly, then proxy certain urls through to it via lighttpd.

This how-to will outline the steps needed to get this setup and comment a little bit on what all the moving parts do.

This guide assumes your code is structured roughly like the 0.9.1 scaffolder, and your Application.hs exports that withYourApp function which is used by main.hs and compiled into a binary and executed.

My application is called "DevSite" (I don't know why), so anywhere you see that in this guide, just assume I mean your foundation type / app name.

Why

Compiling to fastcgi was starting to feel kind of icky. Warp is all grown up now and capable of serving content mighty quickly. Often a problem with my app would result in lighttpd silently failing and leaving troublesome pid files around.

It's nicer to let that front-facing server sit there running, none-the-wiser that I'm constantly developing and recompiling the app that it's forwarding to. Installing and starting a compiled Warp binary will give me greater feedback in the event something goes awry.

Fortunately, I already had the backbone of url-rewriting going on to get requests to app.cgi so I just needed to update that to pull http traffic from another port on localhost rather than actually call a CGI process to get the response.

Lighttpd 1.5 built the proxy framework with the intention of superseding mod_fastcgi and providing that feature simply by telling you to proxy to a fastcgi application in the same way you would to another domain. This meant I just had to update my syntax for 1.5, then it was almost as easy as s/fastcgi/http/ing the config.

There's also the minor benefit that I no longer need duplicated support files (like client-session key and favicon) between development and production.

The Moving Parts

Lighttpd will rewrite / redirect urls in a few stages:

  1. Certain urls will be handled by lighttpd itself. I like lighttpd for static file serving. It's got a pretty directory listing, it's fast, and it makes it super easy to setup a /static/private which enforces simple http-auth for access -- very handy.

  2. Everything else will be rewritten (once) to /proxy/$1.

  3. Anything coming in for /proxy/.* (presumably via rewrite) will go to another port on localhost where my Warp server will take over.

Lighttpd can also load-balance over multiple instances of Warp (nifty!).

The Setup

First let's get lighttpd setup. I'm using mod_proxy_core which is only available in lighttpd-1.5+. If you're on Arch, you can install aur/lighttpd-svn.

Import some modules:

server.modules = ( "mod_rewrite"
                 , "mod_proxy_core"
                 , "mod_proxy_backend_http"
                 )

Setup the "stage-one" redirects:

# notice that by rewriting to /proxy$1 and not /proxy/$1 we get the 
# desired behavior where / becomes /proxy/ and /what/ever becomes 
# /proxy/what/ever.
url.rewrite-once = ( "^/static.*" => "$0"
                   , "(.*)"       => "/proxy$1"
                   )

Finally, setup the actual proxying:

$HTTP["url"] =~ "^/proxy.*" {
  # straight, http pass-through
  proxy-core.protocol        = "http"

  # lighttpd will manage its own queue and send requests to whichever 
  # instance has the shortest queue
  proxy-core.balancer        = "sqf"

  # these are the 5 Warp instances we'll start
  proxy-core.backends        = ( "127.0.0.1:3001"
                               , "127.0.0.1:3002"
                               , "127.0.0.1:3003"
                               , "127.0.0.1:3004"
                               , "127.0.0.1:3005"
                               )

  # strip the /proxy prefix
  proxy-core.rewrite-request = ( "_uri" => ( "^/proxy(.*)" => "$1" ) )
}

Now that we've got that going we need to spin up some Warp instances to serve out anything lighttpd redirects from /proxy.

Luckily the scaffolded main.hs allows us to pass a port on the command line, so we'll just start up a bunch of instances of our app all listening on a different port.

Script It Out

I like to script this process of starting and stopping the multiple Warp instances. To facilitate this, we need to create some support directories alongside your source code:

mkdir tmp/{pid,log}

With those in place, feel free to take the following functions and incorporate them into some server management script:

instances='1 2 3 4 5'

start_devsite() {
  local n

  echo 'Starting worker processes...'
  for n in $instances; do
    devsite -p=300$n > tmp/log/$n.log 2> tmp/log/${n}_errors.log &
    echo $! > tmp/pid/$n.pid
  done
}

stop_devsite() {
  local pid n

  echo 'Stopping worker processes...'
  for n in $instances; do
    if [[ -f tmp/pid/$n.pid ]]; then
      read -r pid < tmp/pid/$n.pid
      if [[ -n $pid ]]; then
        kill $pid
        rm tmp/pid/$n.pid
      fi
    fi
  done
}

Once you execute the start function, you should see 5 processes running listening on ports 3001 through 3005. Lighttpd is already setup to forward to those apps in a load-balanced way so go ahead and see if it worked!

published on 10 Sep 2011, tagged with haskell lighttpd proxy website yesod

Comments

Recently I decided I no longer like disqus. It was a great hands-off commenting system, but it had its downsides. I decided to make a change.

I have my own commenting system which I wrote as a yesod module. The initial reason I didn't go with this when I first moved to yesod (and lost my existing homebrew php commenting system) was because it didn't support authentication. I knew that an unauthenticated "enter comments" box had like a 100 to 1 ratio of spam to real comments.

When I put up rentersreality, I took the time to build authentication into my comments system by natively supporting YesodAuth so that I could use my module there. With that in place, the only thing keeping me back was losing my existing comments (more on that later).

This module really speaks to the flexibility of the framework. With about 8 lines of code I (the end user) was able to add tightly integrated commenting to my yesod site. Comments are stored in my database and users are authenticated using my authentication model. Furthermore, I (the developer) was able to put together a system that integrates this way even with my limited haskell-foo thanks to the openness and robustness of existing modules like yesod-auth and yesod-persistent.

I decided to make a few other changes along with this one. First, I moved the site from sqlite to postgresql. I originally went with sqlite because I was intimidated by postgres. After moving rentersreality to postgres, I realized it was no harder to get set up and offered great benefits in speed, reliability and maintenance tools. Secondly, I had to ditch my simple auth setup (one manually added user) in favor of a more typical auth setup. Now, anyone can authenticate with their open id (to be able to comment) and I just maintain an isAdmin flag manually to allow me to do the me-only stuff.

Why did you do it?

For the sake of completeness here are the Pros and Cons of the recent shift:

Pros:

Since I develop the comments module, I know that I'll get preferential treatment when it comes to bug fixes and features. Also, it's awesome.

Pages load faster and usage is clean, pure GET and POST html-forms.

Comments are parsed by Pandoc the same way my post content itself is. This means that you have the full expressive power of this awesome markdown system (any non dangerous html is possible plus syntax highlighting and other nice features).

Both visually and architecturally, the comments are deeply integrated into my site. This is in spite of the code itself being completely modularized and reusable anywhere. Yay haskell.

Cons:

I have a plan for this.

I kind of sidestepped the whole javascript edit-in-place usage scenario and in stead created a full-blown management sub-site for commenting users. By following the "comments" link in my sidebar you can see all the comments you've left on this site and edit/delete them at-will. It's a really clean method which provides a lot of functionality while being a) javascript-free and b) still completely modularized out of my specific site and reusable anywhere.

Quote could be done with some javascript to just populate the box with a markdown blockquote for you. Reply (and the implied threading) would require a re-engineering that I might not be willing to go through for a while.

I'm not sure how much of a use there is for this on a site like mine but with the new administration sub-site it would be almost trivial to add this functionality -- maybe I'll do this soon.

To existing commenters

If you've commented on this site before I want to restore your comments, but I need your help.

What I need you to do is go ahead and login once, choose a username and along with the disqus account you commented on previously.

If you commented on the old-old system, I could still restore your comments, you'll just have to decide the best way to let me know what comments they were (the username you used or the thread/nature of the comments, etc).

With this information, I'll be able to reinstate your comments and link them to your new identifier on the site.

I hope you'll help me with this; but if not, I understand.

Play around

So go ahead and use this page to try out the commenting system. See what kind of markdown results in what.

If you want any comments (here or on other pages) edited or removed, I can always be reached by email. I don't mind running a quick sql statement on your behalf.

Let me know of any bugs you find but don't worry about css-fails, those should get fixed almost immediately (I just need the content present to find them).

published on 08 Jul 2011, tagged with haskell website yesod

Anatomy of a Yesod Application

subtitle: how to stay sane when developing for the web in haskell

This post was originally about how I structure my Yesod applications and where it differs from the scaffold tool. I've since done a bit of a 180 and started to really like the scaffold tool and its structure.

For that reason, I've rewritten the post to outline that structure, its benefits and some strategies I use above what it provides to develop and deploy Yesod applications.

Note that this information is 0.8-specific and with the 0.9 and 1.0 versions of Yesod, this post will be obsolete (until I update it again).

I recently had the chance to do a coding exercise for a job interview. Due to some misunderstandings of the task on my part I had to do it twice. I'm still embarrassed about it, but I did end up getting the job so all's well that ends well...

Anyway, the cool thing about doing it twice is that the first time, I did it in Rails and the second time in Yesod and that gave me a chance to evaluate the two frameworks in somewhat of a side-by-side.

I'm not going to get into a big discussion on the pros and cons of each one in general -- though I will say they both excelled at staying out of my way and getting me a base for the exercise very quickly. What I'd rather talk about is structure.

During my brief time hacking in Rails, I quickly grew to like the "convention over configuration" philosophy. If you create a model/foo and a controller/foo and a test/foo, then foo itself just magically works. I liked it.

I hadn't used the Yesod scaffold tool since about 0.1, but I knew I needed to get a site up quickly so I decided to use it for this exercise. I found that the new structure was very organized and well thought out. It gave me a similar convention-driven feeling, create hamlet/foo and cassius/foo then widgetFile foo would just work.

I think the framework could use a bit more of this approach to make the yesod-scaffolding tool as versatile as the ruby one. Mechanisms like widgetFile could be more plentiful and library provided (rather than scaffolded into your Settings.hs).

The yesod scaffold basically built you an "example" site which you can rewrite to your needs. In contrast, the ruby scaffold tool(s) let you say "I have some data structure like X" and it goes and creates a bunch of code to make X work. You're obviously still going to rewrite a lot of the generated code, but it's not a 100% guarantee like with yesod init.

Putting on my yesod-end-user cap (vs the usual yesod-contributer one), I would love to see yesod work more like rails: yesod init should give you a simple status-page site with links to all the documentation (you could still chose tiny, or database-driven and get all that setup at this point too). Then, yesod scaffold --whatever commands could be used to build up a CRUD interface with your actual data types.

Hmm, that turned into a bit of a wine about how rails is better than yesod -- that is not my opinion in general. There are tons of reasons I prefer yesod overall, I was just really impressed with rails scaffolding abilities.

Scaffold

Enough comparison, let's talk about yesod as it is now.

The scaffold tool sets up the following basic structure:

/path/to/site
|-- config
|   `-- ...
|-- hamlet
|   `-- ...
|-- cassius
|   `-- ...
|-- julius
|   `-- ...
|-- Handlers
|   `-- ...
|-- Model.hs
|-- YourSite.hs
|-- Controller.hs
`-- yoursite.cabal

config is going to hold your Settings.hs module along with some text files where you define your routes and models. I also like to throw the main executables' source files in there which I'll discuss later.

hamlet, cassius, and julius will contain the templates files for html, css, and javascript respectively. One awesome new development is the aforementioned function widgetFile which I use 100% of the time regardless of what templates the page actually calls for. If you write, say, addWidget $(widgetFile "foo"), that will splice in the templates hamlet/foo.hamlet, cassius/foo.cassius, and julius/foo.julius and just ignores non-existent files.

Model.hs, YourSite.hs, and Controller.hs are pretty self-explanatory, and are either entirely site-dependant or scaffold-generated so I'm not going to discuss them.

One other cool feature of the scaffold is how it sets up the imports and exports in YourSite.hs. It handles all of the major imports (like Yesod itself, etc) and those that need to be qualified (like Settings) and then reexports them as a single clean interface. This means that all of your Handlers, Helpers, etc can just import YourSite and be done with it. Very nice, very clean.

Handlers usually contains one module per route and only defines the route handling functions. I try to keep any support functions in either per-handler or site-wide Helpers.

One last note: do yourself a favor and keep/maintain the generated cabal file. It's a nice way to prevent breakage (when dependencies are updated) and keep dev vs prod options straight. It's also nice to keep all the object and interface files hidden under an ignorable dist directory.

Development

For development, I use an easy simple-server approach. The haskell is as follows:

import Controller (withServer)
import System.IO (hPutStrLn, stderr)
import Network.Wai.Middleware.Debug (debug)
import Network.Wai.Handler.Warp (run)

main :: IO ()
main = do
    let port = 3000
    hPutStrLn stderr $ "Application launched, listening on port " ++ show port
    withServer $ run port . debug

I then keep a simple shell script that runs it:

#!/bin/bash -e

touch config/Settings.hs
runhaskell -Wall -iconfig config/simple-server.hs

The touch just ensures that anything set by CPP options are up to date every time I ./devel.

Deployment

By using the cabal file, deployments are pretty easy. I use lighttpd as my server-of-choice (I also let it do the static file serving), so I need to compile to fastcgi.

I keep exactly one copy of any static files (including my main css) and it lives only in the production location. To support this, I define a staticLink function in Settings.hs which is conditional on the PRODUCTION flag.

If I'm developing locally, staticLink "foo" would return http://the-real-domain/static/foo so that the file is linked from its live location. When running in production, that function just returns /static/foo which is what I would actually want in the html.

I find this approach is way simpler than any other way I've done static file serving.

My cabal file builds an executable from config/mysite.hs which looks like this:

import Controller (withServer)
import Network.Wai.Handler.FastCGI (run)

main :: IO ()
main = withServer run

Then I've got another shell script to make deployments a single-command operation:

#!/bin/bash -e

app="${1:-/srv/http/app.cgi}"

sudo true

# this command just adds an auto-incrementing git tag so that if there's 
# some issue, I can just checkout the last tag and redeploy. this 
# completely sidesteps the need to backup the binary itself
deptag

touch config/Settings.hs
cabal install --bindir=./

# cabal will install to ./myapp as defined in the cabal file so we just 
# stop the service and replace the binary
sudo /etc/rc.d/lighttpd stop
sudo mv myapp "$app"
sudo /etc/rc.d/lighttpd start

This approach can be easily extended to a non-local deployment. In the case of rentersreality, the site lives on a slicehost. Its deployment file looks like this:

#!/bin/bash -e

ip="${1:-rentersreality.com}"

deptag # tag deployments

touch config/Settings.hs
cabal install --bindir=./

scp ./renters "$ip":~/

ssh -t "$ip" '
  sudo /etc/rc.d/lighttpd stop        &&
  sudo mv ./renters /srv/http/app.cgi &&
  sudo /etc/rc.d/lighttpd start       &&
  sleep 3
'

I found that after executing the remote command I had to sleep so that the process could detach correctly. Things would end up in a bad state if I disconnected right away.

I must say, since moving to the more structured approach and utilizing cabal install as the main deployment step, I have had far less issues with developing and deploying my apps.

To see two sites that are currently using this structure, just browse the projects on my github.

published on 29 Apr 2011, tagged with haskell yesod website

Lazy Haskell

Let's say that you have a list of values and you needed to check if any of those values satisfied some condition.

This can be solved easily with Haskell's any function, but let's say you didn't have that.

Here's an alternative method using foldr.

any' :: (a -> Bool) -> [a] -> Bool
any' p list = foldr ((||) . p) False list

any' (== 'x') ['x','y','z']
-- True

any' even [1,3,5,7]
-- False

To understand how this works, you've got to wrap your head around two non-trivial functions for haskell beginners: foldr and ..

Dot

. takes two functions and composes them to create a new function. This is best seen by way of example.

If you know that length counts items in a list and even tests if a number is even, then when someone asks you to write a function that determines if the number of characters in a string is even, it should be little more than these two words put together some way...

-- types refresher:
-- 
-- String = [Char]
-- length :: [a] -> Int
-- even   ::        Int -> Bool
-- 

-- correct, but ugly
stringIsEven :: String -> Bool
stringIsEven s = even (length s)

-- better, but too long
stringIsEven s = even $ length s

-- perfect
stringIsEven = even . length
Remember, $ is function application while . is function composition.

I think that gives you a general idea for how it works. Now let's translate that to the specific example at hand:

(||) is just another function. Think about that for a minute. Welcome to haskell.

(||) takes two Bools and returns a Bool; True if either one of its arguments is True.

(||) True False
-- True

(||) False False
-- False

Curious how (||) is defined in haskell's Prelude?

(||) :: Bool -> Bool -> Bool
True  || _ = True
False || x = x

Wow.

Haskell's laziness means there's no special tricks needed to make if statements "short circuit". Haskell won't evaluate the second expression if the first is True because it's simply never needed.

OK, back to our function.

p is provided as the first argument to our any' function and we know that it's type is (a -> Bool). This means it has to be a test that will check a value and return True or False.

So, what might the type of ((||) . p) be?

This composed function (and you've really got to think of it as one function) will take some value, a as its first argument. It will apply p to it which gives an intermediate Bool. That Bool is then passed through as the first argument to (||).

(||), having gotten its first argument already, only needs one more argument. Since it's not supplied by anything else, it's now required as an argument to the composed function.

-- suppose p is already defined like so:
p :: Char -> Bool
p = (== 'z')

-- types refresher:
-- 
-- p    :: Char -> Bool
-- (||) ::         Bool -> Bool -> Bool
-- 

((||) . p) 'z' False
-- True

((||) . p) 'x' True
-- True


-- ((||) . p) :: Char -> Bool -> Bool

Easy, right?

Folds

The next crazy function is foldr. A fold in the general sense is a way to reduce a list.

If you've got a list of items, a reducing function, and some initial value, then a fold is the process of using these three things to reduce the list to a single value.

This can be seen in the type of foldr. A great deal of information can be learned in haskell by simply taking a look at types; that's why haddocks are so invaluable.

foldr :: (a -> b -> b) -- ^ a reducing function
      -> b             -- ^ some initial value
      -> [a]           -- ^ a list of items
      -> b             -- ^ the resultant single value

Take care to note the type of the reducing function.

It must accept as its first argument the same type as your list of items contains and as its second argument the same type as your initial value.

Its result is also the same type as your initial value. This is important because the initial value and the result of the previous application of foldr must be the same type if we want the required recursion to be type safe.

Often, the types a and b are the same (as in sum' explained below), but this is not required.

foldr and foldl are different in the direction of the fold: folding to the right or folding to the left. In some cases this doesn't matter, in others it does.

Let's look at a folding sum as a concrete example:

sum' :: [Int] -> Int
sum' xs = foldr (+) 0 xs

sum' [1,2,3,4,5]
-- 15

-- how's it work?
foldr (+) 0 [1,2,3,4,5]
-- 15

-- the reducing function is applied with its second argument as the 
-- initial value
result     = (+) 1 0  -- 1 + 0  = 1

-- now we apply the same function but use the result of the previous 
-- application as the new initial value and act on the next element
result'    = (+) 2 1  -- 2 + 1  = 3

-- rinse and repeat until all elements are used up
result''   = (+) 3 3  -- 3 + 3  = 6
result'''  = (+) 4 6  -- 4 + 6  = 10
result'''' = (+) 5 10 -- 5 + 10 = 15

((((0 + 1) + 2) + 3) + 4) + 5 
-- 15

Here's another breakdown with the recursion explicitly shown rather than the values it represents:

foldr (+) 0 [1,2,3,4,5]

result     =                                 (+) 1 0
result'    =                         (+) 2 $ (+) 1 0
result''   =                 (+) 3 $ (+) 2 $ (+) 1 0
result'''  =         (+) 4 $ (+) 3 $ (+) 2 $ (+) 1 0
result'''' = (+) 5 $ (+) 4 $ (+) 3 $ (+) 2 $ (+) 1 0
-- 15

This is an easy example where you can see clearly how things work out. In our case it's a bit more complex, but the principle is the same:

-- assume p is defined like so
p :: Char -> Bool
p = (== 'b') 


foldr ((||) . p) False ['a','b','c']
-- True

-- value breakdown:
result   = ((||) . p) 'a' False -- (== 'b') 'a' || False = False
result'  = ((||) . p) 'b' False -- (== 'b') 'b' || False = True   DING!
result'' = ((||) . p) 'c' True  -- (== 'b') 'c' || True  = True

-- recursion breakdown:
result    =                                   ((||) . p) 'a' False
result''  =                  ((||) . p) 'b' $ ((||) . p) 'a' False -- <- this is the only
result''' = ((||) . p) 'c' $ ((||) . p) 'b' $ ((||) . p) 'a' False --    line ever evaulated
-- True

So the whole thing reduces to True, just as we'd expect.

Why?

This was a really slow and deliberate explanation. I did it this way because I had a real-world situation where I had to come to this exact understanding to solve a problem. OK, not really to solve some dire problem per say, but to do something I wanted to do in an elegant way...

I wanted to walk you all through it from the start only so someone not-so-familiar with haskell might a) see its beauty and b) actually understand the single line of code I'm going to show you in a few more paragraphs.

Sorry.

In my window manager of choice, XMonad, there is a means to test a window's property and take some action depending on the result.

The simplest example is move windows with class "firefox" to the "web" workspace.

className =? "firefox" --> doShift "web"

Easy.

There's also a means to OR rules like these together.

With this, I can say move windows with class "firefox" OR title "chrome" to the "web" workspace.

className =? "firefox" <||> title =? "chrome" --> doShift "web"

The two functions (=?) and (<||>) behave exactly like their normal (==) and (||) counterparts. They're just lifted into a Query Monad. This is a concept you don't need to comprehend right now, just know that there's no elegant way to apply any in this context.

That made it difficult to write a simple function: matchAny that could be a test if any of a window's properties (class, title, name, or role) matched the test string.

Now the any' exercise isn't looking so unrealistic, is it?

-- types refresher:
-- 
-- any :: (a -> Bool) -> [a] -> Bool
-- 
-- we need the same thing, just lifted to the "Query" context:
-- liftAny :: (a -> Query Bool) -> [a] -> Query Bool
-- 

-- our any reimplimentation from ealier:
any'    p list = foldr ((||)   . p)         False  list

-- almost identical:
liftAny p list = foldr ((<||>) . p) (return False) list
return False is simply the lifted version of False just like (=?) is the lifted version of (==)...

Now my manage hooks can leverage a list comprehension for a much more concise and readable rule.

matchAny :: String -> Query Bool
matchAny s = liftAny (=? s) [className, title, name, role]

myManageHook = composeAll [ matchAny s --> action | (s, action) <- myActions ]

    where

        myActions = [ ("rdesktop"  , doFloat         )
                    , ("Xmessage"  , doCenterFloat   )
                    , ("Gmrun"     , doCenterFloat   )
                    , ("Uzbl"      , doShift "2-web" )
                    , ("Uzbl-core" , doShift "2-web" )
                    , ("Chromium"  , doShift "2-web" )
                    , ("irssi"     , doShift "3-chat")
                    ]

Finally

So why is this post about laziness?

foldr ((||) . (== True)) False [False, False, True, undefined, undefined]
-- True

That statement "short circuits". That's only possible because of lazy evaluation.

published on 09 Apr 2011, tagged with haskell xmonad

Landlord Reviews

Last weekend, when our heat wasn't working (again), I had an idea: What if there were a site where I could bitch about my landlord? Then, people who were about to sign a lease could check on this site and see if their would-be landlord sucks... before they sign that lease.

After confirming with the girlfriend marketing that this was actually a half-way decent idea, I started to get excited about it. A few short hours later and I had a decent mock-up.

Full disclosure: It's at that point that we found ratemylandlord.com, sigh.

Though pretty deflating, it's not exactly the same. In fact, that site's kind of old and doesn't have the super-cool jQuery-ness of mine. I'm thinking I could do things differently enough to at least warrant putting my site out there.

Personally, I like my UI a lot better.

Live!

I give you... Landlord Reviews Renters' reality. As it is, you can leave a positive or negative review and search reviews by landlord name or partial address.

I'm making this post to ask for beta-testers. Go, make some fake reviews, play around with the search boxes. If you're feeling motivated, report bugs to me via email or on github.

Todos and Known Bugs

I do have plans to make this epically useful. Any feature requests, just send 'em my way.

Here's the current list I've been procrastinating on:

And some things I already know I need to fix:

The Source

The site is written in haskell (what else) compiled to a fastcgi executable. All I have to do is scp it up to my slice and it just works.

A beautiful thing.

I use slicehost. They have an Arch image. They rock.

If you're interested, you can view the source on my github; pull-requests always welcome.

published on 25 Mar 2011, tagged with haskell website boston

Posts Database

This post is crazy out of date. If your interested in the updated ways in which I accomplish the same goals on yesod 0.9, feel free to checkout the site's source.

For the longest time since my move to Yesod, I've had the listing of posts on this site embedded in its source. This is not only poor practice, but it made it kludgy to add a new post the site. I'd have to write the post, add the Post info to the source file, then recompile my development sources and deploy them to production.

This marriage of code and content made it impossible to write posts and develop anything new in the framework at the same time because both actions required a "push to PROD".

The actual content for each post is stored in a markdown file and parsed at run time, so I could at least edit content while the site is live. Ideally, I would've put the post meta-data (title, date published, rss description, and tags) as a header in this markdown file and have that parsed at runtime as well. This would've been difficult when it came to the rss description. Parsing multi-line tokens is no fun.

So given the requirement that post content and post meta-data would live separately I decided to give Persist a try. I could store post information in a small sqlite database and access it through some management pages to add, remove, and update post meta-data on my site.

This approach required me to work through three aspects of Yesod that aren't extremely well documented: Persistent, Forms, and Authentication. I figured I'd share what I did in this post and maybe others can benefit (or point out what I've done wrong).

I'm going to skip over any required extensions or imports just to keep the code presented here simple and somewhat readable.

If you're able to use Yesod in some way, you can probably decipher ghc errors enough to figure out what's needed.

Persistent

First up was getting CUD actions possible on Posts (the existing data type).

I had to make my site an instance of YesodPersist which will allow abstract data base actions to happen in the GHandler Monad:

-- I added this to Settings.hs:
dataBase :: String
dataBase = "posts.db3"

withConnectionPool :: MonadInvertIO m => (ConnectionPool -> m a) -> m a
withConnectionPool = withSqlitePool dataBase 10

-- And this to the main DevSite.hs file:
instance YesodPersist DevSite where
    type YesodDB DevSite = SqlPersist
    runDB db = fmap connPool getYesod >>= runSqlPool db

Next up, I had to use some template haskell to define the storable entries:

-- I elected to put this code right in Posts.hs which handles all the 
-- Post related stuff already:
share2 mkPersist (mkMigrate "migratePosts") [$persist|
SqlPost
    slug        String
    date        UTCTime Desc
    title       String
    descr       String
    UniqueSqlPost slug
SqlTag
    post SqlPostId Eq
    name String Asc
|]

This creates two tables. One to hold all the post information and a second to just hold the Post-to-Tag relationships.

We also have to add the migration function to the main method which will initialize these tables on first run:

-- This goes in my Controller.hs:
withServer :: (Application -> IO a) -> IO a
withServer f = withConnectionPool $ \p -> do
    runSqlPool (runMigration migratePosts) p -- right here
    let h = DevSite p
    toWaiApp h >>= f

With all the boilerplate in place, we can write some functions to actually do the selects, inserts, and deletes (I cheat and actually do a delete-then-insert for any updates):

-- | The data type of a single post, this is what we actually want to work 
--   with in all our other code
data Post = Post
    { postSlug  :: String
    , postDate  :: UTCTime
    , postTitle :: String
    , postDescr :: String
    , postTags  :: [String]
    }

-- | Select n recent posts from the database and return them
selectPosts :: Int -> Handler [Post]
selectPosts n = mapM go =<< runDB (selectList [] [SqlPostDateDesc] n 0)

    where
        go :: (Key SqlPost, SqlPost) -> Handler Post
        go (sqlPostKey, sqlPost) = do
            -- tags for this post
            sqlTags <- runDB $ selectList [SqlTagPostEq sqlPostKey] [SqlTagNameAsc] 0 0
            return Post
                { postSlug  = sqlPostSlug  sqlPost
                , postDate  = sqlPostDate  sqlPost
                , postTitle = sqlPostTitle sqlPost
                , postDescr = sqlPostDescr sqlPost
                , postTags  = fmap (sqlTagName . snd) sqlTags
                }

-- | Insert a post into the database
insertPost :: Post -> Handler ()
insertPost post = do
    let sqlPost = SqlPost
            { sqlPostSlug  = postSlug post
            , sqlPostDate  = postDate post
            , sqlPostTitle = postTitle post
            , sqlPostDescr = postDescr post
            }

    -- insert the Post record
    sqlPostKey <- runDB $ insert sqlPost

    -- insert each tag record
    mapM_ (go sqlPostKey) $ postTags post
    
    where
        go :: SqlPostId -> String -> Handler SqlTagId
        go key tag = runDB (insert $ SqlTag key tag)

-- | Delete an existing post by slug
deletePost :: String -> Handler ()
deletePost slug = do
    sqlPost <- runDB $ getBy $ UniqueSqlPost slug
    case sqlPost of
        Just (sqlPostKey, _) -> do
            -- delete the post and the tags
            runDB $ deleteBy $ UniqueSqlPost slug
            runDB $ deleteWhere [SqlTagPostEq sqlPostKey]
        Nothing -> return ()

As an example of the simple case, selecting a post out of the data base and displaying it, here is my handler for a GET request on the post route plus some of the immediate support functions:

-- | This was already in place with the hardcoded Posts so I just put it 
--   in the Handler Monad and call from the database instead.
-- 
--   I know, it's silly to not make the slug part of a targetted select, 
--   but with such a small data base this is fine and much easier to 
--   code.
-- 
getPostBySlug :: String -> Handler [Post]
getPostBySlug slug = do
    allPosts <- selectPosts 0
    return $ filter ((== slug) . postSlug) allPosts

-- | Used with posts so that we have post-specific info within scope
--   while still abstracting the overall template/css
postLayout :: Post -> Handler RepHtml
postLayout post = do
    mmesg       <- getMessage
    (t, h)      <- breadcrumbs
    postContent <- loadPostContent post -- parses the markdown file

    pc <- widgetToPageContent $ do
        setTitle $ string $ "pbrisbin - " ++ postTitle post
        addCassius $(S.cassiusFile "root-css")
    hamletToRepHtml $(S.hamletFile "post-layout")

-- | Load a Post
getPostR :: String -> Handler RepHtml
getPostR slug = do
    posts <- getPostBySlug slug
    case posts of
        []       -> notFound
        (post:_) -> postLayout post

So now that that part's done, we need a page where we can edit and delete the existing posts. That will require a form.

Forms

I probably have a bit more boilerplate here than I need, but oh well. I had a lot of this code already in yesod-comments where I need a bit more customization in the form so I reused it.

The first thing we need is an overall page which has an "Add new post" form at the top and a table of existing posts with links to edit and delete them:

-- | The overall template showing the input box and a list of existing
--   posts
managePostTemplate :: String -> Widget () -> Enctype -> Widget ()
managePostTemplate title form enctype = do
    posts <- liftHandler $ selectPosts 0
    [$hamlet|
    <div .post_input>
        <h3>#{string title}

        <form enctype=#{enctype} method="post"
            ^{form}

    <div .posts_existing>
        <h3>Existing posts:

        <table>
            <tr>
                <th>Title
                <th>Description
                <th>Edit
                <th>Delete

            $forall post <- posts
                <tr>
                    <td>
                        <a href=@{PostR $ postSlug post}> #{shortenShort $ postTitle post}
                    <td>#{shortenLong $ postDescr post}
                    <td>
                        <a href=@{EditPostR $ postSlug post}> edit
                    <td>
                        <a href=@{DelPostR $ postSlug post} delete
    |]

    where 
        shortenLong  = shorten 40 
        shortenShort = shorten 15 
        shorten n s  = if length s > n then take n s ++ "..." else s

Don't worry about EditPostR or DelPostR yet, we'll get to those.

Now we need to code the Form itself. The way I do it is I create a data type whose records represent the Form fields. Then, when I run the form, I'll use a function to convert that datatype into the thing I really want from the Form (a Post) with any required conversions or time stamping happening there.

To make things a little more flexible, I'm going to pass an initial argument to most of these functions. If that argument is Just Post, then I'm editing an existing post and I will pre-populate the "new" form with its information and update rather than insert on submit. If that first argument is Nothing, then it's a truly new Post and I'll continue as such.

This is why I pass title to the function above; it might say "Edit ..." or "Add ..." accordingly.

-- the form data type
data PostForm = PostForm
    { formSlug  :: String
    , formTitle :: String
    , formTags  :: String
    , formDescr :: Textarea
    }

-- | Convert form input into a Post and update the db.
updatePostFromForm :: Maybe Post -> PostForm -> Handler ()
updatePostFromForm p pf = do
    postDate' <- if isJust p 
        -- preserve original publish date
        then return $ postDate $ fromJust p
        else liftIO getCurrentTime
    let post = Post
            { postSlug  = formSlug pf
            , postTitle = formTitle pf
            , postDescr = unTextarea $ formDescr pf
            , postDate  = postDate'
            , postTags  = parseTags $ formTags pf
            }
    if isJust p
        then do
            -- delete the original and insert a new version
            deletePost (postSlug post)
            insertPost post
            setMessage $ [$hamlet| %em post updated! |]
        else do
            insertPost post
            setMessage $ [$hamlet| %em post added! |]

    redirect RedirectTemporary ManagePostsR

-- | some minor changes to 
--   <https://github.com/fortytools/lounge/blob/master/Handler/Entry.hs#L57>
parseTags :: String -> [String]
parseTags [] = []
parseTags s  = let (l,s') = break (==',') $ dropWhile (==',') s
    in trim l : case s' of
        []      -> []
        (_:s'') -> parseTags s''

    where 
        trim  = trim' . trim' 
        trim' = reverse . dropWhile isSpace

-- | Display the new post form inself. If the first argument is Just,
--   then use that to prepopulate the form
postForm :: Maybe Post -> FormMonad (FormResult PostForm, Widget ())
postForm post = do
    (slug       , fiSlug       ) <- stringField   "post slug:"   $ fmap postSlug  post
    (title      , fiTitle      ) <- stringField   "title:"       $ fmap postTitle post
    (tags       , fiTags       ) <- stringField   "tags:"        $ fmap (formatTags . postTags) post
    (description, fiDescription) <- textareaField "description:" $ fmap (Textarea . postDescr)  post
    return (PostForm <$> slug <*> title <*> tags <*> description, [$hamlet|
        <table>
            ^{fieldRow fiSlug}
            ^{fieldRow fiTitle}
            ^{fieldRow fiTags}
            ^{fieldRow fiDescription}
            <tr>
                <td>
                    &nbsp;
                <td colspan="2">
                    <input type="submit" value=#{buttonText}>
        |])

    where
        fieldRow fi = [$hamlet|
            <tr>
                <th>
                    <label for=#{fiIdent fi}> #{fiLabel fi}
                    <div .tooltip> #{fiTooltip fi}
                <td>
                    ^{fiInput fi}
                <td>
                    $maybe error <- fiErrors fi
                        #{error}
                    $nothing
                        &nbsp;
            |]

        formatTags = intercalate ", "
        buttonText = string $ if isJust post then "Update post" else "Add post"

-- | Run the post form and insert or update based on the entered data
runPostForm :: Maybe Post -> Widget ()
runPostForm post = do
    ((res, form), enctype) <- liftHandler . runFormMonadPost $ postForm post
    case res of
        FormMissing    -> return ()
        FormFailure _  -> return ()
        FormSuccess pf -> liftHandler $ updatePostFromForm post pf

    managePostTemplate title form enctype

    where 
        title = if isJust post 
            then "Edit post:" 
            else "Add new post:"

Anyway, with all that boilerplate out of the way, we can define our routes.

We need to add the following to our main parseRoutes function first:

-- | Define all of the routes and handlers
mkYesodData "DevSite" [$parseRoutes|
...

/manage                ManagePostsR GET POST
/manage/edit/#String   EditPostR    GET POST
/manage/delete/#String DelPostR     GET

/auth AuthR Auth getAuth
|]

We'll get to that AuthR bit a little later, but with the three routes defined we can create our handler functions for our various actions:

-- | Manage posts
getManagePosts/ :: Handler RepHtml
getManagePostsR = pageLayout $ do
    setTitle $ string "pbrisbin - Manage posts"
    addHamlet [$hamlet| %h1 Manage Posts |]
    runPostForm Nothing

postManagePostsR :: Handler RepHtml
postManagePostsR = getManagePostsR

-- | Edit post
getEditPostR :: String -> Handler RepHtml
getEditPostR slug = do
    post <- getPostBySlug slug
    case post of
        []        -> notFound
        (post':_) -> pageLayout $ do
            setTitle $ string "pbrisbin - Edit post"
            addHamlet [$hamlet| %h1 Edit Post |]
            runPostForm $ Just post'

postEditPostR :: String -> Handler RepHtml
postEditPostR = getEditPostR

-- | Delete post
getDelPostR :: String -> Handler RepHtml
getDelPostR slug = do
    deletePost slug
    setMessage $ [$hamlet| %em post deleted! |]
    redirect RedirectTemporary ManagePostsR

At least that part is pretty easy after all the upfront forms work.

Now, you can start up your test server and head to localhost:3000/manage to try it out.

Manage Posts Screenshot 

The problem now is that if you were push this live, everyone could mess with your data base. We need authentication.

Authentication

This was the most difficult part of the whole thing.

There's very little documentation on yesod-auth and any real world examples I could find were for social-networking type authentication where every login box had "register for a new account" logic tied into it. This was not what I needed.

I wanted to store a username and hashed password in my new Persistent database and validate incoming users (just me) against that.

I ended up writing my own AuthPlugin modeled after Yesod.Helpers.Auth.Email which does just that.

I'm not going to go into the details, if you want to view the code it's on my github. I'd rather describe briefly how to use it.

This has now been added to the main yesod-auth package.

After copying the source for the module into your app directory, make your site an instance of YesodAuth and put my plugin in your list:

instance YesodAuth DevSite where
    type AuthId DevSite = UserId

    -- these are site specific
    loginDest _  = ManagePostR
    logoutDest _ = RootR

    getAuthId    = getAuthIdHashDB AuthR 
    showAuthId _ = showIntegral
    readAuthId _ = readIntegral
    authPlugins  = [authHashDB]

That's where the /auth addition to parseRoutes is used.

Add the migration function to your controller:

withServer :: (Application -> IO a) -> IO a
withServer f = withConnectionPool $ \p -> do
    runSqlPool (runMigration migratePosts) p
    runSqlPool (runMigration migrateUsers) p -- right here
    let h = DevSite p

And that's it!

Well kinda. You'll need to add users to the database by hand, but that's not too hard, here's how I did mine:

$ echo -n 'MySuperAwesomePassword' | sha1sum
bf1bfb9af6e50018dacf19e1618e4fb5f981c14e  -
$ sqlite3 posts.db3
SQLite version 3.7.4
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> insert into user (username,password) 
   ...> values ('pbrisbin','bf1bfb9af6e50018dacf19e1618e4fb5f981c14e');
sqlite> .exit
$

And it's incredibly easy to add authentication to any page, here's how I adjusted my management routes:

getManagePostsR :: Handler RepHtml
getManagePostsR = do
    _ <- requireAuth -- that's it, right there!

    postForm <- runPostForm Nothing
    pageLayout $ do
        ...

Now when you go to any page with authentication required, you're hit with a Login box:

Login Screenshot 

Login and manage your posts.

published on 09 Jan 2011, tagged with haskell website

Site Migration

20:24 rson: if there is anything i could ever suggest that you'd listen to, let
            it be this.  do it.

Wise words from someone who's been there before. That's rson telling me that I should move my site to some sort of framework. Make things cleaner, easier to maintain, and get away from that goddamn php I seem to be so fond of.

I had been thinking about doing this myself for quite some time. As silly as it sounds, I was unhappy with my urls. The whole site (from a purely url-appearance standpoint) was inconsistent. I dreamed for /feed/ and /posts/my_post/.

I could also feel my spider web of php and html spiralling away from me. I was spending too much time monitoring comments, tweaking the syntax highlighting, and figuring out the best way to format bread crumbs based on not only filepath but also custom translations from content.php to all posts and similar.

Yesod

Then I found Yesod, a web framework based on haskell. As anyone who's ever been to this site knows, I love haskell. It's just a cool language. So if I were going to move to some sort of framework, this would be it.

So, using the Yesod Docs, the haddock documentation, and even the actual source for the Yesod Docs, I was able to hobble my site over to the framework. It wasn't easy, but there's a lot of benefit there.

My breadcrumbs went from 100 lines of php to about 14 lines of haskell. And those 14 lines are simply defining what Routes are children of what other Routes.

My posts have tags now. This extra bit of post-metadata was even added later without disrupting any existing code.

My Rss feed is dynamically created whenever it's loaded.

And probably most important of all, urls used throughout the site are type safe, compile-time-guaranteed to be valid.

What that means is that I don't type the url directly, I insert a haskell function that corresponds to those pages' Routes. And no, they aren't built from regular expressions; each Route is generated as a distinct type as defined by me.

Routes can also have arguments. Right now you're viewing the output of the PostR Route using site_migration as its argument. But the best part of all that is that the compiler validates every link in my site each time it's compiled to ensure it's in scope and type checks!

Sell Out!

As part of the transition, I'm also giving up some control over code snippets and comments. I enjoyed the DIY approach but it was getting cumbersome (and less and less KISS as things went on).

Instead, I'm stealing two more ideas from the Yesod Docs site. The new site uses git's gist feature for code snippets and disqus for comments. I know, I originally said I, "didn't want to farm comments out to 3rd party javascript," but disqus is really nice and I'm getting sick of all the overhead that comes with my homebrew php setup.

I'm really sorry to anyone who's left comments so far on the site. I appreciate them greatly. I still have them and I'll continue to look into ways to port them over to disqus, but so far, it's not looking too promising.

I've changed my approach to posts and am now using pandoc to write them. This means that I don't need gist anymore thanks to pandoc's great syntax highlighting features. I'm also working on my own Yesod module for Comments to get things back the way it was on the old site. That's a bit of a work in progress at the moment and will be its own post when it's done... I'll be keeping disqus around for a while.

Lighttpd

Another change I'm making is from Apache over to Lighttpd (pronounced: lighty). To be honest, I just couldn't get (Fast)CGI working with apache and I had it running with lighttpd in minutes. Hopefully it'll be faster and easier to maintain too, we'll see...

So anyway, enjoy the new site; let me know if anything is broken or missing -- I'm still in the process of migrating old posts, so give me some time before reporting that.

The site's source is also in my git repo if anyone's interested.

published on 10 Oct 2010, tagged with haskell website

XMonad Modules

This page is to serve as both an apology and an announcement. I've recently modularized my xmonad.hs. I'm sorry.

This is no longer true. I've since gone through a bit of a config cleanse, deciding it makes my life easier to live closer to defaults and not carry around a lot of extra configuration or features (that I don't actively use).

As part of this cleanse, I've stripped my config back down to a very lean xmonad.hs that can easily live within the confines of a single file.

Who cares?

I know of at least one person who stops by my site on a regular basis to update his xmonad.hs to match the latest version of mine. I've also seen, on a few occasions, someone mention that they use brisbin33's xmonad config when discussing an issue on the forums or in IRC. True, for all I know, there could be only three people using some form of my config -- but to them, I'm sorry.

Anyone who blindly updates to my most recent xmonad.hs may get hit with the following error:

  xmonad.hs:21:7:
      Could not find module `ScratchPadKeys':
         Use -v to see a list of the files searched for.

  Failed, modules loaded: none.

That's because I've offloaded some of the more module-ish chunks of my config into, well, modules.

Why?

I noticed, when browsing the XMonad source (I know, shut-up), that the default recompile command includes the option -ilib this tells ghc to include source files in ./lib. It was a light-bulb moment.

I had gathered some pretty sophisticated code in my little xmonad.hs: custom data types and instances, reusable utilities, etc. Why not put them in their own files and import them into a nice clean config as I would with any normal contrib module?

So, if you're following my xmonad.hs, please continue to do so. Just be advised you'll need a few files in lib if you want to use the functionality they offer.

published on 31 Aug 2010, tagged with dzen haskell website xmonad

Haskell RSS Reader

I've been looking for a good Haskell project for a while now. The language is just awesome, and I've been getting more and more comfortable with it lately thanks to reading Real World Haskell. I even got the opportunity to write some haskell for a project at work (I'm a consultant on a Microsoft product, crazy).

I wanted something challenging but doable; something to keep me interested but still stretch my abilities. I had made some smaller utilities to manage the pages on my site, so I was getting familiar with parsing XML using some haskell libraries as well as starting to wrap my head around the IO Monad a bit more. Well, I just completed (what I think is) a slick little RSS reader using just haskell and dzen.

For those that don't know, RSS feeds are basically just site headlines; a very simple XML page that lists items, each item containing a title, description, and link.

So my reader would read in a listing of feed urls, put together all of the RSS items from each url, and then display them using dzen.

I put it in the upper right of my left monitor, configured to look like part of my existing dzen status bars.

The title text remains static and is clickable (opens the url of the feed item), and the description text is a ticker text that rolls by right-to-left one character at a time.

Installation

First, you would have to download RssReader.hs and Dzen.hs from my old xmonad library and place them in a directory along side a file called rssreader.hs. This file would serve the same purpose xmonad.hs does for XMonad: it would be both a configuration file and the main application itself, gluing together imported functions into a runnable main.

Here's an example:

import Dzen
import RssReader

-- 
-- this is it, the whole application in one line!
-- 

main :: IO ()
main = spawnDzen dzenConf >>= spawnReader readerConf

-- 
-- and the configuration part...
-- 

-- set a width and some text formatting
readerConf :: ReaderConf
readerConf = defaultReaderConf
  { titleFormat = dzenFG "#909090"
  , descrFormat = shorten 200 
  , tickerWidth = 150 
  }

  where
    -- some helpers
    dzenFG c s  = concat ["^fg(", c, ")", s, "^fg()"]
    shorten n s = if length s > n then (take n s) ++ "..." else s

-- start with the default dzen and override some things
dzenConf :: DzenConf
dzenConf = defaultDzen
  { x_position  = Just $ Percent 60 -- start 60% across screen 0
  , width       = Just $ Percent 40 -- and span the other 40%
  , font        = Just "Verdana-8"  -- if you have an xft-capable dzen                                                        
  , fg_color    = Just "#606060"
  , bg_color    = Just "#303030"
  }

Once that's all set, you can run ghc --make -o rssreader rssreader.hs inside this directory to create an executable which you can run standalone.

Dependencies

The following packages would be required either from Hackage or your distribution's package manager:

Hackage Arch linux
http extra/haskell-http
tagsoup aur/haskell-tagsoup

Known Issues

Some unprintable characters seem to still come through. I try to clean the strings as much as possible, but I still see boxes in dzen from time to time.

The rssreader and the spawned dzen are not tied together process-wise. This means that you can kill rssreader and a frozen dzen remains, or you can quit the dzen and rssreader will be left as a zombie.

published on 15 Aug 2010, tagged with dzen haskell xmonad

Scratchpad Everything

If you've read my recent post on using a scratchpad in XMonad, and if you've actually implemented this in your own setup, you probably know how useful it is. For those that don't know what I'm talking about, you basically setup a simple keybinding that calls up a terminal (usually floated, but managed by its own specific manageHook) to be used briefly before being banished away by the same keybinding.

Recently, I found that you can apply this functionality to any application you'd like.

ScratchMixer

I have my music playing through MPD all the time. Occasionally, I'll like to play some other media, a youtube video or what have you. When I do this, I call up ossxmix, adjust down MPD, and adjust up my browser (per application volumes are awesome by the way).

I realized that this was a perfect scratchpad scenario. I was calling up this application for just a second, using it, then sending it away. This simple activity was requiring that I M-p, type ossxmix, hit enter, layout-shuffle, adjust volumes, then M-S-c every single time. What was I thinking?

XMonad.Util.NamedScratchpad

My last writeup used the contrib module XMonad.Util.Scratchpad which, though it has a shorter name, simply provided wrapper functions for the things I'm now using from XMonad.Util.NamedScratchpad.

In the parent extension, things are much more transparent and free. For me, this lead to a much cleaner config file too. I wish I had been using things this way from the start.

So of course, we'll need to add import XMonad.Util.NamedScratchpad to the top of our config file.

Please refer back to my previous post for information regarding some boilerplate code. This writeup assumes you have a main-do block that calls out myManageHook and myKeys to be defined as separate functions. I also won't be going into hiding the NSP

Scratchpads

The Named Scratchpad extension exposes a new data type that can be used to represent a scratchpad. The following four things must be specified to fully describe a scratchpad:

Those last two data types might sound scary, but they aren't. If you think of the fact that most users define custom window management in a list of (Query Bool --> ManageHook) and one representation of this might be (className =? "Firefox" --> doFloat) that should give you an idea of the sorts of functions that you should use to fill those last two slots for your scratchpads.

The haddocks for this module talk about everything that's available, but here's a commented version of my declaration:

myScratchPads = [ NS "mixer"    spawnMixer findMixer manageMixer -- one scratchpad
                , NS "terminal" spawnTerm  findTerm  manageTerm  -- and a second
                ]   

  where

    spawnMixer  = "ossxmix"                               -- launch my mixer
    findMixer   = className =? "Ossxmix"                  -- its window has a ClassName of "Ossxmix"
    manageMixer = customFloating $ W.RationalRect l t w h -- and I'd like it fixed using the geometry below:

      where

        h = 0.6       -- height, 60% 
        w = 0.6       -- width, 60% 
        t = (1 - h)/2 -- centered top/bottom
        l = (1 - w)/2 -- centered left/right

    spawnTerm  = myTerminal ++ " -name scratchpad"       -- launch my terminal
    findTerm   = resource  =? "scratchpad"               -- its window will be named "scratchpad" (see above)
    manageTerm = customFloating $ W.RationalRect l t w h -- and I'd like it fixed using the geometry below

      where

        -- reusing these variables is ok since they're confined to their own 
        -- where clauses 
        h = 0.1       -- height, 10% 
        w = 1         -- width, 100%
        t = 1 - h     -- bottom edge
        l = (1 - w)/2 -- centered left/right

So you can see I have a list containing two scratchpads. The datatype syntax requires the "NS" plus the four things I've listed above.

You'll notice I liberally use sub-functions via where clauses. You can think of these as simple variables and if parenthesized and placed directly where they're called out, they would work exactly the same. I think this is clearer and it should be fairly obvious how it works.

The beauty of all this is that it's almost all that's needed. Each scratchpad has a name which can be bound to a key; even better, the whole scratchpad list will be managed with one simple addition to your manageHook.

I inserted the following keybindings:

myKeys = [ ...
         , ...

         , ("M4-t"   , scratchTerm )
         , ("M4-S-m" , scratchMixer)

         , ...
         ] 

         where

           -- this simply means "find the scratchpad in myScratchPads that is 
           -- named terminal and launch it"
           scratchTerm  = namedScratchpadAction myScratchPads "terminal"
           scratchMixer = namedScratchpadAction myScratchPads "mixer"
I'm using EZConfig notation in my keybindings.

And tacked the following onto the end of my managehook:

myManageHook = ([ -- whatever it might be...
                , ...
                , ...

                -- this manages the entire list of scratchpads 
                -- based on the query and hook listed for each
                ]) <+> namedScratchpadManageHook myScratchPads

That's it, a scratch terminal and a scratch mixer; but most importantly, simple and transparent tools for adding any arbitrary application (graphical or in-term) as a scratchpad application.

One final note about testing: As you're tweaking your queries and hooks, be sure to call up the application, close it, then Mod-Q and test your changes. If you've got a scratchpad still open from before your last config change, it will still be using the old ManageHook.

published on 14 Jun 2010, tagged with haskell xmonad

XMonad Scratchpad

It's been a while since I've made an XMonad post. Thought a good one might be details regarding the scratchpad extension from -contrib.
This can be confusing to set up, but oh-so useful. If you've ever used a quake (or yakuake?) terminal (I have not), you'll know what I'm talking about. It's basically a small terminal that sits idle on a non-visible workspace. You can call it up with a quick keybind, use it for whatever, then banish it away again with the same keybind.

You just have to use it for a while to realize how useful it really is.
My goal for this post is to distill out of my xmonad.hs just the scratchpad functionality so that someone with an existing xmonad.hs could easily plug this into their setup with minimal fuss.

Prerequisites

I'm going to assume that your existing xmonad.hs defines a function called myManageHook and another called myTerminal. If this is not the case, take a look at the below snippet; I think you'll be able to figure out how to rework whatever you do have into this format.

main = do
  xmonad $ defaultConfig
    { terminal   = myTerminal
    , manageHook = myManageHook
    , ...
    , ...
    }

myTerminal = "urxvt"

-- you could have some crazy long managehook 
-- or simply defaultManageHook
myManageHook = ...

Imports

You'll need to import some things to make this functionality available.
Make sure you've got -contrib installed and add the following to the top of your xmonad.hs:

import XMonad.Util.Scratchpad

Pretty easy, huh?

ManageHook

We're going to add an additional manageHook to manage the scratchPad specifically. XMonad makes it easy to just tack manageHooks onto whatever you have existing by using <+> which is an inFix operator that takes two manageHooks and returns a manageHook. So...

myManageHook = ([ -- whatever it is, probably some list of things...
                , ...
                , ...
                ]) <+> manageScratchPad

-- then define your scratchpad management separately:
manageScratchPad :: ManageHook
manageScratchPad = scratchpadManageHook (W.RationalRect l t w h)

  where

    h = 0.1     -- terminal height, 10%
    w = 1       -- terminal width, 100%
    t = 1 - h   -- distance from top edge, 90%
    l = 1 - w   -- distance from left edge, 0%

What I've done is used RationalRect to define a rectagle of where I'd like the scratchpad to ppear. h, w, t, and l are entered as percentage screen size. So in the above, I've got a rectangle that spans the monitor's entire width and is 10% its height. By specifying h and w, t and l are already defined since I want it to be on the bottom edge of the screen.

KeyBinds

I'm not really going to get specific with the key binding part. Personally, I use EZConfig. Everyone seems to have their own syntax/style of binding keys in xmonad; usually it's just the way it was in the first config you copied from, whatever. Just know that someway-somehow you'll need to bind a key to...

myKeys = [ ( ... , ...        )
         , ( ... , scratchPad ) -- spawn a scratchpad terminal
         ]

         where 

           scratchPad = scratchpadSpawnActionTerminal myTerminal

Make sense?

Extra credit

At this point, you should have a functioning scratchpad. Remember, any changes to the manageHook require you to exit and reopen the scratchpad terminal to see the effect.

Using this scratchpad module creates a workspace called NSP where the scratchpad resides when it's not visible. You'll notice, this workspace will show up in any dzen or xmobar you've got going on. But with some changes to our logHook we can filter that out of the workspace list pretty easily.

If you're not using a custom logHook, you've pretty much got two choices at this point: head over to the docs on xmonad.org and find some drop-in filter-out-NSP module and figure out how add it (I know it's there but I could not for the life of me get it working), or just figure out how to get a custom logHook going.

What I'm about to go into assumes you've already got something like the following defined in your xmonad.hs:

myLogHook h = dynamicLogWithPP $ defaultPP
  { ppCurrent         = dzenColor color1 color2 . pad
  , ppVisible         = dzenColor color1 color2 . pad
  , ppUrgent          = dzenColor color1 color2 . pad . dzenStrip
  , ppLayout          = dzenColor color1 color2 . pad
  , ppHidden          = dzenColor color1 color2 . pad
  , ppHiddenNoWindows = namedOnly
  , ppTitle           = shorten 100 
  , ppSep             = " "
  , ppWsSep           = ""
  , ppOutput          = hPutStrLn h
  }
The above requires other contrib modules, changes to main, and special imports to get working. As I've said, I'm leaving it as an exercise for the reader to set up his or her own logHook.

Once we've got this, filtering out the NSP workspace is pretty straight forward. Here's the above again, but this time with the NSP workspace filtered out, hopefully you'll be able to modify things as needed to make this work with your setup.

myLogHook h = dynamicLogWithPP $ defaultPP
  { ppCurrent         = dzenColor color1 color2 . pad
  , ppVisible         = dzenColor color1 color2 . pad
  , ppUrgent          = dzenColor color1 color2 . pad . dzenStrip
  , ppLayout          = dzenColor color1 color2
  , ppLayout          = dzenColor color1 color2 . pad
  , ppHidden          = dzenColor color1 color2 . pad . noScatchPad -- haskell makes it so easy,
  , ppHiddenNoWindows = noScratchPad                                -- just tack on another function
  , ppTitle           = shorten 100 
  , ppSep             = " "
  , ppWsSep           = ""
  , ppOutput          = hPutStrLn h
  }

  where
    -- then define it down here: if the workspace is NSP then print
    -- nothing, else print it as-is
    noScratchPad ws = if ws == "NSP" then "" else ws

Good luck!

published on 10 Apr 2010, tagged with haskell xmonad

XMonad's IM Layout

One of my favorite modules from xmonad-contrib is the IM layout. It's a tiling algorithm designed to handle your roster and chat windows in the best way possible. Here I'm going to outline how I set this up in my xmonad.hs.

What it looks like

Personally, I want my roster tiled in its own properly-sized location on one side, and all other IM related windows floating. I also want any of those IM windows to automatically be pushed to the IM workspace.

IM Layout Screenshot 

Here we can see that my roster is fit in its own little tile on the left, sized properly. The rest of the screen is tiled as a grid of whatever other applications I open. My chat window is always floating.

So, how would you set this up?

Imports and Main

This post assumes you've imported the required modules and you have a main function setup as shown:

-- imports
import XMonad.Hooks.ManageDocks
import XMonad.Hooks.ManageHelpers
import XMonad.Layout.IM
import XMonad.Layout.PerWorkspace

import qualified XMonad.StackSet as W

-- main
main = xmonad $ defaultConfig
    { ...
    -- all of our changes will take place in the myLayout and
    -- myManageHook definitions.
    , layoutHook = myLayout
    , manageHook = myManageHook
    }   

The Layout Hook

Here's a simple layoutHook that adds the IM extension on a specific workspace and has the added bonus that you can cycle between all of your "standard" layouts in the space that's not taken up by the roster.

Also, if your IM client isn't open, the workspace will behave like any other.

-- Layouts
myLayout = avoidStruts $ onWorkspace "3-im" imLayout $ standardLayouts

  where
    --          numMasters, resizeIncr, splitRatio
    tall = Tall 1           0.02        0.5

    -- define the list of standardLayouts
    standardLayouts = tall ||| Mirror tall ||| Full

    -- notice that withIM, which normally acts on one layout, can also 
    -- work on a list of layouts (yay recursive data types!)
    imLayout = withIM (1/10) (Role "roster") (standardLayouts)

I've defined the function standardLayouts, which we'll use on all workspaces. Then for "3-im", I define imLayout which uses the withIM modifier.

This really highlights what XMonad brings as a WM, something you don't get with most other tilers; because we are using haskell as the config language, we have all kinds of native tricks at our disposal. The reason the above works is that both (someLayout) and (someLayout ||| someOtherLayout ||| yetAnotherLayout) are valid as arguments to withIM due to their identical (existential) types. If the compiler allows it, we can be pretty sure it'll behave as we intended.

Now you should have a nice IM layout setup, go ahead and M-q, then head to workspace 3 (or whatever) and fire up your IM client

Feel free to stop here if you plan on having your IM Chat windows tiled. I think that's kind of ugly, so I choose to have any IM-related window besides the roster floated by default. My manageHook takes care of that.

The Manage Hook

-- Manage hook
myManageHook = composeAll
    [ -- whatever you might already have, plus...

    -- move all IM windows to IM workspace
    , className =? "Gajim.py" --> doShift "3-chat"

    -- and float everything but the roster
    , classNotRole ("Gajim.py", "roster") --> doFloat
    ]

    where
        classNotRole :: (String, String) -> Query Bool
        classNotRole (c,r) = className =? c <&&> role /=? r

        role = stringProperty "WM_WINDOW_ROLE"

This will move all IM windows to the IM workspace, and float anything that's of the IM Class but not the roster's Role.

You can use the commandline tool xprop to find out a window's properties for use in these manageHooks.

published on 05 Dec 2009, tagged with haskell xmonad

Status Bars in XMonad

One of the trickiest things for a new user of XMonad is adding a statusbar. This is mainly because xmonad's statusbar support is so transparent and extensible, that any documentation for setting it up could be completely different from any other. Do you want a dynamicLog? A customLog? xmobar? dzen? One bar? Two?

Here I'll outline my method. Two separate dzen2 bars aligned to look like one bar across the top of your screen. The left fed by an xmonad dynamicLogWithPP to show workspaces (with coloring and urgencyHooks), the current layout and the current window title. The right fed by conky to show music, system stats and of course the time.

Many thanks go to moljac and lifeafter2am on the Arch forums. They offered their xmonad.hs's to me and helped get me setup this way.

What it looks like

Full desktop:

XMonad Shot 

And with an urgencyHook notification (Workspace turns a different color):

XMonad Shot Urgent 

To achieve this, we set up a dynamicLog in xmonad.hs and adjust our main function to output to this bar and also spawn our right bar as fed by conky.

Imports and the Main function

Your imports and main function will look like this:

import XMonad.Util.Run
import XMonad.Hooks.DynamicLog
import XMonad.Hooks.ManageDocks
import XMonad.Hooks.UrgencyHook

main = do
    -- spawn our left and right bars. in my case, I use two monitors,
    -- I want one bar on each, and my version of dzen supports the -xs
    -- argument for specifying on which screen to appear. if your
    -- situation is different in some way, use -w and -x to give your
    -- bars appriate width and x offsets for your needs.
    d <- spawnPipe "dzen2 -p -xs 1 -ta l -e 'onstart=lower'"

    spawn $ "conky -c ~/.xmonad/data/conky/dzen | " ++
                "dzen2 -p -xs 2 ta -r -e 'onstart=lower'"

    xmonad $ withUrgencyHook NoUrgencyHook $ defaultConfig
        { ...
        , logHook = myLogHook d

        -- having these call out to external functions makes it easier 
        -- to add the "no overlap" stuff later on. if you don't have 
        -- myLayoutHook or myManageHook, you can continue to use the 
        -- xmonad defaults by declaring them like so:
        -- 
        -- > myManageHook = manageHook defaultConfig
        -- > myLayoutHook = layoutHook defaultConfig
        -- 
        , manageHook = myManageHook
        , layoutHook = myLayoutHook
        }

Don't worry about the things we haven't defined yet, I'll get to those. Also, the conky config file which I use can be found in my xmonad repo.

Your LogHook

Your logHook will setup the output of workspaces, layouts, and titles to the left dzen. You can customize the formatting, padding, shortening, etc.

Here's a commented version of myLogHook which, hopefully, is illustrative enough to not warrant further explanation.

-- 
-- Loghook
-- 
-- note: some of these colors may differ from what's in the
-- screenshot, it changes daily
-- 
myLogHook h = dynamicLogWithPP $ defaultPP

    -- display current workspace as darkgrey on light grey (opposite of 
    -- default colors)
    { ppCurrent         = dzenColor "#303030" "#909090" . pad 

    -- display other workspaces which contain windows as a brighter grey
    , ppHidden          = dzenColor "#909090" "" . pad 

    -- display other workspaces with no windows as a normal grey
    , ppHiddenNoWindows = dzenColor "#606060" "" . pad 

    -- display the current layout as a brighter grey
    , ppLayout          = dzenColor "#909090" "" . pad 

    -- if a window on a hidden workspace needs my attention, color it so
    , ppUrgent          = dzenColor "#ff0000" "" . pad . dzenStrip

    -- shorten if it goes over 100 characters
    , ppTitle           = shorten 100  

    -- no separator between workspaces
    , ppWsSep           = ""

    -- put a few spaces between each object
    , ppSep             = "  "

    -- output to the handle we were given as an argument
    , ppOutput          = hPutStrLn h
    }

No Overlap

The last thing you should do is add two little things to make sure you leave a gap for the new statusbar:

-- add avoidStruts to your layoutHook like so
myLayoutHook = avoidStruts $ {- whatever you had before... -}

-- add manageDocks to your managehook like so
myManageHook = manageDocks <+> {- whatever you had before ... -}

Happy haskelling!

published on 05 Dec 2009, tagged with dzen haskell xmonad