Home

yesod

tee-io Lessons Learned

A while back, I launched a side project called tee-io. It’s sort of like a live pastebin. You use its API to create a command and then send it buffered output, usually a line at a time. Creating the command gives you a URL where you can watch the output come in in real time. We use it at work to monitor the commands run by our bot instead of waiting for the (potentially long) command to finish and report all the output back to us at once.

tee-io-in-action
While working on this project, which is built with Yesod, I started to settle on some conventions for things I’ve not seen written up in the wild. I’d like to collect my thoughts here, both for myself and in case these conventions are useful to others.

Worker

One thing tee-io does that I think is common but under-served in the tutorial space is background work. In addition to the main warp-based binary, it’s often necessary to run something on a schedule and do periodic tasks. In tee-io’s case, I want to archive older command output to S3 every 10 minutes.

My approach is to define a second executable target:

executable              tee-io-worker
    if flag(library-only)
        buildable:      False

    main-is:            main-worker.hs
    hs-source-dirs:     app
    build-depends:      base
                      , tee-io

    ghc-options:        -Wall -Werror -threaded -O2 -rtsopts -with-rtsopts=-N

This is basically a copy-paste of the existing executable, and the implementation is also similar:

import Prelude (IO)
import Worker (workerMain)

main :: IO ()
main = workerMain

workerMain uses the “unsafe” handler function to run a Handler action as IO:

workerMain :: IO ()
workerMain = handler $ do
    timeout <- appCommandTimeout . appSettings <$> getYesod
    archiveCommands timeout

archiveCommands :: Second -> Handler ()
archiveCommands timeout = runDB $ -- ...

Making the heavy lifting a Handler () means I have access to logging, the database, and any other configuration present in a fully-inflated App value. It’s certainly possible to write this directly in IO, but the only real downside to Handler is that if I accidentally try to do something request or response-related, it won’t work. In my opinion, pragmatism outweighs principle in this case.

Logging

One of the major functional changes I make to a scaffolded Yesod project is around AppSettings, and specifically logging verbosity.

I like to avoid the #define DEVELOPMENT stuff as much as possible. It’s required for template-reloading and similar settings because there’s no way to give the functions that need to know those settings an IO context. For everything else, I prefer environment variables.

In keeping with that spirit, I replace the compile-time, logging-related configuration fields with a single, env-based log-level:

Settings.hs

instance FromJSON AppSettings where
    parseJSON = withObject "AppSettings" $ \o -> do
        let appStaticDir = "static"
        appDatabaseConf <- fromDatabaseUrl
            <$> o .: "database-pool-size"
            <*> o .: "database-url"
        appRoot <- o .: "approot"
        appHost <- fromString <$> o .: "host"
        appPort <- o .: "port"
        appIpFromHeader <- o .: "ip-from-header"
        appCommandTimeout <- fromIntegral
            <$> (o .: "command-timeout" :: Parser Integer)
        S3URL appS3Service appS3Bucket <- o .: "s3-url"
        appMutableStatic <- o .: "mutable-static"

        appLogLevel <- parseLogLevel <$> o .: "log-level"
        -- ^ here

        return AppSettings{..}

      where
        parseLogLevel :: Text -> LogLevel
        parseLogLevel t = case T.toLower t of
            "debug" -> LevelDebug
            "info" -> LevelInfo
            "warn" -> LevelWarn
            "error" -> LevelError
            _ -> LevelOther t

config/settings.yml

approot: "_env:APPROOT:http://localhost:3000"
command-timeout: "_env:COMMAND_TIMEOUT:300"
database-pool-size: "_env:PGPOOLSIZE:10"
database-url: "_env:DATABASE_URL:postgres://teeio:teeio@localhost:5432/teeio"
host: "_env:HOST:*4"
ip-from-header: "_env:IP_FROM_HEADER:false"
log-level: "_env:LOG_LEVEL:info"
mutable-static: "_env:MUTABLE_STATIC:false"
port: "_env:PORT:3000"
s3-url: "_env:S3_URL:https://s3.amazonaws.com/tee.io"

I don’t use config/test-settings.yml and prefer to inject whatever variables are appropriate for the given context directly. To make that easier, I load .env files through my load-env package in the appropriate places.

.env (development)

COMMAND_TIMEOUT=5
LOG_LEVEL=debug
MUTABLE_STATIC=true
S3_URL=https://s3.amazonaws.com/tee.io.development

.env.test

DATABASE_URL=postgres://teeio:teeio@localhost:5432/teeio_test
LOG_LEVEL=error
S3_URL=http://localhost:4569/tee.io.test

Now I can adjust my logging verbosity in production with a simple heroku config:set, whereas before I needed a compilation and deployment to do that!

Yesod applications log in a few different ways, so there are a handful of touch-points where we need to check this setting. To make that easier, I put a centralized helper alongside the data type in Settings.hs:

allowsLevel :: AppSettings -> LogLevel -> Bool
AppSettings{..} `allowsLevel` level = level >= appLogLevel

The first place to use it is the shouldLog member of the Yesod instance:

shouldLog App{..} _source level = appSettings `allowsLevel` level

Second is the logging middleware. It’s a little tricky to get the right behavior here because, with the default scaffold, this logging always happens. It has no concept of level and wasn’t attempting to make use of shouldLog in any way.

The approach I landed on was to change the destination to (basically) /dev/null if we’re not logging at INFO or lower. That’s equivalent to if these messages were tagged INFO and respected our configured level, which seems accurate to me. The big win here is they no longer mess up my test suite output.

makeLogWare foundation = mkRequestLogger def
    { outputFormat = if appSettings foundation `allowsLevel` LevelDebug
        then Detailed True
        else Apache apacheIpSource
    , destination = if appSettings foundation `allowsLevel` LevelInfo
        then Logger $ loggerSet $ appLogger foundation
        else Callback $ \_ -> return ()
    }

One last thing, specific to tee-io, is that I can use this setting to turn on debug logging in the AWS library I use:

logger <- AWS.newLogger (if appSettings foundation `allowsLevel` LevelDebug
    then AWS.Debug
    else AWS.Error) stdout

It’s pretty nice to set LOG_LEVEL=debug and start getting detailed logging for all AWS interactions. Kudos to amazonka for having great logging too.

REPL-Driven-Development

DevelMain.hs has quickly become my preferred way to develop Yesod applications. This file ships with the scaffold and defines a module for starting, stopping, or reloading an instance of your development server directly from the REPL:

stack repl --ghc-options="-DDEVELOPMENT -O0 -fobject-code"
λ> :l DevelMain
DevelMain.update
Devel application launched: http://localhost:3000

The big win here in my opinion is that, in addition to viewing changes in your local browser, you naturally fall into a REPL-based workflow. It’s not something I was actively missing in Yesod projects, but now that I’m doing it, it feels really great.

I happen to have a nice Show instance for my settings, which I can see with handler:

λ> appSettings <$> handler getYesod
log_level=LevelDebug host=HostIPv4 port=3000 root="http://localhost:3000"
  db=[user=teeio password=teeio host=localhost port=5432 dbname=teeio]
  s3_bucket=tee.io.development command_timeout=5s

(Line breaks added for readability, here and below.)

And I can investigate or alter my local data easily with db:

λ> db $ selectFirst [] [Desc CommandCreatedAt]
Just (Entity
      { entityKey = CommandKey
          { unCommandKey = SqlBackendKey {unSqlBackendKey = 1097} }
      , entityVal = Command
          { commandToken = Token {tokenUUID = e79dae2c-020e-48d4-ac0b-6d9c6d79dbf4}
          , commandDescription = Just "That example command"
          , commandCreatedAt = 2016-02-11 14:50:19.786977 UTC
          }
      })
λ>

Finally, this makes it easy to locally test that worker process:

λ> :l Worker
λ> workerMain
16/Apr/2016:14:08:28 -0400 [Debug#SQL]
  SELECT "command"."id", ...
    FROM "command"
  LEFT OUTER JOIN "output"
     ON ("command"."id" = "output"."command")
    AND ("output"."created_at" > ?)
  WHERE (("command"."created_at" < ?)
    AND ("output"."id" IS NULL))
  ; [ PersistUTCTime 2016-04-16 18:08:23.903484 UTC
    , PersistUTCTime 2016-04-16 18:08:23.903484 UTC
    ]
16/Apr/2016:14:08:28 -0400 [Info] archive_commands count=1
  @(main:Worker /home/patrick/code/pbrisbin/tee-io/src/Worker.hs:37:7)
[Client Request] {
  host      = s3.amazonaws.com:443
  secure    = True
  method    = PUT
  target    = Nothing
  timeout   = Just 70000000
  redirects = 0
  path      = /tee.io.development/b9a74a98-0b16-4a23-94f1-5df0a01667d0
  query     = 
  headers   = ...
  body      = ...
}
[Client Response] {
  status  = 200 OK
  headers = ...
}
16/Apr/2016:14:08:28 -0400 [Debug#SQL] SELECT "id", "command", ...
16/Apr/2016:14:08:28 -0400 [Debug#SQL] DELETE FROM "output" WHERE ...
16/Apr/2016:14:08:28 -0400 [Debug#SQL] DELETE FROM "command" WHERE ...
16/Apr/2016:14:08:28 -0400 [Info] archived token=b9a74a98-0b16-4a23-94f1-5df0a01667d0
  @(main:Worker /home/patrick/code/pbrisbin/tee-io/src/Worker.hs:59:7)
λ>

Since I run with DEBUG in development, and that was picked up by the REPL, we can see all the S3 and database interactions the job goes through.

The console was one of the features I felt was lacking when first coming to Yesod from Rails. I got used to not having it, but I’m glad to see there have been huge improvements in this area while I wasn’t paying attention.

Deployment

I’ve been watching the deployment story for Yesod and Heroku change drastically over the past few years. From compiling on a VM, to a GHC build pack, to Halcyon, the experience hasn’t exactly been smooth. Well, it seems I might have been right in the conclusion of that last blog post:

Docker […] could solve these issues in a complete way by accident.

We now have a Heroku plugin for using Docker to build a slug in a container identical to their Cedar infrastructure, then extracting and releasing it via their API.

Everything we ship at work is Docker-based, so I’m very comfortable with the concepts and machine setup required (which isn’t much), so using this release strategy for my Yesod applications has been great. Your mileage may vary though: while I do feel it’s the best approach available today, there may be some bumps and yaks for those not already familiar with Docker – especially if on an unfortunate operating system, like OS X.

Thanks to the good folks at thoughtbot, who are maintaining a base image for releasing a stack-based project using this Heroku plugin, making tee-io deployable to Heroku looked like this:

% cat Procfile
web: ./tee-io

% cat app.json
{
  "name": "tee.io",
  "description": "This is required for heroku docker:release"
}

% cat docker-compose.yml
# This is required for heroku docker:release
web:
  build: .

% cat Dockerfile
FROM thoughtbot/heroku-haskell-stack:lts-5.12
MAINTAINER Pat Brisbin <pbrisbin@gmail.com>

And I just run:

heroku docker:release

And that’s it!

If you’re interested in seeing any of the code examples here in the context of the real project, checkout the tee-io source on GitHub.

16 Apr 2016, tagged with haskell, yesod

Writing JSON APIs with Yesod

Lately at work, I’ve been fortunate enough to work on a JSON API which I was given the freedom to write in Yesod. I was a bit hesitant at first since my only Yesod experience has been richer html-based sites and I wasn’t sure what support (if any) there was for strictly JSON APIs. Rails has a number of conveniences for writing concise controllers and standing up APIs quickly – I was afraid Yesod may be lacking.

I quickly realized my hesitation was unfounded. The process was incredibly smooth and Yesod comes with just as many niceties that allow for rapid development and concise code when it comes to JSON-only API applications. Couple this with all of the benefits inherent in using Haskell, and it becomes clear that Yesod is well-suited to sites of this nature.

In this post, I’ll outline the process of building such a site, explain some conventions I’ve landed on, and discuss one possible pitfall when dealing with model relations.

Note: The code in this tutorial was extracted from a current project and is in fact working there. However, I haven’t test-compiled the examples exactly as they appear in the post. It’s entirely possible there are typos and the like. Please reach out on Twitter or via email if you run into any trouble with the examples.

What We Won’t Cover

This post assumes you’re familiar with Haskell and Yesod. It also won’t cover some important but un-interesting aspects of API design. We’ll give ourselves arbitrary requirements and I’ll show only the code required to meet those.

Specifically, the following will not be discussed:

  • Haskell basics
  • Yesod basics
  • Authentication
  • Embedding relations or side-loading
  • Dealing with created-at or updated-at fields

Getting Started

To begin, let’s get a basic Yesod site scaffolded out. How you do this is up to you, but here’s my preferred steps:

$ mkdir ./mysite && cd ./mysite
$ cabal sandbox init
$ cabal install alex happy yesod-bin
$ yesod init --bare
$ cabal install --dependencies-only
$ yesod devel

The scaffold comes with a number of features we won’t need. You don’t have to remove them, but if you’d like to, here they are:

  • Any existing models
  • Any existing routes/templates
  • Authentication
  • Static file serving

Models

For our API example, we’ll consider a site with posts and comments. We’ll keep things simple, additional models or attributes would just mean more lines in our JSON instances or more handlers of the same basic form. This would result in larger examples, but not add any value to the tutorial.

Let’s go ahead and define the models:

config/models

Post
  title Text
  content Text

Comment
  post PostId
  content Text

JSON

It’s true that we can add a json keyword in our model definition and get derived ToJSON/FromJSON instances for free on all of our models; we won’t do that though. I find these JSON instances, well, ugly. You’ll probably want your JSON to conform to some conventional format, be it jsonapi or Active Model Serializers. Client side frameworks like Ember or Angular will have better built-in support if your API conforms to something conventional. Writing the instances by hand is also more transparent and easily customized later.

Since what we do doesn’t much matter, only that we do it, I’m going to write JSON instances and endpoints to appear as they would in a Rails project using Active Model Serializers.

Model.hs

share [mkPersist sqlSettings, mkMigrate "migrateAll"]
    $(persistFileWith lowerCaseSettings "config/models")

-- { "id": 1, "title": "A title", "content": "The content" }
instance ToJSON (Entity Post) where
    toJSON (Entity pid p) = object
        [ "id"      .= (String $ toPathPiece pid)
        , "title"   .= postTitle p
        , "content" .= postContent p
        ]

instance FromJSON Post where
    parseJSON (Object o) = Post
        <$> o .: "title"
        <*> o .: "content"

    parseJSON _ = mzero

-- { "id": 1, "post_id": 1, "content": "The comment content" }
instance ToJSON (Entity Comment) where
    toJSON (Entity cid c) = object
        [ "id"      .= (String $ toPathPiece cid)
        , "post_id" .= (String $ toPathPiece $ commentPost c)
        , "content" .= commentContent c
        ]

-- We'll talk about this later
--instance FromJSON Comment where

Routes and Handlers

Let’s start with a RESTful endpoint for posts:

config/routes

/posts         PostsR GET POST
/posts/#PostId PostR  GET PUT DELETE

Since our API should return proper status codes, let’s add the required functions to Import.hs, making them available everywhere:

Import.hs

import Network.HTTP.Types as Import
    ( status200
    , status201
    , status400
    , status403
    , status404
    )

Next we write some handlers:

Handlers/Posts.hs

getPostsR :: Handler Value
getPostsR = do
    posts <- runDB $ selectList [] [] :: Handler [Entity Post]

    return $ object ["posts" .= posts]

postPostsR :: Handler ()
postPostsR = do
    post <- requireJsonBody :: Handler Post
    _    <- runDB $ insert post

    sendResponseStatus status201 ("CREATED" :: Text)

You’ll notice we need to add a few explicit type annotations. Normally, Haskell can infer everything for us, but in this case the reason for the annotations is actually pretty interesting. The selectList function will return any type that’s persistable. Normally we would simply treat the returned records as a particular type and Haskell would say, “Aha! You wanted a Post” and then, as if by time travel, selectList would give us appropriate results.

In this case, all we do with the returned posts is pass them to object. Since object can work with any type than can be represented as JSON, Haskell doesn’t know which type we mean. We must remove the ambiguity with a type annotation somewhere.

Handlers/Post.hs

getPostR :: PostId -> Handler Value
getPostR pid = do
    post <- runDB $ get404 pid

    return $ object ["post" .= (Entity pid post)]

putPostR :: PostId -> Handler Value
putPostR pid = do
    post <- requireJsonBody :: Handler Post

    runDB $ replace pid post

    sendResponseStatus status200 ("UPDATED" :: Text)

deletePostR :: PostId -> Handler Value
deletePostR pid = do
    runDB $ delete pid

    sendResponseStatus status200 ("DELETED" :: Text)

I love how functions like get404 and requireJsonBody allow these handlers to be completely free of any error-handling concerns, but still be safe and well-behaved.

Comment Handlers

There’s going to be a small annoyance in our comment handlers which I alluded to earlier by omitting the FromJSON instance on Comment. Before we get to that, let’s take care of the easy stuff:

config/routes

/posts/#PostId/comments            CommentsR GET POST
/posts/#PostId/comments/#CommentId CommentR  GET PUT DELETE

Handlers/Comments.hs

getCommentsR :: PostId -> Handler Value
getCommentsR pid = do
    comments <- runDB $ selectList [CommentPost ==. pid] []

    return $ object ["comments" .= comments]

-- We'll talk about this later
--postCommentsR :: PostId -> Handler ()

For the single-resource handlers, we’re going to assume that a CommentId is unique across posts, so we can ignore the PostId in these handlers.

Handlers/Comment.hs

getCommentR :: PostId -> CommentId -> Handler Value
getCommentR _ cid = do
    comment <- runDB $ get404 cid

    return $ object ["comment" .= (Entity cid comment)]

-- We'll talk about this later
--putCommentR :: PostId -> CommentId -> Handler ()

deleteCommentR :: PostId -> CommentId -> Handler ()
deleteCommentR _ cid = do
    runDB $ delete cid

    sendResponseStatus status200 ("DELETED" :: Text)

Handling Relations

Up until now, we’ve been able to define JSON instances for our model, use requireJsonBody, and insert the result. In this case however, the request body will be lacking the Post ID (since it’s in the URL). This means we need to parse a different but similar data type from the JSON, then use that and the URL parameter to build a Comment.

Helpers/Comment.hs

-- This datatype would be richer if Comment had more attributes. For now 
-- we only have to deal with content, so I can use a simple newtype.
newtype CommentAttrs = CommentAttrs Text

instance FromJSON CommentAttrs where
    parseJSON (Object o) = CommentAttrs <$> o .: "content"
    parseJSON _          = mzero

toComment :: PostId -> CommentAttrs -> Comment
toComment pid (CommentAttrs content) = Comment
    { commentPost    = pid
    , commentContent = content
    }

This may seem a bit verbose and even redundant, and there’s probably a more elegant way to get around this situation. Lacking that, I think the additional safety (vs the obvious solution of making commentPost a Maybe) and separation of concerns (vs putting this in the model layer) is worth the extra typing. It’s also very easy to use:

Handlers/Comments.hs

import Helpers.Comment

postCommentsR :: PostId -> Handler ()
postCommentsR pid = do
    _ <- runDB . insert . toComment pid =<< requireJsonBody

    sendResponseStatus status201 ("CREATED" :: Text)

Handlers/Comment.hs

import Helpers.Comment

putCommentR :: PostId -> CommentId -> Handler ()
putCommentR pid cid = do
    runDB . replace cid . toComment pid =<< requireJsonBody

    sendResponseStatus status200 ("UPDATED" :: Text)

We don’t need a type annotation on requireJsonBody in this case. Since the result is being passed to toComment pid, Haskell knows we want a CommentAttrs and uses its parseJSON function within requireJsonBody

Conclusion

With a relatively small amount of time and code, we’ve written a fully-featured JSON API using Yesod. I think the JSON instances and API handlers are more concise and readable than the analogous Rails serializers and controllers. Our system is also far safer thanks to the type system and framework-provided functions like get404 and requireJsonBody without us needing to explicitly deal with any of that.

I hope this post has shown that Yesod is indeed a viable option for projects of this nature.

22 Feb 2014, tagged with haskell, yesod

Parsing DATABASE_URL

A while back, I made a post about deploying yesod apps to heroku. The method used back then is no longer required (thank God!) and deploying to heroku is super simple these days. So simple, in fact, that I won’t reiterate those instructions here, this post is about something a bit more specific.

Chances are, your app is using a database. And you probably don’t want to hard-code those database credentials in your (probably shared) source code. What you’d rather do is parse them out of the DATABASE_URL environment variable provided by heroku.

Well, here is how you do that:

herokuConf

Eventually, I might wrap this up in a cabal package you can install, but for now just create a helper like this:

Helpers/Heroku.hs

module Helpers.Heroku (herokuConf) where

import Prelude
import Data.ByteString (ByteString)
import Data.Text (Text)
import Data.Text.Encoding (encodeUtf8)
import Database.Persist.Postgresql (PostgresConf(..))
import Web.Heroku (dbConnParams)

import qualified Data.Text as T

herokuConf :: IO PostgresConf
herokuConf = do
    params <- dbConnParams

    return PostgresConf
        { pgConnStr  = formatParams params
        , pgPoolSize = 10 -- Adjust this as you see fit!
        }

    where
        formatParams :: [(Text, Text)] -> ByteString
        formatParams = encodeUtf8 . T.unwords . map toKeyValue

toKeyValue :: (Text, Text) -> Text
toKeyValue (k, v) = k `T.append` "=" `T.append` v

This relies on the heroku package, so be sure you add that to the build-depends in your cabal file.

makeFoundation

Now, modify your application loading like so:

Application.hs

import Helpers.Heroku

makeFoundation :: AppConfig DefaultEnv Extra -> IO App
makeFoundation conf = do
    -- ...

    dbconf <- if development
                -- default behavior when in development
                then withYamlEnvironment "config/postgresql.yml" (appEnv conf)
                    Database.Persist.loadConfig >>=
                    Database.Persist.applyEnv

                -- but parse DATABASE_URL in non-development
                else herokuConf

    -- ...

    return foundation

That’s it. Commit, push, enjoy!

07 Jun 2013, tagged with yesod, haskell, heroku, postgresql

For the Library Authors

Recently, Yesod released version 1.2. You can read the announcement here, the changelog here, and a detailed blog post about the subsite rewrite here. These resources do a great job of getting users’ apps up on 1.2. This post won’t rehash those, it is instead intended for those of you (like myself) who maintain libraries dependent on Yesod.

The large refactor to the transformer stack and its implications in how subsites are written made it non-trivial to port my markdown, comments, and pagination libraries over to 1.2; I imagine many other such authors are in the same position and might appreciate a little guidance.

I don’t claim to know why or how all this stuff works, but at least you can benefit from my trial and error.

I apologize for the lack of narrative or conclusion here, this is pretty much just a list of things I had to take care of during the update process…

Transformer Stack

You can basically find-and-replace all of these:

fooHandler :: X -> Y -> GHandler s m a
fooWidget  :: X -> Y -> GWidget s m a

Into these:

fooHandler :: X -> Y -> HandlerT m IO a
fooWidget  :: X -> Y -> WidgetT m IO a

Lifting

Anywhere you use lift to run a Handler action from within a Widget now needs to use handlerToWidget for the same purpose.

Route to Master

Subsites and their masters are now very well isolated, this means you no longer need code like this in a master site’s hander:

tm <- getRouteToMaster
redirect $ tm SomeRoute

It can be simplified to just:

redirect SomeRoute

Which is way better.

The function getRouteToMaster does still exist as getRouteToParent, and it should be used (only) to route to a master site’s route from within the subsite’s handler.

Subsite Declaration

If you author a subsite, here is where your largest changes will be. There’s a handy demo app which serves as a great reference.

Subsites now have a two-phase construction much like in Foundation.hs. So, where you might’ve had a single module like this:

module CommentsAdmin
  ( CommentsAdmin
  , getCommentsAdmin
  , Route(..)
  ) where

CommentsAdmin = CommentsAdmin

getCommentsAdmin :: a -> CommentsAdmin
getCommentsAdmin = const CommentsAdmin

mkYesodSub "CommentsAdmin"
    [ ClassP ''YesodComments [ VarT $ mkName "master" ] ]
    [parseRoutes|
        /                            CommentsR      GET
        /edit/#ThreadId/#CommentId   EditCommentR   GET POST
        /delete/#ThreadId/#CommentId DeleteCommentR GET POST
        |]

You now need a separate file to define the routes:

CommentsAdmin/Routes.hs

module CommentsAdmin.Routes where

CommentsAdmin = CommentsAdmin

mkYesodSubData "CommentsAdmin" [parseRoutes|
    /                            CommentsR      GET
    /edit/#ThreadId/#CommentId   EditCommentR   GET POST
    /delete/#ThreadId/#CommentId DeleteCommentR GET POST
    |]

And import/use them separately:

CommentsAdmin.hs

module Foo
  ( CommentsAdmin
  , getCommentsAdmin
  , module CommentsAdmin.Routes
  ) where

import CommentsAdmin.Routes

getCommentsAdmin :: a -> CommentsAdmin
getCommentsAdmin = const CommentsAdmin

instance YesodComments m => YesodSubDispatch CommentsAdmin (HandlerT m IO)
    where yesodSubDispatch = $(mkYesodSubDispatch resourcesCommentsAdmin)

There’s probably a way around this, but I had enough wrestling to do.

You’ll also want to make a Handler synonym for your subsite routes:

type Handler a = forall master. YesodComments master
               => HandlerT CommentsAdmin (HandlerT master IO) a

getCommentsR :: Handler RepHtml

It’s fine to use Handler as long as you don’t export it.

Subsite Actions

What you do from within a subsite will definitely need some tweaking, but that’s mostly because the old way was very klunky and the new way is much cleaner.

If you want to call any functions in the context of the main site, just use lift. Usually, this’ll be lift $ defaultLayout, but also, as may be common, if you have a Typeclass on your master site providing some functionality (like loading comments), you need to use lift to call those functions from within subsite handlers.

Persistent Fields

If you derive PersistField also now derive PersistFieldSql. I don’t know the motivation behind the split, but as a user dog-fooding my own library, I soon realized I needed both instances on my Markdown type.

Persistent Actions

If you have a library exposing functions which are meant to be called within runDB, you probably already know those type signatures can get messy.

Well, they stay messy, but at least I can tell you what you need to change. Mine went from this:

selectPaginated :: ( PersistEntity val
                   , PersistQuery m1
                   , PersistEntityBackend val ~ PersistMonadBackend m1
                   , MonadLift (GHandler s m) m1
                => Int
                -> [Filter val]
                -> [SelectOpt val]
                -> m1 ([Entity val], GWidget s m ())

To this:

selectPaginated :: ( PersistEntity val
                   , (PersistQuery (YesodPersistBackend m (HandlerT m IO)))
                   , (PersistMonadBackend (YesodPersistBackend m (HandlerT m IO)) ~ PersistEntityBackend val)
                   , (MonadTrans (YesodPersistBackend m))
                   , Yesod m
                   )
                => Int
                -> [Filter val]
                -> [SelectOpt val]
                -> YesodDB m ([Entity val], WidgetT m IO ())

I probably could’ve added the Yesod m constraint and used the YesodDB alias prior to 1.2, but oh well.

31 May 2013, tagged with yesod, haskell

Developing Web Applications with Yesod

The following was written for issue 7 of Web & PHP magazine. Please, if you enjoy this article (or my articles in general), take the two minutes to register there and download the full PDF to show your support.

Why Haskell?

There’s much more to Haskell than just the buzz-words like laziness and parallelism – which are completely deserved, by the way. Having pure computations defined as side-effect-free morphisms that take and return immutable datatypes allows the compiler to do amazing optimizations. This frees you to write elegant, readable code but get near-C performance at the same time.

Runtime errors. grepping through source to find some method you just rewrote to ensure it’s not incorrectly called somewhere. Wondering if that expression represents a String or a Boolean. Wondering how that template behaves when user is nil. Unit tests. Hitting deploy and frantically browsing the site to make sure things still work. These are the hazards of a dynamic language. These are things that all go away when you use a language like Haskell.

It’s been my experience that when developing in Haskell: if it compiles, it works. I’d say, conservatively, that 93% of every bug I’ve ever written in Haskell has been caught immediately by the compiler. That’s a testament to both the compiler and the number of bugs I’m able to produce in Haskell code. It’s amazingly freeing to gain such a level of confidence in the correctness of your code simply by seeing its successful compilation.

My hope for this article is to illustrate this experience by building out a simple site in the Haskell web framework Yesod. Yesod is just one of many web frameworks in Haskell, but it’s the one I’m most comfortable with. I encourage you to check it out at yesodweb.com, there are many features and considerations that I won’t be touching on here.

Yesod Development

In order to develop a Yesod site, you’ll need the Glasglow Haskell Compiler, along with some additional build tools. These can all be installed by setting up the Haskell Platform. There are installers for Windows, OSX, and most Linux distributions have it in their repositories.

Once the Haskell Platform is setup, type:

$ cabal update
$ cabal install yesod-platform

This one time installation of the framework can take while.

The Lemonstand

The blog example feels a bit overdone, doesn’t it? Instead, let’s build a lemonade stand (which I’ll refer to as “The Lemonstand” from now on). We won’t get too crazy with features, we just want a few pages and some database interaction so you can see how this framework can be used.

Much of the site will be provided by the code-generating tool called the Yesod Scaffold.

Yesod Init

The Yesod scaffolding tool will build out a sample site showing some of the more common and useful patterns used in Yesod sites. It will build you a simple “hello world” site with important features like persistence, authentication and static file serving already coded out. You can then edit and extend this site to quickly build out features.

It’s important to note that this is not the way to structure a Yesod application, is just one way to do it. That said, this organisational structure has been refined over a long period of time and comes with many benefits.

To start our project, we do the following:

$ yesod init

We’ll answer a couple of questions about ourselves and our project. I’m calling it “lemonstand” and choosing the sqlite database type since it does not require a separate server.

The first thing we have to do is pull in any additional dependencies (like the driver for the type of database we chose to use).

$ cd lemonstand
$ cabal install

In order for authentication via Google to work (a feature we’ll use down the line), we need to make one small change to config/settings.yml. Please update the development block like so:

Development:
  <<: *defaults
  approot: "http://localhost:3000"

With that bit of housekeeping out of the way, go ahead and fire up the development server:

$ yesod devel

You should see lots of output about compilation, database migrations, etc. Most importantly is “Devel application launched: http://localhost:3000”. Go ahead and checkout the sample site by visiting that URL in your browser.

default screenshot 

Now we’re ready to hack!

Models

In a “production” lemonade stand app, we might get a little more complex with the data modeling, but to keep this demo simple, I’m going to model the system simply as well.

The scaffold already comes with the concept of a User and authentication, so we’ll keep that as-is. The second most important concept will be Orders which our users can create through a typical check-out flow.

Orders will have many Lemonades which have size, price, and quantity.

Open up Model.hs. This is where we’ll place our core data type definitions. You should notice a line about persistFile. What this does is parse the text file “config/models” and generate some Haskell datatypes for us. This line also adds the required boilerplate to persist these types to the database and well as the initial migration code. This is where your User model comes from.

We’ll get to this file in a second, but first we’re going to define some data types that won’t be persisted.

Go ahead and add the following after the import lines but before the share line:

type Price = Double
type Qty   = Int

What these are are type aliases. They just allow you to refer to one type as another ([Char] is aliased to String in the standard Prelude, for instance).

If and when we later make functions that deal with Lemonades, we’ll see type signatures like this:

-- | Calculate the total price of multiple lemonades
totalPrice :: [Lemonade] -> Price
totalPrice = ...

And not like this:

-- | Calculate the total price of multiple lemonades
totalPrice :: [Lemonade] -> Double
totalPrice = ...

Which is not as descriptive. It’s a little thing, but it goes a long way.

We’re also going to create an additional data type that won’t (itself) be stored as a database record.

data Size = Small
          | Medium
          | Large
          deriving (Show, Read, Eq, Ord, Enum, Bounded)

You might be familiar with this concept as an enum. In Haskell, the concept of enumeration types are just a degenerative form of algebraic data types where the constructors take no arguments.

Don’t worry about the deriving line. That just tells Haskell to go ahead and use sane defaults when performing common operations with this type like converting it to string or comparing two values for equality. With this deriving in place, Haskell knows that Small can be shown as “Small” and that Medium == Medium.

Even though we don’t want to store Sizes directly in the database as records, we do plan to have fields of other records be of the type Size. To allow this, we just have to ask Yesod to generate some boilerplate on this type:

derivePersistField "Size"

Easy.

When you hit save on this file, you should see in your terminal still running yesod devel that it’s recompiled your sources and restarted your development server. The important thing is that it does this successfully each time you make a change. When you introduce a bug, you’ll get a compiler error directing you to the problem. This immediate and accurate feedback is important to the development process as we’ll see later on.

Next, we’ll go ahead and add some database models. Open up config/models.

You’ll see some models are already present, we’ll just add more to the bottom of the file:

Order
    user UserId

Lemonade
    order OrderId Maybe
    size  Size
    price Price
    qty   Qty

This is exactly as if you had handwritten the Haskell data types:

data Order = Order
    { orderUser :: UserId
    }

data Lemonade = Lemonade
    { lemonadeOrder :: Maybe OrderId
    , lemonadeSize  :: Size
    , lemonadePrice :: Price
    , lemonadeQty   :: Qty
    }

In addition to the above declarations, Yesod will add all of the boilerplate needed for values of these types to be (de)serialized and persisted to or restored from the database.

Again, save the file and make sure it compiles.

Notice that I used the Maybe type on lemonadeOrder. In Haskell, the this type is defined as:

data Maybe a = Just a | Nothing

This allows you to have a function which can return some a or Nothing at all. This is how Haskell can maintain type safety even when you need the concept of an optional parameter or return value.

I’m assuming here that we might want to describe Lemonades that aren’t yet associated with an Order. We’ll see if that turns out to be the case.

Route Handling

Before we start making further changes, let me provide some context on how the current homepage is rendered. We’ll be mimicking this pattern for our other pages.

Every URL that your app responds to is listed in config/routes, so go ahead and open that file.

You’ll see some scaffold-provided routes already. /static and /auth use a concept called Subsites to provide additional functionality to your app (namely static file serving and user authentication). We’ll not go into this any further as it can get hairy quickly and for the purposes of this article, we can treat these as black boxes.

The rest of the entries are normal routes. For these, you provide:

  1. The relative URL you answer to (we’ll get to variable pieces later)
  2. The data type of the route (again, more later)
  3. The supported methods (GET, POST, etc)

Let’s look at HomeR.

In your Foundation.hs file there’s another line similar to the persistFile line in Model.hs. It works much the same way in that it will parse this flat file (config/routes) and generate some Haskell code for us.

When the parser comes across this HomeR line, it’s going to do a number of things. Conceptually, it’s something like the following:

  1. HomeR is made a valid constructor for values of type Route which is used by the framework to route requests to your handler functions.
  2. The functions in charge of rendering and parsing URLs can now translate to and from this HomeR type.

In order to accomplish this, two functions need to be in scope: getHomeR and postHomeR. This is because we’ve specified GET and POST as supported methods.

So, whenever a GET request comes in for “/”, Yesod will now translate that URL into the data type HomeR and know to call getHomeR which is a function that returns an HTML response (RepHtml).

If you were to define a route like “/users/#UserId UsersR GET”, then your required function getUsersR would have the type UserId -> RepHtml. Since your URL has a variable in it, that piece will match as a UserId and it will be given as the first argument to your handler function – all in an entirely type safe way.

Let’s add a route for buying some lemonade:

/checkout CheckoutR GET POST

While we’re here, remove the POST from HomeR since we’ll no longer be using that.

When you save this file you should see some problems in your compiler window:

[7 of 7] Compiling Application      ( Application.hs, 
dist/build/Application.o )

Application.hs:30:1: Not in scope: `getCheckoutR'

Application.hs:30:1: Not in scope: `postCheckoutR'

Well, look at that. We’ve introduced a bug, and it was caught immediately.

Since the app now needs to answer requests for “/checkout” by calling your handler functions, they need to be there or you’d have runtime errors. There is very little potential for runtime errors in Haskell, and this is just our first example of why: the compiler catches us ahead of time.

So let’s fix it. The following steps might feel a bit tedious, and in Yesod version 1.1 there is a tool to do them for you, however I think that doing things like this manually at least once is useful.

Add the following around line 36 of lemonstand.cabal:

Handler.Checkout

This tells the build system to include this new source file we’ll create.

Add the following around line 26 of Application.hs:

import Handler.Checkout

This imports that module (still not written) into the scope where these functions are needed.

Finally, create the file Handler/Checkout.hs:

module Handler.Checkout where

import Import

getCheckoutR :: Handler RepHtml
getCheckoutR = undefined

postCheckoutR :: Handler RepHtml
postCheckoutR = undefined

We’ve really just traded one runtime error for another as visiting that page will result in the app calling undefined which will fail. However, we’ve made the compiler happy and can move onto other things and come back to these later.

Templates

Let’s open up Handler/Home.hs and see how our current home page is rendered.

We’re going to strip out just about everything here. Similar code will be added later in other handlers, and I’d like you to see those concepts then rather than now.

Rewrite the file so it looks like this:

-- leave everything up to and including the import line as-is.

getHomeR :: Handler RepHtml
getHomeR = do
    -- Use the default overall layout, you'll amost always do this.
    defaultLayout $ do

        -- The page title.
        setTitle "Lemonade Stand"

        -- The template to render.
        $(widgetFile "homepage")

You may notice, you’ve triggered another compiler error, quite a few actually: not in scope: aDomId.

Our templates reference a variable which we’ve just removed. Please, take a moment to appreciate type-safe templates. No runtime error, no silent nil-handling, we get an up-front compiler error indicating exactly where the problem is. How cool is that?

In the process of fixing this, I’ll also try to provide a little more context.

$(widgetFile "homepage") is a very useful function. What it does is look in your templates directory for any HTML, CSS and Javascript templates for your “homepage”. These templates will be combined into a Widget. Widgets can be nested and combined quite naturally throughout your application. In the end, they will all be rolled up into one final Widget and served as a single response. All style sheets and scripts will be concatenated, minified (when configured to do so) and ordered correctly – all without you having to think about it.

For us, this means templates/homepage.{hamlet,lucius,julius} are being found and compiled.

Julius is Javascript templating, it’s essentially a straight passthrough except with variable interpolation. You can go ahead and remove it now, we won’t use it on this page.

$ rm templates/homepage.julius

Lucius is a superset of CSS. It was designed to allow existing CSS to be pasted directly in and have it still compile and work. On top of this, it allows for variable interpolation and some Less-like extensions like nesting and mixins. Open up the template and remove the style block referencing aDomId.

Hamlet is the most complex of Yesod’s templaters. Open up the template and fill it with the following content:

<h1>_{MsgHello}

<p>
  Click 
  <a href=@{CheckoutR}>here
  \ to buy some Lemonade!

We’re going to leave _{MsgHello} in place. The _{ } interpolation will check your messages file for translations and show different content based on the user’s preferred language.

@{ } is a Route interpolation. As you might’ve guessed, it’s used to show internal links in a type safe way. Now that we’ve removed the aDomId references things are compiling, but it’s important to realize that had we added this link to CheckoutR in here before actually adding that route to our app, we’d get a similar compiler error. No more dead links in your application, any URLs that don’t resolve will immediately show up as compiler errors.

If we had a route as mentioned before for users (“/users/#UserId”) we’d have to use something like @{UsersR aUserId} and the compiler would infer and enforce that aUserId is, in fact, a UserId.

There is a lot of functionality in Hamlet templates, some of which we’ll get to when we build out our next page. What you can do right now is refresh your browser and see your changes.

homepage screenshot 

Forms

Let’s head back to Handler/Checkout.hs. We’re going to add a very simple form where the user can pick the size of their lemonade and checkout.

First we’ll declare a form:

lemonadeForm :: Form Lemonade
lemonadeForm = renderTable $ Lemonade
    <$> pure Nothing
    <*> areq (selectField optionsEnum) "Size" Nothing
    <*> pure 0.0
    <*> areq intField "Quantity" Nothing

There’s a few things going on here worth looking at. First of all, each line represents a record of the Lemonade data type. When shown, this form will have fields according to what’s listed and map those values back to a value of type Lemonade when the form is processed. The lines that use pure provide values when processed, but don’t actually show any fields.

Where going to cheat here and completely ignore Price. Dealing with dependent fields (setting price based on size, for example) can get tricky, so we’re just going to set the price server-side after the size and quantity have been submitted.

Before we can test out this form, there’s one thing we need to change about our Foundation.hs. We’re going to use the function requireAuthId to force users to authenticate before checking out. This function also gives us the Id of the current user.

To allow this, we’ve got to change the module exports of Foundation.hs like so:

module Foundation
    ( App (..)
    , Route (..)
    , AppMessage (..)
    , resourcesApp
    , Handler
    , Widget
    , Form
    , maybeAuth
    , requireAuth
    , requireAuthId -- <- add this
    , module Settings
    , module Model
    ) where

With that in place, we can sketch out the Handler now:

getCheckoutR :: Handler RepHtml
getCheckoutR = do
    -- force authentication and tell us who they are
    uid <- requireAuthId

    -- run the defined form. give us a result, the html and an encoding 
    -- type
    ((res,form), enctype) <- runFormPost $ lemonadeForm

    case res of
        -- if a form was posted we get a Lemonade
        FormSuccess l -> do
            -- process it and give us the order id
            oid <- processOrder uid l

            -- TODO: redirect to Thank You page here
            return ()

        -- in all other cases just "fall through"
        _ -> return ()

    -- and display the page
    defaultLayout $ do
        setTitle "Checkout"
        $(widgetFile "checkout")

postCheckoutR :: Handler RepHtml
postCheckoutR = getCheckoutR

processOrder :: UserId -> Lemonade -> Handler OrderId
processOrder = undefined

When requireAuthId is encountered for an unauthenticated user, they will be redirected to login. The scaffold site uses the GoogleEmail plugin which allows users to login using their gmail accounts via Open Id. This authentication system can of course be changed, extended or removed, but we’re going to just use it as is.

We’re also using a common idiom here: the same Handler handles both GET and POST requests. In the case of a GET, the form result (res) will be FormMissing, that case statement will fall through and the form will be displayed. In the case of a POST, the form result will be FormSuccess, we’ll execute processOrder (which we’ve left undefined for now) and redirect to a “Thank You” page.

Additionally, if there were errors in the parameters, the results would be FormErrors which is handled the same way as a GET (fall through to displaying the form) except this time, the form’s HTML will include those errors so they’re visible to the user to correct and resubmit.

Upon saving this, we should have another compiler error. We’ve told yesod to look for “checkout” templates, but there are none. So let’s create just “templates/checkout.hamlet”:

<h1>Checkout
<p>What size lemonade would you like?
<form enctype="#{enctype}" method="post">
  <table>
    ^{form}
    <tr>
      <td>&nbsp;
      <td>
        <button type="submit">Checkout

Simple variable interpolation is done via #{ }, while embedding one template (like form) into another is done via ^{ }.

form screenshot 

Now that we’ve got the form showing, we can replace our undefined business logic with some actual updates:

-- | Take a constructed Lemonade and store it as part of a new order in 
--   the database, return the id of the created order.
processOrder :: UserId -> Lemonade -> Handler OrderId
processOrder uid l = runDB $ do
    oid <- insert $ Order uid
    _   <- insert $ l { lemonadeOrder = Just oid
                      , lemonadePrice = priceForSize $ lemonadeSize l
                      }

    return oid

    where
        priceForSize :: Size -> Price
        priceForSize Small  = 0.99
        priceForSize Medium = 1.99
        priceForSize Large  = 2.99

Make sure that compiles, then add in the actual redirect:

getCheckoutR :: Handler RepHtml
getCheckoutR = do
    uid <- requireAuthId

    ((res,form), enctype) <- runFormPost $ lemonadeForm

    case res of
        FormSuccess l -> do
            oid <- processOrder uid l

            -- redirect to a "Thank You" page which takes an order id as 
            -- a parameter.
            redirect $ ThankYouR oid

        _ -> return ()

    defaultLayout $ do
        setTitle "Checkout"
        $(widgetFile "checkout")

Hopefully, you’ve noticed the compiler error this introduces. Can you guess how to fix it?

We’ve told our Application to redirect to ThankYouR but that route does not exist. Again, no runtime error, just a clear compiler error.

So, follow the advice of the compiler and add the route declaration to config/routes:

/thank_you/#OrderId ThankYouR GET

Again we get the expected compiler error that getThankYouR is not in scope.

In the interest of time and variety, we’ll not create an entirely different module, or template for the Thank You page, We’ll inline everything right here in Handler/Checkout.hs:

getThankYouR :: OrderId -> Handler RepHtml
getThankYouR oid = defaultLayout $ do
    setTitle "Thanks!"

    [whamlet|
        <h1>Thank You!
        <p>Your order is ##{toPathPiece oid}
        |]

thank you screenshot 

Conclusion

Obviously, The Lemonstand is quite lacking. The user never gets to see price, there’s no concept of buying multiple Lemonades of varying Sizes, and the overall UI/UX is pretty terrible.

These are all things that can be fixed, but this article is already getting quite long, so I’ll have to leave them for another time. Hopefully you’ve seen a good enough mix of theory and practice to agree that there are benefits to working on web applications (or any software) in a purely functional language like Haskell.

01 Nov 2012, tagged with haskell, yesod, published

Deploying Yesod Apps On Heroku

Update This post describes compiling a Yesod application locally using a VM to achieve the compilation on a Heroku-like machine, then pushing the binary up to Heroku to be run. This is an annoying route which is no longer necessary (as mentioned in the comments), so don’t follow it. Instead follow this guide.

The following are the steps I followed to get a non-trivial Yesod application running on Heroku.

This guide assumes you know what Heroku is, you’ve got the Toolbelt installed, and your ssh keys are set up. The wiki I followed can be found here. The Heroku “Getting started” guides were also very useful.

Setup

Create an app for your project:

$ heroku apps:create

Add a Procfile

$ cat Procfile
web: ./myapp production -p $PORT

Add a package.json

$ cat package.json
{
  "name"         : "myapp",
  "version"      : "0.0.0",
  "dependencies" : {}
}

The package.json file tricks Heroku into running us as if we were a node.js app which really just means executing the command in the Procfile.

Postgesql

Add the add-on:

$ heroku addons:add heroku-postgresql

Then “promote” your database, whatever that means…

$ heroku pg:info
=== HEROKU_POSTGRESQL_ORANGE_URL (DATABASE_URL)
Plan:        Dev
Status:      available
Connections: 0
PG Version:  9.1.6
Created:     2012-10-20 02:28 UTC
Data Size:   6.1 MB
Tables:      0
Rows:        0/10000 (In compliance)
Fork/Follow: Unavailable

$ heroku pg:promote ORANGE

Grab the credentials information:

$ heroku pg:credentials ORANGE
Connection info string:
   "dbname=dfh57p6tk1gqbl 
   host=ec2-54-243-228-169.compute-1.amazonaws.com port=5432 user=yphlhbhmzthocg password=4KX6f7tENj2YaAh43vWoCqfMAo sslmode=require"

And translate them into your postgresql.yml:

Production:
  <<: *defaults
  user: yphlhbhmzthocg
  password: 4KX6f7tENj2YaAh43vWoCqfMAo
  host: ec2-54-243-228-169.compute-1.amazonaws.com
  port: 5432
  database: dfh57p6tk1gqbl
  sslmode: require

As mentioned in the comments, putting credentials for a world-reachable database into publicly shared source code is a Bad Idea. In my case, the applications I place on Heroku are throw away prototypes for which this lack of security is perfectly acceptable.

Please consider carefully your own security needs.

This (untested) gist may work for pulling the database credentials from the environment.

Build

If your local hardware doesn’t match Heroku’s, you’re gonna have a bad time

There is a great pre-packaged Vagrant setup for Haskell floating around bitbucket, but I found it was a bit broken. I made the needed changes to get it working and the resulting fork is available here.

Add it as a sub directory within your project:

$ git clone https://github.com/pbrisbin/vagrant-haskell ./vagrant

Use it to compile your binary:

$ cd ./vagrant
$ vagrant up
$ vagrant ssh
[guest]$ cabal update
[guest]$ cabal install cabal-install
[guest]$ cd /app
[guest]$ cabal install --only-dependencies
[guest]$ cabal configure -fproduction
[guest]$ cabal build

These steps will take a long time the first time around because you’re compiling GHC, the Haskell Platform, then installing all your Yesod dependencies. As long as you don’t destroy the VM, subsequent rebuilds won’t have to repeat those steps.

Deploy

I keep dist out of version control, so I just move the binary up to top-level and commit it there:

$ cp dist/build/myapp/myapp .
$ git add ./myapp
$ git commit -m 'add binary'

Deploy to Heroku:

$ git push heroku master

Read the output from the push then go view your site.

If you get an Application error when viewing your freshly deployed site, you can check to see what’s wrong via heroku logs. I direct you back to the original wiki for some trouble shooting tips.

Pushing to Heroku requires you setup SSH keys (like any hosting service should). When you initially heroku login it will look for an existing key and use it or create a default id_rsa.pub for you.

I actually prefer to have separate per-service keys (id_rsa.github, id_rsa.nodester, ide_rsa.heroku, etc). This lets me use password-less keys for these less-critical logins and still have my main id_rsa be password-protected for logging into my own servers.

So here’s what I do:

$ ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/home/patrick/.ssh/id_rsa): /home/patrick/.ssh/id_rsa.heroku
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
$ heroku keys:clear
$ heroku keys:add /home/patrick/.ssh/id_rsa.heroku.pub

Then add the following to ~/.ssh/config:

Host heroku.com
  IdentityFile ~/.ssh/id_rsa.heroku

I have similar entries for the other services I mentioned.

Automate

Now that you’ve gone through this process once, you should automate it via a simple deploy script:

#!/bin/bash -ex
(
  cd ./vagrant

  vagrant up
  vagrant ssh -c 'cd /app &&
                  cabal clean &&
                  cabal configure -fproduction &&
                  cabal build'
)

cp dist/build/myapp/myapp .
git commit -m ./myapp 'new binary'

git push heroku master

Bonus: DNS

If you own a domain which you would like to point to this Heroku app, the easiest way I’ve found is to use the Zerigo DNS add-on:

$ heroku addons:add zerigo_dns:basic

The add-on is free, but they do require you verify your account and add billing information to install it.

Update your Domain Registrar to use their name servers:

a.ns.zerigo.net
b.ns.zerigo.net

And add your domain

$ heroku domains:add mydomain.com

Wait for DNS to propogate, and you’re done.

20 Oct 2012, tagged with yesod, website, ops, cloud

Yesod Deployments with Keter

Keter is Michael Snoyman’s answer to the Yesod deployment “problem”. This is something I’ve been looking for a good solution to for some time now and keter does a really great job.

Keter is meant to run as a service. It sets up its own workspace and watches for keter “bundles” to be dropped into an incoming/ directory. A keter bundle is just a gzipped tarball of your app’s executable, any support files and directories it might expect to have in its current working directory, and a configuration file to tell keter a few simple things about how to manage it.

When a bundle is found in incoming/, keter will:

  • Create a folder to run the app in

  • Run the configured executable for the app, setting environment variables so it listens on a random port

  • Capture the process’s output to a log in the keter workspace

  • Setup an nginx virtual host entry which reverse proxies to the app and tell nginx to reload

  • If configured to do so, setup the postgresql database that your app needs

Keter also manages multiple versions of the same app through a zero-downtime deployment. It will bring up a new version in its own folder and wait until the current version is done serving requests before sending it a SIGTERM and removing its folder.

Though this guide will focus on getting a yesod application deploying with keter on Arch Linux, attempts will be made to explain things in a general enough way that the instructions will allow you to get this going on any distro.

This guide also assumes you’ve got postgresql setup and working and will manage it outside of keter. Basically, you’ve already got a running site and a (possibly sub-optimal) method of deployment – I’m going to cover the transition to a keter-based approach.

Keter

First of all, install keter. At the time of this writing, we need to run the git version since it contains the code needed to customize the nginx start/reload commands.

$ git clone https://github.com/snoyberg/keter
$ cd keter
$ cabal configure
$ cabal build
$ sudo cp dist/build/keter/keter /usr/bin/keter

The last step is optional, just know that you’ll need the keter binary somewhere in root’s $PATH.

Next, we’ll setup a configuration file to tell keter where to place its working files and how to start and reload nginx.

/etc/keter.yaml:

root: /opt/keter
nginx:
  start:
    - /etc/rc.d/nginx
    - start
  reload:
    - /etc/rc.d/nginx
    - reload

And a file to run keter as a service:

/etc/rc.d/keter:

#!/bin/bash

. /etc/rc.conf
. /etc/rc.d/functions

PIDFILE=/var/run/keter.pid

case "$1" in
  start)
    stat_busy 'Starting Keter'
    keter /etc/keter.yaml &>/dev/null &
    echo $! >"$PIDFILE"

    if [[ $? -gt 0 ]]; then
      stat_fail
    else
      add_daemon keter
      stat_done
    fi
    ;;

  stop)
    stat_busy 'Stopping Keter'
    read -r pid < $PIDFILE

    kill $pid || kill -9 $pid

    if [[ $? -gt 0 ]]; then
      stat_fail
    else
      rm_daemon keter
      stat_done
    fi
    ;;

  restart)
    $0 stop
    sleep 3
    $0 start
    ;;
  *)
    echo "usage: $0 {start|stop|restart}"
esac
exit 0

Don’t start keter just yet, we’ve got a few more steps.

Nginx

If you’ve already got a site being reversed proxied to by nginx, that’s good, but it’s likely that keter will complete this task differently than you’re currently doing it. We’ll manually update our configs to the “keter way” first, so the transition to keter goes as smoothly as possible.

You’ve probably got everything in a single config file; we’re going to modularize by site. Keter will write a server block to /etc/nginx/sites-enabled/keter containing the reverse proxy declaration. There’s no reason we can’t get setup that way now and verify it’s working without keter’s involvement.

/etc/nginx/conf/nginx.conf

user you;
worker_processes 1;

events {
  worker_connections 1024;
}

http {
  # you can run multiple sites by setting up any number of files in 
  # sites-enabled and having each respond to a specific server_name, 
  # your keterized apps will just be one of them.
  include /etc/nginx/sites-enabled/*;
}

/etc/nginx/sites-enabled/keter

server {
    listen 80;
    server_name example.com
    location / {
       # keter will use a dynamic port in the 4000s, if you let your 
       # current setup use something outside that range you can leave 
       # your current app running when you start keter for the first 
       # time. that way, if it doesn't work, you're no worse off than 
       # you were before.
       proxy_pass http://127.0.01:3001
    }
}

It’s been my experience that starting only the keter service will not then bring up nginx. Not sure if this is intended or a bug; just be aware that you need to start the nginx service yourself. Keter only seems to handle sending the reload command on deployments.

Your App

Now we are ready to keterize our app! All it really takes is one additional config file:

config/keter.yaml

exec: ../dist/build/yourapp/yourapp
args:
  - production
host: example.com

I also write a small script to handle the process of building the app and placing the tarball in keter’s incoming directory:

config/deploy

#!/bin/bash -ex

cabal clean
cabal configure
cabal build

strip dist/build/yourapp/yourapp

rm -rf static/tmp/

# you can use this to tar directly into the incoming folder, but you 
# need write access to it
tar czfv - dist/build/yourapp/yourapp config static > /opt/keter/incoming/yourapp.keter

# if you don't want to provide a normal user that access, you can split 
# the command and use sudo on the mv
tar czfv yourapp.keter dist/build/yourapp/yourapp config static
sudo mv yourapp.keter /opt/keter/incoming/

Try it

Finally, let’s try it out:

# Start the keter service:
sudo /etc/rc.d/keter start

# Tail the log in a separate terminal so you can see any problems
tail -f /opt/keter/log/keter/current.log

# Deploy!
./config/deploy

You should see output like the following in the tailing terminal:

2012-06-01 14:42:07.85: Unpacking bundle '/opt/keter/incoming/yourapp.keter' into folder: /opt/keter/temp/yourapp-0
2012-06-01 14:42:08.54: Created process: config/../dist/build/yourapp/yourapp
2012-06-01 14:42:10.55: App finished reloading: yourapp

And /etc/nginx/sites-enabled/keter should’ve been overwritten with something like:

server {
    listen 80;
    server_name example.com;
    location / {
       proxy_pass http://127.0.0.1:4003;
       proxy_set_header X-Real-IP $remote_addr;
    }
}

Make sure your site’s still working and you’re all set!

At this point you can kill off any old version you might’ve had running and go on developing and deploying at will simply by dropping new keter bundles.

Systemd

If you’ve made the switch to systemd, there are only a few differences compared to above.

First of all, change the keter config file to use the newer commands:

/etc/keter.yaml:

root: /opt/keter
nginx:
  start:
    - systemctl
    - start
    - nginx.service
  reload:
    - systemctl
    - reload
    - nginx.service

Secondly, rather than creating an rc.d file, create a (much simpler) service file

/etc/systemd/system/keter.service

[Unit]
Description=Keter Deployment Handler
After=local-fs.target network.target

[Service]
ExecStart=/usr/bin/keter /etc/keter.yaml

[Install]
WantedBy=multi-user.target

Recently, a post of mine made it to the front page of Hacker News and I was bombarded with traffic for about 5 hours. Aside from the general network slowness of serving from behind a residential Comcast IP, the site held up surprisingly well. CPU and Memory were no issue. One problem I did run into however was file handles.

Turns out, systemd limits any service it manages to 4096 file handles by default. So, if you expect to get decent traffic, it can be a good idea to increase this. Adding LimitNOFILE=<number> to the [Service] block above does the trick. The special value infinity is also available.

Finally, use the following to start the service and enable it at boot.

# systemctl start keter.service
# systemctl enable keter.service

Benefits

There are a couple of non-obvious benefits to the keter system:

  1. It works equally well for local or remote servers

If you’re deploying your app to a remote server just (have keter running there and) change your deployment script to end with:

tar czfv - dist/build/yourapp/yourapp config static |\
  ssh example.com 'cat > ~/keter/incoming/yourapp.keter'
  1. It works equally well for non-yesod apps too

The only real requirement is that the executable respect the $PORT environment variable when choosing how to listen. This is becoming an increasingly popular pattern with hosted solutions like heroko and nodester so any well-behaved app should probably do this anyway.

Besides that, you’ve just got to make a proper bundle: a config/keter.yaml, your executable and any other files or directories your app expects to have present in its current directory.

Downsides

Keter is in its early stages of development so it’s not without its failings. Mainly, it’s not very flexible – you’re expected to use the nginx reverse proxy approach with a single executable backend server.

You’re also unable to setup any static file serving tricks at this time (though there is code in Keter to handle it, and I’ve been playing with some ideas in my own fork).

Those issues notwithstanding, I’m still finding the approach incredibly streamlined and useful for both my local deployments of pbrisbin.com and some remote servers I deploy to. I was able to ditch a number of scattered service files and bash scripts that had been hobbled together to fill this gap.

Well done Michael.

01 Jun 2012, tagged with haskell, yesod, keter

Live Search

Note: this post describes a system for searching posts which once appeared on this site. It was removed in a fit of simplification. Please see Google’s site: keyword for any searching needs.

I’ve had some fun recently, adding full-text search support to the posts on the site to try and make a simple-but-still-useful archive.

I’d like to post a bit about the feature and how it works. It’s got a few moving parts so I’m going to break it up a bit.

This post will focus on the backend, setting up sphinx, providing content to it from a yesod application, and executing a search from within a handler. The second post will go into the front-end javascript that I implemented for a pretty simple but effective search-as-you-type interface.

For the full context, including required imports and supporting packages, please see this feature in the wild.

Sphinx

Sphinx is a full-text search tool. This assumes you’ve got some concept of “documents” hanging around with lots of content you want to search through by key word.

What sphinx does is let you define a source – a way to get at all of the content you have in a digestible format. It will then consume all that content and build an index which you can search very efficiently returning a list of Ids. You can then use those Ids to display the results to your users.

There are other aspects re: weighting and attributes, but I’m not going to go into that here.

The first thing you need to do (after installing sphinx) is to get your content into a sphinx-index.

If you’ve got the complete text you’ll be searching actually in your database, sphinx can natively pull from mysql or postgresql. In my case, the content is stored on disk in markdown files. For such a scenario, sphinx allows an “xmlpipe” source.

What this means is that you provide sphinx with a command to fetch an xml document containing the content it should index.

Now, if you’ve got a large amount of content, you’re going to want to use clever conduit/enumerator tricks to stream the xml to the indexer in constant memory. That’s what’s being done in this example. I’m doing something a little bit more naive – for two reasons:

  1. I need to break out into IO to get the content. This is difficult from within a lifted conduit Monad, etc.
  2. I don’t have that much shit – the thing indexes in almost no time and using almost no memory even with this naive approach.

So, here’s the simple way:

getSearchXmlR :: Handler RepXml
getSearchXmlR = do
    -- select all posts
    posts <- runDB $ selectList [] []

    -- convert each post into an xml block
    blocks <- liftIO $ forM posts $ \post -> do
        docBlock (entityKey post) (entityVal post)

    -- concat those blocks together to one xml document
    fmap RepXml $ htmlToContent $ mconcat blocks

    where
        htmlToContent :: Html -> Handler Content
        htmlToContent = hamletToContent . const

docBlock :: PostId -> Post -> IO Html
docBlock pid post = do
    let file = pandocFile $ postSlug post

    -- content is kept in markdown files on disk, if the file can't be 
    -- found, try to use the in-db description, else just give up.
    exists <- doesFileExist file
    mkd    <- case (exists, postDescr post) of
        (True, _         ) -> markdownFromFile file
        (_   , Just descr) -> return descr
        _                  -> return $ Markdown "nothing?"

    return $
        -- this is the simple document structure expected by sphinx's 
        -- "xmlpipe" source
        [xshamlet|
            <document>
                <id>#{toPathPiece pid}
                <title>#{postTitle post}
                <body>#{markdownToText mkd}
            |]

    where
        markdownToText :: Markdown -> Text
        markdownToText (Markdown s) = T.pack s

With this route in place, a sphinx source can be setup like the following:

source pbrisbin-src
{
	type		= xmlpipe
        xmlpipe_command = curl http://localhost:3001/search/xmlpipe
}

index pbrisbin-idx
{
	source		= pbrisbin-src
	path		= /var/lib/sphinx/data/pbrisbin
	docinfo		= extern
	charset_type	= utf-8
}

Notice how I actually hit localhost? Since pbrisbin.com is reverse proxied via nginx to 3 warp instances running on 3001 through 3003 there’s no need to go out to the internet, dns, and back through nginx – I can just hit the backend directly.

With that setup, we can do a test search to make sure all is well:

$ sphinx-indexer --all # setup the index, ensure no errors
$ sphinx-search mutt
Sphinx 2.1.0-id64-dev (r3051)
Copyright (c) 2001-2011, Andrew Aksyonoff
Copyright (c) 2008-2011, Sphinx Technologies Inc 
(http://sphinxsearch.com)

using config file '/etc/sphinx/sphinx.conf'...
index 'pbrisbin-idx': query 'mutt ': returned 6 matches of 6 total in 
0.000 sec

displaying matches:
1. document=55, weight=2744, gid=1, ts=Wed Dec 31 19:00:01 1969
2. document=62, weight=2728, gid=1, ts=Wed Dec 31 19:00:01 1969
3. document=73, weight=1736, gid=1, ts=Wed Dec 31 19:00:01 1969
4. document=68, weight=1720, gid=1, ts=Wed Dec 31 19:00:01 1969
5. document=56, weight=1691, gid=1, ts=Wed Dec 31 19:00:01 1969
6. document=57, weight=1655, gid=1, ts=Wed Dec 31 19:00:01 1969

words:
1. 'mutt': 6 documents, 103 hits

Sweet.

Haskell

Now we need to be able to execute these searches from haskell. This part is actually going to be split up into two sub-parts: first, the interface to sphinx which returns a list of SearchResults for a given query, and second, the handler to return JSON search results to some abstract client.

I’ve started to get used to the following “design pattern” with my yesod sites:

Keep Handlers as small as possible.

I mean no bigger than this:

getFooR :: Handler RepHtml
getFooR = do
    things      <- getYourThings

    otherThings <- doRouteSpecificStuffTo things

    defaultLayout $ do
        setTitle "..."
        $(widgetFile "...")

And that’s it. Some of my handlers break this rule, but many of them fell into it accidentally. I’ll be going through and trying to enforce it throughout my codebase soon.

For this reason, I’ve come to love per-handler helpers. Tuck all that business logic into a per-handler or per-model (which often means the same thing) helper and export a few smartly named functions to call from within that skinny handler.

Anyway, I digress – Here’s the sphinx interface implemented as Helpers.Search leveraging gweber’s great sphinx package:

The below helper actually violates my second “design pattern”: Keep Helpers generic and could be generalized away from anything app-specific by simply passing a few extra arguments around. You can see a more generic example here.

sport :: Int
sport = 9312

index :: String
index = "pbrisbin-idx"

-- here's what I want returned to my Handler
data SearchResult = SearchResult
    { resultSlug    :: Text
    , resultTitle   :: Text
    , resultExcerpt :: Text
    }

-- and here's how I'll get it:
executeSearch :: Text -> Handler [SearchResult]
executeSearch text = do
    res <- liftIO $ query config index (T.unpack text)

    case res of
        Ok sres -> do
            let pids = map (Key . PersistInt64 . documentId) $ matches sres

            posts <- runDB $ selectList [PostId <-. pids] []

            forM posts $ \(Entity _ post) -> do
                excerpt <- liftIO $ do
                    context <- do
                        let file = pandocFile $ postSlug post

                        exists <- doesFileExist file
                        mkd    <- case (exists, postDescr post) of
                            (True, _         ) -> markdownFromFile file
                            (_   , Just descr) -> return descr
                            _                  -> return $ Markdown "nothing?"

                        return $ markdownToString mkd

                    buildExcerpt context (T.unpack text)

                return $ SearchResult
                            { resultSlug    = postSlug post
                            , resultTitle   = postTitle post
                            , resultExcerpt = excerpt
                            }

        _ -> return []

    where
        markdownToString :: Markdown -> String
        markdownToString (Markdown s) = s

        config :: Configuration
        config = defaultConfig
            { port   = sport
            , mode   = Any
            }

-- sphinx can also build excerpts. it doesn't do this as part of the 
-- search itself but once you have your results and some context, you 
-- can ask sphinx to do it after the fact, as I do above.
buildExcerpt :: String -- ^ context
             -> String -- ^ search string
             -> IO Text
buildExcerpt context qstring = do
    excerpt <- buildExcerpts config [concatMap escapeChar context] index qstring
    return $ case excerpt of
        Ok bss -> T.pack $ C8.unpack $ L.concat bss
        _      -> ""

    where
        config :: E.ExcerptConfiguration
        config = E.altConfig { E.port = sport }

        escapeChar :: Char -> String
        escapeChar '<' = "&lt;"
        escapeChar '>' = "&gt;"
        escapeChar '&' = "&amp;"
        escapeChar c   = [c]

OK, so now that I have a nice clean executeSearch which I don’t have to think about, I can implement a JSON route to actually be used by clients:

getSearchR :: Text -> Handler RepJson
getSearchR qstring = do
    results <- executeSearch qstring

    objects <- forM results $ \result -> do
        return $ object [ ("slug"   , resultSlug    result)
                        , ("title"  , resultTitle   result)
                        , ("excerpt", resultExcerpt result)
                        ]

    jsonToRepJson $ array objects

Gotta love that skinny handler, does its structure look familiar?

In the next post, I’ll give you the javascript that consumes this, creating the search-as-you-type interface you see on the Archives page.

29 Jan 2012, tagged with haskell, website, yesod

UI Refresh

Astute readers may have noticed, the site looks a little bit different today. I know, it’s tough to discern, but if you look closely you might see… It’s now dark on light!

This is actually a small, tangential change that I made as part of a sweeping upgrade and cleanup effort. In moving to Yesod 0.10 (the 1.0 release candidate), I decided to take an axe to some of the bloatier areas of the site.

After dropping a few pounds in backend logic, I decided to keep going and attack the css as well – and by attack, I mean drop entirely.

Believe it or not just about all styling on the site is now coming from twitter’s awesome bootstrap framework.

Breadcrumbs, notices, login dropdowns, general forms, and sweet tables all without a line of styling by me.

The change does make the site less-then-great on less-than-wide monitors, but I’m not sure how many people are viewing this on mobile devices, etc. We’ll see if I need to bring back my @media queries.

Bootstrap 2.0 brings a “responsive” grid, so now the site looks pretty good on just about any device.

I should be posting more in the coming weeks on some of the specific changes as well a new search feature I’m hoping to roll out soon, but I figured such a noticeable visual change should have an accompanying post… So, there it was.

27 Jan 2012, tagged with website, yesod

Lighttpd Reverse Proxy

This site was previously served via lighttpd using fastcgi. My haskell source was compiled using Network.Wai.Handler.FastCGI and the binary was placed at /srv/http/app.cgi to be handled by lighttpd’s mod_fastcgi.

I decided to switch it up and let Warp serve the haskell app directly, then proxy certain urls through to it via lighttpd.

This how-to will outline the steps needed to get this setup and comment a little bit on what all the moving parts do.

This guide assumes your code is structured roughly like the 0.9.1 scaffolder, and your Application.hs exports that withYourApp function which is used by main.hs and compiled into a binary and executed.

My application is called “DevSite” (I don’t know why), so anywhere you see that in this guide, just assume I mean your foundation type / app name.

Why

Compiling to fastcgi was starting to feel kind of icky. Warp is all grown up now and capable of serving content mighty quickly. Often a problem with my app would result in lighttpd silently failing and leaving troublesome pid files around.

It’s nicer to let that front-facing server sit there running, none-the-wiser that I’m constantly developing and recompiling the app that it’s forwarding to. Installing and starting a compiled Warp binary will give me greater feedback in the event something goes awry.

Fortunately, I already had the backbone of url-rewriting going on to get requests to app.cgi so I just needed to update that to pull http traffic from another port on localhost rather than actually call a CGI process to get the response.

Lighttpd 1.5 built the proxy framework with the intention of superseding mod_fastcgi and providing that feature simply by telling you to proxy to a fastcgi application in the same way you would to another domain. This meant I just had to update my syntax for 1.5, then it was almost as easy as s/fastcgi/http/ing the config.

There’s also the minor benefit that I no longer need duplicated support files (like client-session key and favicon) between development and production.

The Moving Parts

Lighttpd will rewrite / redirect urls in a few stages:

  1. Certain urls will be handled by lighttpd itself. I like lighttpd for static file serving. It’s got a pretty directory listing, it’s fast, and it makes it super easy to setup a /static/private which enforces simple http-auth for access – very handy.

  2. Everything else will be rewritten (once) to /proxy/$1.

  3. Anything coming in for /proxy/.* (presumably via rewrite) will go to another port on localhost where my Warp server will take over.

Lighttpd can also load-balance over multiple instances of Warp (nifty!).

The Setup

First let’s get lighttpd setup. I’m using mod_proxy_core which is only available in lighttpd-1.5+. If you’re on Arch, you can install aur/lighttpd-svn.

Import some modules:

server.modules = ( "mod_rewrite"
                 , "mod_proxy_core"
                 , "mod_proxy_backend_http"
                 )

Setup the “stage-one” redirects:

# notice that by rewriting to /proxy$1 and not /proxy/$1 we get the 
# desired behavior where / becomes /proxy/ and /what/ever becomes 
# /proxy/what/ever.
url.rewrite-once = ( "^/static.*" => "$0"
                   , "(.*)"       => "/proxy$1"
                   )

Finally, setup the actual proxying:

$HTTP["url"] =~ "^/proxy.*" {
  # straight, http pass-through
  proxy-core.protocol        = "http"

  # lighttpd will manage its own queue and send requests to whichever 
  # instance has the shortest queue
  proxy-core.balancer        = "sqf"

  # these are the 5 Warp instances we'll start
  proxy-core.backends        = ( "127.0.0.1:3001"
                               , "127.0.0.1:3002"
                               , "127.0.0.1:3003"
                               , "127.0.0.1:3004"
                               , "127.0.0.1:3005"
                               )

  # strip the /proxy prefix
  proxy-core.rewrite-request = ( "_uri" => ( "^/proxy(.*)" => "$1" ) )
}

Now that we’ve got that going we need to spin up some Warp instances to serve out anything lighttpd redirects from /proxy.

Luckily the scaffolded main.hs allows us to pass a port on the command line, so we’ll just start up a bunch of instances of our app all listening on a different port.

Script It Out

I like to script this process of starting and stopping the multiple Warp instances. To facilitate this, we need to create some support directories alongside your source code:

mkdir tmp/{pid,log}

With those in place, feel free to take the following functions and incorporate them into some server management script:

instances='1 2 3 4 5'

start_devsite() {
  local n

  echo 'Starting worker processes...'
  for n in $instances; do
    devsite -p=300$n > tmp/log/$n.log 2> tmp/log/${n}_errors.log &
    echo $! > tmp/pid/$n.pid
  done
}

stop_devsite() {
  local pid n

  echo 'Stopping worker processes...'
  for n in $instances; do
    if [[ -f tmp/pid/$n.pid ]]; then
      read -r pid < tmp/pid/$n.pid
      if [[ -n $pid ]]; then
        kill $pid
        rm tmp/pid/$n.pid
      fi
    fi
  done
}

Once you execute the start function, you should see 5 processes running listening on ports 3001 through 3005. Lighttpd is already setup to forward to those apps in a load-balanced way so go ahead and see if it worked!

10 Sep 2011, tagged with haskell, lighttpd, proxy, website, yesod

Comments

Note: this page describes a custom system for commenting once present on this site. As you can see, it no longer is. Please provide any comments via twitter or email. Thanks.

Recently I decided I no longer like disqus. It was a great hands-off commenting system, but it had its downsides. I decided to make a change.

I have my own commenting system which I wrote as a yesod module. The initial reason I didn’t go with this when I first moved to yesod (and lost my existing homebrew php commenting system) was because it didn’t support authentication. I knew that an unauthenticated “enter comments” box had like a 100 to 1 ratio of spam to real comments.

When I put up rentersreality, I took the time to build authentication into my comments system by natively supporting YesodAuth so that I could use my module there. With that in place, the only thing keeping me back was losing my existing comments (more on that later).

This module really speaks to the flexibility of the framework. With about 8 lines of code I (the end user) was able to add tightly integrated commenting to my yesod site. Comments are stored in my database and users are authenticated using my authentication model. Furthermore, I (the developer) was able to put together a system that integrates this way even with my limited haskell-foo thanks to the openness and robustness of existing modules like yesod-auth and yesod-persistent.

I decided to make a few other changes along with this one. First, I moved the site from sqlite to postgresql. I originally went with sqlite because I was intimidated by postgres. After moving rentersreality to postgres, I realized it was no harder to get set up and offered great benefits in speed, reliability and maintenance tools. Secondly, I had to ditch my simple auth setup (one manually added user) in favor of a more typical auth setup. Now, anyone can authenticate with their open id (to be able to comment) and I just maintain an isAdmin flag manually to allow me to do the me-only stuff.

Why did you do it?

For the sake of completeness here are the Pros and Cons of the recent shift:

Pros:

  • I wrote it.

Since I develop the comments module, I know that I’ll get preferential treatment when it comes to bug fixes and features. Also, it’s awesome.

  • Not javascript.

Pages load faster and usage is clean, pure GET and POST html-forms.

  • Markdown

Comments are parsed by Pandoc the same way my post content itself is. This means that you have the full expressive power of this awesome markdown system (any non dangerous html is possible plus syntax highlighting and other nice features).

  • Integration

Both visually and architecturally, the comments are deeply integrated into my site. This is in spite of the code itself being completely modularized and reusable anywhere. Yay haskell.

Cons:

  • I lose all existing comments

I have a plan for this.

  • No Quote, or Reply functionality.

I kind of sidestepped the whole javascript edit-in-place usage scenario and instead created a full-blown management sub-site for commenting users. By following the “comments” link in my sidebar you can see all the comments you’ve left on this site and edit/delete them at-will. It’s a really clean method which provides a lot of functionality while being a) javascript-free and b) still completely modularized out of my specific site and reusable anywhere.

Quote could be done with some javascript to just populate the box with a markdown blockquote for you. Reply (and the implied threading) would require a re-engineering that I might not be willing to go through for a while.

  • No notifications or rss services

I’m not sure how much of a use there is for this on a site like mine but with the new administration sub-site it would be almost trivial to add this functionality – maybe I’ll do this soon.

To existing commenters

If you’ve commented on this site before I want to restore your comments, but I need your help.

What I need you to do is go ahead and login once, choose a username, and email it to me along with the disqus account you commented on previously.

If you commented on the old-old system, I could still restore your comments, you’ll just have to decide the best way to let me know what comments they were (the username you used or the thread/nature of the comments, etc).

With this information, I’ll be able to reinstate your comments and link them to your new identifier on the site.

I hope you’ll help me with this; but if not, I understand.

Play around

So go ahead and use this page to try out the commenting system. See what kind of markdown results in what.

If you want any comments (here or on other pages) edited or removed, I can always be reached by email. I don’t mind running a quick sql statement on your behalf.

Let me know of any bugs you find but don’t worry about css-fails, those should get fixed almost immediately (I just need the content present to find them).

08 Jul 2011, tagged with haskell, website, yesod

Anatomy of a Yesod Application

subtitle: how to stay sane when developing for the web in haskell

This post was originally about how I structure my Yesod applications and where it differs from the scaffold tool. I’ve since done a bit of a 180 and started to really like the scaffold tool and its structure.

For that reason, I’ve rewritten the post to outline that structure, its benefits and some strategies I use above what it provides to develop and deploy Yesod applications.

Note that this information is 0.8-specific and with the 0.9 and 1.0 versions of Yesod, this post will be obsolete (until I update it again).

I recently had the chance to do a coding exercise for a job interview. Due to some misunderstandings of the task on my part I had to do it twice. I’m still embarrassed about it, but I did end up getting the job so all’s well that ends well…

Anyway, the cool thing about doing it twice is that the first time, I did it in Rails and the second time in Yesod and that gave me a chance to evaluate the two frameworks in somewhat of a side-by-side.

I’m not going to get into a big discussion on the pros and cons of each one in general – though I will say they both excelled at staying out of my way and getting me a base for the exercise very quickly. What I’d rather talk about is structure.

During my brief time hacking in Rails, I quickly grew to like the “convention over configuration” philosophy. If you create a model/foo and a controller/foo and a test/foo, then foo itself just magically works. I liked it.

I hadn’t used the Yesod scaffold tool since about 0.1, but I knew I needed to get a site up quickly so I decided to use it for this exercise. I found that the new structure was very organized and well thought out. It gave me a similar convention-driven feeling, create hamlet/foo and cassius/foo then widgetFile foo would just work.

I think the framework could use a bit more of this approach to make the yesod-scaffolding tool as versatile as the ruby one. Mechanisms like widgetFile could be more plentiful and library provided (rather than scaffolded into your Settings.hs).

The yesod scaffold basically built you an “example” site which you can rewrite to your needs. In contrast, the ruby scaffold tool(s) let you say “I have some data structure like X” and it goes and creates a bunch of code to make X work. You’re obviously still going to rewrite a lot of the generated code, but it’s not a 100% guarantee like with yesod init.

Putting on my yesod-end-user cap (vs the usual yesod-contributer one), I would love to see yesod work more like rails: yesod init should give you a simple status-page site with links to all the documentation (you could still chose tiny, or database-driven and get all that setup at this point too). Then, yesod scaffold --whatever commands could be used to build up a CRUD interface with your actual data types.

Hmm, that turned into a bit of a wine about how rails is better than yesod – that is not my opinion in general. There are tons of reasons I prefer yesod overall, I was just really impressed with rails scaffolding abilities.

Scaffold

Enough comparison, let’s talk about yesod as it is now.

The scaffold tool sets up the following basic structure:

/path/to/site
|-- config
|   `-- ...
|-- hamlet
|   `-- ...
|-- cassius
|   `-- ...
|-- julius
|   `-- ...
|-- Handlers
|   `-- ...
|-- Model.hs
|-- YourSite.hs
|-- Controller.hs
`-- yoursite.cabal

config is going to hold your Settings.hs module along with some text files where you define your routes and models. I also like to throw the main executables’ source files in there which I’ll discuss later.

hamlet, cassius, and julius will contain the templates files for html, css, and javascript respectively. One awesome new development is the aforementioned function widgetFile which I use 100% of the time regardless of what templates the page actually calls for. If you write, say, addWidget $(widgetFile "foo"), that will splice in the templates hamlet/foo.hamlet, cassius/foo.cassius, and julius/foo.julius and just ignores non-existent files.

Model.hs, YourSite.hs, and Controller.hs are pretty self-explanatory, and are either entirely site-dependant or scaffold-generated so I’m not going to discuss them.

One other cool feature of the scaffold is how it sets up the imports and exports in YourSite.hs. It handles all of the major imports (like Yesod itself, etc) and those that need to be qualified (like Settings) and then reexports them as a single clean interface. This means that all of your Handlers, Helpers, etc can just import YourSite and be done with it. Very nice, very clean.

Handlers usually contains one module per route and only defines the route handling functions. I try to keep any support functions in either per-handler or site-wide Helpers.

One last note: do yourself a favor and keep/maintain the generated cabal file. It’s a nice way to prevent breakage (when dependencies are updated) and keep dev vs prod options straight. It’s also nice to keep all the object and interface files hidden under an ignorable dist directory.

Development

For development, I use an easy simple-server approach. The haskell is as follows:

import Controller (withServer)
import System.IO (hPutStrLn, stderr)
import Network.Wai.Middleware.Debug (debug)
import Network.Wai.Handler.Warp (run)

main :: IO ()
main = do
    let port = 3000
    hPutStrLn stderr $ "Application launched, listening on port " ++ show port
    withServer $ run port . debug

I then keep a simple shell script that runs it:

#!/bin/bash -e

touch config/Settings.hs
runhaskell -Wall -iconfig config/simple-server.hs

The touch just ensures that anything set by CPP options are up to date every time I ./devel.

Deployment

By using the cabal file, deployments are pretty easy. I use lighttpd as my server-of-choice (I also let it do the static file serving), so I need to compile to fastcgi.

I keep exactly one copy of any static files (including my main css) and it lives only in the production location. To support this, I define a staticLink function in Settings.hs which is conditional on the PRODUCTION flag.

If I’m developing locally, staticLink "foo" would return http://the-real-domain/static/foo so that the file is linked from its live location. When running in production, that function just returns /static/foo which is what I would actually want in the html.

I find this approach is way simpler than any other way I’ve done static file serving.

My cabal file builds an executable from config/mysite.hs which looks like this:

import Controller (withServer)
import Network.Wai.Handler.FastCGI (run)

main :: IO ()
main = withServer run

Then I’ve got another shell script to make deployments a single-command operation:

#!/bin/bash -e

app="${1:-/srv/http/app.cgi}"

sudo true

# this command just adds an auto-incrementing git tag so that if there's 
# some issue, I can just checkout the last tag and redeploy. this 
# completely sidesteps the need to backup the binary itself
deptag

touch config/Settings.hs
cabal install --bindir=./

# cabal will install to ./myapp as defined in the cabal file so we just 
# stop the service and replace the binary
sudo /etc/rc.d/lighttpd stop
sudo mv myapp "$app"
sudo /etc/rc.d/lighttpd start

This approach can be easily extended to a non-local deployment. In the case of rentersreality, the site lives on a slicehost. Its deployment file looks like this:

#!/bin/bash -e

ip="${1:-rentersreality.com}"

deptag # tag deployments

touch config/Settings.hs
cabal install --bindir=./

scp ./renters "$ip":~/

ssh -t "$ip" '
  sudo /etc/rc.d/lighttpd stop        &&
  sudo mv ./renters /srv/http/app.cgi &&
  sudo /etc/rc.d/lighttpd start       &&
  sleep 3
'

I found that after executing the remote command I had to sleep so that the process could detach correctly. Things would end up in a bad state if I disconnected right away.

I must say, since moving to the more structured approach and utilizing cabal install as the main deployment step, I have had far less issues with developing and deploying my apps.

To see two sites that are currently using this structure, just browse the projects on my github.

29 Apr 2011, tagged with haskell, yesod, website