Random Numbers without Mutation

Feb 9, 2014 00:00 · 1909 words · 9 minute read

In lecture 5A of Structure & Interpretation of Computer Programs, Gerald Sussman introduces the idea of assignments, side effects and state. Before that, they had been working entirely in purely functional Lisp which could be completely evaluated and reasoned about using the substitution model. He states repeatedly that this is a horrible thing as it requires a far more complex view of programs. At the end of the lecture, he shows a compelling example of why we must introduce this horrible thing anyway; without it, we cannot decouple parts of our algorithms cleanly and would be reduced to huge single-function programs in some critical cases.

The example chosen in SICP is estimating π using Cesaro’s method. The method states that the probability that any two random numbers’ greatest common divisor equals 1 is itself equal to 6/π².

Since I know Ruby better than Lisp (and I’d venture my readers do too), here’s a ported version:

def estimate_pi(trials)
  p = monte_carlo(trials) { cesaro }

  Math.sqrt(6 / p)
end

def cesaro
  rand.gcd(rand) == 1
end

def monte_carlo(trials, &block)
  iter = ->(trials, passed) do
    if trials == 0
      passed
    else
      if block.call
        iter.call(trials - 1, passed + 1)
      else
        iter.call(trials - 1, passed)
      end
    end
  end

  iter.call(trials, 0) / trials.to_f
end

I’ve written this code to closely match the Lisp version which used a recursive iterator. Unfortunately, this means that any reasonable number of trials will exhaust Ruby’s stack limit.

The code above also assumes a rand function which will return different random integers on each call. To do so, it must employ mutation and hold internal state:

def rand
  @x ||= random_init
  @x   = random_update(@x)

  @x
end

Here I assume the same primitives as Sussman does, though it wouldn’t be difficult to wrap Ruby’s built-in rand to return integers instead of floats. The important thing is that this function needs to hold onto the previously returned random value in order to provide the next.

Sussman states that without this impure rand function, it would be very difficult to decouple the cesaro function from the monte_carlo one. Without utilizing (re)assignment and mutation, we would have to write our estimation function as one giant blob:

def estimate_pi(trials)
  iter = ->(trials, passed, x1, x2) do
    if trials == 0
      passed
    else
      x1_ = rand_update(x2)
      x2_ = rand_update(x1_)

      if x1.gcd(x2) == 1
        iter.call(trials - 1, passed + 1, x1_, x2_)
      else
        iter.call(trials - 1, passed, x1_, x2_)
      end
    end
  end

  x1 = rand_init
  x2 = rand_update(x1)

  p = iter.call(trials, 0, x1, x2) / trials.to_f

  Math.sqrt(6 / p)
end

Ouch.

It’s at this point Sussman stops, content with his justification for adding mutability to Lisp. I’d like to explore a bit further: what if remaining pure were non-negotiable? Are there other ways to make decoupled systems and elegant code without sacrificing purity?

RGen 🔗

Let’s start with a non-mutating random number generator:

class RGen
  def initialize(seed = nil)
    @seed = seed || random_init
  end

  def next
    x = random_update(@seed)

    [x, RGen.new(x)]
  end
end

def rand(g)
  g.next
end

This allows for the following implementation:

def estimate_pi(trials)
  p = monte_carlo(trials) { |g| cesaro(g) }

  Math.sqrt(6 / p)
end

def cesaro(g)
  x1, g1 = rand(g)
  x2, g2 = rand(g1)

  [x1.gcd(x2) == 1, g2]
end

def monte_carlo(trials, &block)
  iter = ->(trials, passed, g) do
    if trials == 0
      passed
    else
      ret, g_ = block.call(g)

      if ret
        iter.call(trials - 1, passed + 1, g_)
      else
        iter.call(trials - 1, passed, g_)
      end
    end
  end

  iter.call(trials, 0, RGen.new) / trials.to_f
end

We’ve moved out of the single monolithic function, which is a step in the right direction. The additional generator arguments being passed all over the place makes for some readability problems though. The reason for that is a missing abstraction; one that’s difficult to model in Ruby. To clean this up further, we’ll need to move to a language where purity was in fact non-negotiable: Haskell.

In Haskell, the type signature of our current monte_carlo function would be:

monteCarlo :: Int                    -- number of trials
           -> (RGen -> (Bool, RGen)) -- the experiment
           -> Double                 -- result

Within monte_carlo, we need to repeatedly call the block with a fresh random number generator. Calling RGen#next gives us an updated generator along with the next random value, but that must happen within the iterator block. In order to get it out again and pass it into the next iteration, we need to return it. This is why cesaro has the type that it does:

cesaro :: RGen -> (Bool, RGen)

cesaro depends on some external state so it accepts it as an argument. It also affects that state so it must return it as part of its return value. monteCarlo is responsible for creating an initial state and “threading” it though repeated calls to the experiment given. Mutable state is “faked” by passing a return value as argument to each computation in turn.

You’ll also notice this is a similar type signature as our rand function:

rand :: RGen -> (Int, RGen)

This similarity and process is a generic concern which has nothing to do with Cesaro’s method or performing Monte Carlo tests. We should be able to leverage the similarities and separate this concern out of our main algorithm. Monadic state allows us to do exactly that.

RGenState 🔗

For the Haskell examples, I’ll be using System.Random.StdGen in place of the RGen class we’ve been working with so far. It is exactly like our RGen class above in that it can be initialized with some seed, and there is a random function with the type StdGen -> (Int, StdGen).

The abstract thing we’re lacking is a way to call those function successively, passing the StdGen returned from one invocation as the argument to the next invocation, all the while being able to access that a (the random integer or experiment outcome) whenever needed. Haskell, has just such an abstraction, it’s in Control.Monad.State.

First we’ll need some imports.

import System.Random
import Control.Monad.State

Notice that we have a handful of functions with similar form.

(StdGen -> (a, StdGen))

What Control.Monad.State provides is a type that looks awfully similar.

data State s a = State { runState :: (s -> (a, s)) }

Let’s declare a type synonym which fixes that s type variable to the state we care about: a random number generator.

type RGenState a = State StdGen a

By replacing the s in State with our StdGen type, we end up with a more concrete type that looks as if we had written this:

data RGenState a = RGenState
    { runState :: (StdGen -> (a, StdGen)) }

And then went on to write all the various instances that make this type useful. By using such a type synonym, we get all those instances and functions for free.

Our first example:

rand :: RGenState Int
rand = state random

We can “evaluate” this action with one of a number of functions provided by the library, all of which require some initial state. runState will literally just execute the function and return the result and the updated state (in case you missed it, it’s just the record accessor for the State type). evalState will execute the function, discard the updated state, and give us only the result. execState will do the inverse: execute the function, discard the result, and give us only the updated state.

We’ll be using evalState exclusively since we don’t care about how the random number generator ends up after these actions, only that it gets updated and passed along the way. Let’s wrap that up in a function that both provides the initial state and evaluates the action.

runRandom :: RGenState a -> a
runRandom f = evalState f (mkStdGen 1)

-- runRandom rand
-- => 7917908265643496962

Unfortunately, the result will be the same every time since we’re using a constant seed. You’ll see soon that this is an easy limitation to address after the fact.

With this bit of glue code in hand, we can re-write our program in a nice modular way without any actual mutable state or re-assignment.

estimatePi :: Int -> Double
estimatePi n = sqrt $ 6 / (monteCarlo n cesaro)

cesaro :: RGenState Bool
cesaro = do
    x1 <- rand
    x2 <- rand

    return $ gcd x1 x2 == 1

monteCarlo :: Int -> RGenState Bool -> Double
monteCarlo trials experiment = runRandom $ do
    outcomes <- replicateM trials experiment

    return $ (length $ filter id outcomes) `divide` trials

  where
    divide :: Int -> Int -> Double
    divide a b = fromIntegral a / fromIntegral b

Even with a constant seed, it works pretty well:

main = print $ estimatePi 1000
-- => 3.149183286488868

And For My Last Trick 🔗

It’s easy to fall into the trap of thinking that Haskell’s type system is limiting in some way. The monteCarlo function above can only work with random-number-based experiments? Pretty weak.

Consider the following refactoring:

estimatePi :: Int -> RGenState Double
estimatePi n = do
  p <- monteCarlo n cesaro

  return $ sqrt (6 / p)

cesaro :: RGenState Bool
cesaro = do
  x1 <- rand
  x2 <- rand

  return $ gcd x1 x2 == 1

monteCarlo :: Monad m => Int -> m Bool -> m Double
monteCarlo trials experiment = do
  outcomes <- replicateM trials experiment

  return $ (length $ filter id outcomes) `divide` trials

  where
    divide :: Int -> Int -> Double
    divide a b = fromIntegral a / fromIntegral b

main :: IO ()
main = print $ runRandom $ estimatePi 1000

The minor change made was moving the call to runRandom all the way up to main. This allows us to pass stateful computations throughout our application without ever caring about that state except at this highest level.

This would make it simple to add true randomness (which requires IO) by replacing the call to runRandom with something that pulls entropy in via IO rather than using mkStdGen.

runTrueRandom :: RGenState a -> IO a
runTrueRandom f = do
    s <- newStdGen

    evalState f s

main = print =<< runTrueRandom (estimatePi 1000)

One could even do this conditionally so that your random-based computations became deterministic during tests.

Another important point here is that monteCarlo can now work with any Monad! This makes perfect sense: The purpose of this function is to run experiments and tally outcomes. The idea of an experiment only makes sense if there’s some outside force which might change the results from run to run, but who cares what that outside force is? Haskell don’t care. Haskell requires we only specify it as far as we need to: it’s some Monad m, nothing more.

This means we can run IO-based experiments via the Monte Carlo method with the same monteCarlo function just by swapping out the monad:

What if Cesaro claimed the probability that the current second is an even number is equal to 6/π²? Seems reasonable, let’s model it:

-- same code, different name / type
estimatePiIO :: Int -> IO Double
estimatePiIO n = do
  p <- monteCarlo n cesaroIO

  return $ sqrt (6 / p)

cesaroIO :: IO Bool
cesaroIO = do
  t <- getCurrentTime

  return $ even $ utcDayTime t

monteCarlo :: Monad m => Int -> m Bool -> m Double
monteCarlo trials experiment = -- doesn't change at all!

main :: IO ()
main = print =<< estimatePiIO 1000

I find the fact that this expressiveness, generality, and polymorphism can share the same space as the strictness and incredible safety of this type system fascinating.

haskell