login

Beware Never Expectations published on Apr 8, tagged with mocha, rails, ruby

Mocha expectations are incredibly useful for ruby unit testing. You can stub out all kinds of functionality you depend on, specify exactly what values those dependencies return, and validate that the object under test behaves exactly as you want it to, right down to the methods it does or doesn’t call.

Unfortunately, I’ve bumped up against one glaring case where this can get you into trouble. To make matters worse, the symptom of this situation is that your tests just always pass.

Expects never

Take a simple class like this:

class Foo
  def a_method
    another_method

  rescue => ex
    logger.error("#{ex}")
  end

  # ...

end

Say we want to update it so that another_method is only called if some condition is met.

Let’s play hardcore TDD here; we’ll write and run the test first:

class FooTest < Test::Unit::TestCase
  def test_a_method
    foo = Foo.new

    # condition met
    foo.stubs(:some_condition?).returns(true)

    # should call it
    foo.expects(:another_method).once
    foo.a_method

    # condition not met
    foo.stubs(:some_condition?).returns(false)

    # should not call it
    foo.expects(:another_method).never
    foo.a_method
  end
end

Simple, easy to follow – you should absolutely fail with “Unexpected invocation” on that second call due to the never expectation you’ve set. There’s no reason to run this test now right? You know it’s going to fail, right?

You run the test anyway, and it… Passes. Um, wat?

I know what you might be saying here, it’s as simple as a single postfix unless some_condition?. So why am I insisting on figuring this out, wasting time just to see this test fail before I implement?

Well, in actuality I didn’t do things in this order. I wrote the implementation, then ran the test, saw it pass and moved on. It was only later that I regressed, broke the implementation and didn’t find out for a good while because the test never started failing.

Luckily, it hadn’t gone to production, but this scenario makes a strong case for writing and running your tests before your implementations – it’s the only chance you have to ensure your test actually covers what it should.

Anti-rescue

Let me save you the frustration of debugging this. What’s happening here is that when the method gets (incorrectly) called, Mocha raises an ExpectationError which is (by design) promptly rescued and logged.

I’d personally like to see Mocha not use this approach; rather count the number of calls and compare that number against what was expected later outside of your (possibly rescued) logic. This is how not-called-enough is implemented, why not let called-too-much be handled the same way?

There are a couple of ways we can work around this limitation though. One approach could be to re-raise the error when testing:

  # ...

rescue => ex
  logger.error(...)

  raise ex if Rails.env.test?
end

That’s only moderately smelly and might suit you in most cases. In my case, I couldn’t do this because swallowing all errors was by-design and (of course) backed up with test coverage, so those would start failing if I re-raise in that environment.

That and I hate modifying implementation code specifically to support some testing-related concern.

Another option might be to specifically handle the Mocha exception:

  # ...

rescue Mocha::ExpectationException => ex
  raise ex
rescue => ex
  logger.error(...)
end

That exception class is not in scope when you’re running in production, so that wouldn’t be fun. And I’d be very against requiring the Mocha gem in non-test environments.

Rewrite never

Anyway, here’s the solution we ended up with: redefine the method to increment a class-level counter, then assert that it was never called by checking that counter afterwards.

class FooTest < Test::Unit::TestCase
  def test_a_method
    foo = Foo.new
    foo.stubs(:some_condition?).returns(false)

    assert_not_called(foo, :another_method) do
      foo.a_method
    end
  end

  private

  # Note: not thread-safe
  def assert_not_called(obj, method, &block)
    # set a class level counter
    @@counter = 0

    # redefine the method so, if called, it increments that counter
    obj.instance_eval %{
      def #{method}(*args)
        #{self.class}.instance_eval "@@counter += 1"
      end
    }

    # run your code
    yield

    # see if it was ever called
    assert_equal 0, @@counter, "#{obj}.#{method}: unexpected invocation."
  end
end

Now, do yourself a favor and run this test before you write the implementation. It’s the only way to be sure the test works and regressions will be caught down the line.

Anonymous Classes In Ruby published on Mar 24, tagged with ruby

Often times, I find myself wanting something anonymous. This occurs quite frequently in code when you need to define, pass or call some functionality which is usually very short and only useful in this moment. Many languages provide anonymous functions (usually called lambdas) for this sort of thing: haskell has \x y -> x + y and ruby has lambda {|x,y| x + y}, Proc.new and the new ->(x,y) syntax which I’m actually not very fond of.

Sometimes, in ruby, I find myself wanting an anonymous class for much the same reasons. At first, this seemed like a silly thing to do, so I didn’t expect it to be clean or easy – but in fact, it is. Ruby itself uses anonymous classes for all sorts of things, and the syntax we’ll use to do it is almost comically obvious.

Testing

Sometimes if you’re writing a test for a module, you need to include or extend it into something to accurately test it. Here’s one approach to doing that:

# assume MyModule is defined as a module, which you want to test

class ModuleTest < Test::Unit::TestCase
  def test_the_thing
    klass = MyClass.new

    # assert something about klass now that it's included your module
  end

  private

  class MyClass
    include MyModule

    # ...

  end
end

This is fairly contrived, but I think we all agree that sometimes you need a new class to test something (like modules). Putting in some private subclass for the purposes of testing seems fairly appropriate, albeit pretty smelly.

Let’s see how an anonymous class can help:

class ModuleTest < Test::Unit::TestCase
  def test_the_thing
    klass = Class.new do
      include MyModule

      # ...

    end.new

    # assert something about klass
  end
end

Not only is this a bit shorter, but I’d say it’s clearer too now that the object under test is made more prominent.

Rake tasks

I like to write rake tasks to do useful things. Sometimes one of those tasks wants to move files around. FileUtils is great for this, and it’s best used when mixed into a class.

I won’t bore you with the non-anonymous version, so here’s the one using Class.new, hopefully you can imagine it with more boilerplate:

require 'fileutils'

desc "do the damn thing"
task :run do
  Class.new do
    include FileUtils

    def run!
      mv this, that

      cp here, there

      rm the_thing
    end
  end.new.run!
end

So short!

This really speaks to ruby’s flexibility when it comes to “everything is an object” and hopefully illustrates that if you understand the benefits of anonymous functions, why not start thinking about how to use anonymous classes too?

Escalate Your Scripts published on Mar 24, tagged with bash, ruby

Anyone who knows me knows I love the shell. I got my “start” in bash and still have a plethora of scripts lying around doing all sorts of useful and fun things for me. Recently, however, I tackled a task that I had attempted many times in shell script always to be met with frustration. How did I finally figure it out? I made it a rake task and did it in ruby.

Ask me last month what I thought the best tool for this job would’ve been, and 99 times out of 100 I would’ve said “shell script”. But guess what, I couldn’t do it – just never worked out. Now, after having written quite a nice little Rakefile, I can say confidently that I wish I had thought to do this sooner – and I hope I’ll think to do it again.

I want to write about this exercise mainly because I found the process to be quite enjoyable. When I needed to do imperative flow control, call system commands, and move things about the file system, I felt no resistance. More importantly, I could use all of the higher-level features to keep the code clear and clean.

And this is not just praise for ruby (though it does a good job), I’m more recommending that when presented with a task that makes sense as a shell script – think for a second if it might not be possible to do in a higher-level language, you might be surprised.

The Problem

I’ve got a repo (as a lot of you probably do) that contains my main dotfiles. It’s a collection of files that are usually scattered throughout my home directory which I’ve centralized into one folder and placed under version control. The normal approach with this is to symlink these files from the central location out into the proper placed under $HOME.

I wanted to automate this process. I wanted to be able to setup a new box by cloning this repo and running a single script. After that script completes, I want as much of my environment as is generally applicable to be fully configured.

The challenges here were that not all of the files in the repo made sense on every machine, some required parent directories to exist and, of course, I had to be careful not to clobber anything already present.

Nothing about this is insurmountable; the (albeit self-imposed) challenge is to do it as simply and maintainably as possible.

Objectify

The interesting thing about this script is what parts are higher level and what parts are not. So first, here are all of the higher-level bits with the scriptier parts left out:

require 'fileutils'

module Dotfiles
  def self.each(&block)
    [
      '.xcolors/jasonwryan.xcolors',
      '.xcolors/zenburn.xcolors',
      '.gitconfig',
      '.gitignore',
      '.htoprc',
      '.dir_colors',
      '.Xdefaults',
      '.zshrc',
      '.oh-my-zsh',
      '.screen',
      '.vim'
    ].each do |file|
      yield Dotfile.new(file)
    end
  end

  class Dotfile
    include FileUtils

    attr_reader :dotfile
    attr_accessor :source, :target

    def initialize(dotfile)
      @dotfile = dotfile
      @source  = File.join(pwd, dotfile)
      @target  = File.join(ENV['HOME'], dotfile)
    end

    def install!

      #
      # ...
      #

    end
  end
end

desc "updates all submodules"
task :submodules do

  #
  # ...
  #

end

desc "installs all dotfiles into the proper places"
task :install => [:submodules] do

  #
  # ...
  #

end

task :default => :install

This shows the pattern I most often follow when scripting in ruby (which is very different than programming in ruby): one, top-level module to hold any script-wide logic or constants as well as classes to represent the data your working with.

With an overall module and a clean API of classes and methods, you provide yourself a useful set of commands above and beyond the flow control and backtick-interpolation you would normally lean on.

You’ll also notice, in that each method, something I’m calling a Parallel Good Decision. I decided to hardcode the list of dotfile paths relative to the repo. This solved a number of problems that were leading to very smelly code. I could’ve used git ls-files or a normal glob-and-blacklist approach, but simply hardcoding this list allows finer control over what files are linked and if they are treated as files or directories.

Had I made this decision in isolation, it might have been enough for me to get that shell script approach working – but I didn’t. For some reason, only when cleaning up everything else and approaching the problem from a (slightly) higher level did I see that a simple list of relative file paths made the most sense here.

Script It Out

Now that the skeleton-slash-library code is in place, we can fill in the gaps:

require 'fileutils'

module Dotfiles
  def self.each(&block)
    # ... 
  end

  class Dotfile
    # ...

    def install!
      puts "--> installing #{dotfile} as #{target}..."
      if File.exists?(target)
        if File.symlink?(target)
          rm target, :verbose => true
        else
          mv target, "#{target}.backup", :verbose => true
        end
      end

      ln_s source, target, :verbose => true
    end
  end
end

desc "updates all submodules"
task :submodules do
  unless system('git submodule update --init --recursive')
    raise 'error initializing submodules'
  end
end

desc "installs all dotfiles into the proper places"
task :install => [:submodules] do
  Dotfiles.each(&:install!)

  vimrc = Dotfiles::Dotfile.new('.vimrc')
  vimrc.source = File.join(ENV['HOME'], '.vim', 'vimrc')
  vimrc.install!
end

task :default => :install

The stuff that’s easy is easy, the stuff that’s hard is easier and overall, the code is very clean and maintainable.

Oh, and I guess it’s nice that it works.

Implicit Scope published on Oct 28, tagged with ruby, rails, work

No one can deny that rails likes to do things for you. The term “auto-magically” comes to mind. This can be a blessing and a curse.

For the most part, rails tries to give you “outs” – a few hoops here and there that, if jumped though, will let you do things in different or more manual ways. Sometimes though, it doesn’t.

Find In Batches

One of the many ORM helpers provided by rails is find_in_batches. It will repeatedly query the database with a limit and offset, handing you chunks of records to work through in sequence. Perfect for processing a very large result set in constant memory.

Order.find_in_batches(:batch_size => 10) do |orders|
  orders.length # => 10

  orders.each do |order|

    # yay order!

  end
end

The problem is that any conditions you add to find_in_batches are inherited by any and all sql performed within its block. This is called “implicit scope” and there’s no way around it.

Why is this an issue? I’m glad you asked, here’s a real life example:

#
# SELECT * from orders
# WHERE orders.status = 'pending'
# LIMIT 0, 10;
#
# adjusting LIMIT each time round
#
Order.find_in_batches(:batch_size => 10,
                      :conditions => 'orders.status' = 'pending') do |orders|

  orders.each do |order|
    #
    # UPDATE orders SET orders.status = 'timing_out'
    # WHERE orders.id     = ?
    #   AND orders.status = 'pending'; <-- oh-hey implicit scope
    #
    order.update_attribute(:status, 'timing_out')

    #
    # some long-running logic to actually "time out" the order...
    #

    #
    # UPDATE orders SET orders.status = 'timed_out'
    # WHERE orders.id     = ?
    #   AND orders.status = 'pending';
    #
    order.update_attribute(:status, 'timed_out')
  end
end

Do you see the problem? The second update fails because it can’t find the order due to the implicit scope. The first update was only successful due to coincidence.

Workaround

I would love to find a simple remove_implicit_scope macro that can get around this issue, but it’s just not there.

I even went so far as to put the update logic in a Proc or lambda hoping to bring in a binding without the implicit scope – no joy.

I had to resort to simply not using find_in_batches.

At the time, I just rewrote that piece of the code to use a while true loop. Thinking about it now, I realize I could’ve factored it out into my own find_in_batches; also, I could put it in a module so you can extend it in your model to have the better (IMO) behavior…

module FindNoScope

  def find_in_batches(options)
    limit = options.delete(:batch_size)
    options.merge!(:limit => limit)

    offset = 0

    while true
      chunk = all(options.merge(:offset => offset))

      break if chunk.empty?

      yield chunk
    end

    offset += limit
  end

end

class Order < ActiveRecord::Base
  extend FindNoScope

  # ...

end

Note that the above was written blind, is completely untested, and will likely not work

Ruby Eval published on Oct 25, tagged with ruby

Ruby’s intance_eval and class_eval are awesome tricks of the language that can really cut down on redundant code or let you do truly dynamic things that you’d have never thought possible.

There’s one piece of confusion around these methods that each book I’ve read goes about explaining in a slightly different way. None of them really clicked for me, so why not write my own?

The two entirely accurate but seemingly paradoxical statements are this:

Use class_eval to define instance methods

Use instance_eval to define class methods

The reason for the backwards-ness is often explained something like this:

x.class_eval treats x as a Class, so any methods you create will be instance methods.

x.intance_eval treats x as an instance, so any methods you create will be class methods.

Well that’s clear as mud…

My take

Here’s how I think about it:

Any methods you define inside of x.instance_eval will be as if you had defined them on the instance x.

Any methods you define inside of x.class_eval will be as if you had written it in the Class x.

Examples should help…

class_eval

Here’s an example of class_eval

class MyClass
  def my_method
    "foo"
  end
end

MyClass.class_eval do
  def my_other_method
    "bar"
  end
end

c = MyClass.new
c.my_other_method
=> "bar"

This is exactly as if you had done the following:

class MyClass
  def my_method
    "foo"
  end

  # oh... the files are /inside/ the computer!
  def my_other_method
    "bar"
  end
end

c = MyClass.new
c.my_other_method
=> "bar"

So we used class_eval to define an instance method. Just like the book said.

Funny thing is, you can easily use class_eval to define class methods too.

class MyClass
end

MyClass.class_eval do
  def self.foo
    "foo"
  end
end

MyClass.foo
=> "foo"

So I think that whole mindset is incorrect. It’s about the context your code is evaluated in, not what you’re intending that matters.

instance_eval

Similarly, here’s how I think when I’m writing something with instance_eval:

c = MyClass.new

# notice we act *on* an instance
c.instance_eval do
  def my_other_other_method
    "baz"
  end
end

c.my_other_other_method
=> "baz"

# we've written that method *on* c, so it only exists for that 
# *instance*...
d = MyClass.new
d.my_other_other_method
=> Error...

This code is identical to

c = MyClass.new

# definition on c
def c.my_other_other_method
  "baz"
end

c.my_other_other_method
=> "baz"

In the second form, it’s clearer that the method only exists on that specific instance.

One other way to look at it is this:

Methods defined with class_eval will be available to every instance of that class (making them instance methods).

Methods defined with instance_eval will only be available to that specific instance; why they’re called “class methods”, I do not know.

Anyway, hope this helps…