login

The Github Effect published on Feb 24, tagged with opensource, work

Recently I’ve come across a stretch of similarly themed articles. Articles discussing the increasing shift of all business to be tech-based. The conclusion for business owners is often the same: hire talented developers.

I don’t disagree. Every business that expects to have any staying power whatsoever needs some level of development. If not for their core product itself, surely for some sort of web and/or smart-phone presence. And for the former, the level of developer talent required to stay afloat can be tremendous.

In my opinion, the greatest pool of developer talent is open source software projects. The people involved in these projects are doing what they love for free. They’re good at working alone or as part of a distributed team. They use new technologies and champion new techniques and Design Patterns. They write maintainable and generalized libraries of code to be put on display like the Art that it is. Furthermore, to enjoy any level of notoriety in this field, the code has to be high-quality.

More and more, companies are realizing that talented developers are the intellectual property, not the software they create. Letting your talented developers write the software they want –under the licensing they want– is the best way to get great software.

I’d argue that the ridiculous rate of innovation is enough to offset the risk of intellectual piracy. If Google were to open source its entirety right now, could anyone use that as a competitive advantage? I say no; Google has the talented engineers to out-innovate any forks or clones – not to mention the huge head-start in terms of how to best leverage all that code.

Case in point: The two most popular browsers right now are Firefox and Chrome, both just thin layers of branding over fully open versions of themselves. Is Google worried that someone will take Chromium’s source and out-compete? Of course not. There’s plenty of derivative projects using Webkit and V8 – while these are great for exploring ideas and satisfying niche markets, they make no negative impact on Chrome. Quite the opposite; the flourishing ecosystem of buzz and patches can only help.

Successful tech start-ups are show-casing the value of the open source ecosystem. Companies like etsy, thoughtbot, and paperlesspost open source a ton of their “intellectual property” via github. This gains them followers, watchers, end-users, contributers and most importantly: something to entice recruits. Whether intentional or just the organic outcome of good-natured founders, the strategy works.

Companies like the one I work for are starting to look at open-source as a valuable goal. Regardless of the motivation (the altruistic “giving back” or simply building brand recognition), the bottom line is more software is let free. Beyond the fact that this is generally good for the world, it’s good for me to work on such software – I feel unencumbered by licensing or copyrights. I can freely blog about the stuff I’m doing and share snippets with interested parties. It’s just damn easier.

What I’m seeing is people being paid to develop in an open-source landscape. Whether using or writing such tools, we are absolutely making money with them. I can’t tell you how many arguments I’ve had trying to convince people that free software has real monetary value.

We’re here – a world where investors and recruiters are actually interested in the number of watchers and followers you have on github.

Implicit Scope published on Oct 28, tagged with ruby, rails, work

No one can deny that rails likes to do things for you. The term “auto-magically” comes to mind. This can be a blessing and a curse.

For the most part, rails tries to give you “outs” – a few hoops here and there that, if jumped though, will let you do things in different or more manual ways. Sometimes though, it doesn’t.

Find In Batches

One of the many ORM helpers provided by rails is find_in_batches. It will repeatedly query the database with a limit and offset, handing you chunks of records to work through in sequence. Perfect for processing a very large result set in constant memory.

Order.find_in_batches(:batch_size => 10) do |orders|
  orders.length # => 10

  orders.each do |order|

    # yay order!

  end
end

The problem is that any conditions you add to find_in_batches are inherited by any and all sql performed within its block. This is called “implicit scope” and there’s no way around it.

Why is this an issue? I’m glad you asked, here’s a real life example:

#
# SELECT * from orders
# WHERE orders.status = 'pending'
# LIMIT 0, 10;
#
# adjusting LIMIT each time round
#
Order.find_in_batches(:batch_size => 10,
                      :conditions => 'orders.status' = 'pending') do |orders|

  orders.each do |order|
    #
    # UPDATE orders SET orders.status = 'timing_out'
    # WHERE orders.id     = ?
    #   AND orders.status = 'pending'; <-- oh-hey implicit scope
    #
    order.update_attribute(:status, 'timing_out')

    #
    # some long-running logic to actually "time out" the order...
    #

    #
    # UPDATE orders SET orders.status = 'timed_out'
    # WHERE orders.id     = ?
    #   AND orders.status = 'pending';
    #
    order.update_attribute(:status, 'timed_out')
  end
end

Do you see the problem? The second update fails because it can’t find the order due to the implicit scope. The first update was only successful due to coincidence.

Workaround

I would love to find a simple remove_implicit_scope macro that can get around this issue, but it’s just not there.

I even went so far as to put the update logic in a Proc or lambda hoping to bring in a binding without the implicit scope – no joy.

I had to resort to simply not using find_in_batches.

At the time, I just rewrote that piece of the code to use a while true loop. Thinking about it now, I realize I could’ve factored it out into my own find_in_batches; also, I could put it in a module so you can extend it in your model to have the better (IMO) behavior…

module FindNoScope

  def find_in_batches(options)
    limit = options.delete(:batch_size)
    options.merge!(:limit => limit)

    offset = 0

    while true
      chunk = all(options.merge(:offset => offset))

      break if chunk.empty?

      yield chunk
    end

    offset += limit
  end

end

class Order < ActiveRecord::Base
  extend FindNoScope

  # ...

end

Note that the above was written blind, is completely untested, and will likely not work

Developing In OS X published on Oct 13, tagged with work, mac

As everyone who happens upon this site probably knows, I prefer to develop software in linux. The toolset is just better. Having and being proficient with a good shell is an invaluable tool for working with files. And regardless of what windows-ey, gui-IDE-ey developers like to say – software development is working with plain text files.

My work computer is now a Macbook. It’s about a million steps in the right direction from my last work-provided computer, but it’s still not linux.

That said, it’s damn close. It’s a unix variant originally based on BSD, it’s got a good shell and just about every linux tool I’ve grown accustomed to can be easily and quickly installed and utilized here.

So, this post is intended to describe the things I’ve installed and configured to get my development environment the way I like it on this platform.

The Terminal

It all starts with the terminal… and Terminal.app ain’t it. For a long time, I used iTerm simply because it supported 256 colors which no other Mac terminal does (to be fair, and to paraphrase Jon Stewart, #IDontHaveFactsToBackThatUp)

It was recently that I noticed there was a general lag when scrolling line by line in commandline-vim inside iTerm. This was unacceptable and prompted me to try working in MacVim for some time.

MacVim was fine and all, but then I found there was an iTerm2. There’s no lag in the newer terminal version, the preferences pane seems more thought out, and it’s just generally better. So go out and download iTerm2 as your terminal-of-choice on the Mac.

The Multiplexer

A terminal multiplexer offers a number of benefits. Of these, the biggest ones in my opinion are:

  1. Detach and reattach sessions

If you work in a multiplexer, your terminal never closes. All of your work is bundled up in this workspace-terminal that’s running inside and on top of your real terminal. If your ssh connection dies, your terminal crashes, or you actively “detach”, your work is still sitting there in that workspace. You can pull it up and reattach it to some other terminal whenever you want.

You can also have multiple named sessions which you can detach and reattach to shift gears or just stay generally organized.

  1. Split into regions

In linux I have a great tiling window manager. My desktop can be neatly split into multiple terminals where I can spread out my work.

I don’t really have that on the Mac. I tried for a while to get a good WM going in X11, but it just never clicked. So as an alternative, I can use a multiplexer to split one full-screen iTerm instance into any number of tabs, and/or vertical and horizontal regions.

I typically leave one half-term column for vim (which itself can be split any number of ways) then use the other side for running a tail -f on the log, a mysql console, and possibly autotest or watchr.

  1. Keyboard driven navigation

Navigation between regions, copy/paste, and everything else is completed by fully configurable key bindings. Not needing to reach for the mouse is a huge productivity win for me.

So, what multiplexer?

Well, in my opinion screen does 1 and 3 great. It’s what I use and will always use on linux – when I have the WM to do the screen-splitting.

However, tmux owns in the screen-splitting department. So on the Mac, I recommend tmux. Google around for a good tmux.conf and spend some time with the manpage; you won’t regret it.

The Editor

In my opinion there is no alternative to a good vim setup. Luckily, it works just fine on the Mac. In fact, my vim-config worked without any modifications whatsoever.

If anyone’s interested, here are the plugins I currently roll with:

ls ~/.vim/bundle
additional-surroundings
command-t
haskellmode
hoogle
nerdcommenter
simplefold
supertab
surround
vim-endwise
vim-fugitive
vim-git
vim-rails
vim-ruby

And If you’re not using pathogen, get to googlin.

The Other Stuff

Pretty much any unix commandline utility can be installed via ports or homebrew. I recommend grabbing GNU coreutils so you’ve got a better ls and friends. bash-completion and proctools are two others that will make things feel a bit more linux-ey.

Also do yourself a favor and upgrade bash to 4.0. It comes with globstar which itself is more than worth it.

The Bottom Line

Learn to live in a terminal – use an editor and utilities that fit there. Use a multiplexer like tmux or screen in a quality terminal like iTerm2.

Test Driven Development published on Oct 2, tagged with linux, ruby, work

With my recent job shift, I’ve found myself in a much more sophisticated environment than I’m used to with respect to Software Engineering.

At my last position, there wasn’t much existing work in the X++ realm; We were breaking new ground, no one cared about elegance; if you got the thing working – more power to you.

Here, it’s slightly different.

People here are working in a sane, documented, open-source world; and they’re good. Everyone is acutely aware of what’s good design and what’s not. There’s a focus on elegant code, industry standards, solid OOP principles, and most importantly, we practice Test Driven Development.

I’m completely new to this method for development, and I gotta say, it’s quite nice.

Now, I’m not going to say that this is the be-all-end-all of development styles (I’m a functional, strictly-typed, compiler-checked code guy at heart), but I do find it quite interesting – and effective.

So why not do a write-up on it?

Test Framework

The prerequisite for doing anything in TDD is a good test framework. Luckily, ruby is pretty strong in this area. The way it works is the following:

You subclass Test::Unit and define methods that start with test_ where you execute system logic and make assertions about certain results; and then you run that class.

Ruby looks for those methods named as test_whatever and runs them “as tests”. Running a method as a test means that errors and failures (any of your assert methods returning false) will be logged and displayed at the end as part of the “test report”.

All of these test classes can be run automatically by a build-bot and (depending on your test coverage) you get good visibility into what’s working and what’s not.

This is super convenient and empowering in its own right. In a dynamic language like ruby, tests are the only way you have any level of confidence that your most recent code change doesn’t blow up in production.

So now that you’ve got this ability to write and run tests against your code base, here’s a wacky idea, write the tests first.

Test Driven

It’s amazing what this approach does to the design process.

I’ve always been the type that just starts coding. I’m completely comfortable throwing out 6 hours worth of code and starting over. I know my “first draft” isn’t going to be right (though it will be useful). I whole-heartedly believe in refactorings, etc. But most importantly, I need to code to sketch things out. It’s how I’ve always worked.

TDD is sort of the same thing. You do a “rough sketch” of the functionality you’ll add simply by writing tests that enforce that functionality.

You think of this opaque object – a black box. You don’t know how it does what it does, but you’re trying to test it doing it.

This automatically gives you an end-user perspective. You now focus solely on the interface, the input and the output.

This is a wise position to design from.

You also tend to design small self-contained pieces of functionality. Methods that don’t care about state, return the same output for a given input, and generally do one simple thing. Of course, you do this because these are the easiest kind of methods to test.

So, out of sheer laziness, you design a cohesive, easy to use, and completely simple interface, an API.

Now you just have to “plumb it up”. Hack until the tests pass, and you’re done. That might be an over-simplification, but it’s not off by much…

Come to think of it, this is exactly the type of design Haskell favors. With gratuitous use of undefined, the super-high-level logic of a Haskell program can be written out with named functions to “do the heavy lifting”. If you make these functions simple enough and give them descriptive enough names, they practically write themselves.

So that’s TDD (at least my take on it). So far, I like it.