Uncategorized

March 11, 2012

Tool agnosticism is good for you

When it comes to programming editors, frameworks, and languages, you’re likely to take one of three stances: marry, boff, or kill. There are tools that you want to use for the rest of your life, tools that you want to moonlight or tinker with, and tools you never want to use again in your life.

Tool antagonism, ranting about the tools you want to kill, is fun. It plays well on internet forums. It goes nicely with beer and friends. But on a team that isn’t using absolutely terrible tools, it’s a waste of time.

Unless your team is bizarrely like-minded, it’s likely some disagreement will arise along the lines of editors, frameworks, and languages. I’ve been that antagonistic guy. We can’t use this language, it has an annoying feature. We can’t use this framework, it doesn’t protect us from an annoying kind of bug. I’ve learned these conversations are a waste of social capital and unnecessarily divisive. Don’t be that guy.

There are usually a few tools that are appropriate to a given task. I often have a preference and choose specific tools for my personal projects. There are others tools that I’m adept at using or have used before, but find minor faults with. I’ve found it way better for me to accept that other, non-preferred tool if it’s already in place or others have convictions about using it. Better to put effort into making the project or product better than spinning wheels on programming arcanery.

When it comes to programmer tools, rational agnosticsm beats antagonism every time. Train yourelf to work amazingly-great with a handful of tools, reach adeptness with a few others, and learn how to think with as many tools as possible.

March 8, 2012

Rails 4, all about scaling down?

To some, Rails is getting big. It brings a lot of functionality to the table. This makes apps easier to get off the ground, especially if you aren’t an expert. But, as apps grow, it can lead to the pain; there’s so much machinery in Rails, it’s likely you’re abusing something. It’s easy to look at other, smaller tools and think there’s green grass over there.

Rails 4 might have answers to this temptation.

On the controller side of things, it seems likely that some form of Strobe’s Rails extensions will find their way into Rails, making it easier to create apps (or sub-apps) that are focused on providing an API and eschew the parts of ActionPack that aren’t necessary for services. The thing I like a lot about this idea is it covers a gap between Sinatra and Rails. You can prototype your app with all the conveniences of Rails and then strip out the parts you don’t need as you grow the app and strip it down to provide quick services. You could certainly still rewrite services in Sinatra, Grape, or Goliath, but it’s nice to have an option.

On the model side of things, people are, well, modeling. Simpler ways to use use ActiveModel with ActionPack in Rails will appear in Rails 4. The components the DataMapper team is working on, in the form of Virtus seem really interesting too. If you want to get started now, you can check out ActiveAttr right now, sort of the bonus track version of ActiveModel. Chris Griego’s put a lot of solid thought into this; you definitely want to check out his slides on models everywhere; they’re lurking in your controllers, your requests, your responses, your API clients, everywhere.

In short, my best guess on Rails 4, right now, is that it will continue to give developers a curated set of choices and frameworks to get their application off the ground. It will add options to grow your application’s codebase sensably once it’s proven out.

What I know, for sure, is that the notion of Rails 4 seems really strange to me. How fast time flies. Uphill, both ways.

February 26, 2012

Write more manpages

Every program, library, framework, and application needs documentation of some sort; this much is uncontroversial. How much documentation, what kinds of documentation, and where to put that documentation are the questions that often elicit endless prognostication.

When it comes to documentation aimed at developers, there’s a spectrum. On one end, there’s zero documentation, only code. On the other end of the spectrum, are literate programs; the code is intertwined with the documentation and the language is equally geared towards marking up a document and translating ideas into machine executable code.

Somewhere along this spectrum exists a happy ideal for most programmers. Inline API docs ala JavaDoc, RDoc, and YARD have been popular for a while. Lately, tools like docco and rocco have raised enthusiasm for “semi-literate programming”. There’s also a lot of potential in projects exhaustively documenting themselves in their Cucumber features as vcr does.

All of these tools couple code with documentation, per the notion that putting them right next to each other makes it more likely documentation gets updated in sync with the code. The downside to this approach is that code gets ‘noised up’ with comments. Often this is a fair trade, but it occasionally makes navigating a file cumbersome.

It happens that Unix, in its age-old sage ways, has been storing its docs out-of-line with the relevant code for years. They’re called manpages, and they mostly don’t suck. Every C API on a modern Unix has a corresponding manpage that describes the relevant functions and structures, how to use it, and any bugs that may exist. They’re actually a pretty good source of info.

Scene change.

It so happens that Ryan Tomayko is a Unix afficionado and wrote a tool that is even better for writing manpages than the original Unix tooling. It’s called ronn, and it’s pretty rad; you write Markdown, and you get bonafide UNIX manpages plus manpage-styled HTML.

Perhaps this is a useful way to write programmer-focused docs. Keep docs out of the code, put it in manpages instead, push it to GitHub pages. Code stays focused, docs still look great.

I took John Nunemaker’s scam gem and put this idea to the test. Here’s what the manpage looks like, with the default styling provided by ronn:

Here’s the raw ronn document:

No Ruby files were harmed in the making of this documentation.

It took me about ninety minutes to put this together. Probably 33-50% of that time was simply tinkering with ronn and making sure I was writing in the style that is typical of manpages. So we’re talking about forty-five minutes to document a mixin with seven methods. For pretty good looking output and simple tooling, that’s a very modest investment.

The potential drawbacks are the same as any kind of documentation; it could fall out-of-sync with the production code. Really, I think this is more of a workflow issue. Ideally, every commit or merged branch is reviewed to make sure that relevant docs are updated. As a baseline, the release workflow for a project should include a step to make sure docs are up-to-date.

In short, I have one datapoint that tells me that ronn is a pretty great way to generate programmer-oriented documentation. I’ll keep tinkering with it and encourage other developers to do the same.

February 18, 2012

Automated code goodness checking with cane

A few nights ago, I added Xavier Shay’s cane to Sifter. It was super simple, and cane runs surprisingly fast. Cane is designed to run as part of CI, but Sifter doesn’t really an actual CI box. Instead, I’ve added it to our preflight script that tells us whether deploying is a good idea or not, based on our spec suite. Now that preflight can tell us if we’ve regressed on code complexity or style as well. I’m pretty pumped about this setup.

Next step: add glib comments on failure. I’m thinking of something like “Yo, imma let you finish but the code you’re about to deploy is not that great.”

January 19, 2012

Represent dat API

Rails is missing an abstraction when it comes to building REST APIs, in my opinion. Requests route through controllers, controllers call models or services to obtain the right objects. And then…you awkwardly try to bang a JSON object together with an ERB template? It gets awkward quickly.

There’s a lot of experimentation in the wild attempting to figure out what works well here. You can bang out a bunch of presenter classes. You can describe and compose representations. You can go resource oriented.

I came across one yesterday that immediately caught my eye. You could just use lambda to implement a bunch of functions that present, decorate, or map objects from one representation to another. To borrow an example:


# Define a base representation
UrlsPresenter = lambda do
  {
    'self'    => "#{Gauges.api_url}/me",
    'gauges'  => "#{Gauges.api_url}/gauges",
    'clients' => "#{Gauges.api_url}/clients",
  }
end

# Compose the base representation with more data
UserPresenter = lambda do |user|
  {
    'id'          => user.id,
    'email'       => user.email,
    'name'        => user.name,
    'urls'        => UrlsPresenter.call
  }
end

# Pass an object to the presenter and convert it to JSON
UserPresenter[user].to_json

I love that this one adds no machinery and no state. Input, function, output. With just lambda, you can describe a bunch of transformations and string them together into meaningful and interesting pipelines. I’m experimenting with this now, hoping to find an interesting way that functional programming approaches can make it simpler to build APIs with Rails.

December 30, 2011

When to Class.new

In response to Why metaprogram when you can program?, an astute reader asked for an example of when you would want to use Class.new in Ruby. It’s a rarely needed method, but really fun when faced with a tasteful application. Herein, a couple ways I’ve used it and an example from the wild.

Dead-simple doubles

In my opinion, the most “wholly legitimate” frequent application of Class.new is in test code. It’s a great tool for creating test doubles, fakes, stubs, and mocks without the weight of pulling in a framework. To wit:

TinyFake = Class.new do

  def slow_operation
    "SO FAST"
  end

  def critical_operation
    @critical = true
  end

  def critical_called?
    @critical
  end

end

tiny_fake = TinyFake.new
tiny_fake.slow_operation
tiny_fake.critical_operation
tiny_fake.critical_called? == true

TinyFake functions as a fake and as a mock. We can call a dummy implementation of slow_operation without worrying about the snappiness of our suite. We can verify that a method was called in the verification section of our test method. Normally you would only do one of these things at a time, but this shows how easy it is to roll your own doubles, fakes, stubs, or mocks.

The thing I like about this approach over defining classes inside a test file or class is that it’s all scoped inside the method. We can assign the class to a local and keep the context for each test method small. This approach is also really great for testing mixins and parent classes; define a new class, add the desired functionality, and test to suit.

DSL internals

Rack and Resque are two examples of libraries that expose an API based largely on writing a class with a specific entry point. Rack middlewares are objects with a call method that generates a response based on an environment hash and any other middlewares that are contained within the middleware. Resque expects the classes that work through enqueued jobs define a perform method.

In practice, putting these methods in a class is the way to go. But, hypothetically, we are way too lazy to type class/end, or perhaps we want to wrap a bunch of standard instrumentation and logging around a simple chunk of code. In that case, we can write ourself a little shortcut:

module TinyDSL

  def self.performer(&block)
    c = Class.new
    c.class_eval { define_method(:perform, block) }
    c
  end

end

Thingy = TinyDSL.performer { |*args| p args }
Thingy.new.perform("one", 2, :three)

This little DSL gives us a shortcut for defining classes that implement whatever contract is expected of performer objects. From this humble beginning, we could mix in modules to add functionality around the performer, or we could pass a parent class to Class.new to make the generated class inherit from another class.

That leads us to the sort-of shortcoming of this particular application of Class.new: if the unique function of performer is to wrap a class around a method (for instance, as part of an API exported by another library), why not just subclass or mixin that functionality in the client application? This is the question you have to ask yourself when using Class.new in this way and decide if the metaprogramming is pulling its weight.

How Class.new is used in Sinatra

Sinatra is a little language for writing web applications. The language specifies how HTTP requests are mapped to blocks of Ruby. Originally, you wrote your Sinatra applications like so:

get '/'  { [200, {"Content-Type" => "text/plain"}, "Hello, world!"] }

Right before Sinatra 1.0, the team added a cleaner way to to build and compose applications as Ruby classes. It looks the same, except it happens inside the scope of a class instead of the global scope:

class SomeApp < Sinatra::Base

    get '/'  { [200, {"Content-Type" => "text/plain"}, "Hello, world!"] }

end

It turns out that the former is implemented in terms of the latter. When you use the old, global-level DSL, it creates a new class via Class.new(Sinatra::Base) and then class_evals a block into it to define the routes. Short, clever, effective: the best sort of Class.new.

So that’s how you might see Class.new used in the wild. As with any metaprogramming or construct labeled “Advanced (!)”, the main thing to keep in mind, when you use it or when you set upon refactoring an existing usage, is whether it is pulling its conceptual weight. If there’s a simpler way to use it, do that instead.

But sometimes a nail is, in fact, a nail.

December 9, 2011

Why metaprogram when you can program?

When I sought to learn Ruby, it was for three reasons. I’d heard of this cool thing called blocks, and that they had a lot of great use cases. I read there was this thing called metaprogramming and it was easier and more practical than learning Lisp. Plus, I knew several smart, nice people who were doing Ruby so it was probably a good thing to pay attention to. As it turns out, I will never go back to a language without the first and last. I can’t live without blocks, and I can’t live without smart, kind, fun people.

Metaprogramming requires a little more nuance. I understand metaprogramming well enough to get clever with it, and I understand it well enough to mostly understand what other people’s metaprogramming does. I still struggle with the nomenclature (eigenclass, metaclass, class Class?) and I often fall back to trial and error or brute-force tinkering to get things working.

On the other hand, I think I’ve come far enough that I can start to smell out when metaprogramming is done in good taste. See, every language has a feature that is terribly abused because it’s the cool, clever thing in the language: operator overloading in Scala, monadic everything in Haskell, XML in Java, and metaprogramming in Ruby.

Adam’s Handy Guide to Metaprogramming

This guide won’t teach you how to metaprogram, but it will teach you when to metaprogram.

I want you to think twice the next time you reach for the metaprogramming hammer. It’s a great tool for building developer-friendly APIs, little languages, and using code as data. But often, it’s a step too far. Normal, everyday programming will do you just fine.

There are two principles at work here.

Don’t metaprogram when you can just program

Exhaust all your all tricks before you reach for metaprogramming. Use Ruby’s mixins and method delegation to compose a class. Dip into your Gang of Four book and see if there isn’t a pattern that solves your problem.

Lots of metaprogramming is in support of callback-oriented programming. Think “before”/”after”/”around” hooks. You can do this by defining extension points in the public API for your class and mixing other modules into the class that implement logic around those public methods.

Another common form is configuring an object or framework. Think about things that declare models, connections, or queries. Use method chaining to build or configure an object that acts as a parameter list for another method or object.

Use the weakest form of metaprogramming possible

Once you’ve exhausted your patterns and static Ruby tricks, it’s time to play a game: how little metaprogramming can you do and get the job done?

Various forms of metaprogramming are weaker or stronger than others. The weaker ones are harder to screw up and less likely to require a deep understanding of Ruby. The stronger ones have trade-offs that require careful application and possibly need a lot of explanation to newcomers to your codebase.

Now, I will present to you a partial ordering of metaprogramming forms, in order of weak to strong. We can bicker on their specific placement, but I’m pretty certain that the first one is far better to use frequently than the last.

Blocks - I hesitate to call this a form of metaprogramming. But, it is sometimes abused, and it is sometimes smart to use blocks instead of tricks further down this list. That said, if you find yourself needing more than one block parameter to a method, you should consider a parameter object that holds those blocks instead.
Dynamic message send on a static object - You set a symbol on an object and later it will send that symbol as a method selector to an object that doesn’t change at runtime. This is weak because the only thing that varies is the method that gets called. On the other hand, you could have just used a block.
Dynamic message send on a dynamic object - You set a symbol and a receiver object, at some point they are combined into a method call. This is stronger than the previous form because you’ve got two points of variability, which means two things to hunt down and two more things to hold in your brain.
Class.new - I love this method so much. But, it’s a source of potential hurt when trying to understand a new piece of code. Classes magically poofing into existence at runtime makes code harder to read and navigate with simple tools. At the very least, have the civility to assign classes created this way to a constant so they feel like a normal class. Downsides, err, aside, I love this method so much, having it around is way better than not.
define_method - I like this method a lot too. Again, it’s way better to have it around than not. It’s got two modes of use, one gnarly and one not-so-bad. If you look at how its used in Rails, you’ll see a lot of instances where its passed a string of code, sometimes with interpolations inside said string. This is the gnarly form; unfortunately, it’s also faster on MRI and maybe other runtimes. There is another form, where you pass a block to define_method and the block becomes the body of the newly defined method. This one is far easier to read. Don’t even ask me the differences in how variables are bound in that block; Evan Phoenix and Wilson Bilkovich tried to explain it to me once and I just stared at them like a yokel.
class_eval - We’re getting into the big guns of metaprogramming now. The trick with class_eval is that its tricky to understand exactly which class (the metaclass or the class itself) the parameters to class_eval apply to. The upside is that’s mostly a write-time problem. It’s easy to look at code that uses class_eval and figure out what it intends to do. Just don’t put that stuff in front of me in an interview and expect me to tell you where the methods land without typing the damn thing into IRB.
instance_eval - Same tricks as class_eval. This may have simpler semantics, but I always find myself falling back to tinkering with IRB, your mileage may vary. The one really tricky thing you can do with instance_eval (and the class <<some_obj trick) is put methods on specific instances of an object. Another thing that’s better to have around than not, but always gives me pause when I see it or think I should use it.
method_missing - Behold, the easiest form of metaprogramming to grasp and thus the most widely abused. Don’t feel like typing out methods to delegate or want to build an API that’s easy to use but impossible to document? method_missing that stuff! Builder objects are a legitimate use of method_missing. Everything else requires deep zen to justify. Remember: friends don’t let friends write objects that indiscriminately swallow messages.
eval - You almost certainly don’t need this; almost everything else is better off as a weaker form of metaprogramming. If I see this, I expect that you’re doing something really, really clever and therefore have a well-written justification and a note from your parents.

Bonus principle!

At some point you will accidentally type “meatprogram” instead of “metaprogram”. Cherish that moment!

It’s OK to write a few more lines of code if they’re simple, concise, and easy to test. Use delegation, decorators, adapters, etc. before you metaprogram. Exhaust your GoF tricks. Read up on SOLID principles and understand how they change how you program and give you much of the flexibility that metaprogramming provides without all the trickery. When you do resort to trickery, use the simplest trickery you can. Document it, test it, and have someone review it.

When it comes to metaprogramming, it’s not about how much of the language you use. It’s about what the next person to see the code whispers under their breath. Don’t let your present self make future enemies.

December 1, 2011

Cassandra at Gowalla

Over the past year, I’ve done a lot of work making Cassandra part of Gowalla’s multi-prong database strategy. I recently spoke at Austin on Rails on this topic, doing a sort of retrospective on our adoption of Cassandra and what I learned in the process. You can check out the slide deck, or if you’re a database nerd like me, dig into the really nerdy details below.

Why does Gowalla use Cassandra?

We have a few motivations for using Cassandra at Gowalla. First off, it’s become out database of choice for applications with relatively fixed query patterns that, for us to succeed, need to handle a rapidly growing dataset. Cassandra’s read and write paths are optimized for these kinds of applications. It’s good at keeping the hot subset of a database in memory while keeping queries that require hitting disk pretty quick too.

Cassandra is also great for time-oriented applications. Any time we need to fetch data based primarily on some sort of timestamp, Cassandra is a great fit. It’s a bit unique in this regard, and that’s one of the main reasons I’m so interested in Cassandra.

Cassandra is a Dynamo-style database, which yields some nice operational aspects. If a node goes down over night, we don’t take an availability hit; the ops people can sleep through the night and fix it later. The Cassandra developers have also done a great job of eliminating all the cases where one need to an entire Cassandra cluster at one time, resulting in downtime.

When does Gowalla not use Cassandra?

I don’t think Cassandra is all that great for iterating on prototypes. When you’re not sure what your data or queries will end up looking like, it’s hard to build a schema that works well with Cassandra. You’re also unlikely to need the strengths that a distributed, column-oriented database offers at that stage. Plus, there aren’t any options for outsourced Cassandra right now, and early-stage applications/businesses rarely want to devote expertise to hosting a database.

Applications that don’t grow data quickly, or can fit their entire dataset in memory on a pair of machines doesn’t play to Cassandra’s strengths either. Given that you can get a machine with a few dozen gigabytes of memory for the cost of rent in the valley, sometimes it does pay out to scale vertically instead of horizontally as Cassandra encourages.

Cassandra applications at Gowalla

We have a handful of applications going that use Cassandra:

Audit: Stores ActiveRecord change data to Cassandra. This was our training-wheels trial project where we experimented with Cassandra to see if it was useful for us. It was incrementally deployed using rollout and degrade. Worked well, so we proceeded.
Chronologic: This is an activity feed service, storing the events and timelines in Cassandra. It started off life as a secondary index cache, but became a system of record in our latest release. It works great operationally, but the query/access model didn’t always jive with how web developers expected to access data.
Active stories: We store “joinability” data for users at a spot so we can pre-merge stories and prevent proliferation of a bunch of boring, one-person stories. This was built by Brad Fults and integrated in one pull request a few weeks before launch. The nice thing about this one was that it was able to take advantage of Cassandra’s column expiration and fit really nicely into Cassandra’s data model.
Social graph caches: We store friend data from other systems so we can quickly list/suggest friends when they connect their Gowalla profile to Facebook or Twitter. This started life on Redis, but the data was growing too quickly. We decoupled it from Redis and wrote a Cassandra backend over a few days. We incrementally deployed it and got Redis out of the picture within two weeks. That was pretty cool.

What worked?

Stable at launch. A couple weeks before launch, I switched to “devops” mode. Along with Adam McManus, our ops guy, we focused on tuning Cassandra for better read performance and to resolve stability problems. We ended up bringing in a DataStax consultant to help us verify we were doing the right things with Cassandra. The result of this was that, at launch, our cluster held up well and we didn’t have any Cassandra-related problems.
Easy to tune. I found Cassandra interesting and easy to tune. There is a little bit of upfront research in figuring out exactly what the knobs mean and what the reporting tools are saying. Once I figured that out, it was easy to iteratively tweak things and see if they were having a positive effect on the performance of our cluster.
Time-series or semi-granular data. Of the databases I’ve tinkered with, Cassandra stands out in terms of modeling time-related data. If an application is going to pull data in time-order most of the time, Cassandra is a really great place to start. I also like the column-oriented data model. It’s great if you mostly need a key-value store, but occasionally need a key-key-value store.

What would we do differently next time?

Developer localhost setups. We started using Cassandra in the 0.6 release, when it was a giant pain to set up locally (XML configs). It’s better now, but I should have put more energy into helping the other developers on our team getting Cassandra up and working properly. If I were to do it again, I’d probably look into leaning on the install scripts the cassandra gem includes, rather than Homebrew and a myriad of scripts to hack the Cassandra config.
Eventual consistency and magic database voodoo. Cassandra does not work like MySQL or Redis. It has different design constraints and a relatively unique approach to those constraints. In advocating and explaining Cassandra, I think I pitched it too much as a database nerd and not enough as “here’s a great tool that can help us solve some problems”. I hope that CQL makes it easier to put Cassandra in front of non-database nerds in terms that they can easily relate to and immediately find productivity.
Rigid query model. Once we got several million rows of data into Cassandra, we found it difficult to quickly change how we represented that data. It became a game of “how can we incrementally rejigger this data structure to have these other properties we just figured out we want?” I’m not sure that’s a game you can easily win at with Cassandra. I’d love to read more about building evolvable data structures in Cassandra and see how people are dealing with high-volume, evolving data.

Things we’ll try differently next time

More like a hash, less like a database. Having developed a database-like thing, I have come to the conclusion that developers really don’t like them very much. ActiveRecord was hugely successful because it was so much more effective than anything previous to it that tried to make databases just go away. The closer a database is to one of the native data structures in the host language, the better. If it’s not a native data structure, it should be something they can create in a REPL and then say “magically save this for me!”
Better tools and automation. That said, every abstraction leaks. Once it does, developers want simple and useful tools that let them figure out what’s going on, what the data really looks like, tinker with it, and get back to their abstracted world as quickly as possible. This starts with tools for setting up the database, continues through interacting with it (database REPL), and for operating it (logging, introspection, etc.) Cassandra does pretty well with these tools, but they’re still a bit nerdy.
More indexes. We didn’t design our applications to use secondary indexes (a great feature) because they didn’t exist just yet. I should have spent more time integrating this into the design of our services. We got bit a lot towards the end of our release cycle because we were building all of our indexes in the application and hadn’t designed for reverse indexes. We also designed a rather coarse schema, which further complicated ad-hoc querying, which is another thing non-database-nerds love.

What’s that mean for me?

Cassandra has a lot of strengths. Once you get to a scale where you’re running data through a replicated database setup and some kind of key-value database or cache, it makes sense to start thinking about Cassandra. There are a lot of things you can do with it, and it lets you cheat in interesting ways. Take some extra time to think about the data model you build and how you’ll change it in the future. Like anything else, build tools for yourself to automate the things you do repeatedly.

Don’t use it because you read a blog post about it. Use it because it fits your application and your team is excited about using it.

October 9, 2011

Your frienemy, the ORM

When modeling how our domain objects map to what is stored in a database, an object-relational mapper often comes into the picture. And then, the angst begins. Bad queries are generated, weird object models evolve, junk-drawer objects emerge, cohesion goes down and coupling goes up.

It’s not that ORMs are a smell. They are genuinely useful things that make it easier for developers to go from an idea to a working, deployable prototype. But its easy to fall into the habit of treating them as a top-level concern in our applications.

Maybe that is the problem!

What if our domain models weren’t built out from the ORM? Some have suggested treating the ORM, and the persistence of our objects themselves, as mere implementation details. What might that look like?

Hide the ORM like you’re ashamed of it

Recently, I had the need to build an API for logging the progress of a data migration as we ran it over many million records, spitting out several new records for every input record. Said log ended up living in PostgreSQL1.

Visions of decoupled grandeur in my head, I decided that my API should be not leak its databaseness out to the user. I started off trying to make the API talk directly to the PostgreSQL driver, but that I wasn’t making much progress down that road. Further, I found myself reinventing things I would get for free in ActiveRecord-land.

Instead, I took a principled plunge. I surrendered to using an AR model, but I kept it tucked away inside the class for my API. My API makes several calls into the AR model, but it never leaks that ARness out to users of the API.

I liked how this ended up. I was free to use AR’s functionality within the inner model. I can vary the API and the AR model independently. I can stub out, or completely replace the model implementation. It feels like I’m doing OO right.

Enough of the suspense, let’s see a hypothetical example

User model. Everyone has a name, a city, and a URL. I can all do this in my sleep, right?

I start with by defining an API. Note that all it knows is that there is some object called Model that it delegates to.

class User
  attr_accessor :name, :city, :url

  def self.fetch(key)
    Model.fetch(key)
  end

  def self.fetch_by_city(key)
    Model.fetch_by_city(key)
  end

  def save
    Model.create(name, city, url)
  end

  def ==(other)
    name == other.name && city == other.city && url == other.url
  end

end

That’s a pretty straight-forward Ruby class, eh? The RSpec examples for it aren’t elaborate either.

describe User do

  let(:name) { "Shauna McFunky" }
  let(:city) { "Chasteville" }
  let(:url) { "http://mcfunky.com" }

  let(:user) do
    User.new.tap do |u|
      u.name = name
      u.city = city
      u.url = url
    end
  end

  it "has a name, city, and URL" do
    user.name.should eq(name)
    user.city.should eq(city)
    user.url.should eq(url)
  end

  it "saves itself to a row" do
    key = user.save
    User.fetch(key).should eq(user)
  end

  it "supports lookup by city" do
    user.save
    User.fetch_by_city(user.city).should eq(user)
  end

end

Not much coupling going on here either. Coding in a blog post is full of beautiful idealism, isn’t it?

“Needs more realism”, says the critic. Obliged:

  class User::Model < ActiveRecord::Base
    set_table_name :users

    def self.create(name, city, url)
      super(:name => name, :city => city, :url => url)
    end

    def self.fetch(key)
      from_model(find(key))
    end

    def self.fetch_by_city(city)
      from_model(where(:city => city).first)
    end

    def self.from_model(model)
      User.new.tap do |u|
        u.name = model.name
        u.city = model.city
        u.url = model.url
      end
    end

  end

Here’s the first implementation of an actual access layer for my user model. It’s coupled to the actual user model by names, but it’s free to map those names to database tables, indexes, and queries as it sees fit. If I’m clever, I might write a shared example group for the behavior of whatever implements create, fetch, and fetch_by_city in User::Model, but I’ll leave that as an exercise to the reader.

To hook my model up when I run RSpec, I add a moderately involved before hook:

  before(:all) do
    ActiveRecord::Base.establish_connection(
      :adapter => 'sqlite3',
      :database => ':memory:'
    )

    ActiveRecord::Schema.define do
      create_table :users do |t|
        t.string :name, :null => false
        t.string :city, :null => false
        t.string :url
      end
    end
  end

As far as I know, this is about as simple as it gets to bootstrap ActiveRecord outside of a Rails test. So it goes.

Let’s fake that out

Now I’ve got a working implementation. Yay! However, it would be nice if I didn’t need all that ActiveRecord stuff when I’m running isolated, unit tests. Because my model and data access layer are decoupled, I can totally do that. Hold on to your pants:

require 'active_support/core_ext/class'

class User::Model
  cattr_accessor :users
  cattr_accessor :users_by_city

  def self.init
    self.users = {}
    self.users_by_city = {}
  end

  def self.create(name, city, url)
    key = Time.now.tv_sec
    hsh = {:name => name, :city => city, :url => url}
    users[key] = hsh
    users_by_city[city] = hsh
    key
  end

  def self.fetch(key)
    attrs = users[key]
    from_attrs(attrs)
  end

  def self.fetch_by_city(city)
    attrs = users_by_city[city]
    from_attrs(attrs)
  end

  def self.from_attrs(attrs)
    User.new.tap do |u|
      u.name = attrs[:name]
      u.city = attrs[:city]
      u.url = attrs[:url]
    end
  end

end

This “storage” layer is a bit more involved because I can’t lean on ActiveRecord to handle all the particulars for me. Specifically, I have to handle indexing the data in not one but two hashes. But, it fits on one screen and its in memory, so I get fast tests at not too much overhead.

This is a classic test fake. It’s not the real implementation of the object; it’s just enough for me to hack out tests that need to interact with the storage layer. It doesn’t tell me whether I’m doing anything wrong like a mock or stub might. It just gives me some behavior to collaborate with.

Switching my specs to use this fake is pretty darn easy. I just change my before hook to this:

  before { User::Model.init }

Life is good.

Now for some overkill

Time passes. Specs are written, code is implemented to pass them. The application grows. Life is good.

Then one day the ops guy wakes up, finds the site going crazy slow and see that there are a couple hundred million user in the system. That’s a lot of rows. We’re gonna need a bigger database.

Migrating millions of rows to a new database is a pretty big headache. Even if it’s fancy and distributed. But, it turns out changing our code doesn’t have to tax our brains so much. Say, for example, we chose Cassandra:

require 'cassandra/0.7'
require 'active_support/core_ext/class'

class User::Model

  cattr_accessor :connection
  cattr_accessor :cf

  def self.create(name, city, url)
    generate_key.tap do |k|
      cols = {"name" => name, "city" => city, "url" => url}
      connection.insert(cf, k, cols)
    end
  end

  def self.generate_key
    SimpleUUID::UUID.new.to_guid
  end

  def self.fetch(key)
    cols = connection.get(cf, key)
    from_columns(cols)
  end

  def self.fetch_by_city(city)
    expression = connection.create_index_expression("city", city, "EQ")
    index_clause = connection.create_index_clause([expression])
    slices = connection.get_indexed_slices(cf, index_clause)
    cols = hash_from_slices(slices).values.first
    from_columns(cols)
  end

  def self.from_columns(cols)
    User.new.tap do |u|
      u.name = cols["name"]
      u.city = cols["city"]
      u.url = cols["url"]
    end
  end

  def self.hash_from_slices(slices)
    slices.inject({}) do |hsh, (k, columns)|
      column_hash = columns.inject({}) do |inner, col|
      column = col.column
      inner.update(column.name => column.value)
      end
    hsh.update(k => column_hash)
    end
  end
end

Not nearly as simple as the ActiveRecord example. But sometimes it’s about making hard problems possible even if they’re not mindless retyping. In this case, I had to implement ID/key generation for myself (Cassandra doesn’t implement any of that). I also had to do some cleverness to generate an indexed query and then to convert the hashes that Cassandra returns into my User model.

But hey, look! I changed the whole underlying database without worrying too much about mucking with my domain models. I can dig that. Further, none of my specs need to know about Cassandra. I do need to test the interaction between Cassandra and the rest of my stack in an integration test, but that’s generally true of any kind of isolated testing.

This has all happened before and it will all happen again

None of this is new. Data access layers have been a thing for a long time. Maybe institutional memory and/or scars have prevented us from bringing them over from Smalltalk, Java, or C#.

I’m just sayin’, as you think about how to tease your system apart into decoupled, cohesive, easy-to-test units, you should pause and consider the idea that pushing all your persistence needs down into an object you later delegate to can make your future self think highly of your present self.

This ended up being a big mistake. I could have saved myself some pain, and our ops team even more pain, if I’d done an honest back-of-the-napkin calculation and stepped back for a few minutes to figure out a better angle on storage. ↩

June 12, 2011

Locking and how did I get here?

I've got a bunch of browsers tabs open. This is unusual; I try to have zero open. Except right now. I'm digging into something. I'm spreading ephemeral papers around on my epemeral desk and trying to make a concept, not ephemeral, at least in my head.

It all started with locking. It's a hard concept, but some programs need it. In particular, applications running across multiple machines connected by imperfect software and unreliable networks need it. And this sort of thing ends up being difficult to get right.

I've poked around with this before. Reading the code of some libraries that are implementing locking in a way that might come in handy to me, I check out some documentation that I've seen referenced a couple times. Redis' setnx command can function as a useful primitive for implementing locks. It turns out (getset) is pretty interesting too. Ohm, redis-objects and adapter-redis all implement locking using a combination of those two primitives. Then I start to dig deeper into Ohm; there's some interesting stuff here. Activity feeds with Ohm is relevant to my interests. I've got a thing for persistence tools that enumerate their philosophy. Nest seems like a useful set of concepts too.

I'm mentally wandering here. Let's rewind back to what I'm really after: a way to do locking in Cassandra. There's a blog post I came across before on doing critical sections in Cassandra, but it uses ZooKeeper, so that's cheating. Then I get distraced by a thing on HBase vs. Cassandra and another perspective on Cassandra that mentions but does not really focus on locking.

And then, paydirt. A wiki page on locking in Cassandra. It may be a little rough, and might not even work, but it's worth playing with. Turns out it's an adaptation of an algorithm devised by Leslie Lamport for implementing locking with atomic primitives. It uses a bakery as an analgoy. Neat.

Then I get really distracted again. I remember doozer, a distributed consensus gizmo developed by Blake Mizerany at Heroku. I get to reading its documentation and come across the protocol spec, which has an intriguing link to a Plan 9 manpage on the Plan 9 File Protocol. That somehow drives me to ponder serialization and read about TNetstrings.

At this point, my cup has overfloweth. I've got locking, distributed consensus, serialization, protocols, and philosophies all on my mind. Lots of fun intellectual fodder, but I'll get nowhere if I don't stick my nose into one of them exclusively and really try to figure out what it's about. So I do. Fin.

May 10, 2011

ZeroMQ inproc implies one context

I’ve been tinkering with ZeroMQ a bit lately. Abstracting sockets like this is a great idea. However, the Ruby library, like sockets in general, is a bit light on guidance and the error messages aren’t of the form “Hey dumbie, you do it in this order!”

Here’s something that tripped me up today. ZeroMQ puts everything into a context. If you’re doing in-process communication (e.g. between two threads in Ruby 1.9), you need to share that context.

Doing it right:


# Create a context for all in-process communication
>> ctx = ZMQ::Context.new
# Set up a request socket (think of this as the client)
>> req = ctx.socket(ZMQ::REQ)
# Set up a reply socket (think of this as the server)
>> rep = ctx.socket(ZMQ::REP)
# Like a server, the reply socket binds
>> rep.bind('inproc://127.0.0.1')
# Like a client, the request socket connects
>> req.connect('inproc://127.0.0.1')
# ZeroMQ only knows about strings
>> req.send('1')
=> true
# Reply/server side got the message
>> p rep.recv
"1"
=> "1"
# Reply/server side sends response
>> rep.send("urf!")
=> true
# Request/client side got the response
>> req.recv
=> "urf!"

Doing it wrong:


# Create a second context
>> ctx2 = ZMQ::Context.new(2)
# Create another client
>> req2 = ctx2.socket(ZMQ::REQ)
# Attempt to connect to a reply socket, but it doesn't
# exist in this context
>> req2.connect('inproc://127.0.0.1')
RuntimeError: Connection refused
	from (irb):16:in `connect'
	from (irb):16
	from /Users/adam/.rvm/rubies/ruby-1.9.2-p180/bin/irb:16:in `'

I believe what is happening here is that each ZMQ::Context gets a thread pool to manage message traffic. In the case of in-process messages, the threads only know about each other within the confines of a context.

And now you know, roughly speaking.

April 7, 2011

The rules of the yak shave

Yak shaves. They’re great fun. Like most things, yak shaving is more fun when you have some rules to guide you away from the un-fun parts:

always have a goal, know when you’re done
timebox it
work on a branch so you can switch to real work if you need to
make smaller commits than usual so you can unwind if you should go awry
don’t worry about writing tests if you don’t know what you’re doing
if you aren’t sure where you are going, write a test harness and iterate on that
have a pair or buddy to talk through what you’re trying to do and how to get there
bail out if you are starting to burn out, face diminishing returns, or think of a better way to shave they yak

Fun fact: this post is in fact a yak shave extracted from a post on yak shaving.

March 27, 2011

Driven to drawing monsters

From my notebook:

I don't recall what I was working on the time, but it seems that the interaction between many Gowalla models loading from cache via a get class method and something about memcached was causing enough trouble to drive me to draw a little monster.

February 12, 2011

Simple Ruby pleasures

I think I first discovered the joy of take and drop in my journeys through Haskell. But it appears that, since 2008 at least, we have had the pleasure of using them in Ruby too.

Need the first or last N elements from an Enumerable. Easy!

[sourcecode language=“ruby” light=“true”] ary = (1..100).to_a ary.take(5) # => [1, 2, 3, 4, 5] ary.drop(95) # => [96, 97, 98, 99, 100]

range = (1..100) range.take(5) # => [1, 2, 3, 4, 5] range.drop(95) # => [96, 97, 98, 99, 100]

hsh = {:foo => 1, :bar => 2, :baz => 3} hsh.take(1) # => [[:bar, 2]] hsh.drop(2) # => [[:foo, 1]] [/sourcecode]

The real magic is when you use take along with other Enumerable goodies like select and map. Here’s one of my personal favorites amongst the code I wrote in 2010:

[sourcecode language=“ruby” light=“true” highlight=“12,13,14”] class QueryTracer < ActiveSupport::LogSubscriber

ACCEPT = %r{^(app|config|lib)}.freeze FRAMES = 5 THRESHOLD = 300 # In ms

def sql(event) return unless event.duration > THRESHOLD callers = Rails. backtrace_cleaner. clean(caller). select { |f| f =~ ACCEPT }. take(FRAMES). map { |f| f.split(":").take(2).join(":") }. join(" | ")

# Shamelessly stolen from ActiveRecord::LogSubscriber
warning = color(&quot;SLOW QUERY&quot;, RED, true)
name = '%s (%.1fms)' % [event.payload[:name], event.duration]
sql  = event.payload[:sql].squeeze(' ')

warn &quot;  #{warning}&quot;
warn &quot;    #{name} #{sql}&quot;
warn &quot;    Trace: #{callers}&quot;

end

QueryTracer.attach_to :active_record [/sourcecode]

This little ditty is awesome because:

It's super-practical. Drop this in your Rails 3 app, tail your production log, see the slow queries, go to the method in your app calling it, and fix it. Easy.
It only activates itself when it's needed. Queries that execute quickly return immediately.
No framework spelunking required. Rails 3's notification system handles all of it. Rails' backtace cleaner gizmo even makes the backtraces much nicer to read.
It chains methods to make something that reads like a nice, concise functional program.

For more Enumerable joy, read up on each_cons.

September 12, 2010

An ode to Hashie

I was building an API wrapper this weekend. As is common when writing these sorts of things, I found myself needing something that takes semi-structured data (hashes parsed from JSON) and yields Ruby objects that are easy to work with. I've always found myself hacking these sorts of things together on a somewhat ad-hoc basis. It's a fun, but a bit of a yak-shave.

This time around, I decided to see if the state of the art has advanced in this realm. Luckily, I reviewed Wynn Netherland's slides from Lone Star Ruby Conference and found exactly what I needed.

Where have you been all my life?

Intridea's Hashie is a library built on the notion of making hash-like data structures act a little more like objects and a little easier to work with. I have literally wanted something like this for years!

Suppose you have a hash like the following:

hash = {
  "name" => "Adam",
  "age" => 31,
  "url" => "http://therealadam.com"
}

Coding up an object to store that isn't too hard, but writing the code that pulls values out of the Hash and tucks them away in the right attribute on the object gets tedious quickly. Hashie's Dash class makes this trivial.

class User >Hashie::Dashie
  property :name
  property :age
  property :url
end

Its even more delightful to use:

user = User.new(hash)
user.name # => "Adam"

Tons of boilerplate code, eliminated. My life is instantly better.

A great use of inheritance

It's been pointed out that ActiveRecord's use of inheritance is somewhat specious. To argue that "user is-a ActiveRecord::Base" takes a bit of hand-waving. So lately, you'll find lots of libraries insinuate themselves into classes as a mixin, rather than as a parent class. This is a little bit of you-say-potato-I-say-potato, but whatever.

In Hashie's case, I think that inheritance is being used correctly. All of the classes that Hashie provides (Mash, Dash, Trash and Clash) inherit from Hash. So the is-a relationship holds.

Sugary data structures taste great

While I'm going on about inheritance, here's how I used to create these sorts of wrapper classes:

User = Struct.new(:name, :age, :url)

For creating simple objects that just need to hold onto some data, I really like this approach. If they end up needing data, it can easily grow up:

class User < Struct.new(:name, :age, :url)
  # Behavior goes here
end

I like what Hashie is doing even more though. Its enhancing a core class in a largely unobtrusive way, and doing so from the confines of a library that only those who need it can pull from.

I'd love to see more libraries like this that add extra sass to Ruby core library. An Array that pages values out to disk on an LRU-basis perhaps, or a bloom-filter based Set, perhaps?

I'm excited about languages like Erlang, Haskell, Scala, and Clojure and what they can bring to the adventurous developer. Despite that, I feel strongly that Ruby still has plenty of really nifty tricks up its sleeve.

September 9, 2010

Examining software principles

There are too many good things to say about the Design Principles Behind Smalltalk. A few of my favorites:

Scope: The design of a language for using computers must deal with internal models, external media, and the interaction between these in both the human and the computer.

This one is really obvious until you get to the last four words. The human and the computer. Luckily we're starting to take for granted the primacy of human communication in programming lately (mostly), but when Smalltalk was created, I'm sure its designers received no shortage of grief when they steered towards humane optimizations.

Uniform metaphor: A language should be designed around a powerful metaphor that can be uniformly applied in all areas.

Smalltalk is largely objects and messages. Lisp is largely lists and functions. Erlang is largely pattern matching, functions, and actors. These aren't perfect languages, but once you deeply understand, really grasp the core concepts, you have the whole language at your command.

Operating System: An operating system is a collection of things that don't fit into a language. There shouldn't be one.

The first sentence is a great principle when considering what should go in the core of a system and what should go in the surrounding ecosystem of libraries. The second sentence is wonderfully bold, in that it cuts against what nearly every successful system has done since Smalltalk was prominent and in that it contradicts the first sentence. I'm not sure what practical use to make of this principle; its density of intrigue is that keeps me coming back to it.

Natural Selection: Languages and systems that are of sound design will persist, to be supplanted only by better ones.

I stopped worrying about what might supplant Ruby a long time ago. Someday, it will happen. And when it does, whatever succeeds Ruby will have to be really awesome to fill its shoes. I'm looking forward to seeing what that is. But the same goes for any technology; they are often replaced with something wholly awesomer than the incumbent.

I've never done it, but it seems like it would be intriguing and vastly informative to sit down with one of the systems I work on daily and try to extract these principles post-hoc. What values and principles are embedded in the system? What does that say about the team and why the system is the way it is? What principles are enablers and what bad habits should the team work to correct?

July 26, 2010

Making the complicated seem simple

Don Norman, Simplicity Is Not the Answer:

We want devices that do a lot, but that do not confuse, do not lead to frustration. Ahah! This is not about simplicity: it is about frustration. The entire debate is being framed incorrectly. Features is not the same as capability. Simplicity is not the same as usability. Simplicity is not the answer.

Norman goes on to explain how you can take a confusing mass of features and turn it into something less frustrating:

Modularize into understandable clusters
Map clearly from actions to results
Model the ideas and actions cohesively

The article is about interaction design, but it fits just as well in designing programming languages and software.

June 28, 2010

The Cadence and Flow of Editing Programs

I figured out why my trists with other editors often end up back at TextMate. It sounds a bit like this:

Tap-tap-tap-tap-tap-tap; TAP; tap-tap-tap-tap-tap-tap-tap-tap-tap-tap-tap-tap; TAP; TAP; tap-tap-tap-tap-tap-tap; TAP.

When I’ve used vi and its descendants, it sounds like this:

Tap-tap-tap-tap-tap-tap; taptaptap; tap-tap-tap-tap-tap-tap; tapTAP TAP! tap-tap-tap-tap-tap-tap-tap-tap-tap-tap-tap-tap-tap-tap-tap-tap-tap-tap. tapTAPTAPtapTAP TAP!

And Emacs sounds like this:

Tap-tap-tap-tap-tap-tap; tapTAPtapTAP. tap-tap-tap-tap-tap-tap;tap-tap-tap-tap-tap-tap;tap-tap-tap-tap-tap-tap; tapTAP TAP; TAP TAPtapTAPtapTAPTAP. tapTAPtapTAP!

Lest you fear I’ve created some Ook-like language for describing shortcuts in any known editor, let me explain what’s going on here.

Cadence

Emacs is, at it’s core, a Lisp machine with a text editing language wrapped around it. Every interaction with Emacs invokes a function. Handily enough, the function that adds an “a” to the file you’re editing is bound to the a key on your keyboard. Oddly enough, the function that writes the file you’re editing out to disk is bound to the combination of hitting control and x at the same time, followed by control and s at the same time. Getting them out of order matters. Control-s followed by Control-x does something entirely different.

So when you use Emacs, you type a bit, and then you run some command. Maybe you save the file, or switch to editing another file, or go to peruse a directory. So you tap for a while and then you stop tapping, move your hands every so slightly to mash the control, or alt keys and then tap some other key, usually emphatically. The most commonly used key combinations end up being hit even more emphatically. Sit in a room full of developers using Emacs, listen closely; every once in a while, you’ll here everyone save almost simultaneously and go back to a furry of lower-case tapping.

Vi is slightly different from Emacs in that it is built up from two Unix commands: one for editing single lines of text, and another for moving between said lines of text. Thus, the cadence of a vi user is slightly different. Staccato taps followed by a bang as they switch from line editing to line navigation; more staccato taps, this time oddly spaced as they move between lines and place the cursor to begin their next fury of editing; another burst of staccato text entry; a quick and emphatic tap to take them out of editing mode and then a quick but punctuated trio of taps as they invoke the command that saves the file out, a sequence of finger movements so ingrained in the vi users brain that it appears as more of a gesture than a triplet of discrete key presses.

Here’s a project idea for pranksters: stand in a room full of people using vi and Emacs, listen for the really emphatic taps, and trip the room’s breaker right before they all finish their emphatic save commands. Cackle as chaos ensues.

The space between the taps

A roomful of vi-users, Emacs-users, and TextMate users is a homogeneous mess of clackity-clackity to the untrained ear. Most accomplished programmers are touch typists, so what you’re likely to hear is an undifferentiated stream of rapid-fire tapping. But if you’ve used these editors enough, and wasted enough time thinking about the aesthetics they represent, you can hear the differences in the punctuation as commands are invoked by arcane combinations and sequences of keystrokes.

In Vi and Emacs, there is a concise sequence of keys you can mash to do a regular expression search, move down three lines, go to the second sentence on that line, and replace the word under the cursor with “bad-ass text editing programmer, do not offend”. It is, in part, this power that attracts, fascinates, and empowers their users.

TextMate can do this, sure. But there is very little in the way of support from the editor to do it. You mostly have to put your eye on the piece of text you wish to edit and use some primitive motion keystrokes to get the cursor where you want it. Then you use those same keystrokes to highlight the text to replace, this time holding down a modifier key, then you type in the text you want. TextMate, compared to its programmers editor brethren, is a language of grunts and chuffs next to the sophisticated Latin or French of vi and Emacs.

Flow

TextMate is unsophisticated next to the extensibility and conceptual unity of Emacs, or the pure practicality of vim. So why do I keep coming back to it?

It keeps me in flow.

This is a very personal answer. I’m not saying you can’t achieve a flow-state with vi or Emacs. I’m saying that while I like the idea of those editors, understand the aesthetic, and enjoy watching skilled operators using them, I get lost in the punctuation when I use them. I either forget what punctuation I should use in some text editing scenario, or I have a nagging doubt that there is some better punctuation I could be using instead.

If vi is about navigating lines and editing those lines; Emacs is about invoking Lisp functions on files containing text, then TextMate is about primitive but direct manipulation of the text in a file. There’s very little conceptual overhead. You don’t need to know how the editor is enhanced in order to understand how to operate it. You don’t need to know when to put yourself in different modes of operation to make things happen. You just think of what you want the text to look like, you move the cursor around and you type on the keyboard.

It ain’t much, but I (often) call it home.

June 22, 2010

Breaking My Habits For Editing Programs

I’m a Unix guy, by upbringing. My first formative experiences in software development were on an early, Linux 1.x version of Debian. I’d used Windows, but always came back to Linux. When OS X got good enough around 10.2, I switched to something that didn’t require so much tinkering, so I could make more useful stuff.

Software development on Unix has skewed towards focusing on tools, languages, and text editors for quite some time. IDEs and browsers on Unix are a messy, foreign thing (just like everything else in Unix). Thus, I’ve long favored the terminal-and-editor style of development.

I’ve decided that now is the time for me to try something different. I like text editors and directly manipulating text, but I can see why some people feel naked without an IDE. The ability to pop-up a level and make a more broad-stroked transformation to a program is appealing. Having code navigation and semantic awareness baked in has lots of potential.

I’ve probably said grumbly things about RubyMine in the past, but I think now is the time to give it a go. Worst thing that could happen is that I don’t like it and I go back to the infinite tinkering of Emacs or the 85% perfect experience of TextMate.

I’ll let you know how it goes.

I originally wrote that a few months ago, at the apex of my editor neurosis.

I did give RubyMine a try, and I like some parts of it. It’s code navigation is pretty nice, it does an admirable job of integrating with the unique ecosystem of tools that a Ruby developer uses to manage their environment, and it does an excellent job of grokking TDD with test/unit and RSpec. RubyMine is a step in the right direction. I suspect that if I had muscle memory for IntelliJ, it would be the way to go.

But, I have muscle memory for TextMate and Emacs, and I have an affinity for being close to my tools. RubyMine felt one step disconnected from both my muscle memory and my tools. That’s quite an accomplishment; most IDEs feel several steps removed the tools and seem to discourage developing finger-memory in favor of menu-memory. I’ll give RubyMine another try in a year, probably, see how it’s coming along. But in the mean time, it’s great to see that there is a vendor out there tackling the challenge that is tools for Ruby.

June 16, 2010

A rambling, regurgitated thought on process

Elevator pitch: I’ve found that if you want to divert a productive team into an hour or two of semi-fruitless banter, ask how the team should use Git, Pivotal Tracker, and Capistrano to manage incoming work, verify it, and deploy it to production. In reality, you should ignore all the corner-cases and figure out what will enable you to push really small chunks of work with great frequency.

Ed. What follows isn’t novel, but it was a useful change in perspective for me, so I decided to share.

I’ve been thinking a bit about software processes lately. Despite great variation in telling you how to do so, most processes seem to focus on to do more stuff faster.

Lately, the notion of doing less has a lot of interest. Lean startups are the new-new thing and Getting Real is the old new thing; both preach getting more done and delivering value by doing less and analyzing the results more.

There are two kinds of “do less” a software developer can engage in. In the past I’ve been a little too focused on how I can take on fewer responsibilities from other parties. Literally doing less by scoping down features, putting off decisions, and focusing on things that seem like they really matter. I sometimes feel like I’ve become too eager to do less, making myself something of a cranky coder/slacker. But I digress

Recently, I’ve been trying to tackle doing less in my habits of creating software. How can I write less code to implement a feature, not in the minimalist sense, but in the “how do I just get it to kinda work sense”? How can I take less time between starting something and getting some form of it out in the wild? How can I make my code less coupled so there are fewer changes to make when I decide it needs to do something else? How can I make this less coupled to data storage so that putting it out requires less deployment effort? How can I make changes that are less likely to cause long-term regressions? How can I make it less effort to rollback bad changes?

When I look through the lens of accomplishing more by doing less, a lot of popular software methodology seems like dead weight. Rather than trying to find a process that addresses every team member’s own scars and affections, both perceived and imaginary, it seems most useful to imagine the smallest ruleset that won’t result in uncontrollable entropy and put it into action. If something starts to hurt, imagine the simplest new rule and put it into play.

The goal, as stated above, is to get to the point where you make really tiny, maybe imperceptible changes, and push them really frequently. Everything that stands in the way is the enemy.