One step closer to a good pipeline operator for Ruby

I’ve previously yearned for something like Elm and Elixir’s |> operator in Ruby. Turns out, this clever bit of concision is in Ruby 2.5:

object.yield_self {|x| block } → an_object
# Yields self to the block and returns the result of the block.

class Object
  def yield_self

I would prefer then or even | to the verbosely literal yield_self, but I’ll take anything. Surprisingly, both of my options are legal method names!

class Object

  def then
    yield self

  def |
    yield self


require "pathname"

 then { |s| }.
 yield_self { |p| }.
 | { |source| source.each_line }.
 select { |line| line.match /^\W*def ([\S]*)/ }.
 map { |defn| p defn }

However, | already has 20+ implementations, either of the mathematical logical-OR variety or of the shell piping variety. Given the latter, maybe there’s a chance!

Next, all we need is:

  • a syntax to curry a method by name (which is in the works!)
  • a syntax to partially apply said curry

If those two things make their way into Ruby, I can move on to my next pet feature request: a module/non-global namespace scheme ala Python, ES6, Elixir, etc. A guy can dream!

Simple Ruby pleasures

I think I first discovered the joy of take and drop in my journeys through Haskell. But it appears that, since 2008 at least, we have had the pleasure of using them in Ruby too.

Need the first or last N elements from an Enumerable. Easy!

ary = (1..100).to_a
ary.take(5) # => [1, 2, 3, 4, 5]
ary.drop(95) # => [96, 97, 98, 99, 100]

range = (1..100)
range.take(5) # => [1, 2, 3, 4, 5]
range.drop(95) # => [96, 97, 98, 99, 100]

hsh = {:foo => 1, :bar => 2, :baz => 3}
hsh.take(1) # => [[:bar, 2]]
hsh.drop(2) # => [[:foo, 1]]

The real magic is when you use take along with other Enumerable goodies like select and map. Here’s one of my personal favorites amongst the code I wrote in 2010:

class QueryTracer < ActiveSupport::LogSubscriber

  ACCEPT = %r{^(app|config|lib)}.freeze
  FRAMES = 5
  THRESHOLD = 300 # In ms

  def sql(event)
    return unless event.duration > THRESHOLD
    callers = Rails.
      select { |f| f =~ ACCEPT }.
      map { |f| f.split(":").take(2).join(":") }.
      join(" | ")

    # Shamelessly stolen from ActiveRecord::LogSubscriber
    warning = color("SLOW QUERY", RED, true)
    name = '%s (%.1fms)' % [event.payload[:name], event.duration]
    sql  = event.payload[:sql].squeeze(' ')

    warn "  #{warning}"
    warn "    #{name} #{sql}"
    warn "    Trace: #{callers}"


QueryTracer.attach_to :active_record

This little ditty is awesome because:

  • It’s super-practical. Drop this in your Rails 3 app, tail your production log, see the slow queries, go to the method in your app calling it, and fix it. Easy.
  • It only activates itself when it’s needed. Queries that execute quickly return immediately.
  • No framework spelunking required. Rails 3’s notification system handles all of it. Rails’ backtace cleaner gizmo even makes the backtraces much nicer to read.
  • It chains methods to make something that reads like a nice, concise functional program.

For more Enumerable joy, read up on each_cons.

A language experiment writ large

A wall painted with squares

For the past year, the Java ecosystem has seen interesting evolution. Java the language continues take its place as the new safety scissors of programming, but the pieces around it are getting better. The JVM is now acknowledged inside and outside of the Java community as really good stuff. Really interesting software like Hadoop and Cassandra are built on top of Java. Integration with languages like Ruby and Python is getting pretty good.

What’s most interesting to me is that there’s a competition going on for the hearts and minds of those developers who don’t like using safety scissors. This competition is a great experiment into what developers really want in a programming language. For a language nerd such as myself, observing this experiment is a lot of fun.

On one side you’ve got Scala. Scala looks a lot like Java. But on top of that it adds shorthands and pleasantries from Ruby, a really good type system reminiscent of Haskell, and other handy functional features. When you build up a hybrid language like this you, two things happen. First, a lot of people who look at their checklist, find everything they need and decide. Second, you get a pretty complex language.

Clojure, however, looks nothing like Java. It’s a Lisp, it simply can’t. Clojure borrows from Haskell too, this time borrowing ideas about state and how to avoid it and concurrency (notably software transactional memory). Clojure is a funny looking language at first, but there are some great ideas within it. Plus, it’s a relatively small language; it’s just that it’s a different kind of simple and almost every concept is new to many developers.

Both these languages are building up strong communities. Both are full of great people with energy and ideas. It’s quite possible that a winner-take-all situation won’t occur. I’d like that.

What’s most interesting to me is to see how people take to the languages. Will they go for the familiarity of Scala and deal with the complexity? Will they learn the simplicities of Clojure and rewire their brains? Will they prove the common wisdom wrong and learn both?

I’m watching with great interest.

Bundler, not as bad as they say

Of all the new moving parts in Rails 3, the one I see the most grousing over is Bundler. This is not surprising, as its a big part of how your application works and it’s right up front in the process of porting or building Rails 3 apps.Bundler: As Simple as What You Did Before:

Bundler has a lot of advanced features, and it’s definitely possible to model fairly complex workflows. However, we designed the simple case to be extremely simple, and to usually be even less work than what you did before. The problem often comes when trying to handle a slightly off-the-path problem, and using a much more complex solution than you need to. This can make everything much more complicated than it needs to be.

I haven’t run into anything with Bundler that I couldn’t solve with a little critical thinking and maybe a little searching. On the other hand, Bundler has made getting dependencies straight amongst team members and deploying them to production servers far easier than it was before. I’m very glad that while it’s not strictly part of the scope of Rails, that Bundler is now part of it.

An ode to Hashie

I was building an API wrapper this weekend. As is common when writing these sorts of things, I found myself needing something that takes semi-structured data (hashes parsed from JSON) and yields Ruby objects that are easy to work with. I’ve always found myself hacking these sorts of things together on a somewhat ad-hoc basis. It’s a fun, but a bit of a yak-shave.

This time around, I decided to see if the state of the art has advanced in this realm. Luckily, I reviewed Wynn Netherland’s slides from Lone Star Ruby Conference and found exactly what I needed.

Where have you been all my life?

Intridea’s Hashie is a library built on the notion of making hash-like data structures act a little more like objects and a little easier to work with. I have literally wanted something like this for years!

Suppose you have a hash like the following:

hash = {
  "name" => "Adam",
  "age" => 31,
  "url" => ""

Coding up an object to store that isn’t too hard, but writing the code that pulls values out of the Hash and tucks them away in the right attribute on the object gets tedious quickly. Hashie’s Dash class makes this trivial.

class User >Hashie::Dashie
  property :name
  property :age
  property :url

Its even more delightful to use:

user = # => "Adam"

Tons of boilerplate code, eliminated. My life is instantly better.

A great use of inheritance

It’s been pointed out that ActiveRecord’s use of inheritance is somewhat specious. To argue that “user is-a ActiveRecord::Base” takes a bit of hand-waving. So lately, you’ll find lots of libraries insinuate themselves into classes as a mixin, rather than as a parent class. This is a little bit of you-say-potato-I-say-potato, but whatever.

In Hashie’s case, I think that inheritance is being used correctly. All of the classes that Hashie provides (Mash, Dash, Trash and Clash) inherit from Hash. So the is-a relationship holds.

Sugary data structures taste great

While I’m going on about inheritance, here’s how I used to create these sorts of wrapper classes:

User =, :age, :url)

For creating simple objects that just need to hold onto some data, I really like this approach. If they end up needing data, it can easily grow up:

class User <, :age, :url)
  # Behavior goes here

I like what Hashie is doing even more though. Its enhancing a core class in a largely unobtrusive way, and doing so from the confines of a library that only those who need it can pull from.

I’d love to see more libraries like this that add extra sass to Ruby core library. An Array that pages values out to disk on an LRU-basis perhaps, or a bloom-filter based Set, perhaps?

I’m excited about languages like Erlang, Haskell, Scala, and Clojure and what they can bring to the adventurous developer. Despite that, I feel strongly that Ruby still has plenty of really nifty tricks up its sleeve.

Examining software principles

There are too many good things to say about the Design Principles
Behind Smalltalk
. A few of my favorites:

Scope: The design of a language for using computers must deal with internal
models, external media, and the interaction between these in both
the human and the computer.

This one is really obvious until you get to the last four words. The
human and the computer. Luckily we’re starting to take for granted
the primacy of human communication in programming lately (mostly), but
when Smalltalk was created, I’m sure its designers received no
shortage of grief when they steered towards humane optimizations.

Uniform metaphor: A language should be designed around a powerful
metaphor that can be uniformly applied in all areas.

Smalltalk is largely objects and messages. Lisp is largely lists and
functions. Erlang is largely pattern matching, functions, and
actors. These aren’t perfect languages, but once you deeply
understand, really grasp the core concepts, you have the whole
language at your command.

Operating System: An operating system is a collection of things that don’t fit into a language. There shouldn’t be one.

The first sentence is a great principle when considering what should
go in the core of a system and what should go in the surrounding
ecosystem of libraries. The second sentence is wonderfully bold, in
that it cuts against what nearly every successful system has done
since Smalltalk was prominent and in that it contradicts the first
sentence. I’m not sure what practical use to make of this principle;
its density of intrigue is that keeps me coming back to it.

Natural Selection: Languages and systems that are of sound design will persist, to be supplanted only by better ones.

I stopped worrying about what might supplant Ruby a long time
ago. Someday, it will happen. And when it does, whatever succeeds Ruby
will have to be really awesome to fill its shoes. I’m looking
forward to seeing what that is. But the same goes for any technology;
they are often replaced with something wholly awesomer than the

I’ve never done it, but it seems like it would be intriguing and
vastly informative to sit down with one of the systems I work on daily
and try to extract these principles post-hoc. What values and
principles are embedded in the system? What does that say about the
team and why the system is the way it is? What principles are enablers
and what bad habits should the team work to correct?

A quick RVM rundown

(It so happens I’m presenting this at Dallas.rb tonight. Hopefully it can also be useful to those out in internetland too.)

RVM gives you three things:

  • an easy way to use multiple versions of multiple Ruby VMs
  • the ability to manage multiple indpendent sets of gems
  • more sanity

First, let’s install RVM:

  • gem install rvm
  • rvm-install
  • follow the directions to integrate with your shell of choice

Now, let’s install some Rubies:

  • rvm list known will show us all the released Rubies that we can
    install (more on list)
  • rvm list rubies will show which Rubies we have locally installed
  • rvm install ree-1.8.7 gives me the latest release of the 1.8.7
    branch of Ruby Enterprise Edition
  • rvm install jruby will give me the default release for JRuby
  • rvm use jruby will switch to JRuby
  • rvm use ree will give me Ruby Enterprise Edition
  • rvm use ruby-1.8.6 will give me an old and familiar friend
  • rvm use system will put me back wherever my operating system left

The other trick that RVM gives us is the ability to switch between
different sets of installed gems:

  • Each Ruby VM (JRuby, Ruby 1.9, Ruby 1.8, REE) has its own set of
    gems. This is a fact of life, due to differing APIs and, you know,
    underlying languages.
  • rvm use ruby-1.9.1 gives you the default Ruby 1.9 gemset
  • rvm use ruby-1.9.1%acme gives you the gemset for your work with
    Acme Corp (more on using gemsets)
  • rvm use ruby-1.9.1%wayne gives you the gemset for your work with
    Wayne Enterprises
  • rvm use ree%awesome gives you the gemset for your awesome app
  • You can export and import gemsets. This can come in handy to bring
    new people onboard. No longer will they have to sheepishly install
    gems on their first day as they work through dependencies you long
    since forgot about.

Some other handy things to peruse:

I also promised you some extra sanity:

  • RVM knows how to compile things, put Rubygems and rake in place, even apply patches and pull from specific tags. You can do more important things, like watch The View or read an eleven part series on pre-draft analysis for the Cowboys.
  • RVM lets you isolate different applications you’re working on. Got one app that doesn’t play nice with Rails 2.x installed? No problem, create a gem environment for that! Stuck in the spider-web of Merb dependencies? Isolate it in its own environment.
  • RVM makes multi-platform testing and benchmarking easy. You can easily run your test suite or performance gizmo on whatever Rubies you have installed.
  • RVM makes it easy to tinker with esoteric patchlevels and implementations. For instance, feel free to tinker with MagLev or the mput branch of MRI.

A couple other things RVM tastes great with:

  • Using homebrew to manage packages instead of MacPorts
  • Not using sudo to install your gems
  • Managing your dotfiles on GitHub

Give attribute_mapper a try

(For the impatient: skip directly to the `attribute_mapper` gem.)

In the past couple months, I’ve worked on two different projects that needed something like an enumeration, but in their data model. Given the ActiveRecord hammer, they opted to represent the enumeration as a has-many relationship and use a separate table to represent the actual enumeration values.

To a man with an ORM, everything looks like a model

So, their code ended up looking something like this:

class Post < ActiveRecord::Base

  belongs_to :status


class Status < ActiveRecord::Base

  has_many :tickets


From there, the statuses table is populated either from a migration or by seeding the data. Either way, they end up with something like this:

# Supposing statuses has a name column
Status.create(:name => 'draft')
Status.create(:name => 'reviewed')
Status.create(:name => 'published')

With that in place, they can fiddle with posts as such:

post.status = Status.find_by_name('draft') # => 'draft'

It gets the job done, sure. But, it adds a join to a lot of queries and abuses ActiveRecord. Luckily…

I happen to know of a better way

If what you really need is an enumeration, there’s no reason to throw in another table. You can just store the enumeration values as integers in a database column and then map those back to human-friendly labels in your code.

Before I started at FiveRuns, Marcel Molina and Bruce Williams wrote a plugin that does just this. I extracted it and here we are. It’s called attribute_mapper, and it goes a little something like this:

class Post  {
    :draft => 1,
    :reviewed => 2,
    :published => 3

See, no extra table, no need to populate the table, and no extra model. Now, fiddling with posts goes like this:

post.status = :draft
post.status # => :draft
post.read_attribute(:status) # => 1

Further, we can poke the enumeration directly like so:

Post.statuses # => { :draft => 1, :reviewed => 2, :published => 3 }
Post.statuses.keys # => [:draft, :reviewed, :published]

Pretty handy, friend.

Hey, that looks familiar

If you’ve read Advanced Rails Recipes, you may find this eerily familiar. In fact, recipe #61, “Look Up Constant Data Efficiently” tackles a similar problem. And in fact, I’m migrating a project away from that approach. Well, partially. I’m leaving two models in place where the “constant” model, Status in this case, has actual code on it; that sorta makes sense, though I’m hoping to find a better way.

But, if you don’t need real behavior on your constants, attribute_mapper
is ready to make your domain model slightly simpler.

Testing declarative code

I’m a little conflicted about how and if one should write test code for declarative code. Let’s say I’m writing a MongoMapper document class. It might look something like this:

class Issue

  include MongoMapper::Document

  key :title, String
  key :body, String
  key :created_at, DateTime


Those key calls. Should I write a test for them? In the past, I’ve said “yes” on the principle that I was test driving the code and I needed something to fail in order to add code. Further, the growing ML-style-typing geek within me likes that writing tests for this is somewhat like constructing my open wacky type system via the test suite.

A Shoulda-flavored test might look something like this:

class IssueTest < Test::Unit::TestCase

  context 'An issue' do

    should_have_keys :title, :body, :created_at



Ignoring the recursive rathole that I’ve now jumped into, I’m left with the question: what use is that should_have_keys? Will it help someone better understand Issue at some point in the future? Will it prevent me from naively breaking the software?

Perhaps this is the crux of the biscuit: by adding code to make certain those key calls are present, have I address the inherent complexity of my application or have I imposed complexity?

I’m going to experiment with swinging back towards leaving these sorts of declarations alone. The jury is still out.