Here’s a Sunday afternoon hack. It’s a “stack” machine implemented in Clojure. I intended for it to be a stack machine, no airquotes, but I got it working and realized what I’d really built was a machine with two registers and instructions that treat those two registers as a stack. Pretty weird, but it’s not bad for a weekend hack.

I’m going to break my little machine down, and highlight things that will feel refreshingly different to someone, like me, who has spent the past several years in object-oriented languages like Ruby. What follows is observations; I’m still very new to Clojure, despite familiarity with the concepts, so I’ll pass on making global judgements.

Data structures as programs as data

I’ve seen more than one Rubyist, myself included, say that code-as-data, a concept borrowed from Lisp’s syntax, is possible and regularly practiced in Ruby. DSLs and class-oriented little languages accomplish this, to some degree. In my experience, this metaprogramming is really happening at the class level, using the class to hold data that dynamic code parses to generate new behaviors.

In contrast, Clojure, being a Lisp, programs really are data. To wit, this is the crux of my stack machine; the actual stack machine program is a Clojure data structure that in turn specifies some Clojure functions to execute:

    (def program
      [['mpush 1]
       ['mpush 2]
       ['madd]
       ['mpush 4]
       ['msub]
       ['mhalt]])

    (run program)

If you’ve never looked at Clojure or Lisp code, just squint and I bet you’ll keep up. This snippet defines a global variable, of sorts, program, whose value is a list of lists (think Arrays) specifying the instructions in my stack machine program. In short, this program pushes two values on the stack, 1 and 2, adds them, pushes another value 4, subtracts 4 from the result of the addition, and then halts, which prints out the current state of the “stack” registers.

I’ve got a function named run which takes all these instructions, does some Clojure things, then hands them off to instruction functions for execution.

Some familiar idioms

Let’s look at run. It’s really simple.

    (defn run [instructions]
      (reduce execute initial-state instructions))

This function takes one argument, instructions, a Clojure collection (generally called a seq; this one in particular is a vector). Clojure has an amazing library of functions that operate on collections, just as Ruby has Enumerable. In fact, reduce in Clojure is the same idea as inject in Ruby (reduce is aliased to inject in Ruby!). The way I’m calling it says “iterate over a collection instructions, calling execute on each item; on the first iteration, use initial-state as the initial value of the accumulated collection”.

initial-state is another global variable whose value is a mapping (in Ruby, a hash) that maintains the state of the machine. It has two keys, op-a and op-b, representing my two stack-ish registers.

    (def initial-state
      {:op-a nil :op-b nil})

Now you’d expect to find an execute function that takes a collection plus a value and generates a new version of the collection, just like Ruby’s inject. And here that function is:

    (defn execute [state inst]
      (let [fun (ns-resolve *ns* (first inst))
            params (rest inst)]
        (apply fun [params state])))

This one might require extra squinting for eyes new to Clojure. execute takes two arguments, the current state of the stack machine, state, and the instruction to execute, inst. It then uses let to create local variables based on the values of function’s parameters. I use Clojure’s mechanism for turning a quoted variable name (quoting, in Lisp, means escaping a variable name so the interpreter doesn’t try to evaluate it) into a function reference. Because the instruction is of the form [instruction-name arg arg arg ...], I use first and rest to split the instruction into the function name, bound to fun and argument list, bound to params.

The meat of the function “applies” the function I extracted in the let block to the arguments I extracted out of the instruction. Think of apply like send in Ruby; it’s a way to call a function when you have a reference to it.

The sharp reader would now start searching for a bunch of functions, each of which implements an instruction for our stack machine. And so…

Some boilerplate arrives

Here is the implementation for mpush, madd, and mhalt:

    (defn mpush [params state]
      (let [a (state :op-a)
            b (state :op-b)
            v (first params)]
        {:op-a v :op-b a}))

    (defn madd [params state]
      (let [a (state :op-a)
            b (state :op-b)]
        {:op-a (+ a b) :op-b nil}))

    (defn mhalt [params state]
      (println state))

Each instruction takes some arguments and the state of the machine. They do some work and return a new state of the stack machine. Easy, and oh-so-typically functional!

These instructions are where I’d introduce something clever-ish in Ruby. That let where the register values are extracted feels really boilerplate-y. In Ruby, I know what I would do about that: a method taking a block, probably.

I’m not sure how I’d clean this up in Clojure. A macro, a function abstraction? I leave it as an exercise to the reader, and to myself, to find something that involves less copypasta each time a new instruction is implemented.

I found some pleasant surprises in this foray into Clojure:

Building programs from bottom-up functions in a functional language is at least as satisfying as doing the same with a TDD loop in an object-oriented language. It is just a conducive to dividing a problem into quickly solved blocks and then putting the whole thing together. It does, however, lack a repeatable verification artifact as a secondary output.
At first I was a little skeptical of the fact that Clojure mappings (hashes) can be treated as data structures, by passing them to functions, or as functions, by calling them using a key to extract as the parameter. In practice, this is a really awesome thing and it’s a nice way to write one’s own abstractions as well. There’s something to using higher-order functions more prevalently than Ruby does.
The JVM startup isn’t quick in absolute terms, but at this point it’s faster than almost any Rails app, and many pure Ruby apps, to boot. Damning praise for the JVM and Ruby, but the take-away is I never felt distracted our out-of-flow due to waiting around on the JVM.

Bottom line: there’s a lot to like in Clojure. It’s likely you’ll read about more forays into Clojure in this space.

Adam Keys is Thinking

I got Clojure stacks

Data structures as programs as data

Some familiar idioms

Some boilerplate arrives