Designing for Concurrency

A lot is made about how difficult it is to write multi-threaded programs. No doubt, it is harder than writing a CRUD application or your own testing library. On the other hand, it’s not as difficult as writing a database or 3D graphics engine. The point is, it’s worth learning how to do. Skipping the hubris and knowing your program will have bugs that require discipline to track down is an enabling step to learning to write multithreaded programs.

I haven’t seen much written about the experience of writing a concurrent program and how one designs classes and programs with the rules of concurrency in mind. So let’s look at what I’ve learned about designing threaded programs so far.

The headline is this: only allow objects in consistent states and don’t rely on changing state unless you have to. Let’s first look at a class that does not embody those principles at all.

class Rectangle
  attr_accessor :width, :height

  def orientation
    if width > height
      WIDE
    else
      TALL
    end
  end

  WIDE = "WIDE".freeze
  TALL = "TALL".freeze
end

Just for fun, mentally review that code. What are the shortcomings, what could go wrong, what would you advise the writer to change?

For our purposes, the first flaw is that new Rectangle objects are in an inconsistent state. If we create an object and immediately call orientation, bad things will happen. If you’re typing along at home:

begin
  r = Rectangle.new
  puts r.orientation
rescue
  puts "whoops, inconsistent"
end

The second flaw is that our object allows bad data. We should not be able to do this:

r.width = 100
r.height = -20
puts r.orientation

Alas, we can. The third flaw is that we could accidentally share this object across threads and end up messing up the state in one threads because of logic in another thread. This sort of bug is really difficult to figure out, so designing our objects so it can’t happen is highly desirable. We want to make this sort of code safe:

r.height = 150
puts r.orientation

When we modify width or height on a rectangle, we should get back an entirely new object.

Let’s go about fixing each of these flaws.

Encapsulate object state with Tell, Don’t Ask

The first flaw in our Rectangle class is that it isn’t guaranteed to exist in a consistent state. We go through contortions to make sure our databases are consistent; we should do the same with our Ruby objects too. When an object is created, it should be ready to go. It should not be possible to create a new object that is inconsistent.

Further, we can solve the second flaw by enforcing constraints on our objects. We use the “Tell, Don’t Ask” principle to ensure that when users of Rectangle change the object’s state, they don’t get direct access to the object’s state. Instead, they must pass through guards that protect our object’s state.

All of that sounds fancy, but it really couldn’t be simpler. You’re probably already writing your Ruby classes this way:

class Rectangle
  attr_reader :width, :height

  def initialize(width, height)
    @width, @height = width, height
  end

  def width=(w)
    raise "Negative dimensions are invalid" if w < 0
    @width = w
  end

  def height=(h)
    raise "Negative dimensions are invalid" if h < 0
    @height = h
  end

  def orientation
    if width > height
      WIDE
    else
      TALL
    end
  end

end

A lot of little things have changed in this class:

  • The constructor now requires the width and height arguments. If you don’t know the width and height, you can’t create a valid rectangle, so why let anyone get confused and create a rectangle that doesn’t work? Our constructor now encodes and enforces this requirement.
  • The width= and height= setters now enforce validation on the new values. If the constraints aren’t met, a rather blunt exception is raised. If everything is fine, the setters work just like they did in the old class.
  • Because we’ve written our own setters, we use attr_reader instead of attr_accessor.

With just a bit of code, a little explicitness here and there, we’ve now got a Rectangle whose failure potential is far smaller than the naive version. This is simply good design. Why wouldn’t you want a class that is designed not to silently blow up in your face?

The crux of the biscuit for this article is that now we have an object with a narrower interface and an explicit interface. If we need to introduce a concurrency mechanism like locking or serialization (i.e. serial execution), we have some straight-forward places to do so. An explicit interface, specific messages an object responds to, opens up a world of good design consequences!

Lean towards immutability and value objects whenever possible

The third flaw in the naive Rectangle class is that it could accidentally be shared across threads, with possibly hard to detect consequences. We can get around that using a technique borrowed from Clojure and Erlang: immutable objects.

class Rectangle
  attr_reader :width, :height

  def initialize(width, height)
    validate_width(width)
    validate_height(height)
    @width, @height = width, height
  end

  def validate_width(w)
    raise "Negative dimensions are invalid" if w < 0
  end

  def validate_height(h)
    raise "Negative dimensions are invalid" if h < 0
  end

  def set_width(w)
    self.class.new(w, height)
  end

  def set_height(h)
    self.class.new(width, h)
  end

  def orientation
    if width > height
      WIDE
    else
      TALL
    end
  end

end

This version of Rectangle further extracts the validation logic into separate methods so we can call it from the constructor and from the setters. But, look more closely at the setters. They do something you don’t often see in Ruby code. Instead of changing self, these setters create an entirely new Rectangle instance with new dimensions.

The upside to this is, if you accidentally share an object across threads, any changes to the object will result in a new object owned by the thread that initiated the change. This means you don’t have to worry about locking around these Rectangles; in practice, sharing is, at worst, copying.

The downside to this side is you could end up with a proliferation of Rectangle objects in memory. This puts pressure on the Ruby GC, which might cause operational headaches further down the line. Clojure gets around this by using persistent data structures that are able to safely share their internal structures, reducing memory requirements. Hamster is one attempt at bringing such “persistent” data structures to Ruby.

Let’s think about object design some more. If you’ve read up on domain-driven design, you probably recognize that Rectangle is a value object. It doesn’t represent any particular rectangle. It binds a little bit of behavior to a domain concept our program uses.

That wasn’t so hard, now was it

I keep trying to tell people that, in some ways, writing multithreaded program is as simple as applying common object-oriented design principles. Build objects that are always in a sensible state, don’t allow twiddling that state without going through the object’s interface, use value objects when possible, and consider using immutable value objects if you’re starting from scratch.

Following these principles drastically reduces the number of states you have to think about and thus makes it easier to reason about how the program will run with multiple threads and how to protect data with whatever form of lock is appropriate.

5 thoughts on “Designing for Concurrency

  1. In Ruby, thread safety is NP-complete :/. Clojure: made in 2007, Ruby: made in 1994.

    Hamster looks really great though, thanks for the link. See a little CL & Clojure in here. I think the memoization module should raise an exception if its given a method with non-zero arity, I’ll see if the author wants any help with it.

  2. John Donson

    “I haven’t seen much written about the experience of writing a concurrent program and how one designs classes and programs with the rules of concurrency in mind.”

    You have got to be kidding. There has been endless literature written over the last 30 years about these two topics. Perhaps more than any other field in CS it’s constantly being written about.

    T. G. Mattson, B. A. Sanders, and B. L. Massingill. Patterns for Parallel Programming. Addison-Wesley, 2004.

    Is probably currently the authoritative survey work at the moment. It says parallel programming, but in their language this includes concurrency.

  3. I don’t think there’s any language where thread safety is provable, but it doesn’t mean we can’t make a best attempt. Routing packages being NP doesn’t stop FedEx or UPS.

  4. John, do you have any less formal works to suggest? IMHO, textbooks and papers are useful to some, but simple and concise treatments of concurrency is what we need to make multithreaded programming an approachable topic for programmers of all skill levels.

Comments are closed.