Adam Keys is Thinking

June 14, 2012

AC/DC writes robust songs

AC/DC writes songs that are fundamentally very strong. They aren’t the most touching, artistically composed songs. But they’re very solid songs. They hold together well, you can sing along, they don’t ramble on longer than they should.

How robust is AC/DC’s songwriting?

[youtube=www.youtube.com/watch

You can throw bagpipes into one of their songs and it still holds up just fine. That’s solid songwriting.

June 12, 2012

They can't all be winners

My Tuesdays typically look like this: write/hack for my weblog, work, lunch, work, short run, and then hack with other Austin nerds at Houndstooth Coffee. As it happens, I did OK on the write/hack, awesome at my first work sprint, OK at my second work sprint, OK on my run, and I’m currently kicking ass in my evening hacks on Sifter.

They can’t all be winners. If you’ve got enough fires going, one is bound to get hot on any given day. Push through the little disappointments to reach those moments of awesomeness.

June 10, 2012

The forces of change on the US legislature

As of 2012, the major forces operating on the legislation of the US government are, unscientifically speaking:

60% path dependence
20% regulatory capture
10% marginal progress
9% political posturing

Everything else, I’d guess, is a rounding error that fits in that last 1%.

Path dependence, in short, means that once you decide to do something, it’s difficult to unwind all the decisions that follow that original decision. Once you build a military-industrial complex, farm subsidy program, or medical program for the elderly, it becomes increasingly difficult to stop doing those things. You’re invested.

Regulatory capture is a phenomenon where a regulated industry, say telecom, becomes sufficiently cozy with the institutions regulating them that they can manipulate the regulators to ease the boundaries they must operate within, or even impose rules making it difficult for new competitors to enter the industry. To some extent, the prisoners run the asylum. Barring an extremely jarring event, like a financial emergency, the regulated can grow their profit margins, comfortable knowing that competitors are increasingly unlikely. More often, regulatory capture is about protecting the golden egg. Path dependence, in the form of subsidies and existing contracts, often goes hand-in-hand with regulatory capture.

Marginal progress is exactly what politicians are not rewarded for. They are rewarded for having strong ties to those with strong ties, for saying the right things, and staying out of the public eye. Politicians don’t enhance their career by doing what they tell their voters they seek to do.

Political posturing is what legislators are rewarded for. If they fail to accomplish what they’ve told voters they will do, they can always blameshift it away: not enough political will, distasteful political advesaries, more pressing problems elsewhere.

This seems cynical, but I’ve got my reasons:

I find it helps to understand the forces at play before you try to figure out what to invest optimism in.
Understanding a system is the first step towards making meaningful changes within it.

Actually, that’s a pretty good way to summarize my approach to understanding the systems of the world: understand the forces, learn the mechanisms, figure out how this system is interconnected to the other systems. The interconnections are the fun parts!

June 5, 2012

Keep your application state out of your queue

I’m going to pick on Resque here, since it’s nominally great and widely used. You’re probably using it in your application right now. Unfortunately, I need to tell you something unsettling.

¡DRAMA MUSIC!

There’s an outside chance your application is dropping jobs. The Resque worker process pulls a job off the queue, turns it into an object, and passes it to your class for processing. This handles the simple case beautifully. But failure cases are important, if tedious, to consider.

What happens if there’s an exception in your processing code? There’s a plugin for that. What happens if you need to restart your Resque process while a job is processing? There’s a signal for that. What if, between taking a job off the queue and fully processing it, the whole server disappears due to a network partition, hardware failure, or the FBI “borrowing” your rack?

¡DRAMA MUSIC!

Honestly, you shouldn’t treat Resque, or even Redis, like you would a relational or dynamo-style database. Redis, like memcached, is designed as a cache. You put things in it, you can get it out, really fast. Redis, like memcached, rarely falls over. But if it does, the remediation steps are manual. Redis currently doesn’t have a good High Availability setup (it’s being contemplated).

Further, Resque assumes that clients will properly process every message they dequeue. This isn’t a bad assumption. Most systems work most of the time. But, if a Resque worker process fails, it’s not great. It will lose all of the message(s) held in memory, and the Redis instance that runs your Resque queues is none the wiser.

¡DRAMA MUSIC!

In my past usage of Resque, this isn’t that big of a deal. Most jobs aren’t business-critical. If the occasional image doesn’t get resized or a notification email doesn’t go out, life goes on. A little logging and data tweaking cures many ills.

But, some jobs are business-critical. They need stronger semantics than Resque provides. The processing of those jobs, the state of that processing, is part of our application’s logic. We need to model those jobs in our application and store that state somewhere we can trust.

I first became really aware of this problem, and a nice solution to it, listening to the Ruby Rogues podcast. Therein, one of the panelists advised everyone to model crucial processing as state machines. The jobs become the transitions from one model state to the next. You store the state alongside an appropriate model in your database. If a job should get dropped, it’s possible to scan the database for models that are in an inconsistent state and issue the job again.

Example follows

Let’s work an example. For our imaginary application, comment notifications are really important. We want to make sure they get sent, come hell or high water. Here’s what our comment model looks like originally:

    class Comment

      after_create :send_notification

      def send_notification
        Resque.enqueue(NotifyUser, self.user_id, self.id)
      end

    end

Now we’ll add a job to send that notification:

    class NotifyUser
      @queue = :notify_user

      def self.perform(user_id, comment_id)
        # Load the user and comment, send a notification!
      end

    end

But, as I’ve pointed out with great drama, this code can drop jobs. Let’s throw that state machine in:

    class Comment
      # We'll use acts-as-state-machine. It's a classic.
      include AASM

      # We store the state of sending this notification in the aptly named
      # `notification_state` column. AASM gives us predicate methods to see if this
      # model is in the `pending?` or `sent?` states and a `notification_sent!`
      # method to go from one state to the next.
      aasm :column => :notification_state do
        state :pending, :initial => true
        state :sent

        event :notification_sent do
          transitions :to => :sent, :from => [:pending]
        end
      end

      after_create :send_notification

      def send_notification
        Resque.enqueue(NotifyUser, self.user_id, self.id)
      end

    end

Our notification has two states: pending, and sent. Our web app creates it in the pending state. After the job finishes, it will put it in the sent state.

    class NotifyUser
      @queue = :notify_user

      def self.perform(user_id, comment_id)
        user    = User.find(user_id)
        comment = Comment.find(comment_id)

        user.notify_for(comment)
        # Notification success! Update the comment's state.
        comment.notification_sent!
      end

    end

This a good start for more reliably processing jobs. However, most jobs happen to handle the interaction between two systems. This notification is a great example. It integrates our application with a mail server or another service that handles our notifications. Talking to those things is probably something that isn’t tolerant to duplicate requests. If our process croaks between the time it tells the mail server to send and the time it updates the notification state in our database, we could accidentally process this notification twice. Back to square one?

¡DRAMA MUSIC!

Not quite. We can reduce our problem space once more by adding another state to our model.

    class Comment
      include AASM

      aasm :column => :notification_state do
        state :pending, :initial => true
        state :sending # We have attempted to send a notification
        state :sent    # The notification succeeded
        state :error   # Something is amiss :(

        # This happens right before we attempt to send the notification
        event :notification_attempted do
          transitions :to => :sending, :from [:pending]
        end

        # We take this transition if an exception occurs
        event :notification_error do
          transitions :to => :error, :from => [:sending]
        end

        # When everything goes to plan, we take this transition
        event :notification_sent do
          transitions :to => :sent, :from => [:sending]
        end

      end

      after_create :send_notification

      def send_notification
        Resque.enqueue(NotifyUser, self.user_id, self.id)
      end

    end

Now, when our job processes a notification, it first uses notification_attempted. Should this job fail, we’ll know which jobs we should look for in our logs. We could also get a little sneaky and monitor the number of jobs in this state if we think we’re bottlenecking around sending the actual notification. Once the job completes, we transition to the sent state. If anything goes wrong, we catch the exception and put the job in the error state. We definitely want to monitor this state and use the logs to figure out what went wrong, manually fix it, and perhaps write some code to fix bugs or add robustness.

The sending state is entered when at least one worker has picked up a notification and tried to send the message. Should that worker fail in sending the message or updating the database, we will know. When trouble strikes, we’ll know we have two cases to deal with: notifications that haven’t been touched at all, and notifications that were attempted and may have succeeded. The former, we’ll handle by requeueing them. The latter, we’ll probably have to write a script to grep our mail logs and see if we successfully sent the message. (You are logging everything, centrally aggregating it, and know how to search it, right?)

The truth is, the integration points between systems is a gnarly problem. You don’t so much solve the problem; you get really good at detecting and responding to edge cases. Thus is life in production. But losing jobs, we can make really good progress on that. Don’t worry about your low-value jobs; start with the really important ones, and weave the state of those jobs into the rest of your application. Some day, you’ll thank yourself for doing that.

June 2, 2012

Convincing yourself you’re not done

Writer’s block gets all the attention. It robs the inspired and stunts the progress of those with a deadline to beat. It’s a starting problem.

At some point, I learned all the tricks for overcoming the start, for getting past the blank canvas. Now, I find myself challenged by the converse. I have a finishing problem. I’m always convincing myself that I’m not done.

How do I get a bunch of words to feel like a cohesive essay? What’s needed to ship this code? How do I get this awesome joke to fit into one little tweet?!

It’s an ongoing challenge. Even if the essay, code, or joke I’m working on isn’t throwing me curveballs, my head can jump in and impose one. Not eloquent enough, has a potential bug, too obtuse. My brain can come up with any number of ways to convince me that I shouldn’t call the thing done.

Here are some things I’ve been trying to outthink my brain:

Before I sit down to make something, decide what the goal for the session is. Am I trying to get started, explore a new direction, edit or refactor something, or push through the details needed to finish?
When I start something, outline it. What is the beginning, middle, and end of the thing? What is the result? What are the materials (example code, a demonstrative screenshot, a funny picture), and do I need to acquire or create them?
Put up a little resistance when the temptation to start something new strikes. Consider whether it’s an exploration or a creation. Can I easily turn it into something I can publicize (on my weblog, GitHub, etc.), or is it an intermediate or even throwaway product?

I’m not sure if any of these will prove reliable finishers. Your mileage may vary.

Here’s my desktop folder. I shoved all my previously unfinished projects in another folder and wiped the slate clean.

Game on.

May 17, 2012

Tables and lambdas, a cure for smelly cases

Lots of folks consider case expressions in Ruby a code smell. I’m not ready to write them off just yet, but I know a good replacement for some uses of case when I see it. Rad co-worker David Copeland’s Lookup Tables With Lambdas is one of those replacements. For cases where a method takes a parameter, throws it into a case, and returns a value, I can replace all that lookup business with a hash lookup. To carry the metaphor through, the hash is the lookup table. Rad.

Where it gets fun is when I need to do some kind of dynamic lookup in the hash. Normally I wouldn’t want to do that when the Ruby interpreter parses my hash literal. If I reach into my functional programming bag of tricks, I recall that lambdas can be used to defer evaluation. And that’s exactly what David recommends. If I’ve got database lookups or logic I need to embed in my tables, Ruby’s lambda comes to the rescue!

This approach works great at the small-to-medium scale. That said, I always keep in mind that a bunch of methods manipulating a hash, using its keys as a convention, is an encapsulated, orthogonal object begging to happen. Remember, it’s Ruby; we can make our objects behave like hashes but still do OO- and test-driven design.

May 15, 2012

Turns out I was wrong about RSpec subjects

I was afraid that David Chelimsky was going to take away my toys! Consider, explicit use of subject in RSpec considered a smell:

The problem with this example is that the word “subject” is not very intention revealing. That might not appear problematic in this small example because you can see the declaration on line 3 and the reference on line 6. But when this group grows to where you have to scroll up from the reference to find the declaration, the generic nature of the word “subject” becomes a hinderance to understanding and slows you down.

I’m so guilty of using subject heavily. Even worse, I’ve been advocating it to others too. In my defense, it does lend a good deal of concision to specs and seemed like a golden path.

Luckily, David isn’t taking away my toys. He’s got an even better recommendation: just use a method or let with a intention-revealing name. Here’s his example:

describe Article do
  def article; Article.new; end

  it "validates presence of :title" do
    article.should validate_presence_of(:title)
  end
end

This is, now that I’m looking at it, way better. As this spec grows, you can add helpers for article_with_comments, article_with_author, etc. and it’s clear right on the line that helper is used what’s going on. No jumping back and forth between contexts. Thumbs up!

April 28, 2012

Three Easy Essays on Distributed Systems

Ryan Smith is pretty good at thinking about distributed systems. Distributed systems, the systems we (sometimes unwittingly) create on a regular basis these days, are a complicated, dense, far-reaching topic. Ryan’s managed to take a few of its problems and concisely introduce them with simple solutions that apply to all but the largest systems.

In The Worker Pattern, he presents a novel solution to a problem you are probably tackling with background or asynchronous job queues. Teaser: do you know what the HTTP 202 status code does?

A web service that requires high throughput will undoubtedly need to ensure low latency while processing requests. In other words, the process that is serving HTTP requests should spend the least amount of time possible to serve the request. Subsequently if the server does not have all of the data necessary to properly respond to the request, it must not wait until the data is found. Instead it must let the client know that it is working on the fulfillment of the request and that the client should check back later.

Coordinating multiple processes that need to process a dataset in bulk is tricky. Large systems usually end up needing some kind of Paxos service like Doozer or ZooKeeper to keep all the worker processes from butting heads or duplicating work. Leader Election shows how, by scoping the problem space to existing tools, it becomes possible to put together a solution that scales down to small and medium-sized systems:

My environment already is dependent on Ruby & PostgreSQL so I want a solution that leverages my existing technologies. Also, I don’t want to create a table other than the one which I need to process.

As applications grow, they tend to maintain more and more state across more and more systems. Incidental state is problematic, especially when you have to maintain several services to keep all of it available. Applying Event Buffering mitigates many of these problems. The core idea of this one is my favorite:

We have seen several examples of how to transfer state from our client to our server. The primary reason that we take these steps to transfer state is to eliminate the number of services in our distributed system that have to maintain state. Keeping a database on a service eventually becomes and operational hazard.

Most of the systems we build on the web today are distributed systems. Ryan’s writings are an excellent introduction to thinking about and building these systems. It certainly helps to comb through research papers on the topic, but these three essays are excellent starters down the path to intentionally building distributed systems.

April 20, 2012

Ruby anthropology with Hopper

Zach Holman is doing some interesting code anthropology on the Ruby community. Consider Aggressively Probing Ruby Projects:

Hopper is a Sinatra app designed to pull down tens of thousands of Ruby projects from GitHub, snapshot each repository into ten equidistant revisions, run them through a battery of tests (which we call Probes), and hopefully come up with some deeply moving insights about how we write Ruby.

There are plenty of code metric gizmos out there. At a glance, Hopper takes a few nice steps over extant projects. Unlike previous tools, it has a clear design, an obvious extension mechanism, and the analysis tools are distinct from the reporting tools. Further, it’s designed to run out-in-the-open, on existing open source projects. This makes it immediately useful and gives it a ton of data to work with.

For entertainment, here’s some information collected on some stuff I worked on at Gowalla: Chronologic and Audit.

April 19, 2012

A real coding workspace

Do you miss the ability to take a bunch of paper, books, and writing utensils and spread them out over a huge desk or table? Me too!

Light Table is based on a very simple idea: we need a real work surface to code on, not just an editor and a project explorer. We need to be able to move things around, keep clutter down, and bring information to the foreground in the places we need it most.

This project is fantastic. It’s taking a page from the Smalltalk environments of yore, cross-referencing that with Bret Victor’s ideas on workspace interactivity. The result is a kick-in-the-pants to almost every developer’s current workflow.

There’s a lot to think about here. A lot of people focus on making their workflow faster, but what about a workspace that makes it easier to think? There’s a lot of room to design a better workspace, even if you’re not going as far as Light Table does.

There’s a project on Kickstarter to fund further development of Light Table. If you write software, it’s likely in your interest to chip in.

April 17, 2012

UserVoice's extremely detailed project workflow

Some nice people at UserVoice took the time to jot down how they manage their product. Amongst the lessons learned:

Have a set amount of time per week that will be spent on bugs

We have roughly achieved this by setting a limit on the number of bugs we’ll accept into Next Up per week. This was a bit contentious at first but has resolved a lot of strife about whether a bug is worthy. The customer team is now empowered (or burdened) with choice of choosing which cards will move on. It’s the product development version of the Hunger Games.

This, to me, is an interesting juxtaposition. Normally, I think of bugs as things that should all be fixed, eventually. Putting some scarcity of labor into them is a great idea. Fixing bugs is great, until it negatively affects morale. Better to address the most critical and pressing bugs and then move the product ball forward. A mechanism to limit the number of bugs to fix, plus the feedback loop of recognizing those who fix bugs in an iteration (they mention this elsewhere in the article), is a great idea.

April 14, 2012

Cowboy dependencies

So you’ve written a cool open source library. It’s at the point where it’s useful. You’re pretty excited. Even better, it seems like something that might be useful at your day job. You could go ahead and integrate it. Win-win! You get to work out the rough edges on your open source project and make progress on your professional project.

This is tricky ground and it’s not as win-win as you might think. Integrating a new dependency, whether its one maintained by a team-mate or not, requires communication. Everyone on the team will have to know about the dependency, how to work with it, and how to maintain it within the project. If there’s a deal-breaking concern with the library, consider it feedback on your library; it either needs to better address the problem, or it needs better documentation to address why the problem isn’t so much a problem.

It all comes down to communication. Adding a dependency, even if you know the person who wrote it really well, requires collaboration from your teammates. If you’re not talking to your teammates, you’re just cowboy coding.

Don’t cowboy dependencies into your project!

March 30, 2012

A Presenter is a signal

When someone says “your view or API layer needs presenters”, it’s easy to get confused. Presenter has become wildcard jargon for a lot of different sorts of things: coordinators, conductors, representations, filter, projections, helpers, etc. Even worse, many developers are in the “uncanny valley” stage of understanding the pattern; it’s close to being a thing, but not quite. I’ve come across presenters with entirely too much logic, presenters that don’t pull their weight as objects, and presenters that are merely indirection. Presenter is becoming a catch-all that stands for organizing logic more strictly than your framework typically provides for.

I could make it my mission to tell every single person they’re wrong about presenters, but that’s not productive and it’s not entirely correct. Rather, presenters are a signal. When you say “we use presenters for our API”, I hear you say “we found we had too much logic hanging around in our templates and so we started moving it into objects”. From there on out, every application is likely to vary. Some applications are template focus and so need objects that are focused on presentational logic. Other apps are API focused and need more support in the area of emitting and parsing JSON.

At first, I was a bit concerned about the explosion of options for putting more objects into the view part of your typical model-view-controller application. But as I see more applications and highly varied approaches, I’m fine with Rails not providing a standard option. Better to decide what your application really needs and craft it yourself or find something viable off the shelf.

As long as your “presenters” reduce the complexity of your templates and makes logic easier to decouple and test, we’re all good, friend.

March 24, 2012

Learn Unix the Jesse Storimer way

11 Resources for Learning Unix Programming:

I tend to steer clear of the thick reference books and go instead for books that give me a look into how smart people think about programming.

I have a soft spot in my heart for books that are way too long. But Jesse’s on to something, I think. The problem with big Unix books is that they are tomes of arcane rites; most of it just isn’t relevant to those building systems on a modern Unix (Linux) with modern tools (Java, Python, Ruby, etc.).

Jesse’s way of learning you a Unix is way better, honestly. Read concise programs, cross-reference them with manual pages. Try writing your own stuff. Rinse. Repeat.

March 20, 2012

How to approach a database-shaped problem

When it comes to caching and primary storage of an application’s data, developers are faced with a plethora of shiny tools. It’s easy to get caught up in how novel these tools are and get over enthusiastic about adopting them; I certainly have in the past! Sadly, this route often leads to pain. Databases, like programming languages, are best chosen carefully, rationally, and somewhat conservatively.

The thought process you want to go through is a lot like what former Gowalla colleague Brad Fults did at his new gig with OtherInbox. He needed to come up with a new way for them to store a mapping of emails. He didn’t jump on the database of the day, the system with the niftiest features, the one with the greatest scalability, or the one that would look best on his resume. Instead, he proceeded as follows:

Describe the problem domain and narrow it down to two specific, actionable challenges
Elaborate on the existing solution and its shortcomings
Identify the possible databases to use and summarize their advantages and shortcomings
Describe the new system and how it solves the specific challenges

Of course, what Brad wrote is post-hoc. He most likely did the first two steps in a matter of hours, took some days to evaluate each possible solution, decided which path to take, and then hacked out the system he later wrote about.

But more importantly, he cheated aggressively. He didn’t choose one database, he chose two! He identified a key unique attribute to his problem; he only needed a subset of his data to be relatively fresh. This gave him the luxury of choosing a cheaper, easier data store for the complete dataset.

In short: solve your problem, not the problem that fits the database, and cheat aggressively when you can.

March 15, 2012

How I use vim splits

[vimeo www.vimeo.com/38571167 w=500&h=394]

A five-minute exploration of how I use splits in vim to navigate between production or test code and how I move up and down through layers of abstractions. Spoiler: I use vertical splits to put test and production code side-by-side; horizontal splits come into play when I’m navigating through the layers of a web app or something with a client/server aspect to it. I haven’t customized vim to make this work; I just use the normal window keybindings and a bunch of commands from vim-rails.

I seriously heart this setup. It’s worth taking thirty minutes to figure out how you can approximate it with whatever editor setup you enjoy most. Hint: splits are really fantastic.

March 11, 2012

Tool agnosticism is good for you

When it comes to programming editors, frameworks, and languages, you’re likely to take one of three stances: marry, boff, or kill. There are tools that you want to use for the rest of your life, tools that you want to moonlight or tinker with, and tools you never want to use again in your life.

Tool antagonism, ranting about the tools you want to kill, is fun. It plays well on internet forums. It goes nicely with beer and friends. But on a team that isn’t using absolutely terrible tools, it’s a waste of time.

Unless your team is bizarrely like-minded, it’s likely some disagreement will arise along the lines of editors, frameworks, and languages. I’ve been that antagonistic guy. We can’t use this language, it has an annoying feature. We can’t use this framework, it doesn’t protect us from an annoying kind of bug. I’ve learned these conversations are a waste of social capital and unnecessarily divisive. Don’t be that guy.

There are usually a few tools that are appropriate to a given task. I often have a preference and choose specific tools for my personal projects. There are others tools that I’m adept at using or have used before, but find minor faults with. I’ve found it way better for me to accept that other, non-preferred tool if it’s already in place or others have convictions about using it. Better to put effort into making the project or product better than spinning wheels on programming arcanery.

When it comes to programmer tools, rational agnosticsm beats antagonism every time. Train yourelf to work amazingly-great with a handful of tools, reach adeptness with a few others, and learn how to think with as many tools as possible.

March 8, 2012

Rails 4, all about scaling down?

To some, Rails is getting big. It brings a lot of functionality to the table. This makes apps easier to get off the ground, especially if you aren’t an expert. But, as apps grow, it can lead to the pain; there’s so much machinery in Rails, it’s likely you’re abusing something. It’s easy to look at other, smaller tools and think there’s green grass over there.

Rails 4 might have answers to this temptation.

On the controller side of things, it seems likely that some form of Strobe’s Rails extensions will find their way into Rails, making it easier to create apps (or sub-apps) that are focused on providing an API and eschew the parts of ActionPack that aren’t necessary for services. The thing I like a lot about this idea is it covers a gap between Sinatra and Rails. You can prototype your app with all the conveniences of Rails and then strip out the parts you don’t need as you grow the app and strip it down to provide quick services. You could certainly still rewrite services in Sinatra, Grape, or Goliath, but it’s nice to have an option.

On the model side of things, people are, well, modeling. Simpler ways to use use ActiveModel with ActionPack in Rails will appear in Rails 4. The components the DataMapper team is working on, in the form of Virtus seem really interesting too. If you want to get started now, you can check out ActiveAttr right now, sort of the bonus track version of ActiveModel. Chris Griego’s put a lot of solid thought into this; you definitely want to check out his slides on models everywhere; they’re lurking in your controllers, your requests, your responses, your API clients, everywhere.

In short, my best guess on Rails 4, right now, is that it will continue to give developers a curated set of choices and frameworks to get their application off the ground. It will add options to grow your application’s codebase sensably once it’s proven out.

What I know, for sure, is that the notion of Rails 4 seems really strange to me. How fast time flies. Uphill, both ways.

March 4, 2012

This silver bullet has happened before and it will happen again

Today it's Node. Before it was Rails. Before it was PHP. Before it was Java. Cogs Bad:

There’s a whole mindset - a modern movement - that solves things in terms of working out how to link together a constellation of different utility components like noSQL databases, frontends, load balancers, various scripts and glue and so on. Its not that one tool fits all; its that they want to use all the shiny new tools. And this is held up as good architecture! Good separation. Good scaling.

I’ve fallen victim to this mindset. Make everything a Rails app, solve all the problems with Ruby, store all the data in distributed databases! It’s not that any of these technologies are wrong it’s just they might not yet be right for the problem at hand.

You can almost age generations of programmers like tree rings. I’m of the PHP/Rails generation, though I started in the Linux generation. A few years ago, I thought I could school the Java generations. But it turns out, I’ve learned a ton from them, even when I was a bit of a hubristic brat. The Node generation of developers will teach me more still, both in finding new virtuous paths and in going down false paths so I don’t have to follow them.

That said, it would be delightful if there was a shortcut to get these new generations past the “re-invent all the things!” phase and straight into the “make useful things and constructive dialog about how they’re better” phase.

March 1, 2012

Bootstrap, subproject, and document your way to a bigger team

Zach Holman's slides on patterns GitHub uses to scale their team Ruby Patterns from GitHub's Codebase:

Your company is going to have tons of success, which means you'll have to hire tons of people.

My favorites:

Every project gets a script/bootstrap for fetching dependencies, putting data in place, and getting new people ready to go ASAP. This script comes in handy for CI too.
Try new techniques by deploying it only to team members at first. The example here was auto-escaping markup. They started with this only enabled for staff, instead of turning it on for everyone and feeling the hurt.
Build projects within projects. Inevitably areas of functionality start to get so complex or generic that they want to be their own thing. Start by partitioning these things into lib/some_project, document it with a read me in lib/some_project and put the tests in test/some_project. If you need to share it across apps or scale it differently someday, you can pull those folders out and there you go.
Write internal, concise API docs with TomDoc. Most things only need 1-3 lines of prose to explain what’s going on. Don’t worry about generating browse-able docs, just look in the code. I heart TomDoc so much.

These ideas really aren’t about patterns, or scaling the volume of traffic your business can handle. They’re about scaling the size of your team and getting more effectiveness out of every person you add.