Curated
Etsy's rules of distributed systems
Architecting for change. Complex systems and change:
- Distributed systems are inherently complex.
- The outcome of change in complex systems is hard to predict.
- The outcome of small, frequent, measurable changes are easier to predict, easier to recover from, and promote learning.
I’d have thought all the useful things to say about Etsy were said, at this point, but I’d have thought wrong!
There’s a good saying about designing distributed systems that goes something like “avoid it as long as possible”. I think these three guidelines are worth adding to that saying. Iterate, examine, repeat. Don’t make big, tricky changes. In fact, large change you can’t recover from are nearly impossible to make anyway, so route around them entirely.
The last bit, “promote learning”, is great too. I follow distributed systems and database designers on Twitter and see tons of great papers and ideas in the exchange. More than that, always teach your teammates about the distributed systems you’re building. The more they know about the design and constraints of the system you’re making, the easier it is for them to work with those systems. If you can’t teach someone to use your system, you probably don’t understand it well enough.
Thread safety in Rails, explained!
Read up on Thread
and Queue
and ready for more multi-threaded Ruby reading? Aaron Patterson has written up how the thread safe option works in Rails and some tradeoffs involved in removing the option and making thread safety the default. It’s not as complicated as you might think!
The only rub I can see is that, as far as I can tell, he’s talking about making this the default for production mode. Making it the default for development mode isn’t tenable if you want to keep class reloading around, which almost everyone does. It’s just a hunch, but running without thread-safety in development seems weird when its the default in production. But, some teams run YARV in development and JRuby in production, so maybe I’m just making up things to worry about.
Ruby anthropology with Hopper
Zach Holman is doing some interesting code anthropology on the Ruby community. Consider Aggressively Probing Ruby Projects:
Hopper is a Sinatra app designed to pull down tens of thousands of Ruby projects from GitHub, snapshot each repository into ten equidistant revisions, run them through a battery of tests (which we call Probes), and hopefully come up with some deeply moving insights about how we write Ruby.
There are plenty of code metric gizmos out there. At a glance, Hopper takes a few nice steps over extant projects. Unlike previous tools, it has a clear design, an obvious extension mechanism, and the analysis tools are distinct from the reporting tools. Further, it’s designed to run out-in-the-open, on existing open source projects. This makes it immediately useful and gives it a ton of data to work with.
For entertainment, here’s some information collected on some stuff I worked on at Gowalla: Chronologic and Audit.
A real coding workspace
Do you miss the ability to take a bunch of paper, books, and writing utensils and spread them out over a huge desk or table? Me too!
Light Table is based on a very simple idea: we need a real work surface to code on, not just an editor and a project explorer. We need to be able to move things around, keep clutter down, and bring information to the foreground in the places we need it most.
This project is fantastic. It’s taking a page from the Smalltalk environments of yore, cross-referencing that with Bret Victor’s ideas on workspace interactivity. The result is a kick-in-the-pants to almost every developer’s current workflow.
There’s a lot to think about here. A lot of people focus on making their workflow faster, but what about a workspace that makes it easier to think? There’s a lot of room to design a better workspace, even if you’re not going as far as Light Table does.
There’s a project on Kickstarter to fund further development of Light Table. If you write software, it’s likely in your interest to chip in.
UserVoice's extremely detailed project workflow
Some nice people at UserVoice took the time to jot down how they manage their product. Amongst the lessons learned:
Have a set amount of time per week that will be spent on bugs
We have roughly achieved this by setting a limit on the number of bugs we’ll accept into Next Up per week. This was a bit contentious at first but has resolved a lot of strife about whether a bug is worthy. The customer team is now empowered (or burdened) with choice of choosing which cards will move on. It’s the product development version of the Hunger Games.
This, to me, is an interesting juxtaposition. Normally, I think of bugs as things that should all be fixed, eventually. Putting some scarcity of labor into them is a great idea. Fixing bugs is great, until it negatively affects morale. Better to address the most critical and pressing bugs and then move the product ball forward. A mechanism to limit the number of bugs to fix, plus the feedback loop of recognizing those who fix bugs in an iteration (they mention this elsewhere in the article), is a great idea.
Learn Unix the Jesse Storimer way
11 Resources for Learning Unix Programming:
I tend to steer clear of the thick reference books and go instead for books that give me a look into how smart people think about programming.
I have a soft spot in my heart for books that are way too long. But Jesse’s on to something, I think. The problem with big Unix books is that they are tomes of arcane rites; most of it just isn’t relevant to those building systems on a modern Unix (Linux) with modern tools (Java, Python, Ruby, etc.).
Jesse’s way of learning you a Unix is way better, honestly. Read concise programs, cross-reference them with manual pages. Try writing your own stuff. Rinse. Repeat.
How to approach a database-shaped problem
When it comes to caching and primary storage of an application’s data, developers are faced with a plethora of shiny tools. It’s easy to get caught up in how novel these tools are and get over enthusiastic about adopting them; I certainly have in the past! Sadly, this route often leads to pain. Databases, like programming languages, are best chosen carefully, rationally, and somewhat conservatively.
The thought process you want to go through is a lot like what former Gowalla colleague Brad Fults did at his new gig with OtherInbox. He needed to come up with a new way for them to store a mapping of emails. He didn’t jump on the database of the day, the system with the niftiest features, the one with the greatest scalability, or the one that would look best on his resume. Instead, he proceeded as follows:
- Describe the problem domain and narrow it down to two specific, actionable challenges
- Elaborate on the existing solution and its shortcomings
- Identify the possible databases to use and summarize their advantages and shortcomings
- Describe the new system and how it solves the specific challenges
Of course, what Brad wrote is post-hoc. He most likely did the first two steps in a matter of hours, took some days to evaluate each possible solution, decided which path to take, and then hacked out the system he later wrote about.
But more importantly, he cheated aggressively. He didn’t choose one database, he chose two! He identified a key unique attribute to his problem; he only needed a subset of his data to be relatively fresh. This gave him the luxury of choosing a cheaper, easier data store for the complete dataset.
In short: solve your problem, not the problem that fits the database, and cheat aggressively when you can.
This silver bullet has happened before and it will happen again
Today it's Node. Before it was Rails. Before it was PHP. Before it was Java. Cogs Bad:
There’s a whole mindset - a modern movement - that solves things in terms of working out how to link together a constellation of different utility components like noSQL databases, frontends, load balancers, various scripts and glue and so on. Its not that one tool fits all; its that they want to use all the shiny new tools. And this is held up as good architecture! Good separation. Good scaling.
I’ve fallen victim to this mindset. Make everything a Rails app, solve all the problems with Ruby, store all the data in distributed databases! It’s not that any of these technologies are wrong it’s just they might not yet be right for the problem at hand.
You can almost age generations of programmers like tree rings. I’m of the PHP/Rails generation, though I started in the Linux generation. A few years ago, I thought I could school the Java generations. But it turns out, I’ve learned a ton from them, even when I was a bit of a hubristic brat. The Node generation of developers will teach me more still, both in finding new virtuous paths and in going down false paths so I don’t have to follow them.
That said, it would be delightful if there was a shortcut to get these new generations past the “re-invent all the things!” phase and straight into the “make useful things and constructive dialog about how they’re better” phase.
Bootstrap, subproject, and document your way to a bigger team
Zach Holman's slides on patterns GitHub uses to scale their team Ruby Patterns from GitHub's Codebase:
Your company is going to have tons of success, which means you'll have to hire tons of people.
My favorites:
- Every project gets a
script/bootstrap
for fetching dependencies, putting data in place, and getting new people ready to go ASAP. This script comes in handy for CI too. - Try new techniques by deploying it only to team members at first. The example here was auto-escaping markup. They started with this only enabled for staff, instead of turning it on for everyone and feeling the hurt.
- Build projects within projects. Inevitably areas of functionality start to get so complex or generic that they want to be their own thing. Start by partitioning these things into
lib/some_project
, document it with a read me inlib/some_project
and put the tests intest/some_project
. If you need to share it across apps or scale it differently someday, you can pull those folders out and there you go. - Write internal, concise API docs with TomDoc. Most things only need 1-3 lines of prose to explain what’s going on. Don’t worry about generating browse-able docs, just look in the code. I heart TomDoc so much.
These ideas really aren’t about patterns, or scaling the volume of traffic your business can handle. They’re about scaling the size of your team and getting more effectiveness out of every person you add.
How to make a CIA spy, and other anecdotes
And the hilariously incompetent, such as the OSS operative whose cover was so far blown that when he dropped into his favorite restaurant, the band played “Boo! Boo! I’m a Spy.”
Interesting, new-to-me tidbits on what goes into making CIA spies, what they actually do in the field, and how the practitioners of spy craft have changed over the years. The bad news: spies recruitment doesn’t exactly work like in Spies Like Us. The good news: the CIA and its spying is closer to “just as bad/inept as you’d think” than “as diabolical as a James Bond villain”.
Own your development tools, and other cooking metaphors
Noel Rappin encourages all of us to use our development tools efficiently. If your editor or workflow aren’t working for you, get a new tool and learn to use it.
I’ve been working with another principle lately: minimize moving parts. I used to spend time setting up tools like autotest, guard, or spork. But it ended up that I spent too much time tweaking them or, even worse, figuring out exactly how they were working.
I’ve since adopted a much simpler workflow. Just a terminal, a text editor, and some scripts/functions/aliases/etc. for running the stuff I do all the time. I take note when I’m doing something repeatedly and figure out how I can automate it. Besides that, I don’t spend much time thinking about my tools. I spend time thinking about the problem in front of me. It makes a lot of sense, when you think about it.
I say you should “own” your tools and minimize moving parts because you should understand how they all work together and how they might change the behavior of your code. If you don’t own your tools in this way, you’ll end up wasting time debugging someone else’s code, i.e. a misbehaving tool. That’s just a waste of time; when you come across a tool that offends in this way, put aside a time block to fix it, or discard it outright.
What kind of HTTP API is that?
An API Ontology: if you were curious about what the difference between an RPC, SOAP, REST, and Hypermedia API are, but were afraid to ask. In my opinion, this is not prescription; I don't think there's anything inherently wrong with using any of these, except SOAP. Sometimes an RPC or a simple GET is all you need.
On rolling one's own metrics kit
On instrumenting Rails, custom aggregators, bespoke dashboards, and reinventing the wheel; 37signals documents their own metrics infrastructure. They’re doing some cool things here:
- a StatsD fork that stores to Redis; for most people, this is way more sensible than the effort involved in installing Graphite, let alone maintaining it
- storing aggregated metrics to flat files; it’s super-tempting to overbuild this part, but if flat files work for you, run with it
- leaning on ActiveSupport notifications for instrumentation; I’ve tinkered with this a little and it’s awesome, I highly recommend it if you have the means
- building a custom reporting app on top of their metric data; anything is better than the Graphite reporting app
More like this, please.
One could take issue with them rolling this all on their own, rather than relying on existing services. If 37signals were a fresh new shop, one would have a point. Building out metrics infrastructure, even today with awesome tools like StatsD, can turn into an epic time sink. However, once you’ve been around for several years and thought enough about what you need to measure and act on your application, rolling your own metrics kit starts to make a lot of sense. It’s fun too!
Of course, the important part is they’re measuring things and operating based on evidence. Whether you roll your own metrics infrastructure or use something off the shelf like Librato or NewRelic, operating on hard data is the coolest thing of all.
Whither code review and pairing
Jesse Storimer has great thoughts on code review and pairing. You Should be Doing Formal Code Review:
Let’s face it, developers are often overly confident in their work, and telling them that something is done wrong can be taken as a personal attack. If you get used to letting other people look at, and critque, your code then disidentification becomes a necessity. This also goes vice versa, you need to be able to talk about the code of your peers without worrying about them taking your critiques as a personal attack. The goal here is to ensure that the best code possible makes it into your final release.
I struggle with this so much, on both the giving and receiving side. When I’m reviewing code, I find myself holding back so as not to come off as saying the other person’s code is awful and offensive. On the receiving side, I often get frustrated and feel like a huge impediment has been put in front of my ability to ship code. In reality, neither is the case. Whether I’m the reviewer or the reviewee, the other party is simply trying to get the best code possible into production.
Jesse has further great points: review helps you avoid shortcuts, encourages one to review their own code (my favorite), and it makes for better code.
More recently, Jesse’s pointed out that pairing isn’t necessarily a substitute for code review: “…pairing is heavyweight and rare. Code review is lightweight and always.”
In my experience, pairing is great for cornering a problem and figuring out what the path to the solution is. Pairing is great for bringing people into the fold of a new team or project. Review is great for enforcing team standards and identifying wholly missing functionality. Review is sometimes great for finding little bugs, as is pairing.
Neither pairing or code review is a silver bullet for better software, but when a team applies them well, really awesome things can happen.
Stand on the shoulders of others' REST mistakes
Like all API design, putting a REST API on your app is tricky business that most people learn through lots of mistakes. So stand on the shoulders of other peoples mistakes! Thus, REST worst practices:
In the REST world, the resource is key, and it’s really tempting to simply look at a Django model and make a direct link between resources and models — one model, one resource. This fails, though, as soon as you need to provide any sort of aggregated resource, and it really fails with highly denormalized models. Think about a Superhero model: a single GET /heros/superman/ ought to return all his vital stats along with a list of related Power objects, a list of his related Friend objects, etc. So the data associated with a resource might actually come out of a bunch of models. Think select_related(), except arbitrary.
Mistaking the app’s internal model with what API users want to work with was the mistake I made on the first API I wrote.
Any big API is going to need to have dedicated servers that just serve API applications: the performance characteristics of large-scale APIs are so different from web apps in general that they almost always require separately-tuned servers.
This is how I prefer to roll my APIs lately. At the least, they should be a separate set of controllers. If you can extract a completely different application even better.
Crafting lightsabers, uptime the systems, a little Clojure
Herein, some great technical writings from the past week or two.
Crafting your editor lightsaber
Vim: revisited, on how to approach Vim and build your very own config from first principles. My personal take on editor/shell configurations is that its way better to have someone else maintain them. Find something like Janus or oh-my-zsh, tweak the things it includes to work for you, and get back to doing what you do. That said, I’m increasingly tempted to craft my own config, if only to promote the fullness and shine of my neck beard.
Uptime all the systems
Making the Netflix API More Resilient lays out the system of circuit breakers, dashboards, and automatons Netflix uses to proactively maintain API reliability in the face of external failures. Great ideas anyone maintaining a service that needs to stay online.
List All of the Riak Keys, on the trickiness of SELECT * FROM all_the_things
-style queries in Riak, or any distributed database, really. The short story is that these kinds of queries are impractical and not something you can do in production. The longer story is that there are ways to work around it with clever use of indexes and data structures. Make sure you check out the Riak Handbook from the same author.
A little bit of Clojure
Introducing Knockbox introduces a Clojure library for dealing with conflict resolution in data stored in distributed databases like Riak. If you’re working with any database that leaves you wondering what to do when two clients get in a race condition, these are the droids you’re looking for. I would have paid pretty good money to have known about this a few months ago.
Clojure’s Mini-languages is a great teaser on Clojure if, like me, you’ve tinkered with it before but are coming back to it. This is particularly useful if you’ve seen some Lisp or Scheme before, but are slightly confused by what’s going on with all the non-paren characters that appear in your typical Clojure program. Having taken a recent dive into the JVM ecosystem, I have to say there’s a lot to like in Clojure. If your brain understands static types but thinks better in dynamic types (mine does), give this a look.
I occasionally post links with shorter comments, if you’d like a slightly more-frequent dose of what you just read.
Quality in the inner loop
In software, this means that every piece of code and UI matters on its own, as it’s being crafted. Quality takes on more of a verb-like nature under this conception: to create quality is to care deeply about each bit of creation as it is added and to strive to improve one’s ability to translate that care into lasting skills and appreciable results.
When I wrote on “quality” a few months ago, I was thinking of it as an attribute one would use to describe the outer loop of a project. Do a bunch of work, locate areas that need more quality, but a few touches on those areas or note improvements for the next iteration, and ship it.
But what Brad is describing is putting quality into the inner loop. Work attains “the quality” as it is created, rather than as a secondary editing or review step. Little is done without considering its quality.
I’m extrapolating a bit from the letter of what Brad has written here, but that’s because I’ve been lucky enough to work with him. Indeed Brad’s work is of consistently high quality. Hopefully he’ll write more specifics about how quality code is created in the future (hint, Brad), and how much it relates to Christopher Alexander’s “quality without a name”.
Modern Von Neumann machines, how do they work?
Modern Microprocessors - A 90 Minute Guide!. If you didn't find a peculiar joy in computer architecture classes or the canonical tomes on the topic by Patterson and Hennessey, this is the thing for you. It's a great dive into how modern processors work, what the design challenges and trade-offs are, and what you need to know as a software developer.
Totally unrelated: when I interned at Texas Instruments, my last project was writing tests for a pre-silicon DSP. Because there were no test devices, I had to run my code against a simulator. It simulated several million gates of logic and output the result of my program as the wires that come out of the processor registers. This was fun, again in a way peculiar to my interest, at the time, in being a hardware designer/driver hacker. Let me tell you, every debugging tool you will ever see is better than inspecting hex values coming out of registers.
Anyway, these programs ran super slow, each run took about an hour. One day I did the math and figured out the simulator was basically running at 100 hz. Not kilohertz or megahertz. One hundred hertz. So, yeah. In the snow, uphills, both way.