Link
Simplicators for sanity
For those rainy days when integrating with a not-entirely sane system is getting you down:
A Simplicator introduces a new seam into the system that did not exist when the service's byzantine API was used directly. As well helping us test the system, I've noticed that this seam is ideal for monitoring and regularing our systems' use of external services. If a widely supported protocol is used, we can do this with off-the-shelf components.
The Simplicator is a component that lives outside the architecture of your system. It exports a sane interface to your system. You test it separately from your system. Its only purpose in life is to deal with the insanity of others.
Hell is other people’s systems; QED this is a heavenly idea.
Smelly obsessions
Get Rid of That Code Smell - Primitive Obsession:
Think about it this way: would you use a string to represent a date? You could, right? Just create a string, let’s say "2012-06-25" and you’ve got a date! Well, no, not really – it’s a string. It doesn’t have semantics of a date, it’s missing a lot of useful methods that are available in an instance of Date class. You should definitely use Date class and that’s probably obvious for everybody. This is exactly what Primitive Obsession smell is about.
Rails developers can fall into another kind of obsession: framework obsession. Rails gives you folders for models, views, controllers, etc. Everything has to be one of those. Logic is shoehorned into models instead of put in objects unrelated to persistence. Controller methods and helpers grow huge with conditionals and accreted behavior.
This is partially an education and advocacy problem. Luckily, folks like Avdi Grimm, Corey Haines, Gary Bernhardt, and Steve Klabnik, amongst others, are spreading the word of how to use object oriented principles to design Rails applications without obsessing over the constructs in the Rails framework.
The second part is practice. Once you’ve educated yourself and bought into the notion that a Rails app isn’t all Rails classes, you’ve got to practice and struggle with the concepts. It won’t be pretty the first time; at least, it wasn’t for me. But with time, I’ve come to feel far better about how I design applications using both Rails principles and object-oriented principles.
How to think about organizing folders: don't.
Mountain Lion’s New File System:
Folders tend to grow deeper and deeper. As soon as we have more than a handful of notions, or (beware!) more than one hierarchical level of notions, it gets hard for most brains to build a mental model of that information architecture. While it is common to have several hierarchy levels in applications and file systems, they actually don’t work very well. We are just not smart enough to deal with notional pyramids. Trying to picture notional systems with several levels is like thinking three moves ahead in chess. Everybody believes that they can, but only a few skilled people really can do it. If you doubt this, prove me wrong by telling me what is in each file menu in your browser…
A well-considered essay on the non-recursive design of folders in iCloud, how people think about organizing documents, the emotions of organizing documents, and how it comes together in an app like iCloud. Great reading.
"Surround yourself with beautiful software"
Building an army of robots, Kyle Kneath on GitHub's internal tools. The closing line of this deck is "Surround yourself with beautiful software". One of the most compelling things I've looked at this year.
Etsy's rules of distributed systems
Architecting for change. Complex systems and change:
- Distributed systems are inherently complex.
- The outcome of change in complex systems is hard to predict.
- The outcome of small, frequent, measurable changes are easier to predict, easier to recover from, and promote learning.
I’d have thought all the useful things to say about Etsy were said, at this point, but I’d have thought wrong!
There’s a good saying about designing distributed systems that goes something like “avoid it as long as possible”. I think these three guidelines are worth adding to that saying. Iterate, examine, repeat. Don’t make big, tricky changes. In fact, large change you can’t recover from are nearly impossible to make anyway, so route around them entirely.
The last bit, “promote learning”, is great too. I follow distributed systems and database designers on Twitter and see tons of great papers and ideas in the exchange. More than that, always teach your teammates about the distributed systems you’re building. The more they know about the design and constraints of the system you’re making, the easier it is for them to work with those systems. If you can’t teach someone to use your system, you probably don’t understand it well enough.
Thread safety in Rails, explained!
Read up on Thread
and Queue
and ready for more multi-threaded Ruby reading? Aaron Patterson has written up how the thread safe option works in Rails and some tradeoffs involved in removing the option and making thread safety the default. It’s not as complicated as you might think!
The only rub I can see is that, as far as I can tell, he’s talking about making this the default for production mode. Making it the default for development mode isn’t tenable if you want to keep class reloading around, which almost everyone does. It’s just a hunch, but running without thread-safety in development seems weird when its the default in production. But, some teams run YARV in development and JRuby in production, so maybe I’m just making up things to worry about.
Tables and lambdas, a cure for smelly cases
Lots of folks consider case
expressions in Ruby a code smell. I’m not ready to write them off just yet, but I know a good replacement for some uses of case
when I see it. Rad co-worker David Copeland’s Lookup Tables With Lambdas is one of those replacements. For cases where a method takes a parameter, throws it into a case
, and returns a value, I can replace all that lookup business with a hash lookup. To carry the metaphor through, the hash is the lookup table. Rad.
Where it gets fun is when I need to do some kind of dynamic lookup in the hash. Normally I wouldn’t want to do that when the Ruby interpreter parses my hash literal. If I reach into my functional programming bag of tricks, I recall that lambdas can be used to defer evaluation. And that’s exactly what David recommends. If I’ve got database lookups or logic I need to embed in my tables, Ruby’s lambda
comes to the rescue!
This approach works great at the small-to-medium scale. That said, I always keep in mind that a bunch of methods manipulating a hash, using its keys as a convention, is an encapsulated, orthogonal object begging to happen. Remember, it’s Ruby; we can make our objects behave like hashes but still do OO- and test-driven design.
Turns out I was wrong about RSpec subjects
I was afraid that David Chelimsky was going to take away my toys! Consider, explicit use of subject in RSpec considered a smell:
The problem with this example is that the word “subject” is not very intention revealing. That might not appear problematic in this small example because you can see the declaration on line 3 and the reference on line 6. But when this group grows to where you have to scroll up from the reference to find the declaration, the generic nature of the word “subject” becomes a hinderance to understanding and slows you down.
I’m so guilty of using subject
heavily. Even worse, I’ve been advocating it to others too. In my defense, it does lend a good deal of concision to specs and seemed like a golden path.
Luckily, David isn’t taking away my toys. He’s got an even better recommendation: just use a method or let
with a intention-revealing name. Here’s his example:
describe Article do
def article; Article.new; end
it "validates presence of :title" do
article.should validate_presence_of(:title)
end
end
This is, now that I’m looking at it, way better. As this spec grows, you can add helpers for article_with_comments
, article_with_author
, etc. and it’s clear right on the line that helper is used what’s going on. No jumping back and forth between contexts. Thumbs up!
Three Easy Essays on Distributed Systems
Ryan Smith is pretty good at thinking about distributed systems. Distributed systems, the systems we (sometimes unwittingly) create on a regular basis these days, are a complicated, dense, far-reaching topic. Ryan’s managed to take a few of its problems and concisely introduce them with simple solutions that apply to all but the largest systems.
In The Worker Pattern, he presents a novel solution to a problem you are probably tackling with background or asynchronous job queues. Teaser: do you know what the HTTP 202
status code does?
A web service that requires high throughput will undoubtedly need to ensure low latency while processing requests. In other words, the process that is serving HTTP requests should spend the least amount of time possible to serve the request. Subsequently if the server does not have all of the data necessary to properly respond to the request, it must not wait until the data is found. Instead it must let the client know that it is working on the fulfillment of the request and that the client should check back later.
Coordinating multiple processes that need to process a dataset in bulk is tricky. Large systems usually end up needing some kind of Paxos service like Doozer or ZooKeeper to keep all the worker processes from butting heads or duplicating work. Leader Election shows how, by scoping the problem space to existing tools, it becomes possible to put together a solution that scales down to small and medium-sized systems:
My environment already is dependent on Ruby & PostgreSQL so I want a solution that leverages my existing technologies. Also, I don’t want to create a table other than the one which I need to process.
As applications grow, they tend to maintain more and more state across more and more systems. Incidental state is problematic, especially when you have to maintain several services to keep all of it available. Applying Event Buffering mitigates many of these problems. The core idea of this one is my favorite:
We have seen several examples of how to transfer state from our client to our server. The primary reason that we take these steps to transfer state is to eliminate the number of services in our distributed system that have to maintain state. Keeping a database on a service eventually becomes and operational hazard.
Most of the systems we build on the web today are distributed systems. Ryan’s writings are an excellent introduction to thinking about and building these systems. It certainly helps to comb through research papers on the topic, but these three essays are excellent starters down the path to intentionally building distributed systems.
Ruby anthropology with Hopper
Zach Holman is doing some interesting code anthropology on the Ruby community. Consider Aggressively Probing Ruby Projects:
Hopper is a Sinatra app designed to pull down tens of thousands of Ruby projects from GitHub, snapshot each repository into ten equidistant revisions, run them through a battery of tests (which we call Probes), and hopefully come up with some deeply moving insights about how we write Ruby.
There are plenty of code metric gizmos out there. At a glance, Hopper takes a few nice steps over extant projects. Unlike previous tools, it has a clear design, an obvious extension mechanism, and the analysis tools are distinct from the reporting tools. Further, it’s designed to run out-in-the-open, on existing open source projects. This makes it immediately useful and gives it a ton of data to work with.
For entertainment, here’s some information collected on some stuff I worked on at Gowalla: Chronologic and Audit.
A real coding workspace
Do you miss the ability to take a bunch of paper, books, and writing utensils and spread them out over a huge desk or table? Me too!
Light Table is based on a very simple idea: we need a real work surface to code on, not just an editor and a project explorer. We need to be able to move things around, keep clutter down, and bring information to the foreground in the places we need it most.
This project is fantastic. It’s taking a page from the Smalltalk environments of yore, cross-referencing that with Bret Victor’s ideas on workspace interactivity. The result is a kick-in-the-pants to almost every developer’s current workflow.
There’s a lot to think about here. A lot of people focus on making their workflow faster, but what about a workspace that makes it easier to think? There’s a lot of room to design a better workspace, even if you’re not going as far as Light Table does.
There’s a project on Kickstarter to fund further development of Light Table. If you write software, it’s likely in your interest to chip in.
UserVoice's extremely detailed project workflow
Some nice people at UserVoice took the time to jot down how they manage their product. Amongst the lessons learned:
Have a set amount of time per week that will be spent on bugs
We have roughly achieved this by setting a limit on the number of bugs we’ll accept into Next Up per week. This was a bit contentious at first but has resolved a lot of strife about whether a bug is worthy. The customer team is now empowered (or burdened) with choice of choosing which cards will move on. It’s the product development version of the Hunger Games.
This, to me, is an interesting juxtaposition. Normally, I think of bugs as things that should all be fixed, eventually. Putting some scarcity of labor into them is a great idea. Fixing bugs is great, until it negatively affects morale. Better to address the most critical and pressing bugs and then move the product ball forward. A mechanism to limit the number of bugs to fix, plus the feedback loop of recognizing those who fix bugs in an iteration (they mention this elsewhere in the article), is a great idea.
Cowboy dependencies
So you’ve written a cool open source library. It’s at the point where it’s useful. You’re pretty excited. Even better, it seems like something that might be useful at your day job. You could go ahead and integrate it. Win-win! You get to work out the rough edges on your open source project and make progress on your professional project.
This is tricky ground and it’s not as win-win as you might think. Integrating a new dependency, whether its one maintained by a team-mate or not, requires communication. Everyone on the team will have to know about the dependency, how to work with it, and how to maintain it within the project. If there’s a deal-breaking concern with the library, consider it feedback on your library; it either needs to better address the problem, or it needs better documentation to address why the problem isn’t so much a problem.
It all comes down to communication. Adding a dependency, even if you know the person who wrote it really well, requires collaboration from your teammates. If you’re not talking to your teammates, you’re just cowboy coding.
Don’t cowboy dependencies into your project!
Learn Unix the Jesse Storimer way
11 Resources for Learning Unix Programming:
I tend to steer clear of the thick reference books and go instead for books that give me a look into how smart people think about programming.
I have a soft spot in my heart for books that are way too long. But Jesse’s on to something, I think. The problem with big Unix books is that they are tomes of arcane rites; most of it just isn’t relevant to those building systems on a modern Unix (Linux) with modern tools (Java, Python, Ruby, etc.).
Jesse’s way of learning you a Unix is way better, honestly. Read concise programs, cross-reference them with manual pages. Try writing your own stuff. Rinse. Repeat.
This silver bullet has happened before and it will happen again
Today it's Node. Before it was Rails. Before it was PHP. Before it was Java. Cogs Bad:
There’s a whole mindset - a modern movement - that solves things in terms of working out how to link together a constellation of different utility components like noSQL databases, frontends, load balancers, various scripts and glue and so on. Its not that one tool fits all; its that they want to use all the shiny new tools. And this is held up as good architecture! Good separation. Good scaling.
I’ve fallen victim to this mindset. Make everything a Rails app, solve all the problems with Ruby, store all the data in distributed databases! It’s not that any of these technologies are wrong it’s just they might not yet be right for the problem at hand.
You can almost age generations of programmers like tree rings. I’m of the PHP/Rails generation, though I started in the Linux generation. A few years ago, I thought I could school the Java generations. But it turns out, I’ve learned a ton from them, even when I was a bit of a hubristic brat. The Node generation of developers will teach me more still, both in finding new virtuous paths and in going down false paths so I don’t have to follow them.
That said, it would be delightful if there was a shortcut to get these new generations past the “re-invent all the things!” phase and straight into the “make useful things and constructive dialog about how they’re better” phase.
Bootstrap, subproject, and document your way to a bigger team
Zach Holman's slides on patterns GitHub uses to scale their team Ruby Patterns from GitHub's Codebase:
Your company is going to have tons of success, which means you'll have to hire tons of people.
My favorites:
- Every project gets a
script/bootstrap
for fetching dependencies, putting data in place, and getting new people ready to go ASAP. This script comes in handy for CI too. - Try new techniques by deploying it only to team members at first. The example here was auto-escaping markup. They started with this only enabled for staff, instead of turning it on for everyone and feeling the hurt.
- Build projects within projects. Inevitably areas of functionality start to get so complex or generic that they want to be their own thing. Start by partitioning these things into
lib/some_project
, document it with a read me inlib/some_project
and put the tests intest/some_project
. If you need to share it across apps or scale it differently someday, you can pull those folders out and there you go. - Write internal, concise API docs with TomDoc. Most things only need 1-3 lines of prose to explain what’s going on. Don’t worry about generating browse-able docs, just look in the code. I heart TomDoc so much.
These ideas really aren’t about patterns, or scaling the volume of traffic your business can handle. They’re about scaling the size of your team and getting more effectiveness out of every person you add.
Own your development tools, and other cooking metaphors
Noel Rappin encourages all of us to use our development tools efficiently. If your editor or workflow aren’t working for you, get a new tool and learn to use it.
I’ve been working with another principle lately: minimize moving parts. I used to spend time setting up tools like autotest, guard, or spork. But it ended up that I spent too much time tweaking them or, even worse, figuring out exactly how they were working.
I’ve since adopted a much simpler workflow. Just a terminal, a text editor, and some scripts/functions/aliases/etc. for running the stuff I do all the time. I take note when I’m doing something repeatedly and figure out how I can automate it. Besides that, I don’t spend much time thinking about my tools. I spend time thinking about the problem in front of me. It makes a lot of sense, when you think about it.
I say you should “own” your tools and minimize moving parts because you should understand how they all work together and how they might change the behavior of your code. If you don’t own your tools in this way, you’ll end up wasting time debugging someone else’s code, i.e. a misbehaving tool. That’s just a waste of time; when you come across a tool that offends in this way, put aside a time block to fix it, or discard it outright.
What kind of HTTP API is that?
An API Ontology: if you were curious about what the difference between an RPC, SOAP, REST, and Hypermedia API are, but were afraid to ask. In my opinion, this is not prescription; I don't think there's anything inherently wrong with using any of these, except SOAP. Sometimes an RPC or a simple GET is all you need.