2013
Exemplary documentation: size and purpose
There’s a lot to say about programmer-focused software documentation. It’s more crucial than many developers think, so it is often neglected. Even when its not neglected, it’s often an after-thought. I’ve noticed there are three kinds of documentation I’m interested in.
When I first come across some software, I want short and focused examples of how I can use it for my own purposes. I’m not looking for a lot of theory or exposition; show me the benefit. If I can’t quickly see how the software works and makes my life easier, I’m very likely to discard it. In other words, I want shorter, “tweet-sized” documentation that sells me on the sizzle right away.

If I come back to some software, I often want to learn the whole thing in one sitting. I want a longer document that I can read through in a serial fashion to learn most or all of the concepts and details about using the code. It should cover the domain ideas of the software, the individual APIs, and how it all works together to make something. To continue the metaphor, I want a well-written, “Instapaper-length” document worthy of reading in a comfy chair.

After I start using something, I will often want to return to it to remember how to do specific things or to figure out if a task is possible at all. This is when I lean most on traditional API documentation. One to three paragraphs, easily searched are the ideal here. Kind of like the “Tumblr-post” of documentation.
I’ve yet to find all three of these qualities in the documentation for a single piece of software. Finding that software has done a really good job at one of them is delight enough. I can’t imagine how excited the world, at large, would be if something were to have all three. There would be a lot of rejoicing.
Web design for busy coders
Here it is: I'm somewhere between horribly afraid and way-too-smart to seriously attempt front-end web work. Browsers are not the software whose bugs I am interested in knowing about.
That said, putting information on the web that doesn't look like utter dross is a kind of required literacy in our field. While bravely dipping my toes back into the front-end waters, I recently found some great tricks. Rediscovered, probably, but I'm not sure where the idea originally came from.
Most important: design in greyscale. Color is hard and can lead to tinkering. My goal is to get in and out of the front-end bits quickly, so tinkering is the enemy. Greyscale is one dimensional, greatly simplifying matters. Give important information higher contrast and less important information (and "chrome") less contrast. Now you're done thinking about color.
Almost as important: use a fixed-with font. As a programmer, you look at them everyday, so it's a touchpoint of comfort. Pick a font you don't use in your editor all day, just so you can stare at something different for a while. Copy and paste a "font stack" from the aptly named fontstacks. Make important things big and unimportant things small. Now you're done thinking about type.
The key to avoiding browser dragons, it seems, is to skip horizontal layout, i.e. pull quotes, text wrapped around images, etc. It's pretty easy to use CSS if you only run things down the left side of the page. All the depth and despair of CSS is in trying to get things to appear off the left margin. Don't do that. Leave it to people who know how browsers work and how to manage their gnarly bugs. Now you're done thinking about layout.
It's tempting to think you should make your code examples look really nice. Don't worry about it; highlighting code is of marginal value. You'll never be satisfied with how it looks. The human mind is capable of reading code without a rainbow spectrum of colors. Spend time on writing about the code, not on polishing the colors and how its highlighted.
With all of those things out of the way, your way is clear to think about the really important things. What do you need to say, how do you structure the message, what do you leave out, how do you organize all the information? That's the essence of publishing on the web, not the accidental complexity of making things look interesting.
The gift and the curse of green-field projects
The "green field" in software is a gift and a curse.
On the bright side, you have an opportunity to use the new-shiny. Past wrongs can be righted. You can move quickly, without worry about why some code exists, how it works, or whether you should care about it. Life is good.
The peril I've found is that green field projects are by their nature isolated. They don't have a deployment or monitoring story. They don't spring forth fully integrated with other critical systems. The project probably hasn't proven itself as useful yet.
Letting a green-field live in isolation too long is the root of lots of problems. I've experienced scope creep, confused expectations, and declining morale that all could have been avoided had I brought a green-field project "into the fold" sooner. But the whole point of a green field is that you don't integrate too soon, lest you spin your wheels on legacy things.
Green field projects are fun and an often welcome change of pace from working within an existing system. However, succeeding on a green field project is just as hard, or harder, than succeeding with a legacy system. It's a different set of trade-offs that each developer has to get good at.
Hypermedia chicken, web browser egg
A lot of the hypermedia philosophy is centered around the idea that API clients should work a lot like web browsers and plain-old Hypertext Markup Language. They should follow hyperlinks, leverage media types, cache data when they can, and intelligently take advantage of the meaning of hyperlinks whenever possible.

The problem with that is, web browsers are way more capable than HTTP clients developers are using to build API clients with.
Here are some things web browsers have become pretty good at:
- GET and POST requests
- Following links and redirects
- Discovering data structures via HTML forms and submitting data using that schema
- Using headers to negotiate content types
- Caching data when possible and expiring those caches
- Handling streaming data and chunked responses
- A bunch of stuff I'm probably forgetting
Here are some things the HTTP client in your typical standard library are good at:
- GET, POST, PUT, and DELETE requests
- Following redirects (if you manage to set the right options)
What's at play here is a chicken and egg problem. Client developers can't build on hypermedia principles until they are working at the level of hypermedia abstractions. Arming them only with HTTP requests and the ability to choose their encoding and schema poison is too low level.
Protocols like HAL or Collection+JSON could light a path to solving this problem. Rather than dealing in pure data, services and consumers work with data annotated with hypermedia-like semantics for traversing data structures and creating new data. If these protocols can gain traction, it's "simply" a matter of getting HTTP clients into widespread use (read: standard libraries in stable releases of your-favorite-programming-language) that are as good at HTTP as web browsers are. At that point, API providers and API consumers could start using hypermedia principles to build APIs for those who aren't interested in the mechanics of hypermedia.
Until then, it seems to me that putting hypermedia principles front-and-center is suboptimal. It's akin to telling someone who wants to use your website that they need to understand MVC patterns first. It's only going to discourage and confuse them.
How to understand Saturday Night Live
Saturday Night Live is a changing thing. It’s not new like it was in the seventies, it’s not a powerhouse like it was in the nineties, it may not be the training camp for NBC sitcoms anymore. Despite that, its still a big dog in the worlds of comedy and pop culture. Every time I hear or read “SNL was better when…”, I cringe a little. As far as I can tell, this isn’t true.
Everyone’s got their favorite cast. Ferrell/Shannon/Sanz/Oteri, Fey/Poehler/Rudolph, Hartman/Carvey/Myers. It seems largely to depend on whenever you started watching SNL or when you were a teenager or in college. So when someone says “SNL isn’t relevant any more”, I mentally substitute “I liked SNL better with the cast I watched first.”
Over the years of watching and reading about SNL, I’ve come to understand that the show is very much about the people on camera, but Lorne Michaels is the show. The only time the show has been in consistent decline was when Michaels wasn’t around in the early eighties. For the past thirty years, claiming SNL was on the down seems to be more of a sport than a rational argument.
Since Michaels' return, the show is subject to fractal cycles. Each night, some skits kill and some skits bomb. Generally, the front of the show is better than the back; if you stay tuned after “Weekend Update”, you should count yourself a long-time fan, willing to see some weird and/or flat skits, or asleep on the couch.
If you zoom out to look at how a season flows, you’ll again find shows that are really great and some that aren’t. My theory is that this entirely depends on the quality of the host. A mediocre host seems to bring middling material out of the writers and performers. A good or high-profile host seems to bring good-but-not-great material and pretty good performances. One of the darling hosts, like Alec Baldwin and the more recent Jon Hamm, brings the A-game material from the writers and performers play up to the occasion.
Interestingly, musical guests can bring a certain electricity too. Paul Rudd is a capable host, but pairing him with Paul McCartney led to a show that was pretty electric. I defy you to watch that episode and tell me SNL just isn’t as good as it used to be.
Zooming out to look at successive seasons, you see the same sort of up-and-down. Will Ferrell’s first season was good, but not great. He definitely left at his sketch comedy peak, and the show was briefly weaker for it. But right on his heels came the Fey/Poehler/Rudolph powerhouse. There’s an ebb and flow as casts come together, hammer out a few good seasons, and then move on to other stages.
That’s how I understand SNL. Perhaps I’m seeing cognitive biases towards the show through my own cognitive biases. I think it’s still a relevant benchmark of American pop culture.
Context is data to burst your bubbles
Context is a slippery topic that evades attempts to define it too tightly. Some definitions cover just the immediate surroundings of an interaction. But in the interwoven space-time of the web, context is no longer just about the here and now. Instead, context refers to the physical, digital, and social structures that surround the point of use.
Great design is built around people, not devices or software. Applying responsive design or native app UX is a tool, not a solution. Instead, we should design software that solves a problem for a real person (not a power-user or one of our colleagues) given the devices available to them and the context of use they’re in.
A high information density display is no good to a parent trying to get their kids out the door. Documentation based on video tutorials is no good for someone riding a bus. A developer troubleshooting a service bottleneck needs to know more than the average response time.
As both designers of user experiences and developers of software, we need to get away from the desk and out amongst those we’re building for. It’s too easy to build for ourselves and our friends. We need to consider how others approach and use what we make. Armed with that context, we can design a solution for everyone, and not just those we share a bubble with.
Hyperthreading illustrated
I'm fond of saying hyperthreading is a lie. It's true though; a dual hyperthreaded core is nowhere near as awesome as a four real cores. That's more provocative than useful, so let me draw you some pictures.
If you zoom way out, a single core, dual cores, and a single hyperthreaded core look like this:

Note how the hyperthreaded core is really a single core with an extra set of registers and instruction/stack pointers. The reason hyperthreading is a lie is you can't actually run four processes, or even four threads, at the same time. At best, you can run two threads with two others ready in the wings.
I'm a dilletante of processor architecture at best, but I think I can explain why chip designers would do this.

My best guess as to why you would design and release a hyperthreaded core is to increase the number of instructions you can retire (i.e. mark as completely executed) per clock cycle. Instructions retired per cycle is one of the primary metrics processor architects use for judging a design.
The enemy of instructions retired per clock cycle is memory accesses and branch mispredictions. When a processor has to go access a cache line or worse, something out in main memory, it has nothing to do but wait. When a branch (i.e. a conditional or loop) is incorrectly speculatively executed (how awesome is it that processors start executing code paths before they even know if its the right thing to do?) they end up in the same predicament. Cache misses and branch mispredictions are at best a recipe for some overhead, and at worst a recipe for waiting around on main memory.
Hyperthreading attempts to solve this problem by keeping extra programs (threads or processes) waiting in the wings, ready to start executing as soon as another pauses due to needing something from main memory. This means our execution units, the things that actually do math and execute the logic of a program, are (almost) always utilized and retiring instructions. And that gets us to our happy place of a higher instructions retired per clock cycle.
Why not just throw more execution units in and have real cores ready to work? I'm not sure, there must be something about how modern processor pipelines work that I don't know which makes it too expensive to implement. That said, hyper threading (as I understand it) is a pretty clever hack for improving the efficiency of a processor.
TextMate's beautiful and flawed extension mechanism
This is about how TextMate’s bundle mechanism was brilliant, but subtly flawed. However, to make that point, I need to drag you through a dichotomy of developer tools.
Composition vs. context in developer tools
What's the difference between a tool that developers work with and a tool developers often end up working against? Is there a useful distinction between tools that seem great at first, but end up loathed as time goes on? Neal Ford has ideas. Why Everyone (Eventually) Hates (or Leaves) Maven:
I defined two types of extensibility/programability abstractions prevalent in the development world: composable and contextual. Plug-in based architectures are excellent examples of the contextual abstraction. The plug-in API provides a plethora of data structures and other useful context developers inherit from or summon via already existing methods. But to use the API, a developer must understand what that context provides, and that understanding is sometimes expensive.
Composable systems tend to consist of finer grained parts that are expected to be wired together in specific ways. Powerful exemplars of this abstraction show up in *-nix shells with the ability to chain disparate behaviors together to create new things.
Ford identifies Maven and IDEs like Eclipse as tools that rely on contextual extension to get developer started with specific tasks very quickly. On the other hand, a composable tool exchange task-oriented focus for greater adaptability.
Contextual systems provide more scaffolding, better “out of the box” behavior, and contextual intelligence via that scaffolding. Thus, contextual systems tend to ease the friction of initial use by doing more for you. Huge global data structures sometimes hide behind inheritance in these systems, creating a huge footprint that shows up in derived extensions via their parents. Composable systems have less implicit behavior and initial ease of use but tend to provide more granular building blocks that lead to more eventual power.
And thus, the crux of the biscuit:
Contextual tools like Ant and Maven allow extension via a plug-in API, making extensions the original authors envisioned easy. However, trying to extend it in ways not designed into the API range in difficultly from hard to impossible...
Contextual tools are great right up to the point you hit the wall of the original developer’s imagination. To proceed past that point requires a leap of one or two orders of magnitude in effort or complexity to achieve your goal which the original developer never intended.
Bundles are beautiful
Ford wrote this as a follow-on to a piece Martin Fowler wrote about how one extends their text edior. It turns out that the extension models of popular text editors, such as VIM and Emacs, are more like composable systems than extension-based systems.
All of this is a extremely elaborate setup for me to sing the praise of TextMate. Amongst the many things it got very right, TextMate brilliantly walked the line between a nerdy programmer’s editor and an opinionated everyday tool for a wide range of developers. It did this by exposing its extension mechanism through two tools that every developer knows: scripts and regular expressions.
To add functionality to TextMate, you make a bundle. A bundle is a convention for laying out a directory such that TextMate knows the difference between a template and a syntax definition through a convention. This works because developers know how to put things in the right folder. There were only ever five or so folders you needed to know about, so this was a simple mechanism that didn’t become a burden.
To tell TextMate how to parse a language and do nifty things like folding text, you wrote a bunch of regular expressions. If I recall, there were really only a few placeholders to wedge in these regular expressions. This worked great, as most languages, though the “serious” tools use lexers and parsers, are amenable to low-fidelity comprehension with a series of naive pattern matches. The downside was that languages that didn’t look like C were sometimes odd to work with.
In my opinion, the real beauty of TextMate’s bundles was that all of the behavioral enhancement was handled with shell scripts. Plain-old Unix. You could write them in Ruby, Python, bash, JavaScript, whatever fit your fancy. As long as you could read environment variables and output text (or even HTML), you could make TextMate do new things. This led to an absolute explosion of functionality provided by the community. It was a great thing.
Downfall
Interestingly enough, TextMate is essentially a runtime for bundles. This is how VIM and Emacs are structured as well. TextMate just put a nicer interface around that bundle runtime. However, the way it did so was its downfall, at least for me.
Recall, a few hundred words ago, the difference between composable and contextual extensions. A contextual extension is easy to get going, but comes up short when you imagine something the creator of the extension point didn’t imagine. The phenomenal thing about TextMate was how well it chose the extension points and how much further those extension points took the program than what came before it. I’d estimate that about 90% of what TextMate ever needed to do, you could do with bundles. But the cost to find that last 10%, it was brutal.
Eventually, I bumped up against this limitation with TextMate. I wanted split windows, I wanted full-screen modes, I wanted better ctags integration. No one could add those (when TextMate wasn’t open source, circa 2010) because they required writing Objective-C rather than using TextMate’s extension mechanism. And so, I ended up on a different editor (after several months of wandering in a philosophical desert).
The moral of the story
If possible, you should choose a composable extension mechanism (a full-blown language, probably) and use that extension mechanism to implement your system, ala Vimscript/VIM and elisp/Emacs. If you can’t do that, you can get most of the benefit by doing a plugin API, but you have to choose the extension points really, really well.
Senior VP Jean-Luc Picard, of the USS Enterprise (Alpha Quadrant division)
If you’re working from the Jean-Luc Picard book of management, a nice little Twitter account of Picard-esque tips on business and life, we can be friends. Consider:
Picard management tip: Be willing to reevaluate your own behavior.
And:
Picard diplomacy tip: Fighting about economic systems is just as nonsensical as fighting about religions.
But I’m not so sure about this one:
Picard management tip: Shave.
If you’re playing from home, the fictional characters that have most influenced my way of thinking are The Ghostbusters (all of them) and Jean-Luc Picard. I also learned everything I need to know about R&B from The Blues Brothers.
SoundCloud, micro-services, and software largeness
From a monolithic Ruby on Rails app to the JVM, how Soundcloud has transitioned to a hybrid approach with Ruby apps intermingling with Scala and Clojure apps. I think some of their idea of what is idiomatic Rails and how to operate Ruby are not exactly on center. But, their approach to the problem of a large Rails app is right on: break it up into “micro-services” that, if you don’t like the code, you can rewrite quickly if necessary.
Lest you fear this is yet another “Rails doesn’t scale!” deck, they do make a key observation. “Rails, PHP, etc. are a very good choice to start something”. Once you get past “starting” and “growing” to “successful and challenging”, you’ll face the same level of challenge no matter what you choose: Ruby or Java, MySQL or Riak. All the technologies we have today are challenged when they grow large.
So don’t let applications and services get large. Easy to say; hard, but worthwhile, to practice.
Those Who Make, by hand
Those Who Make is a series about people who craft. Physical things, by hand, that don’t come out the same every time. I love watching people make things, and I doubly love hearing their passion for whatever it is they’re making. Even more enlightening, this is a very international series. It’s not all hipster shops in San Francisco, Portland, and Brooklyn; it’s everywhere.
This is delightful stuff.
[vimeo www.vimeo.com/58998157 w=500&h=250]
How coffee is made in a colorful shop in another country, shot in the “Vimeo style” (is this a thing?): that will always get me.
Feynman's mess of jiggling things
Richard Feynman, in the process of explaining rubber bands:
[youtube https://www.youtube.com/watch?v=baXv_5z7HVY&w=420&h=315]
The world is a dynamic mess of jiggling things, if you look at it right!
This simplification delights and amuses me. The great thing is its fractal truth: you can observe our lives at many levels and conclude that they are dynamic jiggling messes.
The Rite of March
INT. OFFICE: A team of enthusiastic young folk rush to get their “game changing” app ready for SXSW. A cacophony of phone calls, typing, and organizing swag.
EXT. PATIO: A team of folks that have done the SXSW ritual before look at their calendar, note it’s almost the middle of March, and shrug. They go back to drinking a tasty beverage and working at their own pace.
[youtube=www.youtube.com/watch
Twitter's optimizations
Data point: a few of the infrastructure pieces out of Twitter have been implemented in low-level, heavy metal C and they’re optimizing on individual machines instead of architecture. Today, twitter/fatcache, a memcached-on-SSDs:
To understand why network connected SSD makes sense, it is important to understand the role distributed memory plays in large-scale web architecture. In recent years, terabyte-scale, distributed, in-memory caches have become a fundamental building block of any web architecture. In-memory indexes, hash tables, key-value stores and caches are increasingly incorporated for scaling throughput and reducing latency of persistent storage systems. However, power consumption, operational complexity and single node DRAM cost make horizontally scaling this architecture challenging. The current cost of DRAM per server increases dramatically beyond approximately 150 GB, and power cost scales similarly as DRAM density increases.
It’s fascinating to observe Twitter’s architectural growth from the outside. They quickly exceeded the capacity of typical MySQL setups, then of Ruby and Rails, then memcached alone. They’ve got distributed filesystems, streaming distributed processing pipelines, and distributed databases. Now they’re optimizing down to the utilization of their hardware, taking advantage of the memory-like latencies of SSDs. When you start caring about power and the size of your index entries, you’ve reached a whole new level of Maslow’s hierarchy of scaling.
If trends continue and Twitter is a leader in how large-scale distributed systems are implemented, watch out. Twitter led many of us to Scala, ZooKeeper, and their own inventions like Storm and Finagle. Gird your programming and scaling fashion loins, because you’re about to learn a lot more about malloc
, ERRNO
, and processor architecture than you ever wanted to know!