Fewer changes are faster to deploy than fewer changes

Itamar Turner-Trauring, Incremental results: how to succeed at large software projects:

  • Faster feedback…
  • Less unnecessary features…
  • Less cancellation risk…
  • Less deployment risk…

👏 👏 👏 👏 👏  read the whole thing, Itamar’s tale is well told.

Consider: incremental approaches consist of taking a large scope and finding smaller, still-valuable scopes inside of it. Risk is 100% proportional to scope. Time-to-deliver grows as scope grows. Cancellation and deployment risk grow as time-to-deliver grows. It’s not quite math, but it is easy to demonstrate on a whiteboard. In case you happen to need to work with someone who wants large scope and low risk and low time-to-delivery.

TIL: divide by 10 with this one weird number

Running an application across two physical databases is not a straightforward thing. One of the relatively easier ways to do it involves assigning each database instance a shard number and then arranging for all your primary key IDs to end with that number. For example, shard 0 generates IDs like 1230, 40, 482340, shard 1 generates IDs like 1231, 41, 482341, and shard 2 generates IDs like 1232, 42, 482342, etc. all the way up to 9. If you want more than 10 database shards, it gets more involved.

My brain is wired oddly, so I came to wonder how you would quickly get the shard ID for an ID (e.g. shard 1 for 1231). This is really easy with decimal math; you just divide by 10. However, we run our databases on computers that can only do binary math, so its not actually simple.

But it turns out you can do it quite fast! There’s one weird number, expressed as `0x1999999A` hexadecimal, that is very close to multiplying by the fraction 1/10 (plus further binary math and register trickery). Thus you can do this in only a few instructions on Intel processors released in the past twenty years.

I’m really glad someone else figured this out.

OAuth2 🔥-takes

Is it too late to do hottakes for something that’s been around for nearly a decade?

OAuth2 pros:

  • I can allow other sites to use my data with some confidence that, at least, my authentication information won’t leak
  • It has made really cool stuff possible at my current workplace and workplace-2
  • Libraries to make it happen in server-side apps are pretty good

Cons:

  • There are a bajillionty implementations and standard definitions of OAuth2 (for somewhat justifiable reasons)
  • If you want to tinker with an OAuth2 API, you’re in a bit of hurt because you can’t just grab a token and start playing (mostly, depending on the implementer)
  • Those open source libraries are the kind of thing that drive maintainers away pretty quickly 😬

Overall: would not uninvent this technology.

The emotional rollercoaster of extracting code

There’s a moment of despair when extracting functionality from a larger library, framework, or program. The idea grows, a seed at first and then a full-blown tree, that the coupling in this functionality isn’t all bad. A lot of people talk only about coupling and leave out cohesion. They aren’t mutually exclusive! When the two are balanced, it’s hard to come up with a reason to start extracting.

On the other hand, sometimes that moment of despair strikes when you start really digging into the domain and realize this chunk of functionality isn’t what you thought it was. Maybe it’s not coherent (see above!) or perhaps the model of the domain isn’t deep enough. This is a pretty good signal to hit the brakes on the refactoring, figure the domain out, and reconsider the course of action.

Feature envy rears its head in extractions too. Patterns of crosstalk between the existing thing and the new thing are a sure sign of feature envy. It’s tempting to say, hey maybe you really need a third thing in the middle. That’s probably making matters worse though.

That said, changing bidirectional communication to unidirectional is usually a positive thing. Same for replacing any kind of asynchronous communication with synchronous. Or replacing lockstep coordination with asynchronous messaging. Envy is tricky!

(I) often encourage starting a new service or application within your existing “mothership”. The trendy way to say this right now is “monorepo all the things” or build a “modular monolith”. I find this compelling because you can leverage a lot of existing effort into operationalizing, tooling, and infrastructure. Once you know the domain and technical concerns specific to the new thing, you can easily extract into its own thing if you need to. The other edge of a monorepolith is that path dependence is a hell of a thing. Today is almost certainly an easier day to split stuff out than tomorrow.

A thing to consider pursuing is a backend-for-frontend service in pursuit of a specific frontend. It doesn’t even have to serve an application. You may have services that are specific to mobile, desktop, apps, APIs, integrations, etc. Each of these may need drastically different rates of change, technical features, and team sets.

Probably don’t split out a service so that a bunch of specialized people can build a “center of excellence” for the rest of the organization to rely upon. This is a very fancy way to say “we are too cool for everyone else and we just can’t stand the work everyone else is doing”. On their best day, the Excellence team will be overwhelmed by the volume of work they have put in front of themselves to make Everything Good. On their worst day, they will straight give up.

If you split something out, realize you’re going to have to maintain it until you replace it. And you’re going to rebuild the airplane while it’s flying. If you’re not really into that, stop now. Just because you can’t stand Rails, relational databases, or whatever doesn’t mean you should jump into an extraction.

More ideas for framework people

A few months ago I wrote about Framework and Library people. I had great follow-up conversations with Ben Hamill, Brad Fults, and Nathan Ladd about it. Some ideas from those conversations:

  • use a well-worn framework when it addresses your technical complexities (e.g. expose functionality via the web or build a 3-d game) and your domain complexity (e.g. shopping, social networking, or multi-dimensional bowling) is your paramount concern

  • once you have some time/experience in your problem domain, start rounding off corners to leave future teammates a metaframework that reduces decision/design burdens and gives them some kind of golden path

  • frameworks may end up less useful as integration surface area increases

  • napkin math makes it hard to justify not using a framework; you have to build the thing and accept the cost of not having a community to support you and hire from

  • to paraphrase Sandi Metz on the wrong abstraction: “(Using) no abstraction is better than the wrong abstraction”; if you’ve had a bad time with a framework, chances it was an inappropriate abstraction or you used the abstraction incorrectly

The first few years of my career, I edited the wrong file all the time. I could spend hours making changes, wondering why nothing was happening, until I realized I’d been tinkering in the wrong place because I was misreading a file path or not paying close enough attention to control flow.

Fast forward to now, and I’m pretty quick to drop a raise "BLORP" in code I’m tinkering with if things aren’t working like I think they should. All hail puts debuggerering.

However, it turns out I found a new class of this operator error today. I was diligently re-running a test case, expecting new results when the test fixture file I thought was changed was the wrong file. Once I deleted the right file, I was back on my way.

Joyful and grumpy are we who can find new ways to screw up time ever day!

Chaining Ruby enumerators

I want to connect two Ruby enumerators. Give me all the values from the first, then the second, and so on. Ideally, without forcing any lazy evaluations and flat so I don’t have to think about nested stuff. Like so:

xs = [1, 2, 3].to_enum
ys = [4, 5, 6].to_enum
[xs, ys].chain.to_a # => [1, 2, 3, 4, 5, 6]

I couldn’t figure out how to do that with Ruby’s standard library alone. But, it wasn’t that hard to write my own:

def chain(*enums)
  return to_enum(:chain, *enums) unless block_given?

  enums.each { |enum| enum.each { |e| yield e } }
 end

But it seems like Ruby’s library, Enumerable in particular, is so strong I must have missed something. So, mob programmers, is there a better way to do this? A fancier enumerator-combining thing I’m missing?

Stored Procedure Modern

The idea behind Facebook’s Relay is to write declarative queries, put them next to the user interaction code that uses them, and compose those queries. It’s a solid idea. But this snippet about Relay Modern made me chuckle:

The teams realized that if the GraphQL queries instead were statically known — that is, they were not altered by runtime conditions — then they could be constructed once during development time and saved on the Facebook servers, and replaced in the mobile app with a tiny identifier. With this approach, the app sends the identifier along with some GraphQL variables, and the Facebook server knows which query to run. No more overhead, massively reduced network traffic, and much faster mobile apps.

Relay Modern adopts a similar approach. The Relay compiler extracts colocated GraphQL snippets from across an app, constructs the necessary queries, saves them on the server ahead of time, and outputs artifacts that the Relay runtime uses to fetch those queries and process their results at runtime.

How many meetings did they need before they renamed this from “GraphQL stored procedures” to “Relay Modern”?

(FWIW, I worked on a system that exposed stored procedures through a web service for client-side interaction code. It wasn’t too bad, setting aside the need to hand write SQL and XSLT.)

Feedback: timing is everything

With feedback, like jokes, timing is everything. Good feedback at a bad time won’t do the trick.

I’ve mostly experienced programming feedback through pull requests. This is way better than no feedback. However, since most pull requests occur at the end of work, and not somewhere in the middle, some kinds of feedback are not conducive to pull requests.

Suppose all feedback falls somewhere on two axes: “timeliness” and “depth”. The narrow sweet spot of code review is apparent:

Pairing and code review are not so similar
Pairing and code review are not so similar

The sweet spot in the top-right corner is when code review works best: unhurried and in-depth feedback. I’d hesitate to call the lower-right corner of hurried, minimal feedback a code review at all; it’s more like rubber stamping.

I’ve often referred to code review, flippantly, as the worst form of pairing yet invented. I’ve given a lot of code review feedback in the past that was better suited to the synchronous nature of pairing than the very asynchronous nature of code reviews. That said, I feel like pairing is an excellent way to give all manners of feedback in the moment the code is being conceived or written. You can immediately point out possible incorrectness or better designs and talk it out, with the code at hand, with your collaborator.

However, we can’t all pair all the time. Let me show you how I’m trying to better time my feedback when I can’t share it immediately.

A tale of four pull requests

Consider four PR subject lines. Which ones are appropriate for architectural ideas? What about optimization ideas? When is deep refactoring feedback appropriate? Can I look at one of these in an hour when I’m done with my current task?

  • “Hotfix Facebook Auth scope”
  • “Prevent sending email for failed payment jobs”
  • “Add tagging to admin storylines listing”
  • “WIP introduce Redis/Lua-based story indexing”

Lately, when I do pull request reviews, I use these guidelines:

  • Figure out if this PR seems like it’s a hot patch to production, a quick fix on existing work, a PR landing new functionality, or a work-in-progress checkpoint seeking feedback.
  • Bear in mind that hot patches and quick fixes are more time sensitive and need yes/no feedback on correctness more than detailed feedback.
  • For hot patches (e.g. “Hotfix FB auth”), I’m only looking for “is this correct” and “will it fix the problem?”; thumbs up or thumbs down and commentary as to what I think is missing to solve the problem. No refactoring ideas. I only touch on performance if I spot a regression.
  • For quick fixes (e.g. “Prevent sending email…”), I’m again looking for correctness and timeliness. I might leave ideas for how to improve the performance or cleanliness of the code later. Those kinds of notes are entirely up to the gumption of the other developer, though. I know the low-gumption feeling of wanting only to fix something and get on to the next thing.
  • Landing new functionality (e.g. “Add tagging…”) receives a full review cycle. Beyond baseline correctness, I’m trying to view this code through my crystal ball. When some value of N is grows, will this code slow down noticeably? Is the code structured so that future changes are easy and obvious?
  • Work-in-progress checkpoints (“WIP introduce Redis/Lua…”) are open to the full spectrum of feedback. Ideas for how to differently structure data, which APIs to export, how to structure objects, how to name the domain model, etc. are all in play. Pretty much the only thing out of play is anything that feels too close to bike shedding.
  • Bear in mind that everyone exists on a spectrum of coding specificity. More seasoned developers are likely open to ideas for restructuring code or considering novel approaches. Less seasoned developers (including seasoned developers new to the team) likely want specific guidance about which changes to make or factors they need to consider.
  • Where I may try to respond to hot patches and quick fixes in less than fifteen minutes, I may wait a couple hours before I look at new functionality or WIP reviews.
  • The most difficult part with these guidelines is how to handle ideas about refactoring on time-sensitive reviews. I want to hold the line against letting lots of little fixes accrete into a medium-sized mess. I don’t want to discourage ideas for refactorings either; I want them separately so I can act on them when I have the energy to really do them.

In short

Use different tactics when sharing feedback for code review; it’s not pairing. Identify patches, reviews, and full feedback pull requests. Sanity check patches, look for correctness in review, look for design in review. Use GitHub’s review process to indicate your feedback is “FYI” vs. “fix this before merging”. Time-to-response is most important for patches and fixes.

Above all: giving feedback is a skill you acquire with practice, empathy, and maintaining a constructive attitude.

Practically applying Clojure

Fourteen Months with Clojure. Dan McKinley on using Clojure to build AWS automation platform Skyliner:

The tricky part isn’t the language so much as it is the slang.

Also, the best and worst part of Clojure:

When the going gets tough, the tough use maps

This is probably better now that specs and schema are popular. Before, when they were mysterious maps full of Very Important State, reading Clojure code (and any kind of Lisp) was pretty challenging.

Make sure you stick around for the joke about covariance and contravariance. Those type theories, hilarious!