Encapsulation is a tradeoff too

Better understand Encapsulation. I can’t 😍 this article enough:

An individual programmer has fixed limits on how quickly they can build up instructions and later on how quickly they can correct problems. A highly-effective team can support and extend a much larger code base than the sum of its individuals, but eventually the complexity will grow beyond their abilities. There is always some physical maximum after which the work becomes excessively error prone or consistently slower or both. There is no getting around complexity, it is a fundamental limitation on scale.

Useless datapoint: my personal maximum is around three thousand lines of code, or 4–6 weeks of clean-slate effort.

So maybe I need to start encapsulating once I reach that limit?

To get the most out of encapsulation, the contents of the box must do something significantly more than just trivially implement an interface. That is, boxing off something simple is essentially negative, given that the box itself is a bump in complexity. To actually reduce the overall complexity, enough sub-complexity must be hidden away to make the box itself worth the effort.

This has been bugging me for a while. Encapsulation is treated as an unquestionable good by many developers. To question encapsulation is to adopt the opposite, that design isn’t worthwhile.

But it’s a tradeoff! Introducing encapsulation incurs a temporary increase in the net complexity of a system. Over the course of a tactical refactoring of methods and classes, the increased complexity is only observable by one or two developers doing the work.

But, if services are encapsulation (they are!), then rearranging the pieces will leave you paying for the increased complexity for days, weeks, months. Now the encapsulation takes on real costs: the risk of completing it, the burden of explaining to others what you’re doing, etc. That encapsulation better be worth it and not just a hunch!

For example, one could write a new layer on top of a technology like sockets and call it something like ‘connections’, but unless this new layer really encapsulates enough underlying complexity, like implementing a communications protocol and a data transfer format, then it has hurt rather than helped. It is ‘shallow’. What this means is that for any useful encapsulation, it must hide a significant amount of complexity, thus there should be plenty of code and data buried inside of the box that is no longer necessary to know outside of it. It should not leak out any of this knowledge. So a connection that seamlessly synchronizes data between two parties (how? We don’t know) correctly removes a chunk of knowledge out of the upper levels of the system. And it does it in a way that it is clear and easy to triage problems as being ‘in’ or ‘out’ of the box.

My experience is that encapsulation, if it happens at all, starts off shallow. Real encapsulation, where a developer can treat it as a black box, never needing to peak inside to understand the mechanisms or in/out problems, is rare. It takes the best designers of software to achieve it.

We should all be so bold as to attempt building encapsulations of that quality, but not so proud to think that we succeed at it even half the time.

In little programs, encapsulation isn’t really necessary, it might help but there just isn’t enough overall complexity to worry about. Once the system grows however, it approaches the threshold really fast. Fast enough that many software developers ignore it until it is way too late, and then the costs of correcting the code becomes unmanageable.

I feel like prefactoring a program or architecture only increases the complexity growth rate of small systems. A dominant factor in complexity is communication and coordination cost. If you start off with ten classes instead of three, or three services instead of one, you haven’t tripled your complexity, you’ve squared it (or worse).

I’m all for minimal solutions and fighting to keep things small, but not at the cost of incurring large coordination overhead.

To build big systems, you need to build up a huge and extremely complex code base. To keep it manageable you need to keep it heavily organized, but you also need to carve out chunks of it that have been done and dusted for the moment, so that you can focus your current efforts on moving the work forward. There are no short-cuts available in software development that won’t harm a project, just a lot of very careful, dedicated, disciplined work that when done correctly, helps towards continuing the active lifespan of the code.

Emphasis mine. In a successful system, size and complexity are nearly unavoidable. Almost every “best practice” and “leading edge approach” we know of is contextual and expresses trade-offs. Thus I’m left agreeing that the unsatisfying, hand-wavy craft of “careful, dedicated, disciplined work” is the principle most likely to generate code that’s improves (rather than regresses) over its lifetime.