pragprog
Michael Feathers on how code grows
Festering Code Bases and Budding Code Bases:
Some teams produce what I call a festering code base. In a festering code base, the team changes the code primarily by adding code to existing methods and adding methods to existing classes. The results are predictable. Classes and methods grow malignantly, eventually becoming thousands of lines long.
Better teams produce budding code bases. Developers create new classes and methods and delegate work outward. Periodically, they collapse structure back into a simpler form, but the dominant trend is to grow the code by creating new structure.
I'd never put much thought into how code bases grow in the past. Feathers has some interesting ideas here about the characteristics of good and not-so-good growth and how languages and tools might promote good growth.
A rambling, regurgitated thought on process
Elevator pitch: I’ve found that if you want to divert a productive team into an hour or two of semi-fruitless banter, ask how the team should use Git, Pivotal Tracker, and Capistrano to manage incoming work, verify it, and deploy it to production. In reality, you should ignore all the corner-cases and figure out what will enable you to push really small chunks of work with great frequency.
Ed. What follows isn’t novel, but it was a useful change in perspective for me, so I decided to share.
I’ve been thinking a bit about software processes lately. Despite great variation in telling you how to do so, most processes seem to focus on to do more stuff faster.
Lately, the notion of doing less has a lot of interest. Lean startups are the new-new thing and Getting Real is the old new thing; both preach getting more done and delivering value by doing less and analyzing the results more.
There are two kinds of “do less” a software developer can engage in. In the past I’ve been a little too focused on how I can take on fewer responsibilities from other parties. Literally doing less by scoping down features, putting off decisions, and focusing on things that seem like they really matter. I sometimes feel like I’ve become too eager to do less, making myself something of a cranky coder/slacker. But I digress
Recently, I’ve been trying to tackle doing less in my habits of creating software. How can I write less code to implement a feature, not in the minimalist sense, but in the “how do I just get it to kinda work sense”? How can I take less time between starting something and getting some form of it out in the wild? How can I make my code less coupled so there are fewer changes to make when I decide it needs to do something else? How can I make this less coupled to data storage so that putting it out requires less deployment effort? How can I make changes that are less likely to cause long-term regressions? How can I make it less effort to rollback bad changes?
When I look through the lens of accomplishing more by doing less, a lot of popular software methodology seems like dead weight. Rather than trying to find a process that addresses every team member’s own scars and affections, both perceived and imaginary, it seems most useful to imagine the smallest ruleset that won’t result in uncontrollable entropy and put it into action. If something starts to hurt, imagine the simplest new rule and put it into play.
The goal, as stated above, is to get to the point where you make really tiny, maybe imperceptible changes, and push them really frequently. Everything that stands in the way is the enemy.
Code re-use as technical debt
I have extremely mixed feelings about code re-use. I think it’s largely a red herring, never working out as well as developers would hope. After all, developers are like golfers; always optimistic about how well an approach will work or how far down the fairway their ball landed.
But here’s a real stab in the side of code re-use: in many cases, it’s tantamount to technical debt. Embrace technical debt:
For example, early on at IMVU, we incorporated in tons of open source projects. This was a huge win (and we were delighted to give credit where it was due), because it allowed our initial products to get to market much faster. The downside was that we had to combine dozens of projects whose internal architectures, coding styles, and general quality varied widely. It took us a long time to pay off all the debt that incurred – but it was worth it.
Using someone else’s code will help you keep moving now, but you stand a good chance of needing to rewrite it later.
That’s not to say it’s all bad. By the time you know you need to replace someone else’s code, you’ll have learned about the domain it covers and how you need to solve that problem in your domain.
Keep it in mind: just because you can drop someone else’s code into your app and use it, doesn’t mean it’s all roses and butterscotch.
Interviewing to seek values
Adam Wiggins, per usual, is on to something. Values:
Sharing values is the most important part of effective collaboration. If you don’t have significant overlap on values between you and your teammates, you’re going to have a tough time getting anything accomplished.
I’m starting to think that figuring what the other person puts a premium on is the most important part of a technical interview. Is the other person passionate in the same way you are? Are the things they obsess over complimentary to what you would rather gloss over? If the answer to these questions is yes, you’ll probably make awesome things together.
Software development requires empathy
If You Want to Write Useful Software, You Have to Do Tech Support:
It seems so obvious: if you want to develop software that’s useful to people, you’ve got to talk with them. But too many developers take the anti-social approach and consider customer support to be beneath their status. Besides, talking with customers would distract them from important code-slinging.
I have to remind myself, almost every day, that one of the the most important qualities I can possess as a developer is empathy. Primarily for the user, their cognitive load, and what they’re trying to accomplish. But further, for the developer who comes to my code when I’m done, the guy who operates it, and everyone else down the line.
When to do test-driven development
I believe that writing code using testing[1] as a design activity yields long-term benefits that make my life easier.
Though I’m a strong believer, I’ve struggled with TDD in the past. I’ve found I get bogged down in keeping the red-green-refactor cycle going. Sometimes I have to work with code that is lacking sufficient tests, but I know I can’t boil the ocean before I proceed to whatever I’m really trying to do with the code. Other times, I’m not sure if I’m testing the right things; I could be missing tests in one place and writing too many tests in another.
Three easy pieces
Last week, I read three insightful pieces and made one discovery of my own that deepened my understanding of the practice of TDD. Allow me to share.
First off, Kent Beck has been exploring the phases in the life of a startup. The earliest stage is proving the idea. He later asserted that when you’re doing stuff like this, you can drop TDD, temporarily. I thought about this and it clicked. If you’re working on a prototype, where you’re trying to explore an idea and see if it works, you don’t want to iterate on the code, as TDD would have you do. You want to iterate on the idea. TDD will just slow you down.
On the other end of the spectrum, you’ve got maintenance programming. Once a startup, company or project has proven their idea and shipped a system, you are maintaining software. Keeping it working, living, breathing. For this sort of development, where you’re making small, focused changes without adding significant functionality, Tim Bray pointed out that TDD is critical. Using it to drive the process of fixing bugs, cleaning up the system and adding minor functionality is really handy for figuring out if you’ve broken something in some dark corner. It also helps the next guy to do the same. It’s a win, and I suspect it’s the sweet-spot of TDD.
If you imagine prototyping and maintenance as opposite ends of the software life-cycle spectrum, adding significant new features to existing software probably lies somewhere in the middle. You may need to “poke around” to decide if what you’re doing is right, like when you’re prototyping. But once you’re done, you want some way for others who have to maintain the software (such as yourself) to figure out what it’s supposed to do and whether assumptions have been broken.
Then I read an anecdote by Uncle Bob about how he worked out an ambitious new feature in Fitnesse. He and his pair got a new feature working, celebrated, and called it a night. When he woke up, he realized they weren’t actually done; they still had to clean it up, by writing tests.
Oh. Duh.
This was the missing link, for me. Sometimes, you need to iterate on the idea first. If skipping the tests helps you, so it goes. Once you’ve got the idea working (and committed!), then start writing tests[2]. Once you’re happy with the structure and coverage of your code, you commit again and then push it to your peers[3].
The crux of my revelation is this: you get the benefits of TDD-as-a-design-activity by doing it. When you do it is immaterial. You just have to do it.
My own revelation is blindingly obvious in retrospect. If the going gets tough, proceed in this order: get it to work, write some tests for it, then clean up the code. Sometimes you can break this cycle if what you’re working on doesn’t take too much cognitive capacity. But if you overflow your mental buffer, you have to break it down into steps and work through the cycle. Failing to realize this was one of the causes of me bogging down in TDD.
Context is everything. Always.
Adding context to answer the question of when you start writing tests is something I haven’t found much writing on until recently. I’m increasingly finding that considering the situation is a great ninja-move in my quest towards writing beautiful, useful code.
fn1. Call it a test, example, behavior, or story. Whatever.
fn2. Jim Weirich did a great presentation on how to backfill tests on existing code.
fn3. Pardon the Git-centric terminology[4].
fn4. If you are not already, please start using Git immediately.
Meaningful work
Just like being awake is more than just having your eyes open, going to work should be more than just being at a workplace trading time for money. It should be meaningful. But where does meaning come from? Of course, it comes from ourselves. We put meaning into things, and share our meanings with others, and teach each other how to build meaning out of what is in front of us.
Buster’s on to something here. He’s articulating one of the qualities I find in the best developers: what they do has meaning and matters to their personality. They are working to make things that result in greater happiness for themselves and others. Their passion is manifest in the quality of their code.
You should also check out Buster’s personal site. He’s got a neat info-graphics, personal data-mining thing going on there.
The joy of enigmas
...thinking about an enigma. There it is before you—smiling, frowning, inviting, grand, mean, insipid, or savage, and always mute with an air of whispering, ‘Come and find out.’
— Joseph Conrad, Heart of Darkness
Visualizing language trade-offs
Guillaume Marceau has done some excellent work crafting the data from the venerable Computer Language Benchmark Game into visualizations that quickly show the trade-offs of using each language. The speed, size and dependability of programming languages puts each language in a small graph that simultaneously shows the execution speed and program size of each test for every language. From there, the characteristics of each language is manifest. He then goes on to consider whether functional languages display unique performance/size characteristics.
This is a must-read. It’s also great information design, proving that programming language esoterica needn’t bore the reader.
How did SQL get so popular?
Many developers, especially of the younger generation, dislike relational databases and their business-partner, SQL. It is regarded by some as the new assembly language. With all this distaste going around, how did it gain such a strong foothold in industry?
I offer you two answers: ACID and surface area.
ACID
Atomicity, consistency, isolation and durability. It’s not something most folks want to think about. To a rookie developer, it’s overwhelming. They’re not yet familiar with the semantics of the systems their programs run upon. Is fread
thread-safe? “How should I know, I just learned C last semester and about fread
’s parameters last week!”
The promises of a modern relational database include a compelling bullet point: your data is safe with us. Use our APIs, don’t break the rules, and we will make sure you never blow away some data and get a call at 3 AM. Rather, the DBA will, but what do you care about that guy?
So I submit to you that most programmers don’t use databases because they’re great. Rather, they have come to rely upon them because the canonical tome on transactions is heavy enough to maim small mammals and rife with formalisms. So they skip the nine-hundred page textbook and pick up the six-hundred page O’Reilly book.
Surface Area
Most programs that people will pay you to write involve side-effects. Further, many of those side-effects have to do with saving data off so you can perform further side-effects on it in the future.
The rookie developer typically leans first to files. Files are familiar and pervasive. But files leave a lot to said rookie. How should I structure my data? How do I load and save the data? How do I manipulate the data once it’s in memory? Even in scripting languages, with their simplified APIs, this means the rookie is faced with APIs like this:
fopen
fread
fwrite
seek
fclose
encode
decode
hash_set
hash_get
When I was bit a wee lad o’ programming, I found this Gordian knot difficult to cut. But then, one day, I was told by a programmer of greater wisdom to use a database. That API looked like this:
connect
execute
fetch
next
select
insert
update
delete
It was a lot easier to understand, even though the last four are a completely different language.
So, I submit to you, that SQL also won because it was easier to understand how one might structure their programs, make them work and, if they’re lucky, get them to run quickly.
Inflection point
I’d wager that five years from now, the generation of developers who are now upcoming won’t take the database tier for granted. Key-value stores, distributed file systems and document databases will all play into the question of “what do we do with the important data?” Sometimes, relational databases will prove useful. But increasingly, other things will too.
In the end, there’s two ways to look at this: we will soon throw down the shackles of our relational overlords, or, prepare yourself for the database renaissance in programming fashion that will occur in a decade or so.
Everyone wins!
Put your objects in space
Space-based Architecture - on building and scaling your system with a tuple space, the kissing cousin of the messaging queue. I didn’t know that tuple spaces are used much in finance apps, but I’m not surprised. They’re a worthy idea.
John Mayer, closet software developer
“The idea is to run as many concurrent streams of production as we can." - Is John Mayer recording an album or bootstrapping an indie app?
When technical discussions get intense
Pro-tip: trying to unwind contentious technical discussions is a losing game. There are really multiple things going on: people discussing trade-offs in absolutes, personal vendettas being aired, missing tact filters and turf protection. If you’re lucky, there’s also some useful information hidden in the turd tossing.
Solution: don’t read too deeply, go do something useful instead.
Bonus tip: talking it out, face to face, over good drinks in a nice environment is “something useful”.
The power of not knowing
It's a programmer's biggest strength when he knows what he doesn't need to know. And gaining (experience) in not knowing isn't as easy as it sounds.
Developing fluidly
Here’s a raw idea I’m playing with in my head:
Agile development is great. But, if your team doesn’t map well to it, steal ideas from agile relentlessly.
You want a fluid environment where developers can solve problems (features, defects, chores) as they see fit.
Don’t use procedures to normalize productivity or as a communication protocol.
Do have a way to communicate things that need to get done or could possibly get done.
You need a safety net. Unless you know better, that safety net is some kind of automated developer test suite.
Enable developers, don’t direct them.
Discuss.
Read slightly less, practice slightly more
Chris Wanswrath, a smart and distinguished fellow, advises us to burn our news readers and just “hear it through the grapevine.” But how far can one go with that?
For myself, reading feeds gets me a few things:
-
Aesthetic where I have none. Feeds like BLDGBLOG and Coudal point me to things that make me better at what I do, in a tangential way, and a more interesting person. These are things that otherwise I wouldn’t know where to start.
-
Awareness on the edges. Reading folks like Simon Willison or Jason Kottke make sure that interesting topics in programming or erudition don’t go unseen even though I am focused on that topic.
-
Aggregation of ideas. This cuts two ways. Most people worth reading compress a bunch of different sources down to a manageable stream. This gives me more bang for the buck in my feed reading time. On the other hand, if a link is mentioned several times in the aggregate of feeds I subscribe to, then its probably worth checking out.
I can see how following interesting folks on Twitter and reading aggregators occasionally can you get you some of this, but not all of it. With sources like Reddit or Hacker News, signal to noise is a problem - you can’t control who posts what. Some people have a lot of extra angst and/or spare time. Which is also the other side of the Twitter story. Some people are great to read, but a pain to put up with at times. So it goes.
When Chris' essay first hit the wires, I was tempted to adopt his ways. But, I think I’m pretty good at ignoring the need to unbold things and cut down to business. What has proved immensely useful to me was has encouragement to just code all the time and make lots of stuff. I’m just getting started with this, but already I’m liking the increased feeling of accomplishment.
Regardless, we could all probably stand to trim our feed lists and hunker down on our projects, no?
Domain Driven Design
Designing software is a tricky thing. It’s tempting to front-load it on a project. That won’t work because the start of a project is when you know the least about it. So some folks try to do as little design as possible. I’m guilty of this. However, that can lead to software that doesn’t adequately express the problem it’s trying to solve. Further, there is often a temptation to over-design software with lots of ceremony and architecture. Contrary to this is the temptation to not design it at all, which again leads to software that doesn’t express itself.
There’s a book that draws a reasonable compromise between these forces. I’ve been meaning to read Eric Evans' Domain Driven Design for a while now. The emphasis of the book is in collaborating with domain experts and other developers to find the essence of the problem space and then express that in software (as objects). I’ve often pointed out the utility of building applications from the language up and the problem domain down. DDD focuses precisely on the latter.
One of the core concepts in the book is the ubiquitous language that is used to describe the problem at hand. This language is used by the domain experts (customers) and the developers. The language is then woven into the design of the system. This leads to software that is more likely to succeed, both in business terms and in terms of development effort. Evans spends the first part of the book describing the particulars of this language.
He then moves on to describing the technical side of the software. Entities, value objects, services, factories, modules and repositories are terms I was already somewhat familiar with that Evans gave a more crisp and satisfying definition to. For most people, this is probably the tasty meat of the book, illuminating the way from a competent developer to an outstanding developer.
The last part of the book focuses on the larger scale issues of deep design. I was particularly pleased that he covers how software design is affected by various good and bad social issues. It also gives a strategic view of the forest, where most books on software development focus on a more tactical view of each tree.
I’m fond of pointing out books that are inflection points in my way of thinking about software development. Code Complete, The Pragmatic Programmer, The Dragon Book and My Job Went To India all fall under this category. Domain Driven Design is certainly the latest edition. It makes sense of trends I see in great software and illuminates a path to make software like it myself.
If reading this review didn’t make you want to vomit, you should probably read the book posthaste.