How many design considerations are there in an almost trivial method? Let’s look at two of them. Consider this code:
def publish!
self.update_attributes(created_at: Time.now)
end
If you’ve been studying OO design and the SOLID principles, using TDD as a practice to guide you towards those ideas, there’s a missing piece here. The reference to Time
is a dependency that should be injected. In Ruby, it’s really easy for us to fix that:
def publish!(time=Time.now)
self.update_attributes(created_at: time)
end
I suspect a lot of TDDers would instinctively write the above first, skipping the first version by force of habit. But, let’s stop and think about what the drives us to want the second version.
The strength of the second version is that it is designed for test. If we need to test how this model behaves when it is published at night, or on a leap day, or the day before Arbor Day, injecting the time object makes that easier.
There are some other test-focused design direction this method could go. We could create our own object whose role is to hand out timestamps, which would allow us to reasonably stub out the time reference, instead of injecting it. I’ll bet there are other approaches lurking out there as well.
I want to look at another set of design considerations. I could design this code for testability, which often leads me to code that follows the SOLID principles which often leads me to decoupled code that is easier to change later. To many people, that’s a good thing.
However, there’s another lens I can look through: API design. How does this method hold up as a piece of behavior that developers will leverage?
Strictly speaking, the TDD’d version is a more complicated API. Even adding one optional parameter to a method carries “mass”. Consider documenting the parameter-less version:
Publishes the current post. The
created_at
timestamp is set to the current time. Returns thecreated_at
timestamp.
For numbers sake, it’s 40 words. More importantly, it reads linearly. Now let’s look at the dependency-injected version:
Publishes the current post. By default,
created_at
is set to the current time. Optionally, callers may pass in aTime
object, or any object that returns aTime
object when sent thenow
message. Thecreated_at
column is set according to thatTime
value. Returns the value of thecreated_at
timestamp.
This one is 54 words. That’s not too many more, numerically, but notice that the explanation is no longer linear. There’s a default, easy case where I don’t care about the timestamp. Then there’s a clever case where I do care about the timestamp. In real API documentation, I’d need to specify when and why I’d want to use that clever case and what it looks like.
There’s some further potential trouble lurking in this API. What if a caller passes in the wrong kind of Time
object? What if sending the now
message raises an exception? Those are important parts of the API too, both from a behavior specification perspective and when considering the user experience of using this API in code and possibly troubleshooting it when things go wrong.
My point is, that optional argument is starting to look rather weighty. Adding the code is pretty trivial. The possible interactions with the optional argument and its support cost is where it gets expensive. Like many things, it’s a trade-off.
I won’t claim to know which of these is better. Honestly, I think it comes down to a subjective view on what’s important: test design, or API design. This is where I can’t make a bold-sounding prognosis. I believe that design, even of code, is about deciding what to leave out. Everyone has to decide what to leave out for themselves.
I suppose it’s worth noting that the un-injected description isn’t as specific as the injected description. The phrase “the current time” isn’t as precise as the description for the injected version. Also, in the injected version’s description, “or any object that returns a Time object when sent the now message” isn’t true.
IMO, the first description warrants less precision because it’s an implementation detail the caller probably doesn’t care about. Even if you did specify its complete behavior, I think that specifying all the edge cases for the test-driven version would mean the API-driven version stays simpler.
But you’re not specifying the edge cases, you’re describing behavior which doesn’t exist! The injected version doesn’t send a now message to its argument. That’s solely a feature of the evaluation of the default argument.
I would have written the injected description: “Publishes the current post. The published_at timestamp is set to the given time, which defaults to the current time. Returns the published_at timestamp.”
You could design for test and API in that case using a clock object that gives you current time and mock it on your tests.
Would you like that approach?
@donaldball yep, you’re right. But, I think you’re missing the forest for the trees. The injected version does more things, even if they’re trivial. I’m not trying to say one way is better, I’m just pointing out a tradeoff that I see a lot of people fail to recognize.
@guilespi You could do that. I think it would satisfy the “rules” of mockist testing. But then you have a clock object whose only purpose in the design is to make it so you don’t violate the rules of mockist testing. I would have a hard time selling that clock object to someone who favors API design over test design.
I think DI is a tradeoff, but in my experience, a sane use of DI is worthwhile in many cases, but I don’t think it needs to be used in every case.
I’ve tested code that is filled with hard dependencies in PHP, and it’s basically impossible to cleanly test outside of integration testing, which is gross. So, to make that code testable, I pretty much have no choice but to rewrite that code with DI. It gets really gross really fast.
I’ve taken the approach to use DI in places where I want to be able to inject things, which is not everywhere. For example, when I’m writing an Obvious app, I use DI to change out the data source, but that’s pretty much it. When I’m testing the app actions I still use real Entity objects. The data source test doubles are also real objects, so the tests run fast, but act more like a controlled integration test instead of a pure unit test. It feels like a good middle ground.
This is really interesting, and I’ve been thinking about some of the same things, though I’m working on the frontend with javascript. But let me try to describe a different way of thinking about this. Let’s go back to the first version of the method, which I agree is ideal from an API perspective. The thing we want to test is “how this model behaves when it is published at night, or on a leap day…”. That’s fine, but those things don’t actually reside in this method. This method has 2 things that:
1) A call to Time.now. I’m not a ruby guy, but this is a pretty standard dependency and there’s no need to put it under test.
2) A call to update attributes. I’m not a rails guy either, but I’m pretty sure this is a well understood mechanism and it’s covered under rails tests.
So if you think about it, the thing we want to test isn’t actually in this method. It’s not actually visible. The thing we want to test is what happens with any save hooks or commit hooks from updating this model. We’re just using this method as a kickoff point for those.
That seems an intuitive approach, but as we’ve seen, it causes problems because we don’t control all the inputs to this method. What I’ve been thinking instead is can we actually move past this method and directly trigger the thing we want to test, e.g. setup a model with a created_at date of leap day 2013, then manually trigger the commit hooks. It really has nothing to do with where we get the time from.
I don’t know if this is possible in rails. But to draw a comparison to client-side code. I might want a test to see if a model does the right thing when a button is clicked. But the test doesn’t need to render into the DOM and then have some harness to simulate a click on an href. Just setup your model, set the right state, and then call model.trigger(‘change’). Or whatever your framework dictates. This has exactly the same effect, it’s just skipping the coupling with UI that’s really not part of the test.
I hope this makes sense. I’m really looking for more tools to manage the tradeoffs between strict DI constraints and trying to create usable APIs.
Would a “published_time” instance method that simply returns Time.now achieve the “best of both worlds”? The “publish!” method’s interface stays clean while providing a means to inject the time value during test? I suppose that’s still increasing the API of the class (unless the method were private) but it at least moves it out of the away from the primary goal of “publishing” (from a consumer/documentation perspective).
I think that adds more weight to the API, so it seems suboptimal to me.
In the end, it’s likely that trying to please the API people and the DI people at the same time always results in compromises that neither side likes.
Since code is read many more times than it is written I think that a clean uncluttered API should win just about everytime.
Your code examples at the top are updating the created_at timestamp. s/created_at/published_at.
Fixed. Thanks!
If we find ourselves changing an API *just* to aid testing then that is wrong to me. It is a failing of the tools or language.
There are clearly two lines of thought that should be kept separate: “what is the API/what is the contract?” and “How can I test this API?”. They are in completely different universes.
I’ve lost count of how many simple codebases have lost their way by DI changes to make them “testable”. I’ve seen simple, clean methods calling concrete types turn into a complete mash of injected interfaces and type hierarchies. Simplicity remains a poorly understood concept.
As you’ve said, developers need to see the tradeoff.
To me, it’s your method naming that is the problem. A method named ‘publish!’ indicates ‘publish now’, a method named ‘publish_at!’ or ‘publish_at_a_specific_time!’ indicates I have control of the timing (past, present, future) of when a thing is published.
With Ruby, we can easily create both using a single method and an alias:
def publish_at!(when=Time.now) # use to publish at a specific time
alias :publish! :publish_at! # use to publish at Time.now
Though the alias is not a strict definition (I can pass it an argument), the name indicates ‘publish now’, and this solution seems to satisfy both design for API and design for testing.
It’s all in the name.
Robert, I wholly agree that naming is crucial.
I have to disagree that an alias can satisfy both camps. You still bear the weight of two methods and explaining when to use which, why they both exist, etc.
I agree with Russ that this is a failing of the tools (not the language in this case).
Mocha and Timecop are great tools that would aid in the testing of methods such as these in which testing needs are different than API needs. In this specific case, all you need is Timecop. For stubbing methods other than Time.now, you can use Mocha.