Domain Models, Anemic Domain Models, and Transaction Scripts (Oh My!)
Ever work on a small project (say 5-8 developers, a few hundred thousand lines of code) and get the feeling that the codebase is unreasonably large and difficult to navigate or use/reuse? Ever notice that other people keep duplicating logic — like validation logic — all over the place, violating the DRY principle every which way? Ever notice how difficult it is to change one part of your system without breaking lots of stuff in another part (I mean, this happens anyways, to a degree, but is it a common occurrence on your project)?
It would seem that your project might be suffering the side effects of an anemic domain model. As I have observed that these models tend to be used mostly with a transaction script style design, I will use these two terms interchangeably.
First, what are anemic domain models and transaction scripts? This has been discussed to death and there are tons of resources which describe an anemic domain model. I’ll spare you from repeating these points, but I’ll summarize this approach to design as creating a lot of dumb “shell” classes (essentially, all of your core domain objects are just DTOs) with everything essentially public (because an anemic domain model relies on components of a transaction script (typically things called “services”) to act on them and modify their state, typically breaking encapsulation). You’ll recognize it if you are constantly using classes with a “Service” or “Util” or “Manager” suffix.
The wiki page for anemic domain model has a nice summary of why you should avoid using them:
- Logic cannot be implemented in a truly object oriented way unless wrappers are used, which hide the anemic data structure.
- Violation of the principals information hiding and encapsulation.
- Necessitates a separate business layer to contain the logic otherwise located in a domain model. It also means that domain model’s objects cannot guarantee their correctness at any moment, because their validation and mutation logic is placed somewhere outside (most likely in multiple places).
- Necessitates a global access to internals of shared business entities increasing coupling and fragility.
- Facilitates code duplication among transactional scripts and similar use cases, reduces code reuse.
- Necessitates a service layer when sharing domain logic across differing consumers of an object model.
- Makes a model less expressive and harder to understand.
I’ll admit: I’ve been guilty of this very pattern (or anti-pattern, if you want to be an “object bigot”). It’s not that it’s a bad thing. In fact, as Greg Young argues, it’s a prefectly suitable pattern in some cases. Fowler himself says that there are virtues to this pattern:
The glory of Transaction Script is its simplicity. Organizing logic this way is natural for applications with only a small amount of logic, and it involves very little overhead either in performance or in understanding.
It’s hard to quantify the cutover level, especially when you’re more familiar with one pattern than the other. You can refactor a Transaction Script design to a Domain Model design, but it’s harder than it needs to be.
However much of an object bigot your become, don’t rule out Transaction Script. there are a lot of simple problems out there, and a simple solution will get you up and running faster. (PoEAA p.111-112)
So it’s not that it’s inherently a bad design, what I’ve found through my own experience, is that it doesn’t scale well. What does this even mean? Again, I admit a certain level of ignorance; it’s not until about a year and a half ago that I finally “got it”. I had read through Fowler’s book a few years ago, and found it difficult to grasp what it meant to write an application using a domain model. Fowler himself only spends some 9 pages discussing the topic and, in his closing paragraph, “chickens out” on providing an end-to-end example.
In working on my current project with several junior consultants, I’ve found myself trying to explain what it means to design a software system using a domain model as opposed to a transaction script model. In doing so, I think I’ve whittled it down to a pretty simple set of examples and the simplest explanation of why a domain model is far superior to a transaction script model when working on a team of more than one 🙂
While codebases written in the style of DM or TS/ADM will probably contain the same number of classes, there is a big difference in how those classes are wired together and how much surface area a user of the codebase (let’s call it an API) will need to know.
As an aside, IMO, as soon as you’re programming in a team of more than one, you’re writing an API or a framework of some sort, even if on a very small scale and very loosely defined.
One of the key benefits of a domain model approach is that it hides the complexity of the different components behind the core objects of the problem domain. I like to think that in a transaction script model, the locus of control is placed outside of your core objects. In a well designed domain model, the locus of control is contained within your core objects (even if big pieces of functionality are still implemented in external services).
Take a calendaring application for example. In a transaction script implementation, you’d have something like this to move a scheduled event (in C#):
<span style="font-family: Lucida Console;"><span style="color: #008000;">// What you would expect to see in a TS/ADM</span>
<span style="color: #000000;">Event _anExistingEvent = ... ;</span></span>
<span style="color: #000000;">EventScheduler scheduler = </span><span style="color: #0000ff;">new </span><span style="color: #000000;">EventScheduler();</span>
<span style="color: #000000;">scheduler.MoveEventDate(_anExistingEvent, DateTime.Now.AddDays(</span><span style="color: #800080;">1</span><span style="color: #000000;">));</span>
In contrast, what you would expect to see in a domain model approach:
<span style="font-family: Lucida Console;"><span style="color: #000000;">Event _anExistingEvent = ... ;</span>
<span style="color: #000000;">_anExistingEvent.MoveEventDate(DateTime.Now.AddDays(</span><span style="color: #800080;">1</span><span style="color: #000000;">));</span></span>
The difference is subtle, but there is a huge gain in usability as there is one less class involved in the interactions of your domain object; a user of your API/framework (i.e. your coworker) doesn’t have to learn about the EventScheduler to use your code. Likewise, if we consider the other things that we can do with events, we start to see the benefits of encapsulating (or rather hiding) the business logic and complex interactions within the domain object instead of in external services. For example, imagine another scenario where you want to send an event in an email:
<span style="font-family: Lucida Console;"><span style="color: #008000;">// TS/ADM style</span>
<span style="color: #000000;">Event _anExistingEvent = ...;</span>
<span style="color: #0000ff;">string </span><span style="color: #000000;">_serverAddress = ...;</span></span>
<span style="color: #000000;">EventSendingService service = </span><span style="color: #0000ff;">new </span><span style="color: #000000;">EventSendingService(_serverAddress);</span>
<span style="color: #000000;">service.CreateMessageForEvent(_anExistingEvent);</span>
<span style="color: #000000;">service.Connect();</span>
<span style="color: #000000;">service.SendEventMessage();</span>
<span style="color: #008000;">// DM approach:</span>
<span style="color: #000000;">Event _anExistingEvent = ...;</span>
<span style="color: #000000;">_anExistingEvent.SendEventMessage(_serverAddress);</span>
In this case, we now need to be aware of two services to interact with events in our calendaring application for moving a date and sending an event. Not only that, they’re relatively hard to discover; whereas a method on the domain object would be easily discoverable via intellisense, it’s not clear how a new user to the system would know which classes were involved in which business interactions without sufficient documentation and/or assistance from the long-timers.
Now while I’ve intentionally left out the implementation of the event class, I’m not implying that there is any less code in the implementation (there may be more and it may be far more complex, involving the usage of dependency injection or inversion of control), but if we consider this code as a public API intended for use by others, clearly, the domain model approach is much more usable and approachable than a transaction script approach where a user of the API has to know many more objects and understand how they are supposed to interact.
I would hardly call myself an expert on the subject (as I’ve written many an anemic domain model in my time). But to me, externally, the distinguishing feature of a domain model approach over a transaction script approach is that the intended usage of the codebase is more discoverable, even if the LOC in the actual implementation only differs by 1%.
Visually, I think of the difference like this:
Clearly, we can see that one of the key benefits of a domain model approach is that there is less coupling between your calling code and your business logic (making it somewhat less painful to change implementation). Note that there aren’t necessarily any less service classes in a domain model approach (although their APIs are likely dramatically different than the APIs of a transaction script model). We can also see that in a domain model approach, the caller or user of the API only has to know about the domain objects and may or may not know about the services in the background (if we were desigining for testability, we’d have some overrides that allow us to pass the concrete service for purposes of mocking). Interacting primarily with the domain objects has the benefit of making it easier to think about the business scenario and the business problem that you’re trying to solve.
Once I grasped this, I started to see the huge benefit that a domain model approach has over an anemic domain model, even on a two person team. Nowadays, I strongly believe that an anemic domain model/transaction script approach is suitable for only the smallest of application development environments: a one man team. Because as soon as you are expected to program different, interacting parts of a system in a team, class explosion becomes a real problem and hinderance to usability (which leads to high ramp up time) and discoverability (which leads to duplication and lots of “Oh, I didn’t know we had that” or “I already impelementated that in that other service“). In such a scenario, documentation (which never exists) becomes even more important (and, if it exists, even more dense).
One very real concern is that then the domain object will become far too complex and bloated. Fowler addresses this:
A common concern with domain logic is bloated domain objects. As you build a screen to manipulate your orders you’ll notice that some of the order behavior is only needed for it. If you put these responsibilities on the order, the risk is that the Order class will become too big because it’s full of responsibilities that are only used in a single use case. This concern leads people to consider whether some responsibility is general, in which case it should sit in the order class, or specific, in which case it should sit in some usage-specific class, which might be a Transaction Script or perhaps the presentation itself.
(Incidentally, that last part regarding putting logic in the presentation is what makes ASP.NET webforms to crappy: the design of the framework (and it doesn’t help that most of the examples in books and MSDN) encourage this).
However, Fowler makes the point that:
The problem with separating usage-specific behavior is that it can lead to duplication. Behavior that’s separated from the order is harder to find, so people tend to not see it and duplicate it instead. Duplication can quickly lead to more complexity and inconsistency, but I’ve found that bloating occurs much less frequently than predicted. If it does occur, it’s relatively easy to see and not difficult to fix. My advice is to not separate usage-specific behavior. Put it all in the object that’s the natural fit. Fix the bloating when, and if, it becomes a problem.
This point is particlarly important; I think most of us recognize this scenario: you change some logic in one place and close a bug ticket only to have another one opened up somewhere else regarding the same issue because you forgot to copy your fix over. Blech! If your validation code is external of your object, it’s easy to end up writing it in two use cases and only updating one (and happens quite often, in my experience). Even if you move your validation code into a common “*Service” class, it is still less discoverable to a new user (well, even team members that have been on the project the whole time, actually) than a method on the class itself. Again, the point is that discoverability and reducing surface area can aid dramatically in terms of cutting down duplication of logic in your codebase.
IMO, a domain model style application design is the way to go. It’s hard to make a case for transaction scripts or anemic domain models. Granted: a domain model is no silver bullet and can add significantly to the initial difficulty of implemetation, but I think that as your codebase matures and grows, the long term savings from an initial investment in setting up the framework (and mindset) to support a domain model is more than worth it, even if you have to build one and throw it away to learn how.