Wednesday 27 August 2008

A Productivity Hint Explained

In an earlier post, I mentioned that writing code that looks like this:

myObject."$localVariable".attribute = 42

... is a bad idea for three reasons. 

Here's the three reasons I was looking for:

1 Violating encapsulation

You don't want to manipulate the guts of other objects directly. If we made a habit of just reaching into the objects near the current chunk of code and doing whatever we want, we'd soon have a codebase which is incredibly difficult to change because whatever piece of code you're looking at now could be potentially affected by another piece of code writing over local state. 

Unit tests work by exercising the public interface of the object i.e. those services that are made available to the community of other collaborating objects in the system. If any old thing can reach in and mess around with the internal state then there wouldn't be any point in having unit tests because you can't rely on the assumptions on which those tests are based.

Every programmer learning his or her first object language used to be taught encapsulation, but I'm still amazed how many really experienced people still don't seem to take this seriously. It's not an aesthetic thing. If you don't get this please give up.

Incidentally the  Sharble and Cohen study from Boeing provides some evidence that designs that use lots of getters and setters to centralise design in a few classes also makes the code slower because you need to execute more instructions just to (needlessly) push all the data around.

2 It makes the object graph very fragile

This is about the Law of Demeter. Basically when you write the expression myObject.x.y.z.doWhatever() you're assuming that there's a whole bunch of things stuck together, at the end of which is an object that can doWhatever(). 

The code "myObject.x.y.z.doWhatever()" makes a lot of needless assumptions:

  • myObject knows x
  • x knows y
  • y knows z

... and all it really needs is to be given z as a parameter.

If any of the relationships between anObject, x, y, and z changes  we get an entirely preventable  breakage.

Some people might find it difficult to restructure this code so that you get access to z without rummaging through an object graph (what some people still seem to call a "data structure".)  See what I've got to say  below about responsibility-driven design to help with this.

3 It's a gratuitous abuse of metaprogramming.
In other words, it's slower and far more complicated that it should be. Incidentally, the business with ."$whatever" is hiding $whatever from compiler  because it needs to be evaluated at runtime rather than compile time. It also makes things an order of magnitude more difficult for a refactoring IDE. I doubt that there's an easy way for the IDE to refactor things automatically when you rename an attribute or to even notice that something's wrong when you move the attribute to another class.

anObject."$whatever" = 42

The metaprogramming here with ."$whatever" only became necessary because we're trying to poke away at the guts of another object - something we shouldn't be attempting to do in the first place. We should get anObject to do something, rather than just treat it as a passive store of data (what some people used to more honestly describe as a "record".)



At least those are the three most pressing things that spring to mind.

A funnier treatment of points 1 and 2 can be found in The Tragic Tale of POTS and his friend SPOT. Grails developers who've read about POTS will appreciate the irony of having a DogWalkingService.


How to avoid writing this kind of thing / how to fix it

The best way to avoid writing code like this is to think in terms of "responsibility-driven design" - that's the fine art of putting the data right next to the code that needs it - in the same class. 

A long time ago, people doing extreme programming knew about this. In fact, RDD was invented by the guys who invented extreme programming. Shame it's not so widely used or publicised.

Doing CRC sessions is a very agile-friendly way to do RDD - and was done on the first ever XP project C3.


A lot of this stuff seems obvious (at least points 1 and 2). It's surprising how many people don't seem to get this. 

Extreme programming is founded on the concept of lowering the cost of change. The points I've made here aren't theoretical or style-related. This is vital. This is about the basic stuff that makes everything else we do possible.

2 comments:

Unknown said...

You know this isn't the bile blog right? ;-)

dafydd said...

... and I was thinking that I'd managed to be polite about it! Not a single swear-word in there ;-)