Archive for December 2008

Musings on Encapsulation vs. (De)composition

The standard dogma of OO-design is, that you should certainly not allow access to the internals of an object to the outside world. Instead you are to expose operations on that type. Generally there seems to be a consent in my current environment that getters and setters are evil, as they do indeed expose internal state. However everyone is using them. So what is going on?

Recently functional programming got a lot of interest (rightfully so), e.g: F#, Scala and a modest come back of Haskell (Curry on!). While in OO we always strive to hide information (which might be a bit of a misunderstanding anyways), while in Haskell it is quite natural to decompose a bit of data and to define functions in terms of functions of the components.

Scala for example provides two alternatives for defining operations on datatypes. There is traditional polymorphism and case classes that allow pattern matching style operations. Odersky sees both approaches as complementary argueing that the traditional OO solution makes it easy to add a new type, by just implementing operations, while pattern matching allows for easy addition of new operations on given types.

My current thinking is that there is certain areas, essentially whenever we externalize state of objects (e.g.: persistence, serialisation, or UI), decomposing an object into it’s components and have generic operations performed on those seems to be quite beneficial. I am not sure, how to allow for such legitimate uses and to prevent all kinds of logic to access the state. Getters and setters are alright, if you use them responsibly. However one could also imagine a more generic access mechanism, perhaps passing in a callback, that takes the components, of the original object as parameters.

Embrace the Power of Unix

fortune | xargs cowsay

What Units to Test?

What to cover?

In my previous post I dismissed the notion that it is necessary to test every class of your system in isolation. I argued that
this is fixing the protocols at all layers of your application and thus making refactorings that shuffle responsibilities around more expensive. Perryn rightly pointed out that the refactoring I proposed should have been covered by some kind of a regression test. So I assume another layer of tests. It would look something like this, where the red circles mark object that are tested together. While the lower right unit can be fully integration tested, the other units actually need stubbed or mocked out collaborators. I think it’s worthwhile to strive for components that have little external dependencies in some cases – pushing as much functionality towards the leaves as Alex recommended.

What is the problem with this approach? Well you definitely should have an integration test a the hightest possible level. In practice this can be difficult, as these things might run in different processes, on different machines, being implemented in a variety of technolgies (javascript on IE6 anyone?). Ideally however we have something along these lines:

Looking at the graph again we might also go for the following component, as it needs only one stubbed out dependency:

On the other hand if we now look at the whole picture of tests, that I mentioned (adding the naïve unit tests), we end up with this picture:

That is probably an overkill. So what is the point of this post? There is a terribly high number of possible tests. As a developer you have to make a call, which ones to go for. Everything else is to expensive and prevents change (Test Sclerosis!). The most confidence in the system is probably gained from system level integration testing. It ensures the system works and it is a level
that can actually drive your design in a meaningful way (breaking a system into several components being the actual design effort). However unit tests are quicker to execute and easier to write even in a messy architecture (do you have an object representing that tool your user works with every day, or is it just a bunch of javascripts, a bit of templating and some domainish code to pull things off a database?).

Another thing that these example reveal is that the terms unit test and integration test are relative. This means we probably have to define implicitly or explicitly, what we refer to, when talking about integration tests and unit tests.

Trivial unit tests

I get increasingly annoyed with what I call trivial unit test.
People are obviously writing tests for the sake of unit test coverage. I observed the following patterns, which were what I would call overly isolated from their collaborators:

  • Testing a factory building a composite decorator. The unit test did not test the behaviour of the composite decorator, but the fact that certain decorators have been composed:
    // Production code public class Decorator Factory { public Decorator getDecoratorForRendering() { return new CompositeDecorator( new HtmlDecorator(), new ParagraphDecorator(), new ImageDecorator() ); } } // Test code public void testShouldCreateDecoratorForRendering() throws Exception { DecoratorFactory factory = new DecoratorFactory(); CompositeDecorator compositeDecorator = factory.getDecoratorForRendering(); Decorator[] decorators = getDeocratorsFromComposite(compositeDecorator); int i = 0; assertEquals(HtmlDecorator.class, decorators[i++].getClass()); assertEquals(ParagraphDecorator.class, decorators[i++].getClass()); assertEquals(ImageDecorator.class, decorators[i++].getClass()); }

  • The mock setup mirrors the actual implementation code. I found one example where
    even a for-loop was mirrored in the expectation setup:

    // Production code public void reprice(List<Product> productList, PricingPolicy policy) { for (Product product : productList) { productRepricingService.reprice(product, pricingPolicy); } } // Test code public void testShouldRepriceAllProducts() { for (Product product : productList) { productRepricingServiceMock.reprice(product, pricingPolicy); } finishedMockSetup(); batchRepricingService.reprice(productList, pricingPolicy); }

All these tests have a negative net value. There is absolutely nothing gained from them, but on the other hand there is an additional burden in terms of maintenance, build times, and clarity.

Why to test in isolation?

Some reasons I could come up with for testing things in isolation:

  • The unit under test solves a very well defined problem and can be reused in different contexts.
  • Reasoning about all kinds of boundary conditions is easier on the unit level. (Caveat if there is
    no way to drive certain cases from the application level, they might not be needed).
  • Performance of tests as well as debugging (I see tests and debugging as complimentary rather than
    exclusive things to do).
  • Limited availablity of external systems. They might not be implemented yet, or simply not installed
    in the test environment.

In all cases the unit should be complex enough, that the test case actually tells me something I can’t spot immediately by looking at the code.

Bottom line: I have come to the conclusion that in entreprise applications the complexity arises from the interplay of all components. Also the single responsibility principle is sometimes difficult to achieve, as there are some concepts that are central to the business and tend to be overstreched (especially when persisted to RDBMS, which helps beeing reluctant about fine grain objects).
Hence I think integration test automation is more important than unit testing. However, if complex functionality arises within a subsystem, it should be nicely isolated and unit tested.

Zimmergenosse

Das hier habe ich heute zwischen meiner Wäsche gefunden:

Von unten sah es so aus: