Schön Sperrige Wörter

Lautstand bezeichnet in einer diachronen oder diatopischen Betrachtungsweise das gesamte System von Lauten einer Sprache. Wenn sich der Lautstand, meistens über einen längeren Zeiträume hinweg, verändert, so spricht man von Lautwandel. Oftmals unterscheiden sich Dialekte einer Sprache durch den Lautstand.

-Wikipedia

Working Effectively with Legacy Code

I have just finished reading Michael Feathers’ Working Effectively with Legacy Code. The book gives a lot of very concrete examples of how to improve code and make it testable. To my tastes he is exaggerating the whole code without test is legacy code thing, which in the naïve reader might lead to the impression that a test is all it needs for code to be good.

As I am going to sell my copy at favourable rate to one of my colleagues, I’ll just list a few things I found intersting:

  • Chapter 15 – My application is all API calls. He uses a very good example of a mailing list server to illustrate how to get around problems with external API dependencies. He distinguishes to approaches “Skin and wrap the API” and “Responsibility-based extraction”. I am strongly favouring the second approach as I find it leads to more relevant abstractions and not wrapping for the sake of wrapping…
  • Another theme in the book is understanding code. He mentions a few nice techniques:
    • Effect analysis (dataflow analysis) using pen and paper
    • Mapping out your way through the code base while reading on a piece of paper
    • Printing the code and doing listing markup using marker pens.
    • Scratch refactorings while reading code, that are thrown away and whose sole purpose is helping to understand the codebase and on a similar note delete unused code you find while reading
    • Telling the story of the system. This is done at several levels starting with a high level description that has a lot of generalisations. This practice helps people to develop a shared vision and also highlights discrepancies between the system and the story.
    • Naked CRC-Cards – somewhat poorly named as actually it’s naked object cards. Use white index cards on a table to represent objects. Tell story as you place and rearrange them.

There is a nice example for separation of concerns (he uses the term SRP, which I don’t like) on page 247 describing an expression evaluator.

Adapt parameter refactoring. If the parameter type is too complicated for stubbing etc, look at the use of the parameter and introduce a wrapper around the original object that just exposes what the current method needs. Example HttpServletRequest gets wrapped in ServlatParameterSource which implements a new ParameterSource interface.

On a similar note primitivise parameter. Instead of passing in our wrapping an expensive abstraction just pass in the required “values”. This of course very dangerous as it breaks encapsulation. On the other hand it could lead to the extraction of roll based primitive interfaces (a bit like the adapt parameter pattern)

Unconditional Success

Before I forget it completely, I have to post a link to Nick Williams’ Unconditional Success. This book is outrageously funny. We kept a copy on our desk and would open it at random pages to reinvigorate the team spirit.

Gog Magog Walk

Today Arne and I ventured out to explore rural England. We did the Gog Magog Walk from Arne’s “Walks around Cambridge” guide.

The first thing that caught my attention was this chair:

Another typical feature of the English landscape is the open fuse box:

Also we learnt what a copse is:

And more importantly that it actually deserves a name (we just saw Anne’s Copse):


And of course we need a picture of our heroes in the winter idyll. As you can clearly see there as about half an inch of snow. This amounts to adverse weather conditions and hence treacherous roads. We felt like invading Russia…

Naked Objects Talk in London this Thursday

Dan Haywood will be coming to the ThoughtWorks London Office to talk about his new book Domain Driven Design using Naked Objects. I have been excited about the idea of NO for quite a while, though the actual implementations have been controversial. So I hope for interesting discussions.

Here the details:

Thursday December 10, 2009 from 7:00pm – 10:00pm
ThoughtWorks UK Office
Berkshire House, 168-173 High Holborn
London, United Kingdom, England WC1V 7AA

If you are keen on the free pizza sign up on our
Upcoming Page.

Some interesting links:
Naked Objects Home of the original implementation.
JMatter A more beautiful implementation

Hope to see you there.

Programming Language du Jour

Today whilst researching how to implement a proper brainfuck interpreter, I stumbled across Babbage. I would say it’s love at first sight.

Using Extension Methods

I came up with the following extensions method in C#:

public static class NeatExtensions{
    public static bool In<T>(this T element, params T[] elements){
        return new HashSet<T>(elements).Contains(element);
    }
}

It actually yields a nice syntax for checking enum values and the likes:

enumValue.In(Enum.X, Enum.Z)

I am wondering whether there is actually a de-factor standard library in the C# world that has extensions like this.

Objects as Functions in Java

Earlier this year I wrote a build tool in java. The core idea at the time was to express the build in terms of functions and function composition. This is not exactly a good fit with java. So last week I had some spare time and came up with this way of defining a function (application) in java:

 public static class FancyFunction extends FunctionBase {
        @In
        public final Str a;
        @In
        public final Str b;
 
        @Out
        public final Str c = null;
        @Out
        public final Str d = null;
 
        public FancyFunction(Str a, Str b) {
            this.a = a;
            this.b = b;
            super.reflect();
        }
 
        protected void evaluate() {
            setResult(c, new StrImpl(a.getValue() + "-" + b.getValue()));
            setResult(d, new StrImpl("XX"));
        }
    }

There is obviously a lot of magic going on and I do feel a bit bad abou setting final fields using reflection.

What it actually does is defining two functions. In a friendlier syntax somwhat like this:

fancy.c(a, b) = a + "-" + b;
fancy.d(a, b) = "XX";

But back to the java example, you can now compose applications like this:

FancyFunction fun = new FancyFunction(new StrImpl("x"), new StrImpl("y"));
FancyFunction fun2 = new FancyFunction(fun.c, new StrImpl("z"));
assertEquals("x-y-z", fun2.c.getValue());

As this is all very reflective I could actually traverse the metadata and get dot to render this:

test

There is no code, as it is all very dirty. But to me it looks like a viable syntax for specifying tasks and wrapping imperative code into a functional style. The current implementation relies heavily on interfaces, because it generates a lot of proxies that allow for lazy evaluation.

Fighting the Fifty-Method-Repository using Specifications

Recently I have been working on a domain specific content management system, that, like most content management systems, lets the users filter information along several dimensions and even has a fancy full-text search.
It sports a web-based UI that is implemented using an MVC-architecture. Essentially requests are being served by controller methods, that pull the data from the content repository and throw it at a template.

When writing our content repository in the traditional way (adding a method per query) we realised, that we were adding a lot of methods with signatures, that were just different combinations of the same set of parameters, which felt wrong. The next observation was, that finding meaningful names for these methods was also quite difficult. The first impulse was to describe the query, which is just paraphrasing in text, what can be expressed in a query language more elegantly (on a similar note my ex-colleague Jay Fields called test-method names a smell). If on the other hand you try to name your repository methods after the intent, these names become very similar to the controller actions, this also seems wrong as it leaks responsibility from the controller (decide what data to display for a given action) into the repository. Thirdly thirdly we found that, when we tried to review, what content we display on the respective pages, we actually had to do a lot of drilling down into the repository to figure out, what data gets retrieved.

So what was to be done about this? The first idea was to have SQL or a hibernate query straight in the controller, this actually makes it very obvious what data gets retrieved. At first glance the only thing that stands against this is agile folklore. But then looking closer at our repository we could of course identify a few genuine responsibilities for the repository. The simple and obvious cases were pagination and abstraction from the underlying data store. The third one was actually translating some of our domain notions into a set of conditions. Whether a piece of content is published depends on it not being marked as draft and on the current date being inside the publishing interval of the particular piece of content, i.e. each piece of content has a start date and an end date.

The solution we came up with was turning our query into a proper object describing the objects we want back. Eric Evans calls this a specification. One of the DDD-patterns that should be used more often – as opposed to the abominable service pattern. So using CSharp our query looks something like this:

public class ContentQuery {
 
    public ContentQuery(){
        Status = Status.All;            
        Sort = new List<ContentSort>();
        Pagination = PaginationStrategy.ALL;
    }
 
    public long? Id { get; set;}
    public Status Status { get; set; }
    public string FullText { get; set; }
    public Author Author { get; set; }
    public Set<Tag> tags { get; set; }
 
    public IList<ContentSort> Sort { get; set; }
 
    public PaginationStrategy Pagination { get; set; }
}

We are using CSharp properties, to get the syntactic suguar around object creation. Creating and executing a query comes down to:

var query = new ContentQuery{
                Status = Status.Published,
                FullText = "Java"
}
 
var content = contentRepository.Execute(query);

PaginationStrategy is a simple pair that takes the current page number and the number of items per page. The sort is for specifying a sort order. A typical query in our system looks like this:

var query = new ContentQuery {                                                                                
         FullText = "No more .net",                                                     
         Status = Status.Published,                                                     
 
         Pagination = new PaginationStrategy(1, 15),                                              
 
         Sort =   {
                 new DescendingSort(ContentSortField.StartDate),                           
                 new AscendingSort(ContentSortField.Title)
         }                                                                        
}                                                                              
 
var content = repository.Execute(query);

Even though this is quite a bit of text, I do prefer it over something like this:

var content = repository.GetPublishedContentOrderedByStartDateAndTitle("No more .net", new PaginationStrategy(1, 15));

Also, as the query is now an object, you could introduce factory methods that populate the defaults. You could have two factories, one for for admin queries and one for public front end queries, that default to showing all content and published content respectively.

The first cut of the implementation of our repository then looks like this:

public List<Content> Execute(ContentQuery query){
        var criteria = Session.CreateCriteria(typeof(Content))
 
        if (query.Status == Status.Published){
                criteria
                       .Add(Restrictions.Le("StartDate", DateTime.Today))                
                        .Add(Restrictions.Ge("EndDate", DateTime.Today));
        }
 
        if (query.Status == Status.Draft){
                criteria
                        Restrictions.Or(
                                  Restrictions.Gt("StartDate", DateTime.Today),
                                  Restrictions.Lt("EndDate", DateTime.Today)
                        );
        }
 
        if (query.Author != null) {
                criteria.Add(Restrictions.Eq("Author", query.Author)) ;
        }
 
        if (query.FullText!= null) {
                criteria.Add(Restrictions.Like("Body", "%" + query.FullText + "%")) ;
            }
 
         ...
 
         return criteria.List();
}

There is a lot of conditional logic going on, but believe it or not , they are all “good ifs” (I should write a blogpost about good and bad ifs at some point). The approach even though being based on simple conjunction of criteria was covering all our use cases. More importantly it also helped with optimisation. We actually switched from using hibernate and the database to using lucene for searching content. The only thing we had to change was this single method inside the repository. So all the special case optimisations are hidden away in the repository and not exposed to the application (I recall a project that had a method along the lines of GetAllContentForTagOptimisedForTheFirstPage(Set<Tag> tags) – that was called from application code)

Another interesting observation is, that the query object is actually a very good candidate for a model object behind a search view/ UI, with all the fields of the query object matching one element in the UI. This reminded me that back in 2006 I did something similar coming from a UI perspective.

Verdict:
The SoC score of this solution is pretty high. It decoupled our application from the database and allowed for low impact optimisations. It also made the application code a lot more readable.

Want to use a Mock?

You are wrong! Well in most cases. It seems above the intellectual means of most agile developers (the non-agile developers don’t write tests) to realize that tests like the following are useless and even harmful. It is not giving me anything apart from test sclerosis.

  [Test]
        public void ShouldGetAllByauthorId()
        {
            //given
            var mockRepository = MockRepository.GenerateMock<IContentRepository>();
            var authorRepository = MockRepository.GenerateMock<IAuthorRepository>();
            var subcategoryRepository = MockRepository.GenerateMock<ISubCategoryRepository>();
            var contentService = new ContentService(mockRepository, authorRepository, subcategoryRepository);
 
            //when
            contentService.GetContentByAuthorId(1);
 
            //then
            mockRepository.AssertWasCalled(repository => repository.GetContentByAuthorId(1));
        }

A healthy six lines of code to test this:

   public IList<Voucher> GetContentByAuthorId(long authorId)
        {
            return contentRepository.GetContentByAuthorId(authorId);
        }

Verdict: Waste in an agile disguise.