Gog Magog Walk

Today Arne and I ventured out to explore rural England. We did the Gog Magog Walk from Arne’s “Walks around Cambridge” guide.

The first thing that caught my attention was this chair:

Another typical feature of the English landscape is the open fuse box:

Also we learnt what a copse is:

And more importantly that it actually deserves a name (we just saw Anne’s Copse):


And of course we need a picture of our heroes in the winter idyll. As you can clearly see there as about half an inch of snow. This amounts to adverse weather conditions and hence treacherous roads. We felt like invading Russia…

Naked Objects Talk in London this Thursday

Dan Haywood will be coming to the ThoughtWorks London Office to talk about his new book Domain Driven Design using Naked Objects. I have been excited about the idea of NO for quite a while, though the actual implementations have been controversial. So I hope for interesting discussions.

Here the details:

Thursday December 10, 2009 from 7:00pm – 10:00pm
ThoughtWorks UK Office
Berkshire House, 168-173 High Holborn
London, United Kingdom, England WC1V 7AA

If you are keen on the free pizza sign up on our
Upcoming Page.

Some interesting links:
Naked Objects Home of the original implementation.
JMatter A more beautiful implementation

Hope to see you there.

Programming Language du Jour

Today whilst researching how to implement a proper brainfuck interpreter, I stumbled across Babbage. I would say it’s love at first sight.

Using Extension Methods

I came up with the following extensions method in C#:

public static class NeatExtensions{
    public static bool In<T>(this T element, params T[] elements){
        return new HashSet<T>(elements).Contains(element);
    }
}

It actually yields a nice syntax for checking enum values and the likes:

enumValue.In(Enum.X, Enum.Z)

I am wondering whether there is actually a de-factor standard library in the C# world that has extensions like this.

Objects as Functions in Java

Earlier this year I wrote a build tool in java. The core idea at the time was to express the build in terms of functions and function composition. This is not exactly a good fit with java. So last week I had some spare time and came up with this way of defining a function (application) in java:

 public static class FancyFunction extends FunctionBase {
        @In
        public final Str a;
        @In
        public final Str b;
 
        @Out
        public final Str c = null;
        @Out
        public final Str d = null;
 
        public FancyFunction(Str a, Str b) {
            this.a = a;
            this.b = b;
            super.reflect();
        }
 
        protected void evaluate() {
            setResult(c, new StrImpl(a.getValue() + "-" + b.getValue()));
            setResult(d, new StrImpl("XX"));
        }
    }

There is obviously a lot of magic going on and I do feel a bit bad abou setting final fields using reflection.

What it actually does is defining two functions. In a friendlier syntax somwhat like this:

fancy.c(a, b) = a + "-" + b;
fancy.d(a, b) = "XX";

But back to the java example, you can now compose applications like this:

FancyFunction fun = new FancyFunction(new StrImpl("x"), new StrImpl("y"));
FancyFunction fun2 = new FancyFunction(fun.c, new StrImpl("z"));
assertEquals("x-y-z", fun2.c.getValue());

As this is all very reflective I could actually traverse the metadata and get dot to render this:

test

There is no code, as it is all very dirty. But to me it looks like a viable syntax for specifying tasks and wrapping imperative code into a functional style. The current implementation relies heavily on interfaces, because it generates a lot of proxies that allow for lazy evaluation.

Fighting the Fifty-Method-Repository using Specifications

Recently I have been working on a domain specific content management system, that, like most content management systems, lets the users filter information along several dimensions and even has a fancy full-text search.
It sports a web-based UI that is implemented using an MVC-architecture. Essentially requests are being served by controller methods, that pull the data from the content repository and throw it at a template.

When writing our content repository in the traditional way (adding a method per query) we realised, that we were adding a lot of methods with signatures, that were just different combinations of the same set of parameters, which felt wrong. The next observation was, that finding meaningful names for these methods was also quite difficult. The first impulse was to describe the query, which is just paraphrasing in text, what can be expressed in a query language more elegantly (on a similar note my ex-colleague Jay Fields called test-method names a smell). If on the other hand you try to name your repository methods after the intent, these names become very similar to the controller actions, this also seems wrong as it leaks responsibility from the controller (decide what data to display for a given action) into the repository. Thirdly thirdly we found that, when we tried to review, what content we display on the respective pages, we actually had to do a lot of drilling down into the repository to figure out, what data gets retrieved.

So what was to be done about this? The first idea was to have SQL or a hibernate query straight in the controller, this actually makes it very obvious what data gets retrieved. At first glance the only thing that stands against this is agile folklore. But then looking closer at our repository we could of course identify a few genuine responsibilities for the repository. The simple and obvious cases were pagination and abstraction from the underlying data store. The third one was actually translating some of our domain notions into a set of conditions. Whether a piece of content is published depends on it not being marked as draft and on the current date being inside the publishing interval of the particular piece of content, i.e. each piece of content has a start date and an end date.

The solution we came up with was turning our query into a proper object describing the objects we want back. Eric Evans calls this a specification. One of the DDD-patterns that should be used more often – as opposed to the abominable service pattern. So using CSharp our query looks something like this:

public class ContentQuery {
 
    public ContentQuery(){
        Status = Status.All;            
        Sort = new List<ContentSort>();
        Pagination = PaginationStrategy.ALL;
    }
 
    public long? Id { get; set;}
    public Status Status { get; set; }
    public string FullText { get; set; }
    public Author Author { get; set; }
    public Set<Tag> tags { get; set; }
 
    public IList<ContentSort> Sort { get; set; }
 
    public PaginationStrategy Pagination { get; set; }
}

We are using CSharp properties, to get the syntactic suguar around object creation. Creating and executing a query comes down to:

var query = new ContentQuery{
                Status = Status.Published,
                FullText = "Java"
}
 
var content = contentRepository.Execute(query);

PaginationStrategy is a simple pair that takes the current page number and the number of items per page. The sort is for specifying a sort order. A typical query in our system looks like this:

var query = new ContentQuery {                                                                                
         FullText = "No more .net",                                                     
         Status = Status.Published,                                                     
 
         Pagination = new PaginationStrategy(1, 15),                                              
 
         Sort =   {
                 new DescendingSort(ContentSortField.StartDate),                           
                 new AscendingSort(ContentSortField.Title)
         }                                                                        
}                                                                              
 
var content = repository.Execute(query);

Even though this is quite a bit of text, I do prefer it over something like this:

var content = repository.GetPublishedContentOrderedByStartDateAndTitle("No more .net", new PaginationStrategy(1, 15));

Also, as the query is now an object, you could introduce factory methods that populate the defaults. You could have two factories, one for for admin queries and one for public front end queries, that default to showing all content and published content respectively.

The first cut of the implementation of our repository then looks like this:

public List<Content> Execute(ContentQuery query){
        var criteria = Session.CreateCriteria(typeof(Content))
 
        if (query.Status == Status.Published){
                criteria
                       .Add(Restrictions.Le("StartDate", DateTime.Today))                
                        .Add(Restrictions.Ge("EndDate", DateTime.Today));
        }
 
        if (query.Status == Status.Draft){
                criteria
                        Restrictions.Or(
                                  Restrictions.Gt("StartDate", DateTime.Today),
                                  Restrictions.Lt("EndDate", DateTime.Today)
                        );
        }
 
        if (query.Author != null) {
                criteria.Add(Restrictions.Eq("Author", query.Author)) ;
        }
 
        if (query.FullText!= null) {
                criteria.Add(Restrictions.Like("Body", "%" + query.FullText + "%")) ;
            }
 
         ...
 
         return criteria.List();
}

There is a lot of conditional logic going on, but believe it or not , they are all “good ifs” (I should write a blogpost about good and bad ifs at some point). The approach even though being based on simple conjunction of criteria was covering all our use cases. More importantly it also helped with optimisation. We actually switched from using hibernate and the database to using lucene for searching content. The only thing we had to change was this single method inside the repository. So all the special case optimisations are hidden away in the repository and not exposed to the application (I recall a project that had a method along the lines of GetAllContentForTagOptimisedForTheFirstPage(Set<Tag> tags) – that was called from application code)

Another interesting observation is, that the query object is actually a very good candidate for a model object behind a search view/ UI, with all the fields of the query object matching one element in the UI. This reminded me that back in 2006 I did something similar coming from a UI perspective.

Verdict:
The SoC score of this solution is pretty high. It decoupled our application from the database and allowed for low impact optimisations. It also made the application code a lot more readable.

Want to use a Mock?

You are wrong! Well in most cases. It seems above the intellectual means of most agile developers (the non-agile developers don’t write tests) to realize that tests like the following are useless and even harmful. It is not giving me anything apart from test sclerosis.

  [Test]
        public void ShouldGetAllByauthorId()
        {
            //given
            var mockRepository = MockRepository.GenerateMock<IContentRepository>();
            var authorRepository = MockRepository.GenerateMock<IAuthorRepository>();
            var subcategoryRepository = MockRepository.GenerateMock<ISubCategoryRepository>();
            var contentService = new ContentService(mockRepository, authorRepository, subcategoryRepository);
 
            //when
            contentService.GetContentByAuthorId(1);
 
            //then
            mockRepository.AssertWasCalled(repository => repository.GetContentByAuthorId(1));
        }

A healthy six lines of code to test this:

   public IList<Voucher> GetContentByAuthorId(long authorId)
        {
            return contentRepository.GetContentByAuthorId(authorId);
        }

Verdict: Waste in an agile disguise.

Selenium

A warning to myself, as I forgot over the last year and might forget again.

  • Selenium is bloody slow.
  • Selenium produces crappy error messages.
  • Selenium sports a highly unintuitive vaguely documented api (at least it always takes ages to find the right docs on
    the web).
  • Selenium is unreliable.

Use WebDriver!

Seaside Event at the London Office this Monday

If you are in London this Monday don’t miss out on our Seaside themed GeekNight at 7pm.

Seaside is a truly revolutionary web framework implemented in smalltalk. We have Lukas Renggli representing the small but rapidly growing seaside and smalltalk open source community as well as Michel Bany of Cincom talking about his commercial experience with Seaside and Smalltalk.

It’s at the ThoughtWorks London office.

For more information about seaside and smalltalk have a look at the links in my earlier post.

CATting multiple files

Quite often I want to pipe the content of multiple files into a command line utility. An example would be to count the lines of sql in my project. This is another case, where xargs comes in handy:

find . -name "*.sql" | xargs cat | wc -l