Brexit Means Brexit

Posted in Allgemein | Leave a comment

New jproc Release

I have just released version 2.2.0 of jproc, a handy library to run external programs from Java. It is nice to use a serious programming language (whichever JVM language you consider serious) rather than just bash, if things get a bit more complicated (set -eux anyone?). This version comes with a couple of minor improvements. I also went back to generating the from the source of the automated acceptance test.

A few highlights of jproc:

  • Neat builder syntax.
  • Consumes STDIO and STDERR automatically.
  • Converts non-zero exit status to Exception.
  • Provides timeout mechanism.

Feedback is always welcome, e.g. via github issue tracker or in the comments section below.

Posted in Software Development | Leave a comment

Wrapping Exceptions with Context

A good error / exception message should provide enough information to pinpoint a problem. In most modern languages there is a stack trace feature that will show how the program got to the point where it broke.

Unfortunately this information is very static in nature; all the dynamic context is lost. To solve this problem most environments provide exception chaining. Low level exceptions get wrapped into higher level exceptions, that provide context.

Here an example:

/* Posts a comment for a user and returns an id*/
def comment(userId, commmentBody): Id = {
    val user = userReposistory.load(userId)
    val comment = new Comment(, commentBody,
/* A generic filesystem based json repository */
def save(json: JsonNode): Id = {
    val uuid = generateUuid
    val file = new File(storagePath + "/" + uuid)

If this code breaks at the IO level, the information which user tried to post a comment is not no longer in scope. And rightly so, because it’s a separate concern.

The typical way to deal with this is to catch the exception in the comment method and wrap it into a new exception, like so:

def comment(userId, commmentBody): Id = {
    val user = userReposistory.load(userId)
    try {
        val comment = new Comment(, commentBody,

    } catch {
        case t: Throwable => 
            throw new Exception(s"Exception trying to post comment for user '${}'", t)

This will now produce a nice logical stack trace, however it is relatively verbose. Also, the text in the exception, which somehow describes what the code block is doing is on the bottom rather than a proper heading.

I recently debugged a piece of code that I hadn’t touched in two years. The first thing I did was to improve the feedback by wrapping exceptions.

The problem was annoying enough for me to come up with a helper function, aptly called scope, that would let me rewrite the above to:

def comment(userId, commmentBody): Id = {
    val user = userReposistory.load(userId)
    scope(s"post comment for user '${user,name}'") {
        val comment = new Comment(, commentBody,


I find that much more appealing, because the domain logic looks cleaner. This is what the scope function looks like:

def scope[T](scope: String)(expression: => T): T = {
    try {
    catch {
        case t: Throwable => 
            throw new Exception("Exception trying to " + scope, t)

Surely this can be improved. We could have a certain well known exception (super) class that bubbles up the stack without being wrapped. This could be used in cases where the exception is used as a non-local return.

Also, it might be nice to use the same mechanism for logging, so that log messages get context. However in Scala it seems we would have to do this with a side-effect like a thread local, or an extra parameter, as there is no way to put something into the binding of the closure at evaluation time. Tangent: In Kotlin there is a type safe way to do this using its “function type with a receiver”.

In Scala a similar function could be provided for the Try and the Future monads, because they lead to exceptions being passed around without leaving a “stack trace”, so some context would be very helpful indeed.

Posted in Scala, Software Development | Leave a comment

Shades of Green – The Release Candidate Report

On our current project we practise continuous integration and we perform weekly releases.
To simplify the decision whether a particular build is fit for production we introduced a release candidate report that contains all the information that we usually take into account as well as instructions on how to interpret this information. In this article I describe how we came to this point.

The classical view of the build process is that it takes a defined version of the source code and ideally deterministically produces executable artifacts.

With the advent of pervasive automated testing the focus has shifted from only producing “binaries” to actually validating any given version of the software. There are static analysis steps that assert syntactic correctness (compiler) and adherence to coding standards (e.g. lint, checkstyle) and there is dynamic testing of units and the overall system for functional and non-functional quality.

Traditionally automated tests are self-validating, i.e. they evaluate the outcome to either passing or failing – green or red:


They should be passing or failing deterministically, because computers are deterministic machines after all. At least that’s the theory. Anyone who has done serious test automation knows that concurrency issues can lead to non-deterministic test results.

Bugs aside we have however observed a class of tests that produce output that defies staright forward automatic validation. The first reason is that the test returns a metric that has more than one dimension, e.g. precision and recall in an information retrieval system or latency and throughput for load tests. Secondly the tests may be based on current user behaviour, which makes tests more realistic, but less comparable. An example would be a load test that always uses the latest production traffic patterns or a quality metric that uses user feedback. In a similar vain “cloud-based” architectures with a lot network connections and virtual machines have highly non-deterministic performance characteristics.

Also, there is the case of tests where the outcome is hard to evaluate for a machine, e.g. validating the layout of a web page by comparing it to the previous version. Here tools can assist, but as of today the human brain is much more powerful at understanding whether a layout has been broken.

Another interesting aspect is, that in a lot of cases it is interesting to compare the results
to a baseline. Assuming that the system is up and running the current production build provides a good baseline for all sorts of metrics.

We implemented a number of such tests for our current product. As there were quite a few things to check we came up with a release checklist. The items were mostly instructions where to find a and how to evaluate the output of a particular test. The build was structured using ThoughtWorks’ Go continuous delivery product. With Go we mapped the whole build, test and release process to two pipelines, which consists of a number of stages, which in turn comprised a number of jobs. Finding all the results was cumbersome, so we decided to put all the links to these results into a dynamically created version of that checklist, which would include all the links.

This was still a bit cumbersome, so my colleague Ben Barnard implemented a candidate release report generator that actually pulled the full results into a single document. This proved especially useful for comparisons, because we presented the results of the current production build and the new build side by side as illustrated in this diagram:


The current version of our release candidate report contains the following information:

  • Instructions how to interpret the report and how to release.
  • A list of all our Jira tickets that had commits since the last release including their release notes (which we keep in Jira) as well as their status. For tickets that haven’t been accepted by the product owner yet, we also include a list of all commits. We added deep links to our Jira as well as to our git web-frontend, so that more information can be retrieved.
  • The summary statistics for our load test results for the current production build and the new release candidate
  • A graph of the average latency and error rates at different load levels for the current production version and the new release candidate
  • The output of a custom diff algorithm, that we use to compare the responses of the current release candidates with the production system
  • A diff that shows how the configuration file changed

Here is a generic example for a release candidate report (full html):


The release process now consisted of reading and reviewing a single self-documenting
document. Releasing became much simpler as a result. The release candidate report contains everything that is needed for the decision whether to replace the current version with this new release candidate.

An important move forward was to accept the fact that some tests cannot be automatically validated (at a reasonable cost) and that for those we should make human validation as simple as possible, i.e. we don’t waste brain cycles on pulling together documents, but instead use them to perform complicated pattern matching operations. Another way to look at this is, that the release candidate report is a first class artifact that summarises the build result in an actionable document.

Posted in Software Development | Leave a comment

Dealing with Flaky Tests using Data

I recently wrote a sbt plugin that produces tabular test reports. My motivation
was to get better insights into the reliability of our test suite. Due to good
discipline most failures that occur are caused by flakiness of tests that
usually pass on developer machines, but are causing problems on the faster build machines.

We use GOCD as our build server. It doesn’t come with any
feature that helps analysing build failures over time. However with the
reports now being written as tabular text files, I was able to produce a list of
failures running the following command, which just concatenates all the
test results and finds failures:

find /go-server-home/artifacts/pipelines/my-pipeline -wholename  "*test-reports/test-results-*.txt" \
  | xargs cat \
  | grep -e 'FAILURE\|ERROR' \
  | awk '{print $1, $5"."$6}'  \
  | sort -r > failures.txt

I concatenated the last two fields, which are test suite name and test name to get a fully qualified name. I also wrote a little script that uses SSH, so that I can trigger this search from my local machine. Anonymising the test names the result looked something like this:

2015-04-24T12:25:46 A
2015-04-24T12:21:32 B
2015-04-24T10:42:35 C
2015-04-23T11:36:26 C
2015-04-23T11:36:26 D
2015-04-23T11:36:26 E
2015-04-22T12:38:54 B
2015-04-22T08:15:04 F
2015-04-22T08:15:04 G

This already gave me a good idea what test cases to look for. To get an even better impression I created a
simple visualisation using the gnuplot-based time-line tool:

time-line \
    -o failures.png \
    --dimensions 700,400 \ 
    --font-size 10 \
    < failures.txt

The resulting graph looked like this:

It was interesting to see that the number of test that failed over three weeks of 20 was quite low in comparison to the total number of tests (more than 2000). Also, it’s easy to spot the flaky ones versus the genuine build breakages, e.g. B, C and D versus A.

When I looked at the failures in detail, I found numerous places where we synchronously asserted that asynchronous actions had taken place.

To me this experience confirmed that it is a good habit to try to get as much data as possible about a problem, before making generalisations. It also showed that it is not necessary to buy specialised tools, if data is made available in a format that can easily be consumed by the standard tools of the trade.

Posted in Software Development | Leave a comment

Managing Configuration

The Problem

In almost every serious software development effort that entails integrating different systems multiple environments are being used throughout development, testing and the production use of the system. Each of these environments typically contains a number of systems, that interact with each other. Each system needs to know how to talk to the systems it depends on in a particular envirnoment. Also, there are other parameters such as timeouts and feature flags. that might change across different environments.

The classical solution to that problem is to externalise the configuration of upstream systems into a separate file, so that the installed artefact can be made to talk to different systems depending on where it is being installed.

Frameworks like spring and puppet allow the use of placeholders that get assigned different values for different environments as a means to parameterise these files.

A Stroll down the Garden Path

On a recent project I started going down this route. Spring would load a .properties file which in turn contained placeholders which puppet would then populate differently for different environments. This was the approach all services in our department were using so, in the interest of consistency, we just adopted it.

Let’s look at an example. Our system talks to a system called System A. The interaction is encapsulated into a client abstraction taking the configuration as a a constructor parameter:

class SystemAClient(@Value("${dependencies.system.a.url}) systemAUrl: String) {
    // ... implementation

Spring in turn used a .properties file to fill in the right hand side values. For day-to-day development on the developers’ workstations we used a file such as the following:


The trouble is, that for our acceptance test environment we needed a different version of this file as we did for our continuous integration build, our demo system and most importantly production.

This is were puppet came into play. The operations team created a template for the configuration file as part of their scripting of the deployment:

dependencies.system.a.url=<%= @dependecies_system_a_url %>
dependencies.system.b.url=<%= @dependecies_system_b_url %>
connect.timeout=<%= @connect_timeout %>

Dealing with optional values was extremely awkward and involved an if statement inside the placeholder expression.

The values for these variables were defined in a yaml file that was far away from the code in a repository that belonged to the operation team.

Here an example for the testing environment:

        connect_timeout: 1000

Trouble Strikes

This is of course a massively complicated system. We ran into several issues:

  • Introducing new configuration parameters became hard, because
    • The new parameter had to be introduced at several layers.
    • The change of the puppet config had to be in lockstep.
  • Textual substitutions can easily lead to escaping and type errors.

  • Understanding the whole mechanism caused a lot of cognitive burden, as we quickly learnt when introducing new team members.

  • Releases could easily fail, because there was no easy way to test this until the actual deployment using puppet, because some of it relied on puppet functionality and some of it relied on the spring configuration parameter resolution which both couldn’t easily be tested together without an actual deployment.

We actually experienced quite a few failed attempts to relese because of issues with the configuration.

Surely we could have introduced a complicated test harness for this set-up, but we felt that this would be throwing good money after bad, so we had a brainstorming session to come up with our procedural requirements for the way we handle configuration.

What we Needed

The following requirements emerged during our session:

  • A workflow that enables developers to do all the changes associated with a new parameter in a single commit, so that they don’t go out of sync.
  • Minimise the number of changes required to introduce a new configuration parameter.
  • Ensure that parameter values are provided for every environment.
  • Ensure that no values for parameters that have been removed are provied to avoid zombie values, that sometimes got updated for quite a while, before everyone realised that they were no longer used.
  • Ensure that trivial validations for the values have been performed.
  • A central place that tells us what configuration parameters are available and whether they are optional.
  • A mechanism to change configuration values in an emergency without rebuilding the application.

The Solution

  • All configuration lives in the same repository as the code. There is a special place in the source directory where all the config files live. There name is the name of the environment they describe.
  • The configuration was bundled with the application so that they would move together.
  • When configuring the application, puppet only told the application which file to pick up.
  • Instead of using two layers of templating we managed the actual files as deployed to the server in the source repository. To make sure this remained manageable we stripped everything out of the configuration that is not changing across environments.
  • For the actual config files as loaded by the app we moved from .properties files to .json files, which we mapped to a couple of classes that actually represented the configuration. Using json4s we could easily express constraints such as: this is a number, this is a url, or this parameter is optional.
  • We wrote a unit test that tries to load up all files and check a number of properties
    • Whether the json is valid json.
    • Whether it can be deserialised into a configuration object. This in turn checked whether all non-optional values were supplied for every environment and whether URLs and numbers could be parsed as such.
    • Whether the file has the canonical format with defined indentation and field order. If it wasn’t, a correctly formatted file was written to the target folder along with an error message that contained a commandline mv statement that would replace the file in question with the properly formatted version.
  • A little script was build that would allow the operations team to replace the config that came bundled

The configuration had been turned into proper objects:

/** Environment specific configuration for my service. */
case class MyServiceConfig(
    /** Configuration */
    dependendcies: DependenciesConfig,
    /** The connect timeout for all upstream systems in ms.*/
    connectTimeout: Int
case class DependenciesConfig (
    /** We rely on Service A, which is maintained by the foo-group */
    systemAUrl: Url,
    /** Service B is not required in all deployments, so it is optional. */
    systemBUrl: Option[Url]

Here an example for the configuration for the testing environment:

  "dependencies": {
    "systemAUrl": ""
  "connectTimeout": 1000

The configuration for a developer workstation on the other hand could look like this:

  "dependencies": {
    "systemAUrl": "",
    "systemBUrl": ""
  "connectTimeout": 1000

This solution worked beautifully and suddenly introducing a new configuration parameter was no longer a dreaded task.

With the MyServiceConfig class there now also was a canonical place to look at things that can be configured as well as to elaborated a little bit on the parameters, which was a much better place than a separate wiki page.

Going further we could have also written tests on top of the MyServiceConfig abstraction, e.g. to ensure that all the urls point to our companies domain or that certain patterns are being followed across environments.

On my next project I will push hard for using this approach as soon as the problem of configuration comes up.

Posted in Software Development | Leave a comment

Go Defrustrator

At programmiersportgruppe I blogged about the go-defrustrator, a userscript that dynamically improves Go’s user interface.

Posted in Software Development | Leave a comment

Sad Truth

This Saturday I made a rather chilling discovery.
Most people in my generation know the ingenious prankster Emil i Lönnerberga. And – as the wikipedia article states – we were lead to believe that he grew up to be a reponsible man and indeed the chairman of the local council.

And then I saw this:

So not only did he pass away pretty early in life, no he also fell prey to a
terribly bad taxidermist.

Posted in Reading | Leave a comment


Unablässiges Gerede von Strategien ist das Markenzeichen des schlechten Taktierers.

Posted in Tiraden, Uncategorized | Leave a comment

Readability vs Runtime Feedback

In my scala explorations I also came across the problem of testing and verifying mocks. In scala it is trivial to pass around predicates, however these are function objects, that can be applied, but don’t know much about their implementation. So while the code is readable the feedback can be quite bad.

Groovy’s assert statement is a shining example, how both goals can be attainded. Just consider the following assertion:

assert ["100","Test", 11].contains("10"+"1")

It is beautiful to read (as opposed let’s say hamcrast that makes me use it’s own syntax to construct an expression tree). When it comes to execute this I also do get really good feedback:

Posted in Software Development | Leave a comment