Managing Configuration

The Problem

In almost every serious software development effort that entails integrating different systems multiple environments are being used throughout development, testing and the production use of the system. Each of these environments typically contains a number of systems, that interact with each other. Each system needs to know how to talk to the systems it depends on in a particular envirnoment. Also, there are other parameters such as timeouts and feature flags. that might change across different environments.

The classical solution to that problem is to externalise the configuration of upstream systems into a separate file, so that the installed artefact can be made to talk to different systems depending on where it is being installed.

Frameworks like spring and puppet allow the use of placeholders that get assigned different values for different environments as a means to parameterise these files.

A Stroll down the Garden Path

On a recent project I started going down this route. Spring would load a .properties file which in turn contained placeholders which puppet would then populate differently for different environments. This was the approach all services in our department were using so, in the interest of consistency, we just adopted it.

Let’s look at an example. Our system talks to a system called System A. The interaction is encapsulated into a client abstraction taking the configuration as a a constructor parameter:

@Bean
@AutoWired
class SystemAClient(@Value("${dependencies.system.a.url}) systemAUrl: String) {
 
    // ... implementation
 
}

Spring in turn used a .properties file to fill in the right hand side values. For day-to-day development on the developers’ workstations we used a file such as the following:

dependencies.system.a.url=http://a.development.envs.company.com
dependencies.system.b.url=http://b.development.envs.company.com
connect.timeout=1000

The trouble is, that for our acceptance test environment we needed a different version of this file as we did for our continuous integration build, our demo system and most importantly production.

This is were puppet came into play. The operations team created a template for the configuration file as part of their scripting of the deployment:

dependencies.system.a.url=<%= @dependecies_system_a_url %>
dependencies.system.b.url=<%= @dependecies_system_b_url %>
connect.timeout=<%= @connect_timeout %>

Dealing with optional values was extremely awkward and involved an if statement inside the placeholder expression.

The values for these variables were defined in a yaml file that was far away from the code in a repository that belonged to the operation team.

Here an example for the testing environment:

classes:
    company::myservice:
        dependecies_system_a_url: http://a.testing.envs.company.com
        dependecies_system_b_url: http://b.testing.envs.company.com
        connect_timeout: 1000

Trouble Strikes

This is of course a massively complicated system. We ran into several issues:

Introducing new configuration parameters became hard, because
- The new parameter had to be introduced at several layers.
- The change of the puppet config had to be in lockstep.
Textual substitutions can easily lead to escaping and type errors.
Understanding the whole mechanism caused a lot of cognitive burden, as we quickly learnt when introducing new team members.
Releases could easily fail, because there was no easy way to test this until the actual deployment using puppet, because some of it relied on puppet functionality and some of it relied on the spring configuration parameter resolution which both couldn’t easily be tested together without an actual deployment.

We actually experienced quite a few failed attempts to relese because of issues with the configuration.

Surely we could have introduced a complicated test harness for this set-up, but we felt that this would be throwing good money after bad, so we had a brainstorming session to come up with our procedural requirements for the way we handle configuration.

What we Needed

The following requirements emerged during our session:

A workflow that enables developers to do all the changes associated with a new parameter in a single commit, so that they don’t go out of sync.
Minimise the number of changes required to introduce a new configuration parameter.
Ensure that parameter values are provided for every environment.
Ensure that no values for parameters that have been removed are provied to avoid zombie values, that sometimes got updated for quite a while, before everyone realised that they were no longer used.
Ensure that trivial validations for the values have been performed.
A central place that tells us what configuration parameters are available and whether they are optional.
A mechanism to change configuration values in an emergency without rebuilding the application.

The Solution

All configuration lives in the same repository as the code. There is a special place in the source directory where all the config files live. There name is the name of the environment they describe.
The configuration was bundled with the application so that they would move together.
When configuring the application, puppet only told the application which file to pick up.
Instead of using two layers of templating we managed the actual files as deployed to the server in the source repository. To make sure this remained manageable we stripped everything out of the configuration that is not changing across environments.
For the actual config files as loaded by the app we moved from .properties files to .json files, which we mapped to a couple of classes that actually represented the configuration. Using json4s we could easily express constraints such as: this is a number, this is a url, or this parameter is optional.
We wrote a unit test that tries to load up all files and check a number of properties
- Whether the json is valid json.
- Whether it can be deserialised into a configuration object. This in turn checked whether all non-optional values were supplied for every environment and whether URLs and numbers could be parsed as such.
- Whether the file has the canonical format with defined indentation and field order. If it wasn’t, a correctly formatted file was written to the target folder along with an error message that contained a commandline mv statement that would replace the file in question with the properly formatted version.
A little script was build that would allow the operations team to replace the config that came bundled

The configuration had been turned into proper objects:

/** Environment specific configuration for my service. */
case class MyServiceConfig(
    /** Configuration */
    dependendcies: DependenciesConfig,
    /** The connect timeout for all upstream systems in ms.*/
    connectTimeout: Int
)
 
case class DependenciesConfig (
    /** We rely on Service A, which is maintained by the foo-group */
    systemAUrl: Url,
    /** Service B is not required in all deployments, so it is optional. */
    systemBUrl: Option[Url]
)

Here an example for the configuration for the testing environment:

{
  "dependencies": {
    "systemAUrl": "http://a.testing.envs.company.com"
  },
  "connectTimeout": 1000
}

The configuration for a developer workstation on the other hand could look like this:

{
  "dependencies": {
    "systemAUrl": "http://a.development.envs.company.com",
    "systemBUrl": "http://b.development.envs.company.com"
  },
  "connectTimeout": 1000
}

This solution worked beautifully and suddenly introducing a new configuration parameter was no longer a dreaded task.

With the MyServiceConfig class there now also was a canonical place to look at things that can be configured as well as to elaborated a little bit on the parameters, which was a much better place than a separate wiki page.

Going further we could have also written tests on top of the MyServiceConfig abstraction, e.g. to ensure that all the urls point to our companies domain or that certain patterns are being followed across environments.

On my next project I will push hard for using this approach as soon as the problem of configuration comes up.

The Problem

A Stroll down the Garden Path

Trouble Strikes

What we Needed

The Solution

Comments

Leave a Reply Cancel reply