Java Lib to Launch External Processes

I recently redesigned some of the code I tend to use to spawn external processes (pdflatex anyone?) in java. The implementation is still a bit buggy, but I am more interested in people’s opinions about the API (non-blocking killable invocations are not yet supported). The project on github is called jproc. Here is the cookbook so far:

Get the library from maven central:


To launch an external program we’ll use a ProcBuilder. The run method
builds and spawns the actual process and blocks until the process exits.
The process takes care of writing the output to a stream (as opposed to the standard
facilities in the JDK that expect the client to actively consume the
output from an input stream:

ByteArrayOutputStream output = new ByteArrayOutputStream();
new ProcBuilder("echo")
        .withArg("Hello World!")
assertEquals("Hello World!\n", output.toString());

The input can be read from an arbitrary input stream, like this:

ByteArrayInputStream input = new ByteArrayInputStream("Hello cruel World".getBytes());
ProcResult result = new ProcBuilder("wc")
assertEquals("3", result.getOutputString().trim());

If all you want to get is the string that gets returned and if there
is not a lot of data, using a streams is quite cumbersome. So for convenience
if no stream is provdied the output is captured by default and can be
obtained from the result.

ProcResult result = new ProcBuilder("echo")
                            .withArg("Hello World!")
assertEquals("Hello World!\n", result.getOutputString());
assertEquals(0, result.getExitValue());
assertEquals("echo \"Hello World!\"", result.getProcString());

For providing input there is a convenience method too:

ProcResult result = new ProcBuilder("cat")
   .withInput("This is a string").run();
assertEquals("This is a string", result.getOutputString());

Some external programs are using environment variables. These can also
be set using the withVar method

ProcResult result = new ProcBuilder("bash")
                            .withArgs("-c", "echo $MYVAR")
                            .withVar("MYVAR","my value").run();
assertEquals("my value\n", result.getOutputString());
assertEquals("bash -c \"echo $MYVAR\"", result.getProcString());

A common usecase for external programs is batch processing of data.
These programs might always run into difficulties. Therefore a timeout can be
specified. There is a default timeout of 5000ms. If the program does not terminate
within the timeout interval it will be terminated and the failure is indicated through
an exception:

ProcBuilder builder = new ProcBuilder("sleep")
try {;
    fail("Should time out");
catch (TimeoutException ex){
    assertEquals("Process 'sleep 2' timed out after 1000ms.", ex.getMessage());

Even if the process does not timeout, we might be interested in the
execution time. It is also available through the result:

ProcResult result = new ProcBuilder("sleep")
assertTrue(result.getExecutionTime() > 500 && result.getExecutionTime() < 1000);

By default the new program is spawned in the working directory of
the parent process. This can be overidden:

ProcResult result = new ProcBuilder("pwd")
        .withWorkingDirectory(new File("/"))
assertEquals("/\n", result.getOutputString());

It is a time honoured tradition that programs signal a failure
by returning a non-zero exit value. However in java failure is
signalled through exceptions. Non-Zero exit values therefore
get translated into an exception, that also grants access to
the output on standard error.

ProcBuilder builder = new ProcBuilder("ls")
try {;
    fail("Should throw exception");
} catch (ExternalProcessFailureException ex){
    assertEquals("ls: xyz: No such file or directory\n", ex.getStderr());
    assertEquals(1, ex.getExitValue());
    assertEquals("ls xyz", ex.getCommand());
    assertEquals("ls: xyz: No such file or directory\n", ex.getStderr());
    assertTrue(ex.getTime() > 0);

Input and output can also be provided as byte[].
ProcBuilder also copes with large amounts of

int MEGA = 1024 * 1024;
byte[] data = new byte[4 * MEGA];
for (int i = 0; i < data.length; i++) {
    data[i] = (byte) Math.round(Math.random() * 255 - 128);
ProcResult result = new ProcBuilder("gzip")
assertTrue(result.getOutputBytes().length > 2 * MEGA);

The builder allows to build and spawn several processes from
the same builder instance:

ProcBuilder builder = new ProcBuilder("uuidgen");
String uuid1 =;
String uuid2 =;

For convenience there is also a static method that just runs a
program and captures the ouput:

String output ="echo", "Hello World!");
assertEquals("Hello World!\n", output);

Also there is a static method that filters a given string through
a program:

String output = ProcBuilder.filter("x y z","sed" ,"s/y/a/");
assertEquals("x a z\n", output);
Posted in Software Development | 4 Comments


Note to self: join works only on sorted text files.

Posted in Software Development | Leave a comment

Thoughts on handling Translations and Views on Source Code

One of my recent java projects was to be used by users with three different languages. We went with the standard java approach of using properties files for messages. In intellij there is decent tool support for that.
However it seems a bit odd to have a strongly typed language and then rely on string keys for text lookup. At some point we introduced enums representing the keys, but as they were not automatically generated they involved a lot of repetition. Also you are never quite sure how many arguments a message needs.

So I had this idea of using interfaces to represent resource bundles. Each message could be represented as a method, with parameters representing the arguments to the placeholders. It would look somewhat like this:

public interface ExampleMessagePool {
    String sayHello(String name);
    String bye(String name);
    String warning();    

From the clients point of view it would works as follows:

var factory = new MessagePoolFactory<ExampleMessagePool>(ExampleMessagePool.class);
ExampleMessagePool english = factory.getLanguageSource("en");
assertEquals("Hello Matthieu!", english.sayHello("Matthieu"));
assertEquals("Good bye Felix!", english.bye("Felix"));
assertEquals("Attention!", english.warning());
ExampleMessagePool french = factory.getLanguageSource("fr");
assertEquals("Bonjour Matthieu!", french.sayHello("Matthieu"));
assertEquals("Au revoir Felix!", french.bye("Felix"));
assertEquals("Attention!", french.warning());
ExampleMessagePool german = factory.getLanguageSource("de");
assertEquals("Guten Tag Matthieu!", german.sayHello("Matthieu"));
assertEquals("Auf Wiedersehen Felix!", german.bye("Felix"));
assertEquals("Achtung!", german.warning());

The interesting point here is that I get completion and also a hint as of which arguments a particular message takes. The MessagePoolFactory takes care of creating instances for the respective languages. The next question is obviously, where these messages do come from. One way would be to use annotations:

@MessagePool(languages={"en", "de", "fr"})
public interface ExampleMessagePool {
            entries = {
                @Entry(key = "en", value = "Hello {0}!"),
                @Entry(key = "de", value = "Guten Tag {0}!"),
                @Entry(key = "fr", value = "Bonjour {0}!")
    String sayHello(String name);
	        entries = {
	            @Entry(key = "en", value = "Good bye {0}!"),
	            @Entry(key = "de", value = "Auf Wiedersehen {0}!"),
	            @Entry(key = "fr", value = "Au revoir {0}!")
	String bye(String name);
	        entries = {
	            @Entry(key = "en", value = "Attention!"),
	            @Entry(key = "de", value = "Achtung!"),
	            @Entry(key = "fr", value = "Attention!")
	String warning();

This approach is very sexy in so far as it allows to refactor method and parameter names. Also the implementation is trivial not least because java deals with unicode source files. On the other hand making your translators edit the java sources is probably a bit of a challenge. However you could provide a narrow view on that source code. The eclipse platform lends itself to that kind of experiment, so I actually did a bit of spike. Which I am demoing here:

Whilst this is very cool, it’s an entirely static approach. Especially with translations I find it beneficial if they are stored in a database so that they can be edited at run-time. Ideally some key users can then maintain translations, which are part of their domain anyway. Then the interface could be annotated with some sort of GUID, that could later be used at runtime to identify entries in the database:

public interface ExampleMessagePool {
    String sayHello(String name);
    String bye(String name);
    String warning();    

This way the refactorability would be preservered. Also tools support could be provided. Annotating elements with GUUIDs seems to be an interesting concept. Just imagine your database mapping being immune to name changes.

There is definitely some more thinking required here, but it strikes me that the default mechanisms in java are very rudimentary. Much less than a talented developer can dream up in a rainy afternoon.
Further developments could include static or, if you go for the runtime approach, dynamic tools to analyse, whether translations are present, or whether there are duplicates in message pools.

Posted in Software Development | 2 Comments

Don’t Play with Yourself

Some two years ago I had the pleasure of working with a code base that relied heavily on the spring SimpleForm framework. The thing I didn’t like about this framework was the whole controller class hierarchy. Essentially there is about ten super classes and calls get delegated up the whole chain. In theory this is all good OO. So I struggled a bit understanding why I didn’t like it. And as so often when you are struggling a bit of sketching goes a long way:

This sequence diagram degenerated into a christmas tree shape. What is the problem with that? Well these methods operate on different layers. In the case of the SimpleFormController there is methods dealing with Requests and then there is methods dealing with FormBackingObjects. Different levels of abstraction get mixed into a single object (even if they are structured by the inheritance hierarchy). Also this is a point where I struggled a lot with idiomatic smalltalk code. This kind of code might be fine if you are a genius. If you are just a simple developer like I am, probably somethings like this might be much more intelligible to you:

Here each object lives only in a single layer. The same argument can also be used in the long standing debate over private methods, because the receiver of private methods is also self. Now I think there is definitely a case for calling some methods on self, but having a single instance more than twice or three times on your call stack is probably a bad thing.

Posted in Software Development | 1 Comment

Getting Real with

Two weeks back I posted a video explaining how to get started with While this might have been enough information to start playing, there is a lot more to consider when going with for production use.

Over the past few weeks I went through that experience on my project and here are some of the practices that we found useful. Håkan also released version 0.3.0 that incorporates some of our learnings.

When working in the IDE we used the javaagent to weave the lambdas. However there were a few caveats. In version 0.2.4 you could only black list packages to prevent them from being weaved. In most cases it is easier to specify explicitly for which package to enable weaving. Also when using contemporary java goodness there is other parties doing byte code magic. We experienced these problems with spring’s scoped proxies, therefore we introduced another parameter to filter out classes by a regular expression applied to the full classname. Proxy classes usually come with a lot of dollar signs in their names.
So the vmargs we are using in development are:


If you are using eclipse you can specify these parameters as default vm-parameters in your JRE definition:

While using an agent in the ide is acceptable it gets messy, when deploying the application. Essentially all the JVMs on the way to production including your application server would need the agent. What we did instead is, we instrumented our class files in our build using the AOT lambda compiler. So this is a single step after the compilation. If you are using ant it could look somewhat like this:

<target name="lambda" depends="compile">
        <java fork="true"
              <sysproperty key="lambda.weaving.debug" value="true"></sysproperty>
                      <path refid="prod"/>
              <arg value="prod.jar"></arg>

Important: The lambda weaver needs debug information on local variables and also you need to encode your source files as utf-8 to get the fancy lambda letter, so perhaps you have to change your compile target like this:

<javac srcdir="src/java" 

From a technical point of view this is all you need to do. However there is also the question of how to use the new feature “responsibly”. In our code base we used the generic CollectionUtils to filter and transform Collections. While they are elegant from a theoretical point of view, they are quite an insult to the eye, so our aim was to replace all these with Functional operations on collections are very well understood (the method names used in go at least back to Smalltalk-80, that’s thirty years).

So the advice introduce your team the following methods and aim to restrict the use of lambda expressions to those:

  • select
  • collect
  • find
  • inject
  • groupBy
  • While there is remarkable support for arrays and primitives, try to stick to
    Collections and proper Objects.

I think these alone justify the investment.
Things you should avoid or do only in after proper consideration:

  • Do not declare parameters of function types
  • Do not declare local variables of function types
  • Do not declare fields of function types
  • Do not modify enclosed state from within the closure, i.e. local variables or fields
  • Generally avoid side effects

The typical transformation we did, looked something like this:

    private Set<Order.Status> retrieveOrderStatuses() {
        Set<Order.Status> statuses = new HashSet<Order.Status>();
        if (orders == null) return statuses;
        for (Order  order : orders) {
        return statuses;


    @LambdaParameter private static Order o;
    private Set<Order.Status> retrieveOrderStatuses() {
        if (orders == null) return emptySet();
        return  collect(orders, ?(o, o.getStatus())).toSet();

Because getting the lambda character (you might also want to use the alias fn) is a bit tricky you might want to use this java editor template for eclipse that also sorts out the static import:


Sticking to these simple rules actually helped us getting away from the anonymous inner classes and CollectionUtils and that was an easy sell to the mostly mathematically inclined team. In my opinion the fact that lambdas are restricted to expressions is actually a good thing.

Posted in Software Development | Leave a comment

Get Closures for Java Now!

The general state of affairs at Sun/ Oracle is very sad. If you are like me, you cannot take this much longer. Fortunately enough my esteemed colleague Håkan Råberg has invented what is at least a molotov cocktail perhaps even a guillotine to the revolution of java software development. Yes, I am talking about the modestly named, which brings the Ruby enumerable module and closures to java. Without a special compiler, just by pushing the boundaries and enhancing byte code.

I created this little screencast, that shows how to get up and running with

There is also a hi-res version of the videos on google docs (approx. 20M download size).

For more examples and explanation watch Håkan’s blog and read the source.

Now, go, download and, enjoy!

Posted in Software Development | 2 Comments

Contemporary LaTeX Resource

I am currently using LaTeX to produce PDF output for a multilingual (polish characters anyone) business application. I hope to write a proper post about this once I am done. But I have to post the url of this blog which deals with doing nice contemporary typesetting with LaTeX.

Posted in Software Development | 1 Comment

The Gospel

Today’s lesson is from the book of stackoverflow 65,35-21:

It’s a well known fact, that Oracle treats empty strings as null.

I knew about that, but somehow I forgot…

Posted in Software Development | 1 Comment

The Train Build Monitor

On our current project we came up with a model train build monitor. The objective was to have the train move, while the build is green and to stop, when it goes red. The whole thing looks some what like this:

As a USB interface we chose to go for the Velleman K8055, which is available for about forty pounds and provides eight digital and two analogue outputs as well as two analogue and five digital inputs. The analogue outputs are provided as pwm signal.
The outputs are all open-collectors. Being a good developer I did of course anticipate a few more use-cases and hence designed the controller, so we could actually control the speed as well as the direction in which the train in moving. Essentially it uses a transistor to switch the train (using the pwm) and a relay to reverse the direction (yes this is somewhat lame). Also there is two free-wheeling diodes to protect the electronics from the high inductivity of the relay and the trains motor. This is the circuit diagram of the controller that goes between the K8055 and the train:

The K0855-board ships with a DLL to control the IO. Apparently there is a linux driver, which is much better than the windows version, but we are in a bit of a windows shop, so we went with the DLL. It turned out wrapping a DLL in a ruby script is fairly trivial:

require 'Win32API'
open ="K8055D", "OpenDevice", ['L'] , 'L')
outputAnalog ="K8055D", "OutputAnalogChannel", ['L','L'],'V')
outputDigital ="K8055D", "WriteAllDigital", ['L'],'V')
inputDigital ="K8055D", "ReadAllDigital", [],'L')
outputAnalog.Call(1, 10) # setting DA channel 1 to 10
outputDigital.Call(128)    # reversing

Future plans include figuring out where the train currently is to stop in the station. The original plan was to use a reed switch, but that proved to be a bit unreliable, so the current thinking involves using a camera and something like hornetseye to get the exact position of the train.

Posted in Software Development | 2 Comments

Wieder einmal ist es die Deutsche Bahn…

…der wir eine, wie ich finde, kleine sprachliche Perle verdanken: Das Wort “Unterwegsbahnhof”.

Posted in Sprache | Leave a comment