1 Exception To The Power of JDK 8 Collectors

Lambda Duke

I’ve been using JDK 8 for over two years now and have found the new functional style of programming it provides really powerful. The thing that really impresses me about lambda expressions and streams is how I keep finding places where I am just blown away by how simple they make things.

Let’s have a look at a recent example I found which nicely demonstrates the power of collectors, but also shows one of the limitations of streams.

My project was to build a very small web server that could be used to control certain things on a headless server via a web front-end. Clearly, there are many free and open source projects I could have used for this like Apache web server, Tomcat, etc., etc. However, what I wanted was a very small, very simple install on the server. Ideally, it would be just an executable jar file containing everything that was needed.

The JDK has included a simple web server package, com.sun.net.httpserver, since JDK 6 and that seemed like an excellent starting point.

NOTE: Some people think that anything that is not in the java or javax packages in the JDK should be avoided, since they’re private APIs. The com.sun.net package is a public API, it’s just not part of the Java SE specification, so quite legitimate to use in application code.

Writing the code to serve up web pages, images, etc. was pretty trivial, but things got a little more complex when you want to start using POST from the web client to include form-like data. Unfortunately, the ability to extract POST parameters is not built-in to the API, which seems like a pretty big oversight, but where there’s a will, there’s a way.

Having searched the web I found an example of how to process POST parameters. Essentially what you need to do is create a filter that is used when processing a given context before your handler is called. The author chose to implement the doFilter() method to process both GET and POST parameters using the code below

@Override
public void doFilter(HttpExchange exchange, Chain chain)
    throws IOException {
  parseGetParameters(exchange);
  parsePostParameters(exchange);
  chain.doFilter(exchange);
}

private void parseGetParameters(HttpExchange exchange)
    throws UnsupportedEncodingException {
  Map parameters = new HashMap();
  URI requestedUri = exchange.getRequestURI();
  String query = requestedUri.getRawQuery();
  parseQuery(query, parameters);
  exchange.setAttribute("parameters", parameters);
}

private void parsePostParameters(HttpExchange exchange)
    throws IOException {
  if ("post".equalsIgnoreCase(exchange.getRequestMethod())) {
    @SuppressWarnings("unchecked")
    Map parameters = (Map)exchange.getAttribute("parameters");
    InputStreamReader isr = new
      InputStreamReader(exchange.getRequestBody(),"utf-8");
    BufferedReader br = new BufferedReader(isr);
    String query = br.readLine();
    parseQuery(query, parameters);
  }
}

@SuppressWarnings("unchecked")
private void parseQuery(String query, Map parameters)
    throws UnsupportedEncodingException {
  if (query != null) {
    String pairs[] = query.split("[&]");

    for (String pair : pairs) {
      String param[] = pair.split("[=]");
      String key = null;
      String value = null;

      if (param.length > 0) {
        key = URLDecoder.decode(param[0],          
          System.getProperty("file.encoding"));
      }

      if (param.length > 1) {
        value = URLDecoder.decode(param[1],
          System.getProperty("file.encoding"));
      }

      if (parameters.containsKey(key)) {
        Object obj = parameters.get(key);

        if(obj instanceof List) {
          List values = (List)obj;
          values.add(value);
        } else if(obj instanceof String) {
          List values = new ArrayList();
          values.add((String)obj);
          values.add(value);
          parameters.put(key, values);
        }
      } else {
        parameters.put(key, value);
      }
    }
  }
}

Having looked through this I thought I could tidy it up a bit and use streams to make it a bit more succinct. Since streams are all about processing sets of data this seemed like an excellent place to use them.

To start with I tidied up the doFilter() method so it only did what was required (after all, you can’t do a GET and a POST at the same time)

@Override public void doFilter(HttpExchange exchange, Chain chain)
    throws IOException {
  this.exchange = exchange;
  String method = exchange.getRequestMethod();

  switch (method) {
    case "GET":
      parseGetParameters();
      break;
    case "POST":
      parsePostParameters();
      break;
    default:
      logger.warning("Filter not configured for method: " + method);
      break;
  }
  chain.doFilter(exchange);
}

Then I simplified the parseGetParameters() and parsePostParameters() thus:

private void parseGetParameters() throws UnsupportedEncodingException {
  exchange.setAttribute("parameters",
    parseQuery(exchange.getRequestURI().getRawQuery()));
}

private void parsePostParameters() throws IOException {
  BufferedReader br = new BufferedReader(
    new InputStreamReader(exchange.getRequestBody()));
  exchange.setAttribute("parameters", parseQuery(br.readLine()));
}

Essentially all I did here was have the parseQuery() method return a Map rather than passing one to it as a parameter. This made more sense, as we’ll see in a moment, because of the way we use a Collector.

The original parseQuery() method certainly does what it needs to, but is a classic example of the use of external iteration, which streams are designed to eliminate. As we can see, first the parameter string is split into an array of strings that hold keys and values. The array is iterated over and each key-value pair is extracted as separate strings. Since keys do not need to be unique the code needs to handle the situation of more than one key of the same value and creates a list to hold the associated values if necessary. This all requires 33 lines of code.

The way to solve this with streams is to start by creating a stream of the key-value pairs in the parameter string, thus:

Arrays.stream(query.split("[&]"))

In the author’s code, they decided to have a Map whose values are either a String (for keys with a single value) or a List (for keys with multiple values). This is a poor design decision because it means the user of the Map has to figure out the type of the key, String or List. It also means they can’t use a typed collection. In the new code, we’ll use a Map with generic type parameters.

Now what we need to do is to reduce our stream of key-value pairs into a Map<String, List<String>>. To quote from the API documentation, collect() “Performs a mutable reduction operation on the elements of this stream using a Collector”. Our code then becomes:

Map<String, List> paramMap = Arrays.stream(query.split("[&]"))
    .collect(aCollector);

All we need to do is figure out what aCollector needs to be. Fortunately, the streams API comes with a handy utility class for this, unsurprisingly called Collectors. This provides us with some very useful methods so if all we want to do is collect the elements of a stream into a List we use Collectors.toList(). If we want to collect into a CSV string we can use Collectors.joining(“,”) and so on.

What we need to do here though is a bit more complicated. We need to take each element of the input stream and split it into a key and a value string. Then we need to group the values into a List for each unique key. To do this we can use the Collectors.groupingBy() method. There are three of these to choose from (not including the groupingByConcurrent() methods), but which one to use?

The answer is the form that takes two arguments:

groupingBy(Function classifier, Collector downstream)

I’ve left out all the generic type parameters here to make things a little easier to see (look at the JDK 8 API documentation for the full method signature). What this method returns is a Collector that uses the classifier Function to create the keys from the elements on the input stream and the downstream collector to collect the value elements together.

Here we get to use the other really powerful feature of JDK 8: Lambda expressions, which give us a simple way of parameterising behaviour. The classifier is simple, because it’s just a function that maps from the key-value string to a key only, so:

s -> (s.split(“[=]”)[0]

We split the string around the equal sign and take the first element of the resulting array, which is our key.

Now we need another Collector that will return a List<String> that are the values associated with the keys provided by the classifier. Since we need to extract the values from the key-value pairs we need to use the Collectors.mapping() method:

mapping(Function mapper, Collector downstream)

Again, the mapper Function is simple; we can repeat what we did for the key:

s -> (s.split(“[=]”)[1]

this time taking the second string, which is the value.

Finally, for our downstream collector we need to put these values into a List so we use Collectors.toList(). Putting this all together we get our new, improved parseQuery() method

private Map<String, List> parseQuery(String query)
    throws UnsupportedEncodingException {
  return (query == null) ? null : Arrays.stream(query.split("[&]"))
    .collect(groupingBy(s -> (s.split("[=]"))[0],
      mapping(s -> (s.split("[=]"))[1], toList())));
}

The original 33 lines become just 3! Almost. At this point, we can sit back and look at this and think about how great streams are.

Unfortunately, if you go back and look at the original code you will see that I’ve conveniently left out one step: using URLDecoder to convert the parameter strings to their original form. That should be easy, all we have to do is add the necessary method call.

private Map<String, List> parseQuery(String query)
    throws UnsupportedEncodingException {
  String enc = System.getProperty(“file.encoding”);
    return (query == null) ? null : Arrays.stream(query.split("[&]"))
      .collect(groupingBy(s -> URLDecode.decode((s.split("[=]"))[0], enc),
        mapping(s -> URLDecode.decode((s.split("[=]"))[1], enc), toList())));
}

Now, however, we find ourselves faced with one of the limitations of streams, which is how it deals with exceptions. Having added the decode() call the code will not compile because decode() can throw an UnsupportedEncodingException.

What would be really nice is simply to have the exception caught outside the whole stream but that’s not the way it works. The apply() method of the Function interface does not declare any thrown exceptions so it must be caught inside the Lambda expression. This is a real shame for our much-simplified code. We’re faced with two options:

    1. Give the lambda expression a body with braces and include a try-catch block. This becomes really ugly as you can see as it adds a whole lot of braces, and pushes our code line-count up to 14.

  return (query == null) ? null : Arrays.stream(query.split("[&]"))
    .collect(groupingBy(s -> {
      try {
        return URLDecoder.decode((s.split("[=]"))[0], enc);
      } catch (UnsupportedEncodingException ex) {
        return null;
      }
    }, mapping(s -> {
      try {
        return URLDecoder.decode((s.split("[=]"))[1], enc);
      } catch (UnsupportedEncodingException ex) {
        return null;
      }
    }, toList())));

  1. A better solution for the stream is to create two methods that handle the exception internally. We can use method references in our stream but this still adds a lot of code for the new methods.

In the end, I decided to keep my first super-slim approach. Since my web front end wasn’t encoding anything in this case my code would still work.

The takeaway from this is that using streams and the flexibility of collectors it is possible to greatly reduce the amount of code required for complex processing. The drawback is this doesn’t work quite so well when those pesky exceptions rear their ugly head.

© Azul Systems, Inc. 2017 All rights reserved.