Friday, January 31, 2014

The Power of the Arrow

If you haven't yet migrated a project to JDK 8 I recommend reading my post Migrating to JDK 8 with NetBeans first.

After converting projects to JDK 8 I've been opening my eyes to opportunities where I could simplify or better express a block of code by using lambdas and the new collections API's. This post is to share with you some of what I have done and how you might be able to find the same sorts of elegant changes in your code.

Some of these samples may be from other libraries where I have seen how it could be better achieved with JDK 8.

I stated doing these by hand and then slapped myself, NetBeans' hints can do all this for me. With NetBeans generally a hint will appear, you smile and accept it, possibly modify, then another hint might appear so you grin and accept that too; before you know it you've taken 5 or more lines down to 1.

Samples here are all shown in stages of conversion to help with understanding what each step is doing. Sometimes this may mean that the complexity increases which will eventually result in simplifications in subsequent steps.

I've called this The Power of the Arrow in homage to the lambda symbol -> which looks to me like an arrow.

TIP: Many examples here make use of static methods which could reduce code by using a static import for the method, such as Collectors.groupingBy. I have chosen to explicitly include the class name so it is clear where the method is coming from.

Collection Contains Another Collection Entry


There's many different implementations out there, I had my own and only just realised there was one from spring-core that I could share as a source of the example. Here's the method we will be rewriting: org.springframework.util.CollectionUtils.containsAny(Collection<?>,Collection<?>);

Before

This is a fairly common problem, we need to first perform null checks, then perform a for loop testing if the entry is within another collection.

public static boolean containsAny(Collection<?> source,
                                  Collection<?> candidates) {
    if (isEmpty(source) || isEmpty(candidates)) {
        return false;
    }
    for (Object candidate : candidates) {
        if (source.contains(candidate)) {
            return true;
        }
    }
    return false;
}

If you don't want to download the source for spring-core, the following is the `isEmpty' method:

public static boolean isEmpty(Collection<?> collection) {
    return (collection == null || collection.isEmpty());
}

Step 1: Remove the For Loop

NetBeans will be hinting "Can use functional operations".



Accept the hint, you will now end up with the following reduction:

public static boolean containsAny(Collection<?> source,
                                  Collection<?> candidates) {
    if (isEmpty(source) || isEmpty(candidates)) {
        return false;
    }
    if (candidates.stream()
            .anyMatch((candidate) -> (source.contains(candidate)))) {
        return true;
    }
    return false;
}

Step 2: Eliminate one if

Now if you are on the same line, you will now have the "The if statement is redundant" hint.



Accept this hint for the following reduction:

public static boolean containsAny(Collection<?> source,
                                  Collection<?> candidates) {
    if (isEmpty(source) || isEmpty(candidates)) {
        return false;
    }
    return candidates.stream()
                .anyMatch((candidate) -> source.contains(candidate)));
}

Step 3: Eliminate final if

You can now see how simple the method has become, the first if can now be combined with the return statement, this won't show up as a hint but the change is simple.

public static boolean containsAny(Collection<?> source,
                                  Collection<?> candidates) {
    return !isEmpty(source) &&
           !isEmpty(candidates) &&
           candidates.stream()
                   .anyMatch(candidate -> source.contains(candidate));
}

Step 4: Pretty it Up

You can now further refine this as the parentheses are not required for the lambda argument, likewise the lambda body is not required to be parenthesised. I also shorten the lambda argument name.

public static boolean containsAny(Collection<?> source,
                                  Collection<?> candidates) {
    return !isEmpty(source) && !isEmpty(candidates) &&
           candidates.stream().anyMatch(x -> source.contains(x));
}

Summary

Method simplified greatly by removing two if statements and a for loop, we have reduced the vertical footprint from 9 lines to 2, that's a 78% reduction. This could fit on a single line if you're that inclined. What's really cool about this function is the anyMatch can be changed to allMatch or noneMatch, you may want to handle the `isEmpty' differently though.

Extract Key Value From Collection


I often have a need where I want to take a collection of objects, let's say Widgets, of which contains an idproperty that I wish to return a HashSet containing each id that can be used for quick O(1) contains(Object) operations.

Before

Let us take a simplified example where we pass a collection of Widget objects to a function that returns the resulting HashSet<Widget> for each id property.

public Set<String> getKeySet(Collection<Widget> source) {
    Set<String> ids = new HashSet<>();
    for (Widget w : source) {
        ids.add(w.getId());
    }
    return ids;
}

Step 1: Remove for Loop

Within NetBeans you will have see it has identified the for each loop can be converted into a functional stream operation.



Accept this to have the function transformed.

public Set<String> getKeySet(Collection<Widget> source) {
    Set<String> ids = new HashSet<>();
    source.stream().forEach((w) -> {
        ids.add(w.getId());
    });
    return ids;
}

It looks like it's complicated the operation, and yes; it has. But we have a plan to simplify this by not using the forEach and instead using a map and collect

Step 2: Simplify Expression

While it looks busy, we can simplify the expression as the lambda body only contains one line we can remove the braces, further we can remove the parentheses around the lambda parameter too.

public Set<String> getKeySet(Collection<Widget> source) {
    Set<String> ids = new HashSet<>();
    source.stream().forEach(w -> ids.add(w.getId()));
    return ids;
}

Step 3: Map to Keys

Note that in the body of the forEach we are calling the getId method. What we can actually do is use the map function to convert the Widget into a String for the id property before it gets to the forEach

public Set<String> getKeySet(Collection<Widget> source) {
    Set<String> ids = new HashSet<>();
    source.stream().map(Widget::getId).forEach(id -> ids.add(id));
    return ids;
}

That :: thing is a method reference, what it does is say to the function "Hey, here's a lambda expression that I'm going to pass in a Widget object, you give me the result of calling this method reference on that object". This can be really handy as we can omit the need to specify the lambda arguments which makes this easier to read.

Okay, again, it looks busy! But what you might notice now is that we have the stream before it gets to the forEach lambda that it's already in the correct format, a collection of String objects for each id property, woohoo! We can really simplify this now.

Step 4: Reduce Complexity with a Collector

Now the really cool part happens. Since the stream is already in the right format, why not have the stream return the correct type? We do this with a collector by passing a collection type to Collectors.toCollection.

public Set<String> getKeySet(Collection<Widget> source) {
    return source.stream().map(Widget::getId)
            .collect(Collectors.toCollection(HashSet::new));
}

Here, everything is controlled by the stream, we don't even have to construct an object!

Summary

We have performed a small simplification on the original source and a vertical reduction from 7 to 4 (43%) which might not seem like much, but the greater benefit here is that we have moved everything into the stream which can now be read as it's written. Take a look at the final example and read it in the way it's written out loud "source stream map to widget id and collect to hash set", this can be extremely handy when trying to read code later.

Using Method References with JDBCTemplate

While there's many persistence frameworks available I still use spring's JdbcTemplate as our environment has multiple databases within the one application making managing persistence context difficult.

Never mind me though, let me show you something cool you can do with method references.

Before

Here we have a situation where you have to implement a rowMapper instance. There's two ways this is normally done, one; the quick way is with an anonymous class within the call to the query function.

jdbcTemplate.query("select * from customers", new RowMapper<Customer> () {
    @Override
    public Customer mapRow(ResultSet rs, int i) throws SQLException {
        Customer cust = new Customer();
        cust.setId("id");
        // ... set other fields.
        return cust;
    }
});

Usually though we find that you need to reuse the mapper, so you stick it in a dedicated class and create a DEFAULT static instance like this:

public final class CustomerRowMapper implements RowMapper<Customer> {

    public static final CustomerRowMapper DEFAULT
            = new CustomerRowMapper();

    @Override
    public Customer mapRow(ResultSet rs, int i) throws SQLException {
        Customer cust = new Customer();
        cust.setId("id");
        // ... set other fields.
        return cust;
    }

}

And then use it in your query:

jdbcTemplate.query("select * from customers",
                   CustomerRowMapper.DEFAULT);

So the problem is solved. But one thing that would be cool is the ability to group all common domain mappers in one class, and simply have static methods that perform the mapping. Well, now with method references you can do exactly that.

The first thing we do is convert the prior mapper into a simple static class with a static method that handles the mapping, we no longer need to implement the interface, but the method still must adhere to the contract.

public final class SalesRowMapper {

    public static Customer mapCustomerRow(ResultSet rs, int i)
            throws SQLException {
        Customer cust = new Customer();
        cust.setId(rs.getString("id"));
        // ... set other fields.
        return cust;
    }

}

And then use this as follows:

jdbcTemplate.query("select * from customers",
                   SalesRowMapper::mapCustomerRow);

This might not look like a huge benefit over the dedicated RowMapper implementation, however you can now put all your domain mappers in one place, or have row mappers calling each other for base objects sharing similar structures.

As Dan Ciborowski has pointed out a builder pattern would also make this even more expressive making the method for mapping customers very clean, hypothetically the mapCustomer method might look something like this:

public static Customer mapCustomerRow(ResultSet rs, int i)
        throws SQLException {
    return Customer.builder()
            .id(rs.getString("id"))
            .name(rs.getString("name"))
            .build();
}

Group Objects Into Map

Often you have a need to group objects by a particular property of that object into a map that contains a list for the map values per entry. This is usually a tedious task that can often contain errors, i.e. forgetting to add a list to the map for a given key if it does not exist and then trying to add to it.

Before

One such implementation that groups customers by state might look like the following:

public static Map<String, List<Customer>> 
        groupByState(Collection<Customer> custs) {
    Map<String, List<Customer>> res = new HashMap<>();
    for (Customer cust : custs) {
        if (!res.containsKey(cust.getState())) {
            res.put(cust.getState(), new ArrayList<Customer>());
        }
        res.get(cust.getState()).add(cust);
    }
    return res;
}

TIP: with JDK 8 the compiler is now smart enough that the explicit type argument on line 5 is no longer required. JDK 7 required that this be explicitly defined. NetBeans will give you the "Use functional operation" hint, if you accept it it will look something like the following:

public static Map<String, List<Customer>> 
        groupByState(Collection<Customer> custs) {
    Map<String, List<Customer>> res = new HashMap<>();
    custs.stream().map((cust) -> {
        if (!res.containsKey(cust.getState())) {
            res.put(cust.getState(), new ArrayList<>());
        }
        return cust;
    }).forEach((cust) -> {
        res.get(cust.getState()).add(cust);
    });
    return res;
}

This really isn't what we want. To get to where we want to go we really need to think about the collect method.

Use a Collector to Group By

I'm not going to attempt to show how to refactor this as it's easier to just start from scratch, it's barely any code :)

As earlier we can take a look at the Collectors methods to see what we could use. We find the method Collectors.groupBy. Making use of this method really is simple.

public static Map<String, List<Customer>> 
        groupByState(Collection<Customer> custs) {
    return custs.stream()
            .collect(Collectors.groupingBy(Customer::getState));
}

You can simplify further by statically importing Collectors.groupingBy

public static Map<String, List<Customer>> 
        groupByState(Collection<Customer> custs) {
    return custs.stream().collect(groupingBy(Customer::getState));
}

Summary

We have taken a complex 11 line block ant transformed it into a 3 line block giving a 75% reduction which is super simple to read from left to right.

Nested Grouping

Building on the prior example, say we wanted to further group our customers into state and city.

Before

As we are building on the prior example we use it to represent an even more complicated method that must create double Map structures.

public static Map<String, Map<String, List<Customer>>>
        groupByStateCity(Collection<Customer> customers) {
    Map<String, Map<String, List<Customer>>> res = new HashMap<>();
    for (Customer cust : customers) {
        if (!res.containsKey(cust.getState())) {
            res.put(cust.getState(),
                    new HashMap<String, List<Customer>>());
        }
        if (!res.get(cust.getState()).containsKey(cust.getCity())) {
            res.get(cust.getState()).put(cust.getCity(),
                                         new ArrayList<Customer>());
        }
        res.get(cust.getState()).get(cust.getCity()).add(cust);
    }
    return res;
}

After

The above is horrible and hard to maintain, lots of things can go wrong. We can greatly simplify this by passing another groupBy collector to the collector.

Again, I'm not going to attempt to show how to refactor this.

public static Map<String, Map<String, List<Customer>>>
        groupByStateCity(Collection<Customer> customers) {
    return customers.stream()
            .collect(Collectors.groupingBy(Customer::getState,
                             Collectors.groupingBy(Customer::getCity)));
}

This is now super compact, it can again be further simplified if you statically import Collectors.groupingBy.

public static Map<String, Map<String, List<Customer>>>
        groupByStateCity(Collection<Customer> customers) {
    return customers.stream()
            .collect(groupingBy(Customer::getState,
                                groupingBy(Customer::getCity)));
}

Conclusion


I've given just 5 examples you can reuse in your own code that I have found crop up. There are many examples all over about lambdas in collections, I highly recommend reading Brian Goetz's article on The State of the Lambda: Libraries Edition.

There is a bit of interest in LINQ from Java developers that I'm thinking of writing java 8 versions of the 101-LINQ Samples, some may not be possible such as those that make use of anonymous types though I think the exercise would be useful.