Towards More Functional Java using Lambdas as Predicates

Comment

Towards More Functional Java using Lambdas as Predicates

Previously I showed an example that transformed a map of query parameters into a SOLR search string. The pre-Java 8 code used a traditional for loop with a conditional and used a StringBuilder to incrementally build a string. The Java 8 code streamed over the map entries, mapping (transforming) each entry to a string of the form "key:value" and finally used a Collector to join those query fragments together. This is a common pattern in functional-style code, in which a for loop transforms one collection of objects into a collection of different objects, optionally filters some of them out, and optionally reduce the collection to a single element. These are common patterns in the functional style - map, filter, reduce, etc. You can almost always replace a for loop with conditional filtering and reduction into a Java 8 stream with map, filter, and reduce (collect) operations.

But in addition to the stream API, Java 8 also introduced some nice new API methods that make certain things much simpler. For example, suppose we have the following method to remove all map entries for a given set of keys. In the example code, dataCache is a ConcurrentMap and deleteKeys is the set of keys we want to remove from that cache. Here is the original code I came across:


public void deleteFromCache(Set<String> deleteKeys) {
    Iterator<Map.Entry<String, Object>> iterator = dataCache.entrySet().iterator();
    while (iterator.hasNext()) {
        Map.Entry<String, Object> entry = iterator.next();
        if (deleteKeys.contains(entry.getKey())) {
            iterator.remove();
        }
    }
}

Now, you could argue there are better ways to do this, e.g. iterate the delete keys and remove each mapping using the Map#remove(Object key) method. For example:


public void deleteFromCache(Set<String> deleteKeys) {
    for (String deleteKey : deleteKeys) {
        dataCache.remove(deleteKey);
    }
}

The code using the for loop certainly seems cleaner than using the Iterator in this case, though both are functionally equivalent. Can we do better? Java 8 introduced the removeIf method as a default method, not in Map but instead in the Collection interface. This new method "removes all of the elements of this collection that satisfy the given predicate", to quote from the Javadocs. This method accepts one argument, a Predicate, which is a functional interface introduced in Java 8, and which can therefore be used in lambda expressions. Let's first implement this a regular old anonymous inner class, which you can always do even in Java 8. It looks like:


public void deleteFromCache(Set<String> deleteKeys) {
    dataCache.entrySet().removeIf(new Predicate<Map.Entry<String, Object>>() {
        @Override
        public boolean test(Map.Entry<String, Object> entry) {
            return deleteKeys.contains(entry.getKey());
        }
    });
}

As you can see, we first get the map's entry set via the entrySet method and call removeIf on it, supplying a Predicate that tests whether the set of deleteKeys contains the entry key. If this test returns true, the entry is removed. Since Predicate is annotated with @FunctionalInterface it can act as a lambda expression, a method reference, or a constructor reference according to the Javadoc. So let's take the first step and convert the anonymous inner class into a lambda expression:


public void deleteFromCache(Set<String> deleteKeys) {
    dataCache.entrySet().removeIf((Map.Entry<String, Object> entry) ->
        deleteKeys.contains(entry.getKey()));
}

In the above, we've replaced the anonymous class with a lambda expression that takes a single Map.Entry argument. But, Java 8 can infer the argument types of lambda expressions, so we can remove the explicit (and a bit noisy) type declarations, leaving us with the following cleaner code:


public void deleteFromCache(Set<String> deleteKeys) {
    dataCache.entrySet().removeIf(entry -> deleteKeys.contains(entry.getKey()));
}

This code is quite a bit nicer than the original code using an explicit Iterator. But what about compared to the second code example that looped through the keys using a simple for loop, and calling remove to remove each element? The lines of code really aren't that different, so assuming they are functionally equivalent then perhaps it is just a style preference. The explicit for loop is a traditional imperative style, whereas the removeIf has a more functional flavor to it. If you look at the actual implementation of removeIf in the Collection interface, it actually uses an Iterator under the covers, just as with the first example in this post.

So practically there is no difference in functionality. But, removeIf could theoretically be implemented for certain types of collections to perform the operation in parallel, and perhaps only for collections over a certain size where it can be shown that parallelizing the operation has benefits. But this simple example is really more about separation of concerns, i.e. separating the logic of traversing the collection from the logic that determines whether or not an element is removed.

For example, if a code base needs to remove elements from collections in many difference places, chances are good that it will end up having similar loop traversal logic intertwined with remove logic in many different places. In contrast, using the removeIf function leads to only having the remove logic in the different locations - and the removal logic is really your business logic. And, if at some later point in time the traversal logic in the Java collections framework were to be improved somehow, e.g. parallelized for large collections, then all the locations using that function automatically receive the same benefit, whereas code that combines the traversal and remove logic using explicit Iterator or loops would not.

In this case, and many others, I'd argue the separation of concerns is a much better reason to prefer functional style to imperative style. Separation of concerns leads to better, cleaner code and easier code re-use precisely since those concerns can be implemented separately, and also tested separately, which results in not only cleaner production code but also cleaner test code. All of which leads to more maintainable code, which means new features and enhancements to existing code can be accomplished faster and with less chance of breaking existing code. Until the next post in this ad-hoc series on Java 8 features and a functional style, happy coding!

Comment

Towards more functional Java using Streams and Lambdas

Comment

Towards more functional Java using Streams and Lambdas

In the last post I showed how the Java 7 try-with-resources feature reduces boilerplate code, but probably more importantly how it removes errors related to unclosed resources, thereby eliminating an entire class of errors. In this post, the first in an ad-hoc series on Java 8 features, I'll show how the stream API can reduce the lines of code, but also how it can make the code more readable, maintainable, and less error-prone.

The following code is from a simple back-end service that lets us query metadata about messages flowing through various systems. It takes a map of key-value pairs and creates a Lucene query that can be submitted to SOLR to obtain results. It is primarily used by developers to verify behavior in a distributed system, and it does not support very sophisticated queries, since it only ANDs the key-value pairs together to form the query. For example, given a parameter map containing the (key, value) pairs (lastName, Smith) and (firstName, Bob), the method would generate the query "lastName:Smith AND firstName:Bob". As I said, not very sophisticated.

Here is the original code (where AND, COLON, and DEFAULT_QUERY are constants):


public String buildQueryString(Map<String, String> parameters) {
    int count = 0;
    StringBuilder query = new StringBuilder();

    for (Map.Entry<String, String> entry : parameters.entrySet()) {
        if (count > 0) {
            query.append(AND);
        }
        query.append(entry.getKey());
        query.append(COLON);
        query.append(entry.getValue());
        count++;
    }

    if (parameters.size() == 0) {
        query.append(DEFAULT_QUERY);
    }

    return query.toString();
}

The core business logic should be very simple, since we only need to iterate the parameter map, join the keys and values with a colon, and finally join them together. But the code above, while not terribly hard to understand, has a lot of noise. First off, it uses two mutable variables (count and query) that are modified within the for loop. The first thing in the loop is a conditional that is needed to determine whether we need to append the AND constant, as we only want to do that after the first key-value pair is added to the query. Next, joining the keys and values is done by concatenating them, one by one, to the StringBuilder holding the query. Finally the count must be incremented so that in subsequent loop iterations, we properly include the AND delimiter. After the loop there is another conditional which appends DEFAULT_QUERY if there are no parameters, and then we finally convert the StringBuilder to a String and return it.

Here is the buildQueryString method after refactoring it to use the Java 8 stream API:


public String buildQueryString(Map<String, String> parameters) {
    if (parameters.isEmpty()) {
        return DEFAULT_QUERY;
    }

    return parameters.entrySet().stream()
            .map(entry -> String.join(COLON, entry.getKey(), entry.getValue()))
            .collect(Collectors.joining(AND));
}

This code does the exact same thing, but in only 6 lines of code (counting the map and collect lines as separate even though technically they are part of the stream call chain) instead of 15. But just measuring lines of code isn't everything. The main difference here is the lack of mutable variables, no external iteration via explicit looping constructs, and no conditional statements other than the empty check which short circuits and returns DEFAULT_QUERY when there are no parameters. The code reads like a functional declaration of what we want to accomplish: stream over the parameters, convert each (key, value) to "key:value" and join them all together using the delimiter AND.

The specific Java 8 features we've used here start with the stream() method to convert the map entry set to a Java 8 java.util.stream.Stream. We then use the map operation on the stream, which applies a function (String.join) to each element (Map.Entry) in the stream. Finally, we use the collect method to reduce the elements using the joining collector into the resulting string that is the actual query we wanted to build. In the map method we've also made use of a lambda expression to specify exactly what transformation to perform on each map entry.

By removing explicit iteration and mutable variables, the code is more readable, in that a developer seeing this code for the first time will have an easier and quicker time understanding what it does. Note that much of the how it does things has been removed, for example the iteration is now implicit via the Stream, and the joining collector now does the work of inserting a delimiter between the elements. You're now declaring what you want to happen, instead of having to explicitly perform all the tedium yourself. This is more of a functional style than most Java developers are used to, and at first it can be a bit jarring, but as you practice and get used to it, the more you'll probably like it and you'll find youself able to read and write this style of code much more quickly than traditional code with lots of loops and conditionals. Generally there is also less code than when using traditional looping and control structures, which is another benefit for maintenance. I won't go so far as to say Java 8 is a functional language like Clojure or Haskell - since it isn't - but code like this has a more functional flavor to it.

There is now a metric ton of content on the internet related to Java 8 streams, but in case this is all new to you, or you're just looking for a decent place to begin learning more in-depth, the API documentation for the java.util.stream package is a good place to start. Venkat Subramaniam's Functional Programming in Java is another good resource, and at less than 200 pages can be digested pretty quickly. And for more on lambda expressions, the Lambda Expressions tutorial in the official Java Tutorials is a decent place to begin. In the next post, we'll see another example where a simple Java 8 API addition combined with a lambda expression simplifies code, making it more readable and maintainable.

Comment

Reduce Java boilerplate using try-with-resources

Comment

Reduce Java boilerplate using try-with-resources

Java 8 has been out for a while, and Java 7 has been out even longer. But even so, many people still unfortunately are not taking advantage of some of the new features, many of which make reading and writing Java code much more pleasant. For example, Java 7 introduced some relatively simple things like strings in switch statements, underscores in numeric literals (e.g. 1_000_000 is easier to read and see the magnitude than just 1000000), and the try-with-resources statement. Java 8 went a lot further and introduced lambda expressions, the streams API, a new date/time API based on the Joda Time library, Optional, and more.

In this blog and in a few subsequent posts, I will take a simple snippet of code from a real project, and show what the code looked like originally and what it looked like after refactoring it to be more readable and maintainable. To start, this blog will actually tackle the try-with-resources statement introduced in Java 7. Many people even in 2016 still seem not to be aware of this statement, which not only makes the code less verbose, but also eliminates an entire class of errors resulting from failure to close I/O or other resources.

Without further ado (whatever ado actually means), here is a method that was used to check port availability when starting up services.


public boolean isPortAvailable(final int port) {
    ServerSocket serverSocket = null;
    DatagramSocket dataSocket = null;

    try {
        serverSocket = new ServerSocket(port);
        serverSocket.setReuseAddress(true);
        dataSocket = new DatagramSocket(port);
        dataSocket.setReuseAddress(true);
        return true;
    } catch (IOException e) {
        return false;
    } finally {
        if (dataSocket != null) {
            dataSocket.close();
        }

        if (serverSocket != null) {
            try {
                serverSocket.close();
            } catch (IOException e) {
                // ignored
            }
        }
    }
}

The core logic for the above code is pretty simple: open a ServerSocket and a DatagramSocket and if both opened without throwing an exception, then the port is open. It's all the extra boilerplate code and exception handling that makes the code so lengthy and error-prone, because we need to make sure to close the sockets in the finally block, being careful to first check they are not null. For good measure, the ServerSocket#close method throws yet another IOException, which we simply ignore but are required to catch nonetheless. A lot of extra code which obscures the actual simple core of the code.

Here's the refactored version which makes use of the try-with-resources statement from Java 7.


public boolean isPortAvailable(final int port) {
    try (ServerSocket serverSocket = new ServerSocket(port);
         DatagramSocket dataSocket = new DatagramSocket(port)) {
        serverSocket.setReuseAddress(true);
        dataSocket.setReuseAddress(true);
        return true;
    } catch (IOException e) {
        return false;
    }
}

As you can hopefully see, this code has the same core logic, but much less of the boilerplate. There is not only less code (7 lines instead of 22), but the code is much more readable since only the core logic remains. We are still catching the IOException that can be thrown by the ServerSocket and DatagramSocket constructors, but we no longer need to deal with the routine closing of those socket resources. The try-with-resources statement does that task for us, automatically closing any resources opened in the declaration statement that immediately follows the try keyword.

The one catch is that the declared resources must implement the AutoCloseable interface, which itself extends Closeable. Since the Java APIs make extensive use of Closeable and AutoCloseable this means that most things you'll want to use can be handled via try-with-resources. Classes that don't implement AutoCloseable cannot be used directly in try-with-resources statments. For example, if you are unfortunate enough to still need to deal with XML, for example if you need to use the old-school XMLStreamReader then you are out of luck since it doesn't implement Closeable or AutoCloseable. I generally fix those types of things by creating a small wrapper/decorator class, e.g. CloseableXMLStreamReader, but sometimes it simply isn't worth the trouble unless you are using it in many difference places.

For more information on try-with-resources, the Java tutorials on Oracle's website has a more in-depth article here. In subsequent posts, I'll show some before/after code that makes use of Java 8 features such as the stream API and lambda expressions.

Comment

Serving Uncompressed Rails Assets Via a URL Parameter

Comment

Serving Uncompressed Rails Assets Via a URL Parameter

Let's say you want to debug a javascript issue in your Rails application.  Easy enough.  You can use a browser tool such as Firebug to set a breakpoint and trace through the live code.  Well, let's throw a wrench into the situation and let's say that you need to do this in your production environment, which is using the asset pipeline, which has compressed all of your beautifully crafted javascript into a single application.js file.  This just became a lot harder because all of your javascript has been minified/compressed and no longer maintains it's nice, human readable structure that you coded it in.

Wouldn't it be nice to be able to set a url parameter (?debug=true) and immediately serve your uncompressed assets?  If that were the case you'd be able use your browser tools (Firebug) to set some breakpoints and trace through live code in your production environment.  Well, here's how I did it...

Create two javascript manifest files.  One that will be compressed and one that will be left uncompressed.  I named mine application.js and application-debug.js

// application.js

//= extjs-all
//= openlayers
//= other_javascript_files
/**
* application-debug.js
* disable_asset_compression <== this is important later!
**/
//= extjs-all-debug
//= openlayers-debug
//= other_javascript_files

Create a before_action in your application_controller.rb that checks for your "debug" url parameter.  If the "debug" url parameter exists, set a "debug" instance variable that can be read from your root layout file (application.html.erb).

class ApplicationController < ActionController::Base
  before_action :debug_assets

  def debug_assets
    if params[:debug]
      @debug = true
    end
  end
  # ...
end

Inside the head tags of your root layout file (application.html.erb), check for the "debug" instance variable.  Based on it's existence, serve up the uncompressed or compressed version of your javascript.

<% if @debug %>
  <%= javascript_include_tag "application-debug“ %>
<% else %>
  <%= javascript_include_tag "application" %>
<% end %>

Pretty simple so far, right?  What happens when you to need precompile the assets and you've set your js_compressor as Uglifier?  Uglifier will compress both versions of your javascript manifests. To fix this, we'll need to override a couple methods in Uglifier by defining our own js_compressor that inherits from Uglifier.

First, define ConiditionalUglifier in your application's lib directory.  Notice that the override looks for the string "disable_asset_compression" in a comment block in your manifest file.  When it sees this string, it will disable compression for that manifest.

class ConditionalUglifier < Uglifier
 def initialize(options = { })
   options.merge(:comments => :all)
   super(options)
 end

 def compress(source)
   if source =~ /disable_asset_compression/
     source
   else
     super(source)
   end
 end
end

Next, set your js_compressor in the appropriate environment files (production and wherever else your precompile assets).

# config/environments/production.rb
# ...
config.assets.js_compressor = ConditionalUglifier.new
# ...

That's it!  Precompile and deploy.  To serve the uncompressed versions of your javascript assets, simply put "debug=true" as a parameter in your url.  Happy debugging!

Comment

Conditionally Precompiling Assets with Rails and Capistrano

3 Comments

Conditionally Precompiling Assets with Rails and Capistrano

This week I 'capified' a Rails application I've been developing.  It's a basic Rails 4 app using Git for version control and is deployed across multiple environments: development-integration, test-integration, and production.  Along with Capistrano's documentation, there are quite a few Capistrano 3 setup/configuration tutorials on the web that are very helpful to get basic deployments working across multiple environments.

Basic Capistrano setup steps consist of:

  • Update Gemfile to include capistrano gems:
gem 'capistrano', '~> 3.1'
gem 'capistrano-rails', '~> 1.1'
  • Run Capistrano install task:
bundle exec cap install
  • Configure Capfile to require what you need (asset precompilation, migrations, etc):
require 'capistrano/bundler'
require 'capistrano/rails/assets'
require 'capistrano/rails/migrations'
  • Configure environment specific information in the deploy/<env>.rb files.

So far so good, right?  Now I want to throw a wrench in the spokes.  I don't want to precompile assets in the development-integration environment, but I would like to precompile assets in the other environments.  Why?  Well, the application is very javascript heavy.  It's useful to be able to use browser tools, such as firebug, to set breakpoints in un-minified, uncompressed code in an integration environment.  (Yes, I'm aware that some bugs manifest themselves due to precompiling assets and this wouldn't solve that problem.)

Unfortunately, Capistrano doesn't provide a simple environment flag to turn on/off precompiling assets per environment.  Out of the box, it's all or nothing.  However, you can clear the existing 'assets_precompile' task provided by capistrano and then re-define this task in deploy.rb.

# deploy.rb

namespace :deploy do
  Rake::Task['deploy:compile_assets'].clear
  desc 'Compile assets'
  task :compile_assets => [:set_rails_env] do
    unless fetch(:rails_env) == 'development-integration'
      invoke 'deploy:assets:precompile'
      invoke 'deploy:assets:backup_manifest'
    end
  end
end

Thankfully the provided 'compile_assets' task is only ~4 lines of code making it easy to re-define. The conditional statement above is just a simple example of how to check the rails environment.  This could also be accomplished via a custom flag set in each of the deploy/<env>.rb files and checking that custom flag in your custom 'compile_assets' task.

3 Comments

Code Quality Unravelled: Part 2 Maven Multimodule Support in Your Site

Comment

Code Quality Unravelled: Part 2 Maven Multimodule Support in Your Site

This is the second installment in a series of posts about code quality, the tools that are available to track quality, and how to configure various systems to get all of the reporting.  There are a lot of posts on the internet about all of the tools and configurations that I'm going to describe, however, I thought it would be useful to have it all in one place.


Code Quality Unravelled: Part 1


What are multi-module projects?

There are times when you have a project that is made up of many components that you want to build together. When you have this setup there is a parent pom.xml and then each module is in a sub directory with its own pom.xml.

Aggregating Results

Multi-module projects have the ability to aggregate the quality reports and project information into one site. The site will include links to drilldown into each module as well as rolled up reports.

JXR, Javadoc, Checkstyle, JavaNCSS

Some plugins know how to handle aggregation without any extra configuration. JXR, Javadoc, Checkstyle, and JavaNCSS support aggregation with the same configuration that I listed in Part 1.


<plugin>
  <groupId>org.apache.maven.plugins</groupId>
  <artifactId>maven-jxr-plugin</artifactId>
  <version>2.5</version>
</plugin>

<plugin>
  <groupId>org.apache.maven.plugins</groupId>
  <artifactId>maven-javadoc-plugin</artifactId>
  <version>2.10.3</version>
  <configuration>
    <additionalparam>-Xdoclint:none</additionalparam>
  </configuration>
</plugin>

<plugin>
  <groupId>org.apache.maven.plugins</groupId>
  <artifactId>maven-checkstyle-plugin</artifactId>
  <version>2.16</version>
</plugin>

<plugin>
  <groupId>org.codehaus.mojo</groupId>
  <artifactId>javancss-maven-plugin</artifactId>
  <version>2.1</version>
</plugin>

Cobertura

Adding aggregation to Cobertura is as simple as adding an aggregate tag to the configuration section.


<plugin>
  <groupId>org.codehaus.mojo</groupId>
  <artifactId>cobertura-maven-plugin</artifactId>
  <version>2.7</version>
  <reportSets>
    <reportSet>
      <id>cobertura</id>
      <reports>
        <report>cobertura</report>
      </reports>
      <configuration>
        <aggregate>true</aggregate>
        <formats>
          <format>html</format>
          <format>xml</format>
        </formats>
      </configuration>
    </reportSet>
  </reportSets>
</plugin>

PMD/CPD, Surefire, and Taglist

In order to add aggregation for PMD, CPD, Surefire, and Taglist you need to add separate reportSets one for the individual module and one for the aggregate.


<plugin>
  <groupId>org.apache.maven.plugins</groupId>
  <artifactId>maven-pmd-plugin</artifactId>
  <version>3.5</version>
  <reportSets>
    <reportSet>
      <id>pmd-report</id>
      <reports>
        <report>pmd</report>
      </reports>
      <configuration>
        <skipEmptyReport>false</skipEmptyReport>
      </configuration>
    </reportSet>

    <reportSet>
      <id>pmd-aggregate</id>
      <inherited>false</inherited>
      <reports>
        <report>pmd</report>
      </reports>
      <configuration>
        <aggregate>true</aggregate>
        <skipEmptyReport>false</skipEmptyReport>
      </configuration>
    </reportSet>

    <reportSet>
      <id>cpd-report</id>
      <reports>
        <report>cpd</report>
      </reports>
      <configuration>
        <skipEmptyReport>false</skipEmptyReport>
      </configuration>
    </reportSet>

    <reportSet>
      <id>cpd-aggregate</id>
      <inherited>false</inherited>
      <reports>
        <report>cpd</report>
      </reports>
      <configuration>
        <aggregate>true</aggregate>
        <skipEmptyReport>false</skipEmptyReport>
      </configuration>
    </reportSet>
  </reportSets>
</plugin>

<plugin>
  <groupId>org.apache.maven.plugins</groupId>
  <artifactId>maven-surefire-report-plugin</artifactId>
  <version>2.18.1</version>
  <reportSets>
    <reportSet>
      <id>unit-tests</id>
      <reports>
        <report>report-only</report>
      </reports>
      <configuration>
        <linkXRef>true</linkXRef>
        <alwaysGenerateSurefireReport>true</alwaysGenerateSurefireReport>
      </configuration>
    </reportSet>

    <reportSet>
      <id>unit-tests-aggregate</id>
      <inherited>false</inherited>
      <reports>
        <report>report-only</report>
      </reports>
      <configuration>
        <aggregate>true</aggregate>
        <linkXRef>true</linkXRef>
        <alwaysGenerateSurefireReport>true</alwaysGenerateSurefireReport>
      </configuration>
    </reportSet>
  </reportSets>
</plugin>

<plugin>
  <groupId>org.codehaus.mojo</groupId>
  <artifactId>taglist-maven-plugin</artifactId>
  <version>2.4</version>
  <reportSets>
    <reportSet>
      <id>taglist-report</id>
      <reports>
        <report>taglist</report>
      </reports>
      <configuration>
        <tagListOptions>
          <tagClasses>
            <tagClass>
              <displayName>Todo Work</displayName>
              <tags>
                <tag>
                  <matchString>todo</matchString>
                  <matchType>ignoreCase</matchType>
                </tag>
                <tag>
                  <matchString>FIXME</matchString>
                  <matchType>exact</matchType>
                </tag>
              </tags>
            </tagClass>
            <tagClass>
              <displayName>Architecture Review Needed</displayName>
              <tags>
                <tag>
                  <matchString>ARCH-REV</matchString>
                  <matchType>exact</matchType>
                </tag>
              </tags>
            </tagClass>
          </tagClasses>
        </tagListOptions>
      </configuration>
    </reportSet>

    <reportSet>
      <id>taglist-aggregate</id>
      <inherited>false</inherited>
      <reports>
        <report>taglist</report>
      </reports>
      <configuration>
        <aggregate>true</aggregate>
        <tagListOptions>
          <tagClasses>
            <tagClass>
              <displayName>Todo Work</displayName>
              <tags>
                <tag>
                  <matchString>todo</matchString>
                  <matchType>ignoreCase</matchType>
                </tag>
                <tag>
                  <matchString>FIXME</matchString>
                  <matchType>exact</matchType>
                </tag>
              </tags>
            </tagClass>
            <tagClass>
              <displayName>Architecture Review Needed</displayName>
              <tags>
                <tag>
                  <matchString>ARCH-REV</matchString>
                  <matchType>exact</matchType>
                </tag>
              </tags>
            </tagClass>
          </tagClasses>
        </tagListOptions>
      </configuration>
    </reportSet>
  </reportSets>
</plugin>

Not Everyone Plays Nicely

Unfortunately, not all of the reports we have talked about support aggregation. FindBugs and JDepend do not support aggregation, so when you run site you will have to look at these reports within each module instead of at the top level page. You can find more information about why FindBugs doesn't support this on the FindBugs site.

Generating the site

In addition to running the site report as I described in Part 1, there are a few other goals for the site plugin.

  • mvn site:site
    • Runs the site report putting the output in target/site
  • mvn site:stage
    • Runs the site report putting the output in target/staging
  • mvn site:deploy
    • Runs the site report and deploys the output to a location specified in a <distributionMangement> block

The generated reports contain deficiencies that you should be aware of:

  • Running maven site (or site:site) on a multi-module project will create links to the sub-modules that do not work. To get the links to work you will have to run site:stage or site:deploy after site:site.
  • For some reason when running site:stage, the cobertura report is a blank page. You can find the aggregate report for this in the target/site location.
  • The Surefire Report shows zeros for everything in the rolled up report, but the module reports still contain the information.

For a sample project that includes all of these reporting sections check out the repo on Github.


Code Quality Unravelled: Part 1


Comment

Code Quality Unravelled: Part 1 Building a Maven Site

Comment

Code Quality Unravelled: Part 1 Building a Maven Site

This is the first installment in a series of posts about code quality, the tools that are available to track quality, and how to configure various systems to get all of the quality reporting.  There are a lot of posts on the internet about all of the tools and configurations that I'm going to describe, however, I thought it would be useful to have it all in one place.


Code Quality Unravelled: Part 2


Building a Maven Site

Apache Maven is a build tool for Java. There are a ton of plugins to have your build do all sorts of processing and reporting. One of the key components of Maven is the site report. The site report collects all sorts of information about your project including descriptive information, contact information, license information and also various reports.

The pom.xml file is the place where you can configure all of the build information, dependencies, and reporting tools. The Apache Maven site has a great reference to all of the various components that make up a pom file. For the purposes of this post, I am going to focus on the reporting section of the pom file.

NOTE: This post was written using Maven version 3.2.3.


<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>com.mycompany</groupId>
    <artifactId>test-java-build</artifactId>
    <version>1.0-SNAPSHOT</version>
    <packaging>jar</packaging>

    <name>test-java-build</name>

    <properties>
      <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
    </properties>

    <reporting>
      <plugins>
        <!-- This is where we will focus -->
      </plugins>
    </reporting>
</project>

There are a lot of reports that can be generated on the code base using Maven. I'm going to go over how to setup and configure the following tools: Cobertura, PMD/CPD, JXR, Javadoc, Surefire, FindBugs, JDepend, Taglist, Checkstyle.

Cobertura

Cobertura is a tool to report on the amount of code being covered by automated tests.


<plugin>
    <groupId>org.codehaus.mojo</groupId>
    <artifactId>cobertura-maven-plugin</artifactId>
    <version>2.7</version>
    <reportSets>
      <reportSet>
        <id>cobertura</id>
        <reports>
          <report>cobertura</report>
        </reports>
        <configuration>
          <formats>
            <format>html</format>
            <format>xml</format>
          </formats>
        </configuration>
      </reportSet>
    </reportSets>
</plugin>

This block will cause cobertura to generate the report in both xml and html format. You can find other configuration options on the plugin site. NOTE: If you are using Java 8 (specifically the new syntax) and start seeing stack traces referencing JavaNCSS, make sure you upgrade to version 2.7 of this plugin. This version brings in Cobertura 2.1.x which supports Java 8 correctly.

PMD/CPD

PMD analyzes your source code finding common programming flaws. CPD (Copy-paste-detector) also comes with PMD and looks for duplicate code in your project.


<plugin>
    <groupId>org.apache.maven.plugins</groupId>
    <artifactId>maven-pmd-plugin</artifactId>
    <version>3.5</version>
    <reportSets>
      <reportSet>
        <id>pmd-report</id>
        <reports>
          <report>pmd</report>
        </reports>
        <configuration>
          <skipEmptyReport>false</skipEmptyReport>
        </configuration>
      </reportSet>

      <reportSet>
        <id>cpd-report</id>
        <reports>
          <report>cpd</report>
        </reports>
        <configuration>
          <skipEmptyReport>false</skipEmptyReport>
        </configuration>
      </reportSet>
    </reportSets>
</plugin>

This block will run both PMD and CPD and include the reports regardless of whether there are issues detected or not. Plugin configuration options can be found on the plugin site. One key section to take a look at in the documentation is the Rule Set section. This shows you how to define or customize the rules you want PMD to check against.

JXR

JXR is plugin that produces an html view of all of the source in the project. This plugin is also used by other reports in order to link directly to the source.


<plugin>
    <groupId>org.apache.maven.plugins</groupId>
    <artifactId>maven-jxr-plugin</artifactId>
    <version>2.5</version>
</plugin>

Javadoc

The Javadoc plugin will generate the Javadoc for your project and include the documentation in the site output.


<plugin>
    <groupId>org.apache.maven.plugins</groupId>
    <artifactId>maven-javadoc-plugin</artifactId>
    <version>2.10.3</version>
</plugin>

Note: if using JDK 8 or newer, the Javadoc command has added a strict validator that checks for broken references and invalid HTML. This validator will fail the build if it finds anything it deems unacceptable. There is a good run down on this behavior on this blog. To have maven still present the warnings but not fail the build a configuration section can be added to the plugin definition like so:


<plugin>
    <groupId>org.apache.maven.plugins</groupId>
    <artifactId>maven-javadoc-plugin</artifactId>
    <version>2.10.3</version>
    <configuration>
      <additionalparam>-Xdoclint:none</additionalparam>
    </configuration>
</plugin>

JavaNCSS

The JavaNCSS Plugin generates metrics for quality and complexity on the project.


<plugin>
  <groupId>org.codehaus.mojo</groupId>
  <artifactId>javancss-maven-plugin</artifactId>
  <version>2.1</version>
</plugin>

Surefire

The Surefire Report Plugin generates a nice easy to read report of all of your test results.


<plugin>
    <groupId>org.apache.maven.plugins</groupId>
    <artifactId>maven-surefire-report-plugin</artifactId>
    <version>2.18.1</version>
    <reportSets>
      <reportSet>
        <id>unit-tests</id>
        <reports>
          <report>report-only</report>
        </reports>
        <configuration>
          <linkXRef>true</linkXRef>
          <alwaysGenerateSurefireReport>true</alwaysGenerateSurefireReport>
        </configuration>
      </reportSet>
    </reportSets>
</plugin>

This block will generate the surefire report without running the tests a second time (the default report "report" type runs the tests twice). The configuration option linkXRef links the line references in the tests to the actual source provided by JXR. The alwaysGenerateSurefireReport option will generate the surefire report whether all of the tests pass or not.

FindBugs

FindBugs analyzes the code looking for potential bugs.


<plugin>
    <groupId>org.codehaus.mojo</groupId>
    <artifactId>findbugs-maven-plugin</artifactId>
    <version>3.0.2</version>
</plugin>

JDepend

JDepend looks through the source tree analyzing the quality of the code in terms of extensibility and reusability.


<plugin>
    <groupId>org.codehaus.mojo</groupId>
    <artifactId>jdepend-maven-plugin</artifactId>
    <version>2.0</version>
</plugin>

Taglist

The Taglist plugin searches the code for various defined tags that are used to highlight areas of code that need more attention (e.g. TODO and FIXME).


<plugin>
    <groupId>org.codehaus.mojo</groupId>
    <artifactId>taglist-maven-plugin</artifactId>
    <version>2.4</version>
    <reportSets>
      <reportSet>
        <id>taglist-report</id>
        <reports>
          <report>taglist</report>
        </reports>
        <configuration>
          <tagListOptions>
            <tagClasses>
              <tagClass>
                <displayName>Todo Work</displayName>
                <tags>
                  <tag>
                    <matchString>todo</matchString>
                    <matchType>ignoreCase</matchType>
                  </tag>
                  <tag>
                    <matchString>FIXME</matchString>
                    <matchType>exact</matchType>
                  </tag>
                </tags>
              </tagClass>
              <tagClass>
                <displayName>Architecture Review Needed</displayName>
                <tags>
                  <tag>
                    <matchString>ARCH-REV</matchString>
                    <matchType>exact</matchType>
                  </tag>
                </tags>
              </tagClass>
            </tagClasses>
          </tagListOptions>
        </configuration>
      </reportSet>
    </reportSets>
</plugin>

This block configures taglist to search for the TODO tag (regardless of case), FIXME tag, and a custom ARCH-REV tag.

Checkstyle

Checkstyle is similar to PMD and FindBugs but instead of search for potential bugs focuses more on keeping a consistent coding style across the project.


<plugin>
    <groupId>org.apache.maven.plugins</groupId>
    <artifactId>maven-checkstyle-plugin</artifactId>
    <version>2.16</version>
</plugin>

The Checkstyle maven plugin has a lot of options to help configure the level of warnings and errors to include in the generated report. The plugin documentation has some good examples of adding extra configuration to customize the items that Checkstyle will check for you.

Generating the site

Now that we have all of our reports configured we can easily generate our maven site by running the following:

mvn clean test site

When the command is finished running you should have a directory under the target folder. This site folder contains all of the files needed to display the html website.

For a sample project that includes all of these reporting sections check out the repo on Github.


Code Quality Unravelled: Part 2


Comment

Why isn't Dropwizard validating objects in resource methods?

2 Comments

Why isn't Dropwizard validating objects in resource methods?

Dropwizard provides automatic validation of Jersey resource method parameters by simply adding the @Valid annotation. For example, in a method to save a new Person object, you might have code like:


@POST
public Response createPerson(@Valid Person person) {
    Person savedPerson = save(person);

    URI location = UriBuilder.fromResource(PersonResource.class)
            .path(savedPerson.getId().toString())
            .build();

    return Response.created(location).entity(savedPerson).build();
}

By adding @Valid to the person argument, Dropwizard ensures that the Person object will be validated using Hibernate Validator. The Person object will be validated according to the constraints defined on the Person class, for example maybe the @NotEmpty annotation is on first and last name properties. If the object passes validation, method execution continues and the logic to save the new person takes place. If validation fails, however, Dropwizard arranges for a 422 (Unprocessable Entity) response to be sent back to the client, and the resource method is never actually invoked. This is convenient, as it means you don't need any conditional logic in resource methods to manually check if an object is valid. Under the covers, Dropwizard registers its own custom provider class, JacksonMessageBodyProvider, which uses Jackson to parse request entities into objects and perform validation on the de-serialized entities.

Out of the Dropwizard box this automatic validation "just works" due to the above-mentioned JacksonMessageBodyProvider. (For this post, we are assuming Dropwizard 0.8.2, which uses Jersey 2.19) It worked for us just fine, until on one service it simply stopped working entirely. In other words, no validation took place and therefore any objects, valid or not, were being passed into resource methods. Since resource method code assumes objects have been validated already, this causes downstream problems. In our case, it manifested in HibernateExceptions being thrown when data accesss code tried to persist the (not validated) objects.

This was quite perplexing, and to make a (very long) debuggging story short, it turned out that someone had added one specific dependency in the Maven pom, which triggered auto-discovery of the JacksonFeature via the JacksonAutoDiscoverable. The dependency that had been added (indirectly, more on that later) was:


<dependency>
    <groupId>org.glassfish.jersey.media</groupId>
    <artifactId>jersey-media-json-jackson</artifactId>
    <version>2.19</version>
</dependency>

If you look in the jersey-media-json-jackson-2.19.jar file, there are only five classes. But the upshot is that this JAR specifies auto disovery for Jackson via the Auto-Discoverable Features mechanism, which causes the JacksonFeature class to register JacksonJaxbJsonProvider as both a MessageBodyReader and a MessageBodyWriter. And due to the vagaries of the way Jersey orders the message body readers, that provider ends up as the first available MessageBodyReader when processing requests, which in turn means the Dropwizard JacksonMessageBodyProvider never gets executed, and as a result no validation is performed!

For some code spelunking, check out the WorkerComparator class in MessageBodyFactory (in Jersey) which is used when sorting readers and writers via a call to Collections.sort(). The Javadoc for the comparator states "Pairs are sorted by distance from required type, media type and custom/provided (provided goes first)." In particular the last bit of that sentence is key, provided goes first - this means the auto-discovered feature (JacksonJaxbJsonProvider) takes precedence over the custom provider registered by Dropwizard (JacksonMessageBodyProvider).

Of course now that we know what is going on, the solution is pretty easy:

Make sure you don't have the jersey-media-json-jackson dependency, either directly or via a transitive dependency.

In our case it had actually come in via a Maven transitive dependency, which made tracking it down harder. An easy way to see exactly what dependencies exist and where they are coming from you can use mvn dependency:tree to display the entire dependency tree for your application.

Ultimately, while Jersey provides the auto disovery mechanism, I still prefer explicit configuration so it is very clear exactly what features are present. Dropwizard abstracts some of this from us, i.e. it registers the JacksonMessageBodyProvider in the AbstractServerFactory class (in the createAppServlet method), but since Dropwizard is mostly "just code" it is much easier to deterministically know what it is and isn't doing. So if you suddenly experience a lack of validation in Dropwizard, make sure the jersey-media-json-jackson dependency is not present. If that doesn't work, then you need to figure out what other MessageBodyReader is taking precedence, determine its origin, and eliminate it!

Sample code for this post can be found on GitHub.

2 Comments

Easily integrate Spring into Dropwizard

Comment

Easily integrate Spring into Dropwizard

Dropwizard is a really nice, simple framework for developing RESTful web services. Out of the box it comes with all kinds of nice features by leveraging a collection of mature libraries in a straightforward manner. The main libraries it uses are Jetty for web serving, Jersey for REST, Jackson for JSON processing, along with others such as Google Guava, Hibernate Validator, Liquibase for migrations, and more. It also comes with out of the box support for JDBI, a simple "SQL convenience library for Java", as well as Hibernate.

The JDBI and Hibernate support provide the basics needed to get going with either of those frameworks, and the Hibernate support provides the @UnitOfWork annotation for declarative transaction demarcation which you can apply to Jersey resource methods. Dropwizard then takes care of automatically opening a Hibernate session, starting a transaction, and either committing or rolling back that transaction and finally closing the session when the resource method completes. For many use cases this will be all you need or want.

In certain situations, though, you might want more control over transactions and for this Spring is often a good choice. Specifically, you might want to take an existing Spring backend codebase and integrate it into a RESTful web service using Dropwizard. Or maybe you are using Hibernate in a Dropwizard application but you don't want the transaction scoped around and entire resource method (which is what using the @UnitOfWork annotation gives you out of the box). Maybe you aren't using Hibernate and are using JDBI, Spring JDBC, or Spring JPA and want to use Spring's native @Transactional support. Or maybe you are not even using a relational database, and instead are using one of the Spring Data projects that support transactions in Neo4J, MongoDB, Redis, etc.

For situations like those described above, what you want is a simple way to use Spring in your Dropwizard application to manage transactions in DAOs or service classes, but not much else. In other words you don't need or want to use all the other many features provided in Spring, but you do want to take advantage of its automatic connection and transaction management. For Dropwizard projects I've been working on recently, we came up with a very simple abstraction and simple pattern to integrate Spring using a builder-style class to create application contexts. This makes Dropwizard configuration and other objects like ManagedDataSources available to the Spring context.

For example, suppose you will be using Hibernate in a Dropwizard application but want to use Spring to manage transactions. Suppose also that you configure a DataSourceFactory via the normal Dropwizard configuration mechanism and want that to be used by Spring when creating the Hibernate session factory. Also suppose you want the Dropwizard configuration object to be available to the Spring context. Assuming you have a simple "todo" Dropwizard application with a TodoApplication class and a TodoConfiguration class, you can write code like the following in the TodoApplication class:


@Override
public void run(TodoConfiguration configuration, Environment environment) throws Exception {
    DataSourceFactory dataSourceFactory = configuration.getDataSourceFactory();
    ManagedDataSource dataSource = dataSourceFactory.build(environment.metrics(), "dataSource");
 
    ApplicationContext context = new SpringContextBuilder()
            .addParentContextBean("dataSource", dataSource)
            .addParentContextBean("configuration", configuration)
            .addAnnotationConfiguration(TodoSpringConfiguration.class)
            .build();

    TodoDao todoDao = context.getBean(TodoDao.class);
    TodoResource todoResource = new TodoResource(todoDao);
    environment.jersey().register(todoResource);
}

In the above code, SpringContextBuilder is a very simple builder-style class that lets you create Spring application contexts by specifying parent beans that should be available to other beans (e.g. the "dataSource" and "configuration" beans), and then adding either annotation-based configuration classes or XML configuration file locations. This class is available on GitHub here.

The above code creates a Spring context from which you can then extract the beans, such as TodoDao, that will be used by Jersey resource classes. Note that we're not using Spring for autowiring dependencies in the TodoResource class, and are simply passing the DAO to its constructor. The resource class has no idea that the DAO is actually a Spring-managed bean, nor does it need to. This also makes it very easy to inject a mock into the resource class for unit tests.

So the only thing left to do is actually create the Spring application context. In the above code, we're using the TodoSpringConfiguration class which is a Java Config-based configuration class. The code below shows the basics, with a few details omitted:


@Configuration
@EnableTransactionManagement
@ComponentScan(basePackageClasses = TodoDao.class)
public class TodoSpringConfiguration {

    @Autowired
    private DataSource _dataSource;

    @Autowired
    private TodoConfiguration _configuration;

    @Bean
    public LocalSessionFactoryBean sessionFactory() {
        LocalSessionFactoryBean sessionFactory = new LocalSessionFactoryBean();
        sessionFactory.setDataSource(_dataSource);
        sessionFactory.setPackagesToScan(Todo.class.getPackage().getName()));
        sessionFactory.setNamingStrategy(ImprovedNamingStrategy.INSTANCE);
        sessionFactory.setHibernateProperties(hibernateProperties());
        return sessionFactory;
    }

    @Bean
    @Autowired
    public HibernateTransactionManager transactionManager(SessionFactory sessionFactory) {
        return new HibernateTransactionManager(sessionFactory);
    }

    @Bean
    public PersistenceExceptionTranslationPostProcessor exceptionTranslation() {
        return new PersistenceExceptionTranslationPostProcessor();
    }

    private Properties hibernateProperties() {
        // details elided...
    }

}

As you can see, the TodoSpringConfiguration class is just a plain Spring Java-based configuration class. The only real noteworthy thing is how we used @Autowired to make the data source and Dropwizard configuration objects available. The sessionFactory() configuration method then simply uses the data source when constructing the session factory. Other beans could use the Dropwizard configuration object, for example to extract other configuration such as Hibernate-specific configuration properties.

That's really all there is to it. You just use SpringContextBuilder to create your Spring context, extract the beans you need, and pass them to Jersey resource classes or any other classes such as health checks. The classes using the Spring-managed beans can simply use them without needing to be aware of their Spring-managed nature. All of which helps keep your Dropwizard code clean while still gaining the advantage of powerful Spring features. The "todo" example application code is available on GitHub here along with instructions for building and running it.

Comment