Roll your own end-to-end solution for hosting your next BIG Project

Recently, a great friend – that happens to also be a former ex-business partner – contacted me to discuss how we could apply machine learning to reduce the costs associated with an activity related to the financial department of every company here in Brazil – and worldwide I guess. To be more precise, he wished to understand if it was possible to use any classification algorithm to automatize part of this process.

After a few days performing a preliminary investigation we found the problem was worth it and we decided to move forward with a PoC. Even though he already had a few servers for hosting our PoC, I decided that I wouldn’t risk interfering with his production environment and that instead I would setup a minimal server under any cheap hosting solution. Ideally I wished I had the same infrastructure I am currently working with: a Kubernetes cluster running on top of Rancher Server but this would require at the bare minimum one server with more than 2GB of memory (with overlapping planes, that I personally don’t recommend). Since such a huge environment was out of consideration for a PoC I opted to stick with a machine running Rancher OS, this way I could still have at least a minimal Linux that could be able of running Docker containers. But still one thing surprised me: Rancher OS doesn’t run on machines with less than 2GB of memory, my first attempt was to run it on a machine with only 1GB and it got stuck on an endless loop boot. But it wasn’t a huge problem since it only increased my monthly fee from U$5 to U$10 and I got twice the memory and more storage.


After setting up the Rancher OS instance I started to setup the stack for hosting the code and also running the solution after the first week of coding.

Git Hosting

I could have chosen Bitbucket for hosting my code but I thought it would be better to run something on my own premises, Gitlab could have been an option but its HUGE memory footprint is a no-go. Then I found a thread on reddit mentioning gitea as an interesting alternative to Gitlab. I decided to give it a try and it was a huge surprise! It has an impressive low memory footprint: 45Mb when idle and occasionally it spikes up to 60Mb and goes down again, not to mention that it had everything I needed.

Reverse Proxy

Then it was time to setup the reverse proxy that would take care of routing every HTTP(S) request to one of my services – No kubernetes, no ingress. Remember? – be it part of the dev infrastructure, be it part of the solution itself. Nginx to the rescue! The most simple setup I could think of was to have a directory for hosting a shell script that was responsible for running the Nginx container and doing a bind mount for the default.conf file (from the host pwd/default.conf to the container /etc/nginx/conf.d/default.conf), the file simply had a bunch of server sections with server_names and proxy_pass directives.

By that time, this was how the server looked like:

I was already using DuckDNS to avoid having to memorize the server ip address any time I had to SSH to it. Then I realized I needed HTTPS and therefore host names, for a PoC in such initial stages, buying a domain would be an overkill, so, DuckDNS again, this time for the actual solution.

Free HTTPS certificates

If spending with the domain wasn’t being considered by that time, with the HTTPS certificate was also a no-go. But with the advent of Let’s Encrypt we can now have SSL on our solutions without spending a penny. And requesting the certificate is even easier if you have the possibility of running CertBot’s public available docker container. As I wasn’t willing to investigate how to use certbot nginx plugin, I opted to run its simplest mechanism of issuing a certificate: the one you provide it a public available directory on your HTTP server in which it writes the content it receives from CertBot service during the certificate granting process.

Lightweight CI/CD

After struggling to run git on RancherOS (I even tried to run git as a container but I had so much file permissions issues that I gave up soon on this approact – repositories were cloned as root) I thought it’d be a good reason to anticipate the deployment of a CI/CD solution. I ended up ditching Jenkins for the same reason I had already dropped Gitlab: impressive memory footprint. After some research I found Buildbot: a python based minimalist CI/CD solution. Apart from the stock Buildbot Master Container, I’ve created a few worker containers: one based on stretch with support for PyEnv (in order to have a good support for scikit), a similar one based on Alpine, one Alpine based with Node (for building the UI) and a few other base containers (check them out at my Dockerhub account).


Finally, I had to deploy MySQL and PostgreSQL for both the dev stack and also for my own solution that was being developed. PostgreSQL was deployed as is but for MySQL I opted to slim its memory usage a little bit by following this post.

Wrap up

The project I am working on is based on Python and uses scikit-learn, Flask and SQLAlchemy with Alembic for the backend (running on Waitress) and Angular for the frontend.

The following picture provides an overview of the current containers and components running on my U$10 server:

The idea of this post was to give the overview of a recipe on how to build a cheap but comprehensive solution for hosting you next BIG idea. In the next post I’ll try to drill down on the details of setting up each of the components that were used. Feel free to comment if you have any questions.




Beginning on Node.js

Recently I started learning Node.JS as a result of a few projects developed on a startup I recently joined. We are using Parse.com on some of our projects and it happens to be based on Node.JS. Also, while learning Backbone.JS and searching for a solution to unit test our Backbone.JS code I stumbled upon Sails.JS. Sails.JS drew my attention immediately, its productivity was stunning. Suddenly, Javascript, something that only recently I started to give some credit (due to Backbone.JS and some other frameworks but ALL focused on the UI), was looking as a promising solution for a great niche of application.

While studying the Node.JS platform I soon realized that some of the similarities between what can be seen on the Java <-> Android is true for Javascript <-> Node.JS. Node.JS happens to run Javascript code but it isn’t simply Javascript. The same is true for Android as we can’t say that Android isn’t simply Java. Java presumes a lot of things that isn’t possible on Android, eg.: ClassLoading on Android doesn’t work as on Java, it seems that you are limitted to a DexClassLoader (and as we are on the Dex subject it is THE sign that Android isn’t really Java otherwise it would simply run Java bytecode).

But what about Node.JS (I’ll refer to it simply as Node from now on)? Although Node runs Javascript it has some unique things that form the Node platform, one of them is the process global object. One of its most known functions is the process.nextTick(callback) that allows code to schedule callbacks to be immediately executed on the next run of the event loop (not exactly the next run but it is guaranteed to run before any I/O event). Another important characteristic of code running inside node – but this one was drawn ipsis litteris from javascript – is the run-to-completion. This characteristic allows javascript code to place event listeners on an EventEmitter without risking the event firing before the listeners are set since the I/O code would only be able to run on the next execution of the event loop. I’ve talked a lot about the EventQueue but I’ve not mentioned that Node runs on a single thread and this alleviates the issues related to thread contention but we STILL have race conditions since we have code running concurrently (between callbacks), eg.: if you have a code that checks whether a file exists (async code since it is I/O based) and downloads that file if it does not exist you may risk having that file overwritten since the callbacks that checks for the file must be run on after the other and then the two downloads would proceed as the file did not exist. So how to scale Node applications? Node provides a core module named cluster that allows for a model similar to Apache MPM Prefork.

Enough of Node for today… I’ll try to come back soon and talk about NPM (another node cornerstone) and Sails.JS since Sails.JS is a REALLY interesting framework that draws from a lot of best practices and concepts found on lots of other frameworks (and still produces really clean and organized code).



Designing an ASTERIX Protocol Parser

In the next weeks I’ll try to describe the steps taken in order to decompose ASTERIX, Eurocontrol variable length binary protocol for Radar message Exchange. According to Eurocontrol, ASTERIX Stands for: All Purpose STructured Eurocontrol SuRveillance Information EXchange. Try not to confuse it with Asterisk (the opensource telephony project).

One of the major challenges was to achieve a great reuse of the domain classes designed to mirror the protocol fields since ASTERIX employs a design where some packet types are simply extensions of previous released versions, therefore sharing a huge amount of fields. One example is the Cat034 and Cat002 packets. If you are unfamiliar with ASTERIX, Eurocontrol site has a great deal of documentation on ASTERIX format.


JSF Datatable with database sorting, filtering and pagination

When database grows (or may grow) significantly it is necessary to consider database pagination for loading data since the Application Server memory isn’t sized for handling such huge amount of data.

Luckily rich:extendedDataTable provides an easy way of delegating the sorting, filtering and also pagination to the underlying data store. I sincerely remember that the I saw the base for this idea somewhere in Seam Framework forum (if anyone knows something similar please let me know) but I couldn’t find it for referencing, anyways it is now extended and more flexible. Another foundation of this technique is the org.jboss.seam.framework.Query since it provides means of expressing filtering and sorting as a huge string even with JSF variables (eg.: #{something.otherthing}).

Lets start with the EntityExtendedTableDataModel that is responsible for passing filtering and sorting data from view layer to data layer:

EntityExtendedTableDataModel - Database pagination, sorting and filtering for richfaces datatables
Copyright (C) 2011 Rafael Ribeiro

This library is free software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public
License as published by the Free Software Foundation; either
version 2.1 of the License, or (at your option) any later version.

This library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
Lesser General Public License for more details.

You should have received a copy of the GNU Lesser General Public
License along with this library; if not, write to the Free Software
Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301  USA
import java.util.ArrayList;
import java.util.List;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

import javax.faces.component.html.HtmlInputText;
import javax.faces.context.FacesContext;
import javax.faces.convert.Converter;
import javax.faces.convert.ConverterException;

import org.apache.commons.lang.StringUtils;
import org.jboss.seam.core.Expressions;
import org.jboss.seam.core.Expressions.ValueExpression;
import org.jboss.seam.framework.Query;
import org.jboss.seam.ui.AbstractEntityLoader;
import org.richfaces.model.DataProvider;
import org.richfaces.model.ExtendedFilterField;
import org.richfaces.model.ExtendedTableDataModel;
import org.richfaces.model.FilterField;
import org.richfaces.model.Modifiable;
import org.richfaces.model.Ordering;
import org.richfaces.model.SortField2;

public class EntityExtendedTableDataModel extends ExtendedTableDataModel implements Modifiable {
EntityDataProvider entityDataProvider;
public EntityExtendedTableDataModel(Query query) {
super(new EntityDataProvider(query));
entityDataProvider = (EntityDataProvider) getDataProvider();

public void modify(List<FilterField> filterFields,
List<SortField2> sortFields) {

private void performSort(List<SortField2> sortFields) {
StringBuilder order = new StringBuilder();
Pattern p = Pattern.compile("\\#\\{(.+)\\}");
for (SortField2 s: sortFields) {
if (Ordering.UNSORTED.equals(s.getOrdering()) == false) {
String expr = s.getExpression().getExpressionString();
Matcher m = p.matcher(expr);
if (m.matches()) { //remove the #{} otherwise richfaces wont trigger the sort event
else {
order.append(" ");
order.append(Ordering.ASCENDING.equals(s.getOrdering()) ? "ASC" : "DESC");
order.append(", ");
if (order.length() > 0)
entityDataProvider.getQuery().setOrder(order.delete(order.length()-2, order.length()).toString());
//allows us to specify filter as o.name == #{exampleEntity.name}
private void performFilter(List<FilterField> filterFields) {
Expressions expressions = new org.jboss.seam.core.Expressions();
Pattern p = Pattern.compile(".*(\\#\\{.+\\}).*");
List<String> restrictions = new ArrayList<String>();
for (FilterField f: filterFields) {
ExtendedFilterField e = (ExtendedFilterField) f;
if (StringUtils.isEmpty(e.getFilterValue()))
StringBuilder filter = new StringBuilder();
String expr = e.getExpression().getExpressionString();
Matcher m = p.matcher(expr);
if (!m.matches())
ValueExpression ve = expressions.createValueExpression(m.group(1));
FacesContext ctx = FacesContext.getCurrentInstance();
Converter c = ctx.getApplication().createConverter(ve.getType());
if (c == null)
else {
try {
ve.setValue(c.getAsObject(ctx, new HtmlInputText(), e.getFilterValue()));
} catch (ConverterException ce) {

class EntityDataProvider implements DataProvider {

private Query query;
public EntityDataProvider(Query query) {
this.query = query;

public Object getItemByKey(Object key) {
return AbstractEntityLoader.instance().get(String.valueOf(key));

public List getItemsByRange(int firstRow, int endRow) {
return query.getResultList();

public Object getKey(Object item) {
return AbstractEntityLoader.instance().put(item);

public int getRowCount() {
return query.getResultCount().intValue();
protected Query getQuery() {
return query;

Now we have to replace Seam datamodels component with ours:

<pre>EntityExtendedTableDataModel - Database pagination, sorting and filtering for richfaces datatables
Copyright (C) 2011 Rafael Ribeiro

This library is free software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public
License as published by the Free Software Foundation; either
version 2.1 of the License, or (at your option) any later version.

This library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
Lesser General Public License for more details.

You should have received a copy of the GNU Lesser General Public
License along with this library; if not, write to the Free Software
Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301  USA
import static org.jboss.seam.ScopeType.STATELESS;
import static org.jboss.seam.annotations.Install.FRAMEWORK;

import javax.faces.model.DataModel;

import org.jboss.seam.annotations.Install;
import org.jboss.seam.annotations.Name;
import org.jboss.seam.annotations.Scope;
import org.jboss.seam.annotations.intercept.BypassInterceptors;
import org.jboss.seam.faces.DataModels;
import org.jboss.seam.framework.EntityQuery;
import org.jboss.seam.framework.Query;

public class RichFacesModels extends DataModels

public DataModel getDataModel(Query query)
if (query instanceof EntityQuery)
return new EntityExtendedTableDataModel((EntityQuery) query);
return super.getDataModel(query);


Finally we specify in rich:column the filtering remembering that the base object name here must be in sync with the referenced query, example:

<rich:extendedDataTable id="listaPacientes" rows="10" value="#{queryAllEntities.dataModel}" var="obj">
<rich:column sortable="true" sortBy="#{o.name}" selfSorted="false"
filterBy="lower(o.name) like concat(lower(#{exampleEntity.name}),'%')" filterEvent="onkeyup"

The trick here is to specify selfSorted=”false” and adding the JSF #{} part for the sorting to work, this way o.name will be appended to Query sorting.

On the other hand, for filtering to work you only need to specify your query, note that what you specify between the #{} will hold the filter data handed to the Query.

Also note that you need to have a query and an entity as a component specified in components.xml, in this example, this could be the query and the example entity:

<component name="exampleEntity" class="br.com.rafaelri.MyEntity" scope="session" />
<framework:entity-query name="queryAllEntities" ejbql="select o from MyEntity o" />

This, combined with rich:dataScroller will result in sorting, pagination and filtering handled by the Database.


Enum backed h:selectOneRadio and h:selectOneMenu

Recently I needed to display a h:selectOneRadio and a h:selectOneMenu with values provided by a Java Enum. After some research on seamframework forum I saw one post that clarified a lot the way I should pursue, specially Pete’s comment. My greatest concern was the same of Pete: wiring up presentation and domain layer. But, after some analysis I found a simple solution that does not wire up domain and presentation layer and also respects the DRY principle. The solution involved crafting a class that would be instantiated as a Seam component on components.xml so it can be instantiated for each and every enum that we want to provide on view layer and also used EnumSet.allOf method so we could automagically iterate for each of the enum values.

Below, you can find the EnumList class:

EnumList Seam Component - Converts Java Enums to List.
Copyright (C) 2011 Rafael Ribeiro

This library is free software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public
License as published by the Free Software Foundation; either
version 2.1 of the License, or (at your option) any later version.

This library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
Lesser General Public License for more details.

You should have received a copy of the GNU Lesser General Public
License along with this library; if not, write to the Free Software
Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301  USA
import java.util.ArrayList;
import java.util.EnumSet;
import java.util.List;

import org.apache.commons.lang.builder.ToStringBuilder;
import org.apache.commons.lang.builder.ToStringStyle;

public class EnumList<T extends Enum<T>> {
private List<T> list;

public void setEnumClass(Class<T> c) {
list = new ArrayList<T>();

public List<T> getList() {
return list;

public String toString() {
return new ToStringBuilder(this, ToStringStyle.SHORT_PREFIX_STYLE)
.append("list", list).toString();


And the configure it on components.xml as follows:

<component name="myEnumComp" scope="application" auto-create="true">
<property name="enumClass">br.com.rafaelri.MyEnum</property>
<factory name="myEnum" value="#{myEnumComp.list}" scope="application" auto-create="true"/>

Finally you’ll refer to it on your xhtml as follows:

<h:selectOneMenu id="myEnumSelect" value="#{instanceHome.instance.myEnum}">
<s:selectItems var="enum" value="#{myEnum}" label="#{enum.description()}"/>
<s:convertEnum />


JPA 2 on Seam 2

As intelligently pointed by Thomas on CTP Java blog starting a Java Web application today from scratch presents you an interesting crossroads. crossroadsIf you stick with the proven Seam 2 option you end up also with its deficiencies not to mention some old versioned libraries. The other option is experimenting with Seam 3 but this on the other hand has few knowledge critical mass around. But with a few changes, Seam 2 can still provide value and at least Hibernate/JPA version can be improved bringing a bunch of bugfixes (eg.: proper schema generation that in Hibernate 3.3 was buggy – even though HHH1012 says it is fixed I never saw this on any 3.3 release) and new features (eg.: criteria api as also pointed by Thomas).

I’ve followed Thomas instructions but instead of coding the proxy method by method I suggest using Java Dynamic Proxies:

import java.lang.reflect.InvocationHandler;
import java.lang.reflect.InvocationTargetException;
import java.lang.reflect.Method;
import java.lang.reflect.Proxy;

import javax.persistence.EntityManager;

import org.jboss.seam.persistence.PersistenceProvider;

public class Jpa2EntityManagerProxy implements InvocationHandler {

private EntityManager delegate;

public Jpa2EntityManagerProxy(EntityManager delegate) {
this.delegate = delegate;

public Object invoke(Object proxy, Method method, Object[] args)
throws Throwable {
if (method.getName().equals("getDelegate")) {
return PersistenceProvider.instance().proxyDelegate(
} else {
try {
return method.invoke(delegate, args);
} catch (InvocationTargetException e) {
throw e.getTargetException();

public static EntityManager newInstance(EntityManager entityManager) {
return (EntityManager) Proxy.newProxyInstance(
new Class[] { EntityManager.class },
new Jpa2EntityManagerProxy(entityManager));

Apart from this change I followed Thomas instructions as-is and it got JPA2 with Hibernate 3.5 working in my project.


Java IO fundamentals

I am usually questioned by friends regarding Java IO. Questions usually revolve between byte and char conversion and what determines the codepage used. My answer to them is always the same: if you are using Streams you are dealing with bytes and if you are dealing with Readers and Writers then you are dealing with Chars (usually you specified codepage to be used or it is using system default). I usually tell them to distrust any class named SomethingStream that have any method that operates over Strings (eg.: ServletOutputStream) cause they’ll usually do some implicit conversion (either using system codepage or something else out of your control).
I’ve prepared a diagram to clarify the main Java IO classes (and their main methods also):

Basic Java IO Classes

Basic Java IO Classes

On the bottom you’ll find Stream classes, they operate on bytes so codepage here is irrelevant. In the middle you’ll find conversion classes: those are the ones that allow you to forcefully define a codepage, they are the bridge classes that allow you to plug a Writer on an OutputStream or a Reader on an InputStream. You may use them to specify, for example, a codepage for writing on a TCP socket. And lastly, on top, you’ll find Character (and therefore String) oriented classes: Reader and Writer.
You may have noticed that I have included the Buffered version of the Stream classes. The Output one provides better performance under certain scenarios. The BufferedInputStream on the other hand allows you to mark and rewind its content, something usually useful for implementing protocol interpreters.


Setting up Mercurial on Apache

Recently I started investigating the two major Distributed Version Control Systems (DVCS) mainly due to the historical SVN deficiency in handling renames. You may say that you don’t need a DVCS for tracking renames … Yes, in fact I know… it was only an excuse to start learning a DVCS after all there are plenty differences between a regular VCS and a DVCS.

My first option

After analysing whether I should stick with Git or Hg I decided to go with Hg since I have a trauma of using native applications originally written for Linux on Windows. Not that I am a Windows only user, in fact for a long time I had been using Linux as a Desktop option instead of Windows but you can’t deny that there is still a huge crowd that won’t switch from Windows over anything. The problem with native Linux applications that highly depend on a collection of shell scripts and other Linux dependent solutions is that they usually have a suboptimal performance on Windows, either they miss some functionality or they depend on a myriad of rare libraries. Have said that, I went with Mercurial on my first attempt.

First attempt with Hg

I wasn’t really lucky on my first attempt to install Hg. My first mistake was to pay too much attention to python.org’s warn on main downloads page:

If you don’t know which version to use, start with Python 2.7;

This warning is probably updated after each stable version is released but if I had seen the other advice on releases page I’d have thought twice:

Consider your needs carefully before using a version other than the current production version.

I chose to download latest python and build Hg myself and obviously it prove to be not that smart as it was my first experience with Hg.

Comes Git

As I gave up on Hg I decided to give a try on Git. First thing was to download msysGIT and surprisingly enough (following this tutorial) it was rather easy to set it up but its drawbacks were related to its tooling. As soon as I setup Git and tried to clone a repository over HTTPS with authentication I realized that JGIT does not support authentication over HTTP and as it was what I planned (in fact SSH on Windows is not very advisable since I have never seen a good free port of a SSH Server for Windows).
I had to get back to Hg but I decided to check whether I was taking a complex approach since Git employs a similar approach and had been much easier, I used what I learned with the tutorial used for Git setup.

Second attempt on Hg

As already mentioned, I decided to do something similar to what I done on Git, so, I chose CGI. I’ll highlight the important points for the installation here:

  • The file to be downloaded is now named hgweb.cgi and not hgwebdir.cgi
  • Download python 2.5 as noted here
  • Unzip library.zip as noted here and edit the sys.path.insert line and the first line (the one with the #! (sha-bang) ) to point to python executable
  • Configure style and templates entries under [web] on hgweb.config
  • Configure an entry under [paths] for each repository (eg.: repository = c:/users/hg/repository)
  • Enable pushing for the configured repositories
  • Configure authorization on Apache. Either using htpasswd or ldap, but authorization is really recommended.
  • Configure SSL on Apache (there is a short explanation on how to do this in portugues over here, the only thing is that SSLPassPhraseDialog builtin is not supported on Windows, so instead, provide a .bat file with a simple @echo yourpassword and use exec instead of builtin (eg.: SSLPassPhraseDialog exec:C:/Progra~1/Apache~1/Apache2.2/bin/passphrase.bat

Perform an hg init for each configured repository, start Apache and try cloning the repository over HTTPS (remember to provide your credentials if you configured any authentication method).


Android Adapters

I’ve been fooling around with Android for a couple of weeks.Android.. I bought a Motorola Milestone, downloaded the ADT plugin, a few device images and started reading and coding…
I have to admit that the Android Architecture has some ingenious points, one example is the idea of having one process (a Dalvik VM instance) hosting both application activities and also its services but this will be subject for another post (android.os.Looper and Handler are worth mentioning too).
Getting back to the reason for the post…

Android Adapter vs Swing TableModel

Having once used Java Swing Toolkit my first impression when I saw Android SimpleAdapter was that I was seeing a DefaultTableModel sibling but the reality was that I couldn’t be more wrong!
Android Adapters and ListViews are more similar to Java’s c:forEach tag (Yeah, I know… probably you thought I would use a Swing metaphor but I couldn’t think of one) since it does not impose a resulting Widget, it is up to the Adapter getView method to determine which will be the rendered View instance. Swing’s TableModel only determines the content that will be displayed but it can’t redefine the UI component JTable uses. This responsibility on Swing is delegated to JTable getDefaultRenderer method.

Android SimpleAdapter

Android’s SimpleAdapter employs a rather simple yet smart strategy: you handle it Composite component ID that will be displayed for each row of a List of HashMaps. The mapping of each subcomponent from the composite component is performed by two Arrays that are also passed at SimpleAdapter construct time. These arrays maps the Keys from the HashMap into the subcomponent IDs.

SimpleAdapter drawback

What I sincerely miss with the SimpleAdapter approach is a more direct mapping between domain objects and the ListView. The developer ends up coding a plumbing code that moves data from Domain Objects into the HashMap and this code easily violates the DRY principle.
Although reflection poses a performance penalty I’ll try to see if a ReflectionAdapter based on SimpleAdapter is a valid alternative.


Fluent Interfaces for Searching Hibernate Entities

Recently I changed my job, I am now at my previous employee… the reasons are out of scope for this post so let’s get back to the real point… in the process of developing the software that is scope of this new project a colleague suggested that we should adopt fluent interfaces for the creation of our entities. Sincerely, the term fluent interfaces was new to me but I’ve been a Hibernate user since two dot something (a long time ago) and since Hibernate makes extensive usage of fluent interfaces it turned out not to be that much news. In fact I have to admit that fluent interfaces increase code readability a lot!

Fluent Interfaces for POJOs

The code for the POJOs fluent interfaces was based on a post on Code Monkeyism blog. After adopting this code we started to think about further improvements to that class that made this handle collections, direct access for public fields whenever possible and we crafted an extension to JavaBeans style that let us hide the setter for collections, make the getter return an immutable list and added an addSomething method (all following further good practices that were also on Code Monkeyism).
Project development went on then we hit the point in where we needed to develop our entity search infrastructure. Our first conclusion was that we would probably need a counting facility for our queries and this would probably rule out the usage of NamedQueries since we would need two for each of them (one for counting and one for returning the data – anyways we are still considering if we’ll ever use counting queries for pagination), next obvious point: pagination. But the key point for our search infrastructure was that we had DAOs and we weren’t willing to expose HibernateSession or EntityManagers and we are going to have predefined searches that could have different parameters (some mandatory, some optional). This predefined search scenario is one of the ideal applications for something similar to the fluent interface idea for creating POJOs except that if we resorted to dynamically creating queries this could turn to a huge effort (not to mention a huge wheel reinvention since Hibernate has the incredible Criteria API).

The Solution for Searching Hibernate Entities

The trick for having fluent interfaces on top of hibernate criterias is based around two things:

  1. Basing the solution on top of Criteria ability of expressing constraints and the most important part: creating subcriterias upon entity association navigation
  2. Providing an way of composing the fluent interfaces for expressing collections and integrating this with the concept of the subcriteria

The first point is to implement the InvocationHandler class and making it able to store a Criteria instance which will be responsible for storing the Root Criteria instance and also the SubCriterias (see Subcriteria class) when InvocationHandlers are created for the entity associations.

public Object invoke(Object proxy, Method m, Object[] args) throws Throwable {
	try {
		String name = m.getName();
      		if ("list".equals(name)) {
        		return criteria.list();
		else if ("count".equals(name)) {
			return criteria.setProjection(Projections.rowCount());
		else {
			if (args == null || args.length == 0) {
				Criteria subCriteria = criteria.createCriteria(name);
				//create another InvocationHandler instance enclosing the newly created subCriteria and pass it to Proxy.newProxyInstance	and return the just created proxy
			else {
				criteria.add(Restrictions.eq(name, args[0]));
    return proxy;

The next step is to develop the interfaces, in order to make it easier to understand the idea I’ll provide a sample domain model to make things clearer.

Sample Domain Model

We are going to use the well known Order, OrderItem, Product model in order to make things clearer. Below is a class diagram for our model:

Sample Domain Model

Sample Domain Model

Our fluent interfaces for those classes would (hugely simplified) be:

public interface OrderFluentSearch {
	OrderFluentSearch code(String code);
	OrderItemFluentSearch items();
	List<Order> list();
	long count();

The OrderItemFluentSearch would be as follows:

public interface OrderItemFluentSearch {
	OrderItemFluentSearch quantity(int quantity);
	ProductFluentSearch product();
	List<OrderItem> list();
	long count();

And finally our ProductFluentSearch interface would be similar to the one below:

public interface ProductFluentSearch {
	ProductFluentSearch code(String code);
	List<Product> list();
	long count();

Suppose now that you had a with() method in your Order DAO similar to the one on the Code Monkeyism blog post a search for Orders with an item that references a product with code “10” would be as easy as:

	OrderFluentSearch search = orderDAO.with();

Pretty readable isn’t it? But it still has some drawbacks…
One obvious is that it isn’t possible to express it into a single line. The other one is that we still have an extra interface that wont get automatically renamed when we rename the entity class but I feel like there is one way of fixing this.

Future improvements

Although this idea presents a great improvement regarding code readability of Criteria queries expressed in an application it is still vulnerable to renames on classes attributes. One possibility that I’ll give a try is to use a DSL similar to the one employed on JMock that uses a combination of custom crafted CGLIB classes based on entity classes and a ThreadLocal composition process that allows methods that have no linking between them to share context.


Blog Stats

  • 375,197 hits since aug'08

%d bloggers like this: