Posts Tagged ‘JavaEE


Embedding Cassandra into your JavaEE AppServer

Recently I started some tests with Apache Cassandra. Sincerely I got impressed with its aptitude to horizontally scale and therefore to handle load increase. Among its main virtues are:

  • Ability of replacing failed nodes without downtime
  • Complete absence of a single point for failure. Every node is identical in regard to functionality

not to mention a few other features.
But still there is one point that bothers me: I still see it and its API as a foundation thing. Something like JGroups turned into. JGroups is largely used nowadays but rarely directly by the application developer. You use it indirectly when you use a clustered JBoss or JBoss TreeCache or Infinispan.
One of these responsibilities that still lies on application developer is the fail over capacity. When you connect to a Cassandra Database node through your application you still need to fetch through its JMX API the rest of the nodes that are part of this cluster, otherwise if this node fails even though your Cassandra cluster is still up and healthy your application won’t know how to connect to it. Another possibility (and in fact this is a recommendation even if you retrieve this node list) is to have a list of reliable servers to serve as bootstrap connection servers but remember Murphy Law even all the servers on this list may go down so you still need to retrieve the whole list for failing over.
So, in summary we have something like what is depicted on the following picture:

Cassandra Regular usage thru JCA and Thrift

Cassandra Regular usage thru JCA and Thrift

Note that there is an iherent complexity in this solution, the JCA connector will be responsible for keeping a list of fail over nodes, something a Cassandra instance already does, so we end up violating the DRY principle.
But what alternatives do we have?

The StorageProxy API

It turns out that CassandraDaemon class (the main one responsible for starting up a Cassandra node) does only a few things that we can embed into our application, or should I say into our JCA Connector since in a JavaEE this is the only place you should be spawning threads and opening files directly from filesystem.
In fact those few steps are properly described on Cassandra Wiki.
Then if you take what we could call a regular approach you’d spawn the embedded Cassandra and perform a connection to localhost in order to communicate make the application talk to the Cassandra Server you’ll end up having an unecessary network communication that could reduce performance by increasing latency. It turns out that this can be avoided too by using the (not so stable) StorageProxy API.
By taking all the steps describe above you’d end up with a much simpler architecture as the one below:

Cassandra Embedded into JavaEE Server

Cassandra Embedded into JavaEE Server

With this architecture you are shielded from the complexity of failing over, Cassandra handles this automatically for you. Then you could argue: what if I need to independently scale the Cassandra layer? No problem! You can resort to an hybrid architecture like the one below:
Cassandra Embedded with Extra Nodes

Cassandra Embedded with Extra Nodes

In order to achieve this you only need to provide to these extra nodes the address of some of the JavaEE servers, this way, the extra standalone nodes can communicate with Cassandra daemons inside the AppServer and become part of the Cassandra cluster.

Memory mapped TTransport

My first attempt on doing this with Cassandra involved implementing a TTransport class that would be responsible for sending over a byte buffer commands to the server worker thread and receiving response in a similar fashion. I tried this first due to the complete ignorance of the existence of the StorageProxy API. But later I thought this could solve the issue related to the lack of stability this API has (as per the apache Wiki page). But this turned to be a not so easy task.
I thought of having a CassandraMMap that would act as the CassandraDaemon class but it would differ on TThreadPoolServer initialization as below:

		this.serverTransp = new MMapServerTransport();
		TThreadPoolServer.Options options = new TThreadPoolServer.Options();
		options.minWorkerThreads = 64;
		this.serverEngine = new TThreadPoolServer(new TProcessorFactory(
				processor), serverTransp, inTransportFactory,
				outTransportFactory, tProtocolFactory, tProtocolFactory,

The same instance of MMapServerTransport would be handled to the client through a getter method in order to open client connections as follows:

		TTransport tr = mmap.getServerTransp().getClientTransportFactory()
		TProtocol proto = new TBinaryProtocol(tr);
		Cassandra.Client client = new Cassandra.Client(proto);;

Requests through getTransport would be queued on server using the class below and a TTransport for the server would be returned upon acceptImpl:

public class MMapTransportTuple {
	private TByteArrayOutputStream server2Client;
	private TByteArrayOutputStream client2Server;
	private TTransport clientTransport;
	private TTransport serverTransport;

	public MMapTransportTuple(int size) {
		server2Client = new TByteArrayOutputStream(size);
		client2Server = new TByteArrayOutputStream(size);
		clientTransport = new MMapTransport(server2Client, client2Server);
		serverTransport = new MMapTransport(client2Server, server2Client);
        //certain codes ommited for brevity

This class would be responsible for binding the memory buffers from client to server and vice-versa.
The last class involved in this implementation would be the MMapTransport:

public class MMapTransport extends TTransport {

	private TByteArrayOutputStream readFrom;

	private TByteArrayOutputStream writeTo;

	public MMapTransport(TByteArrayOutputStream readFrom,
			TByteArrayOutputStream writeTo) {
		this.readFrom = readFrom;
		this.writeTo = writeTo;
        //read and write would operate on the respective buffer
        //and they would point to different buffers on client and server TTransport instances...

But this turned to be harder than I thought at first and as time became a short resource I’ll stick with the StorageProxy API approach for now.


Inbound JCA Connectors Introduction

After some posts about Outbound JCA connectors, let’s have a look at the concepts related to an Inbound JCA connector.

First question I always hear: “So, the server (legacy application) connects a client socket into a server socket on my J2EE server, right?” and I always answer: “depends”. People tends to think that the term inbound and outbound are related to TCP/IP connection, but, in fact, it is related to the flow of the process. In an outbound connector, our system is the actor and if you trace the flow of the call you’ll notice that it is leaving from our system to the legacy one; or perharps we could say outgoing. On the other hand, in an inbound connector, the actor is the legacy system that is triggering an ingoing message that is started outside our system and goes all the way into it.

Let’s see a sequence diagram of an inbound connector to make things clearer:

JCA Inbound Connector Sequence Diagram

JCA Inbound Connector Sequence Diagram

As you can see in the diagram, the component that ties the Application Server, the legacy system and the J2EE Application is the ActivationSpec.

Usually, instance configurations such as port, host and other instance related data is stored on the ActivationSpec, you may also have configuration in the ResourceAdapter itself, but remember that all the configurations placed on the ResourceAdapter will be shared across multiple ActivationSpecs that may be deployed on this ResourceAdapter.

The real class responsible for handling Enterprise Events is ResourceAdapter dependant and will be initialized during endpointActivation method call on a ResourceAdapter. These classes are usually implemented as a Thread or they implement the Work interface and are submitted to the container for execution through the WorkManager instance. If you opt to use a simple Thread, remember to daemonize it otherwise your application server wont be able to properly shutdown.

For the next weeks I’ll be posting some insights about how to implement an Inbound Connector using JCA.


Websphere PMI: enabling and viewing data

For those who ever needed to have a deeper look at application internals that may be impacting performance probably had this impression:

  • System.out.println with System.nanoTime (or currentTimeMillis) is tedious, errorprone and limited
  • A profiler is an overkill not to mention cumbersome (and unavailable for certain platforms [eg.:tptp on AIX]*)
  • This is the scenario where Websphere PMI is a killer feature.

    Imagine that your application isn’t performing as expected. Many can be the reasons for the poor performance. I’ve faced myself a scenario where the application was waiting a long time for getting a JMS connection from Websphere internal provider since its default configuration of 10 connections maximum isn’t acceptable for any application with performance requirements of even 100 transactions per second.

    Enabling PMI

    By default, Websphere 6.1 ND comes with basic PMI metrics enabled. These include for example:

    • Enterprise Beans.Create Count
    • JDBC Connection Pools.Wait Time
    • JDBC Connection Pools.Use Time

    If you need anything more than the default, you can change under:

    Monitoring and Tuning > Performance Monitoring Infrastructure (PMI)

    then click on the desired server.

    After you have chosen the desired metrics (remember that more metrics involve more CPU impact on runtime), go to the following menu:

    Monitoring and Tuning > Performance Viewer > Current Activity

    Now you need to check if your server is in fact already collecting data, if it is already enabled but not collecting, Collection Status will show Available. In order to start collecting, check the desired server and click Start Monitoring button. After clicking the button it will now show Monitored on the status column.

    Now you can click on the desired server and tick for example one of your connection pools under the tree on the left, you should see an structure similar to the below:

    Performance Modules > JDBC Connection Pools > Oracle XA Provider > yourDataSource

    After clicking the metric you’ll have a graph display of the current data and also a tabular with the snapshot of the indicator below.

    * note: Eclipse TPTP is said to be supported on AIX on version 4.3.1 but I have not been able to make it work


    Connection request flow on outbound JCA connector

    Continuing with the posts about outbound JCA connectors lets have a quick look at one of the main flows of an outbound JCA connector: the connection request flow.

    By now you might be wondering: “What on earth is responsible for making my connectors poolable? Is there any dark magic involved?”.

    First answer: “the ConnectionManager” and second answer: “Yes and it is container dependant!”. That’s the reason you should keep an instance of the ConnectionManager inside your ConnectionFactory when it is created, that’s how you delegate the connection creation to the pooling implementation of the container.

    Enough explanation let’s have a look on a sequence diagram for a connection creation when we have no connections inside the pool.

    First flow: connection request with no matching connections

    First thing to take note, it is up to you to implement the matchManagedConnections, I’ve put a note on the next diagram that might help on this implementation: making the ManagedConnection be able to compare itself to a ConnectionRequestInfo makes the things easier. Also beaware that some Application Servers (eg.: WebSphere) skip the matchManagedConnections part if it detects by itself that the pool is empty or there isn’t any matching connection.

    Sequence Diagram - No Matching Connection

    Sequence Diagram - No Matching Connection

    Another point about the specification is that it states that a single managed connection may be responsible for more than a single physical connection (FooConnectionImpl), this is required for scenarios where connection sharing is possible (refer to section of the specification).

    Previously I tried implementing using a single handle but I noticed that it does not take much effort to make it fully adherent, in fact, it is only necessary to use a delegate (FooConnectionHandle) that implements the FooConnection interface and delegates almost all the methods to the FooConnectionImpl instance that is inside the FooManagedConnection instance (refer to sections and 6.8.1 of the specifications). The exception to the delegation is the close method that your connection must provide, this method in the delegate will be responsible for raising the Connection Closed Event, this is the way you signal the container that you are giving the connection back to the pool.
    Second flow: connection request when there are matching connections inside the pool.

    Sequence Diagram - Matching Connections

    Sequence Diagram - Matching Connections

    This flow is executed whenever the container guesses there are potential matching connections. The matchManagedConnection method is invoked with the minimal connection set that the container can identify. The connection set is a key point: specification states that the container is responbile for determining the minimal set that has a potential match to avoid degrading performance while looking for the connection. Also I noticed that some containers don’t check the connections inside this Set before sending them for matching.

    Implementation Tips

    Here are some decisions that might help on the implementation of the ManagedConnection and ManagedConnectionFactory.

    • Store the ConnectionRequestInfo that was passed to the ManagedConnectionFactory on the creation of the ManagedConnection as an attribute inside the ManagedConnection this leads to the next tip
    • Use the ConnectionRequestInfo stored in the ManagedConnection as argument to an overload of the equals method of the ManagedConnection class this helps in the implementation of the matchManagedConnection method
    • Never ever forget to include your ManagedConnection object on equals(Object) method implementation. The tip above may lead you to forget this details. Dont do this under any circumstance since specification requires this and some containers freak out after some use time if this method is not implemented (connection borrow time goes all way up if not implemented)

    Next post I’ll focus on how to signal the container that something went wrong with the connection and how to validate the connection before it is returned to the requestor.


    Outbound JCA connectors introduction

    Continuing with the posts about JCA I’ll start with the project of an Outbound JCA Connector that I’ll call FooJCA. First of all and let make this clear, I am not stating that this design is an optimal JCA Connector design, what I can say is that I tried to make as compliant as possible from my understanding of the documentation.

    Lets start with a overall view of the main classes and interfaces involved in a JCA Connector implementation.

    Foo Connector Class Diagram

    Foo Connector Class Diagram

    First class to take note, the FooResourceAdapter, that’s one of the connector classes that is able to directly communicate with the container and use some priviledged services (eg.: scheduling Work instances or creating timers). This ability is provided by the BootstrapContext that is passed as a parameter of the start method by the container.

    Another key class that also has the ability of interacting with the container is the FooManagedConnectionFactory. As you may have already noticed, this class isn’t always container dependant, or at least it may not be (that’s up to the developer the choice of providing functionality of the Connector outside the container). The difference (between working inside the container or outside it) is usually detected when the class method createConnectionFactory is called with or without the ConnectionManager. Talking about the ConnectionManager this is a Container implemented class that is responsible for pooling connections and related work. This Container work relies on the matchManagedConnections method that is provided by the ManagedConnectionFactory, so, pay much attention to this method implementation.

    Moving on, the FooConnectionRequestInfo is a implementation of the ConnectionRequestInfo interface and is the class responsible for passing Connector dependant details about the request between the Container and the JCA Connector. One of the requisites of this class is that it implements equals and hashCode.

    The FooManagedConnectionFactory as you may guess is the class responsible for creating the FooConnectionFactory instances and FooManagedConnection instances as well. It does not matter if the FooConnectionFactoryImpl instance will be Container dependent or not, it will be FooManagedConnectionFactory’s role of creating it.

    At least but not last, the FooConnectionFactoryImpl will be the class responsible for creating FooConnectionImpl instances when requested (this will be treated in a separate post since it is a little long topic).

    And, for finishing this post, FooManagedConnectionImpl is the class responsible for wrapping a FooConnectionImpl and returning delegates (FooConnectionHandle class) that implement FooConnection interface and notifies the container about the status of the ManagedConnection connection.

    That’s enough for this post, keep on watching the blog for the rest of the implementation of the Outbound JCA Connector.


    Blog Stats

    • 353,165 hits since aug'08

    %d bloggers like this: