Archive for April, 2010


Embedding Cassandra into your JavaEE AppServer

Recently I started some tests with Apache Cassandra. Sincerely I got impressed with its aptitude to horizontally scale and therefore to handle load increase. Among its main virtues are:

  • Ability of replacing failed nodes without downtime
  • Complete absence of a single point for failure. Every node is identical in regard to functionality

not to mention a few other features.
But still there is one point that bothers me: I still see it and its API as a foundation thing. Something like JGroups turned into. JGroups is largely used nowadays but rarely directly by the application developer. You use it indirectly when you use a clustered JBoss or JBoss TreeCache or Infinispan.
One of these responsibilities that still lies on application developer is the fail over capacity. When you connect to a Cassandra Database node through your application you still need to fetch through its JMX API the rest of the nodes that are part of this cluster, otherwise if this node fails even though your Cassandra cluster is still up and healthy your application won’t know how to connect to it. Another possibility (and in fact this is a recommendation even if you retrieve this node list) is to have a list of reliable servers to serve as bootstrap connection servers but remember Murphy Law even all the servers on this list may go down so you still need to retrieve the whole list for failing over.
So, in summary we have something like what is depicted on the following picture:

Cassandra Regular usage thru JCA and Thrift

Cassandra Regular usage thru JCA and Thrift

Note that there is an iherent complexity in this solution, the JCA connector will be responsible for keeping a list of fail over nodes, something a Cassandra instance already does, so we end up violating the DRY principle.
But what alternatives do we have?

The StorageProxy API

It turns out that CassandraDaemon class (the main one responsible for starting up a Cassandra node) does only a few things that we can embed into our application, or should I say into our JCA Connector since in a JavaEE this is the only place you should be spawning threads and opening files directly from filesystem.
In fact those few steps are properly described on Cassandra Wiki.
Then if you take what we could call a regular approach you’d spawn the embedded Cassandra and perform a connection to localhost in order to communicate make the application talk to the Cassandra Server you’ll end up having an unecessary network communication that could reduce performance by increasing latency. It turns out that this can be avoided too by using the (not so stable) StorageProxy API.
By taking all the steps describe above you’d end up with a much simpler architecture as the one below:

Cassandra Embedded into JavaEE Server

Cassandra Embedded into JavaEE Server

With this architecture you are shielded from the complexity of failing over, Cassandra handles this automatically for you. Then you could argue: what if I need to independently scale the Cassandra layer? No problem! You can resort to an hybrid architecture like the one below:
Cassandra Embedded with Extra Nodes

Cassandra Embedded with Extra Nodes

In order to achieve this you only need to provide to these extra nodes the address of some of the JavaEE servers, this way, the extra standalone nodes can communicate with Cassandra daemons inside the AppServer and become part of the Cassandra cluster.

Memory mapped TTransport

My first attempt on doing this with Cassandra involved implementing a TTransport class that would be responsible for sending over a byte buffer commands to the server worker thread and receiving response in a similar fashion. I tried this first due to the complete ignorance of the existence of the StorageProxy API. But later I thought this could solve the issue related to the lack of stability this API has (as per the apache Wiki page). But this turned to be a not so easy task.
I thought of having a CassandraMMap that would act as the CassandraDaemon class but it would differ on TThreadPoolServer initialization as below:

		this.serverTransp = new MMapServerTransport();
		TThreadPoolServer.Options options = new TThreadPoolServer.Options();
		options.minWorkerThreads = 64;
		this.serverEngine = new TThreadPoolServer(new TProcessorFactory(
				processor), serverTransp, inTransportFactory,
				outTransportFactory, tProtocolFactory, tProtocolFactory,

The same instance of MMapServerTransport would be handled to the client through a getter method in order to open client connections as follows:

		TTransport tr = mmap.getServerTransp().getClientTransportFactory()
		TProtocol proto = new TBinaryProtocol(tr);
		Cassandra.Client client = new Cassandra.Client(proto);;

Requests through getTransport would be queued on server using the class below and a TTransport for the server would be returned upon acceptImpl:

public class MMapTransportTuple {
	private TByteArrayOutputStream server2Client;
	private TByteArrayOutputStream client2Server;
	private TTransport clientTransport;
	private TTransport serverTransport;

	public MMapTransportTuple(int size) {
		server2Client = new TByteArrayOutputStream(size);
		client2Server = new TByteArrayOutputStream(size);
		clientTransport = new MMapTransport(server2Client, client2Server);
		serverTransport = new MMapTransport(client2Server, server2Client);
        //certain codes ommited for brevity

This class would be responsible for binding the memory buffers from client to server and vice-versa.
The last class involved in this implementation would be the MMapTransport:

public class MMapTransport extends TTransport {

	private TByteArrayOutputStream readFrom;

	private TByteArrayOutputStream writeTo;

	public MMapTransport(TByteArrayOutputStream readFrom,
			TByteArrayOutputStream writeTo) {
		this.readFrom = readFrom;
		this.writeTo = writeTo;
        //read and write would operate on the respective buffer
        //and they would point to different buffers on client and server TTransport instances...

But this turned to be harder than I thought at first and as time became a short resource I’ll stick with the StorageProxy API approach for now.


Remapping keyboard keys on Windows

Looks like I have something against notebooks (or netbooks) with regular keyboards…
Recently I bought a second hand Sony netbook that a friend bought on Italy. Sincerely when I agreed to buy it I didn’t take into consideration that it could come with a rather strange keyboard and we usually tend to think that every keyboard will work as a US_International one… What a mistake.
First attempt was to map it as an ABNT2 keyboard and blindly type but it wasn’t that good. Next I tried setting language as portuguese and keyboard as italian (hoping dead keys would work)… another failure… that was time I tried to do the old trick my Lenovo had on Windows (something similar to what I used on Linux… but the fact is Linux has always dead keys… no matter what keyboard language you are using…) and this could only handle swapping keys but if my keyboard missed dead keys (as it missed at all I would never have dead keys with this trick).
Then I found the solution: The Microsoft Keyboard Layout Creator
The tool is described as:

The Microsoft Keyboard Layout Creator (MSKLC) extends the international functionality of Windows 2000, Windows XP, Windows Server 2003,and Windows Vista systems by allowing users to:

  • Create new keyboard layouts from scratch
  • Base a new layout on an existing one
  • Modify an existing keyboard layout and build a new layout from it
  • Multilingual input locales within edit control fields
  • Build keyboard layout DLLs for x86, x64, and IA64 platforms
  • Package the resulting keyboard layouts for subsequent delivery and installation

With this tool you can even get around the “missing curly braces limitation” of italian keyboards.
There is no secret on using it. First of all you pick up a keyboard to base your new one (or start from scratch but I suggest picking one as a template) by choosing the “Load Existing Keyboard” option under File menu, then you’ll be presented with a list similar to the one below:

Load Existing Keyboard

Load Existing Keyboard

After you pick up your template keyboard you’ll be presented with the screen where you can customize it. This screen presents you keyboard with a visual representation of a keyboard.

MSKLC Main Screen

MSKLC Main Screen

When you click any of the keys you’ll be presented with a short popup and a button where you can open the full customization screen.

Change Key Screen

Change Key Screen

This is where you provide new meanings for keyboard keys, in my particular case I had to change circumflex key from a simple key to a dead key. In order to do this I had to enable the dead key view in the window above and then define all the dead key possibilities.

Dead Key Mapping

Dead Key Mapping

Note that there is a standard of having the symbol composition with white space as the last one on the list (the tool will complain if it is not like that and I sincerely did not test without complying to this).
When you are done you only need to build the setup package and install on the desired machine.

Swapping Keys

If instead of a such a powerful tool you only need to simply swap one key by another, something as Lenovo did on the Windows XP bundled in my notebook.
While searching on the internet for some documentation on this I found this site which describes pretty well what you need to do even with samples but misses the whole scancodes list. The after some googling (and a few outdated link on MS site) I found this site that describes the remapping method and this one that has a word document with scancodes. Another handy tool for this job is an old tool from Visual Studio called spy++ but this tool is only supplied into paid Visual Studio versions.


Blog Stats

  • 372,288 hits since aug'08

%d bloggers like this: