Archive for December, 2009

20
Dec
09

Hibernate SequenceGenerator with allocationSize=1 leads to huge contention

This post is a kind of warn for Hibernate users.

I sincerely don’t know which was the reason but accidentally someone annotated every entity on a software with:

@SequenceGenerator(allocationSize=1)

Doing a brainstorm to find the reason we can speculate:

  1. Minimise sequence gaps due to server reboots (unlikely)
  2. Misunderstanding of what the parameter really does. Probably the developer thought it had to do with the increment by parameter of the underlying database sequence (more likely)

Whichever the reason the truth is that whenever you annotate an entity with @SequenceGenerator Hibernate (at least I know up to 3.3.1.GA) delegates sequence generation to org.hibernate.id.SequenceHiLoGenerator which in turn generate method is as follows:

public synchronized Serializable generate(SessionImplementor session, Object obj)
	throws HibernateException {
		if (maxLo < 1) {
			//keep the behavior consistent even for boundary usages
			long val = ( (Number) super.generate(session, obj) ).longValue();
			if (val == 0) val = ( (Number) super.generate(session, obj) ).longValue();
			return IdentifierGeneratorFactory.createNumber( val, returnClass );
		}
		if ( lo>maxLo ) {
			long hival = ( (Number) super.generate(session, obj) ).longValue();
			lo = (hival == 0) ? 1 : 0;
			hi = hival * ( maxLo+1 );
			if ( log.isDebugEnabled() )
				log.debug("new hi value: " + hival);
		}

		return IdentifierGeneratorFactory.createNumber( hi + lo++, returnClass );
	}

If you compare the generate sequence above with the one from org.hibernate.id.SequenceGenerator below you’ll notice that this one in turn is not synchronized:

public Serializable generate(SessionImplementor session, Object obj)
	throws HibernateException {

		try {

			PreparedStatement st = session.getBatcher().prepareSelectStatement(sql);
			try {
				ResultSet rs = st.executeQuery();
				try {
					rs.next();
					Serializable result = IdentifierGeneratorFactory.get(
							rs, identifierType
						);
					if ( log.isDebugEnabled() ) {
						log.debug("Sequence identifier generated: " + result);
					}
					return result;
				}
				finally {
					rs.close();
				}
			}
			finally {
				session.getBatcher().closeStatement(st);
			}

		}
		catch (SQLException sqle) {
			throw JDBCExceptionHelper.convert(
					session.getFactory().getSQLExceptionConverter(),
					sqle,
					"could not get next sequence value",
					sql
				);
		}

	}

The synchronized keyword on the previous one isn’t any surprise since SequenceHiLoGenerator does some of its sequence generation in memory and it may be accessed concurrently by multiple threads.

But the fact is that SequenceHiLoGenerator wasn’t designed to have an allocationSize=1 (strangely enough it has a separate flow for this situation).

Reproducing this scenario seems to be very easy, sincerely I have not tried to do a syntetic reproduction but I guess that spawning a few threads with a simple entity configured with allocationSize=1 and having all the threads call a Session.persist might do the trick. All you will need to do is have a thread dump after a few minutes (3 or 5) of execution and import it into a dump analyser (IBM Support Assistant if you are running on an IBM JVM) and you’ll notice that a great deal of threads will be waiting on a lock (before entering the generate method on AbstractSaveEventListener.saveWithGeneratedId).

But what would be possible alternatives for this?

If you want to have the entities IDs reflecting the value of the database sequence use the following annotation:


@GenericGenerator(strategy="sequence")

Another possibility is removing the allocationSize (which leads to the default value of 50) or configuring it with a greater value. But, in this case you’ll need to update the database sequence so that you don’t end up with a gap on your IDs since SequenceHiLoGenerator employs the following formula for sequence generation:

DBSequence*allocationSize<= IDs < (DBSequence+1)*allocationSize

And upon startup it already increments sequence by one acquiring a new slot of IDs.

So you’ll have to restart your sequence with the following value:


select round(max(id)/allocationSize) from table;

19
Dec
09

WebSphere eXtreme Scale 6 book

As already stated I’ve been invited to review WebSphere eXtreme Scale book from Packt Publishing. The book is an excellent choice for anyone interested in adopting WebSphere eXtreme Scale in a solution.

WebSphere eXtreme Scale

WebSphere eXtreme Scale Book

The book starts on chapter one with basic concepts about a data grid and ends up with a hello world like application.

On chapter two it starts to take a deeper dive into WebSphere eXtreme Scale concepts. The first one explored is the ObjectMap API which is composed by the ObjectMap and the BackingMap objects.

Chapter three is the one that specially took my attention since it presented me a feature I had no clue eXtreme Scale had: a JPA like implementation. This is one of the things that makes eXtreme Scale such a nice tool. By now you will probably be thinking: “but what is the difference of having this and eXtreme Scale acting as an ORM cache?”. In fact there’s a great deal of differences – first one: on an ORM solution you’re caching data that when “hydrated” (as Hibernate calls it – this article explains something about hibernate 2nd level cache) becomes objects, so, you still have the time lapse and the memory losses derived from hydrating objects on each request. Another major advantage is that ORM caches are usually invalidated upon some update scenarios which leads to reload of data.

Chapter four covers how to integrate a data grid with a database store. As the book states you’d need to integrate with a database backend to provide a richer and wider set of tools for report generation, to integrate with legacy application that still interact with database, etc.

Chapter five is dedicated to handling increased load. If you thought: “Do I still need to care about load increase?” the answer is “YES! For Sure!”. All and every software needs to be proper configured for increased load, by no means eXtreme Scale would be an exception and this chapter covers important topics of performing planning  for increased load.

Chapter six is devoted to keeping data available, so now it focus on replication and other important points.

Chapter seven (which is also freely available online) presents a rather different pattern of performing tasks in a distributed application – pushing the operation instead of pulling the data.

Chapter eight is dedicated to present some common patterns when using a data grid.

Chapter nine covers some facilities for Spring integration.

And finally chapter ten covers a complete example of a project that is built open eXtreme Scale.

One thing I’d like to see on the books next edition is the ability of replacing WebSphere’s Application Server DynaCache with an eXtreme Scale cache implementation. But it is no surprise for me that this was missing on book since they (the book and this feature) were released almost at the same time, invalidating any possibility of this being covered on the book.

In summary: this is a book I really recommend for anyone interested in picking a data grid solution for Java. It is also more recommended if you are already inclined to use an IBM solution and stick with eXtreme Scale.

01
Dec
09

Mapping JavaEE and .NET stack components

Recently I started to research how to develop on .NET as I’ve been developing for JavaEE.
First thing that came to my mind as a JavaEE architect was: “Okay, IIS and ASP.NET are some of the presentation tier alternative I have for .NET but what for business tier?” or rephrasing as someone that has
been using JavaEE for a long time: “What would be the .NET EJB?”.
JavaEE-DotNETComparisonFirst thing I missed was the concept of the MDB that in .NET stack that seems to be replaced by an MSMQ message trigger.

Another major .NET difference is that even though it seems to have the concept of the VM process (as Java does) it has the concept of a shared library, or the Global Assembly Cache. If you think in terms of Java then you have to either manually replicate the jar or share them using a shared storage and having it on the classpath.

As we are on the classpath subject, .NET and its CLR seems to avoid (and limit it to J#) the concept of a ClassLoader. Although sometimes problematic, the Java concept of the ClassLoader allows very sofisticate scenarios of application composition and also for hot code generation (by the application) at runtime.

I hope this post has helped anyone in the situation as I were before and I also hope it does not sound as a comparison of which platform is better as you might guess that I sincerely know that each one has its applications, strengths and weaknesses.




ClustrMaps

Blog Stats

  • 357,126 hits since aug'08

%d bloggers like this: