Monitoring Session Replication in J2EE Clusters
Subject:   Solving the Session Management challenges
Date:   2004-09-10 06:31:34
From:   cpurdy
(Disclaimer: I work for Tangosol, whose Coherence product includes a session management module, Coherence*Web, which addresses all of the problems discussed in the article, and is used for high-scale applications like the one discussed in this article.)

The mention of multicast should throw up a red flag -- any server using multicast to manage session information will scale extremely poorly under load and probably cause all sorts of network issues. That's why they have to suggest that you use small clusters and tiny sessions (3-5KB):

  • Reduce the number of replicating nodes in your cluster (some application servers allow you to isolate the nodes that work together in small groups).

  • Reduce the HTTP session object to the minimum amount of relevant information.

  • Save your session information to a database or file regularly, freeing the HTTP session object of your application.

The problem I have with the suggestions is that the server vendor is putting the onus on the application developer for a piece of functionality that should just work (tm).

The real problem, though, is this statement:

This implies that whenever your servlet/JSP engine is using multicast for replicating objects across application server instances, the objects will not appear on other nodes before 300 msecs have passed. Consequently, if a failover happens and the load balancing takes place in less than 300 msecs (which you can expect from most application servers), your client request will find an older version of the object in the new node and the application will become inconsistent.

In other words, even if you follow all of the suggestions, the approach being used will leave a big window of opportunity for data loss. That might be acceptable as an option ("put a check in this checkbox to allow your app to lose data but it will make the server run faster") but it shouldn't be default behavior.

I think a better suggestion is to improve the application server, or if that isn't an option, then use Coherence*Web.


Cameron Purdy
Tangosol, Inc.
Coherence: Shared Memories for J2EE Clusters