Measuring Java Object Sizes

Probably pure coincidence, but I stumbled upon this Estimating Java Object Sizes with Instrumentation blog. Given we just had released Ehcache 2.5 with Automatic Resource Control, aka ARC, I had to read through this. We do, amongst other, use an instance of java.lang.instrument.Instrumentation to measure object sizes ourselves. Yet, we found some shortcomings to that approach:

Getting a reference to a Instrumentation instance !

As the blog mentions, you need an agent to get to that instance. Yet it felt like imposing every Ehcache users wanting to use ARC to add a -javaagent: just wasn’t a great idea. Trying to work around this, it turns out Java 6 introduced the Attach API. Now we can try to attach to the VM while it’s running and load the agent.
And when I say try, I do mean try! As this could fail… For all kind of reasons: we’re running JDK5, so no Attach API for us to use; or the attaching to the VM itself fails, this could be for different reasons again. One particularly weird one being due to a bug on OS X, when the java.io.tmpdir system property is set! And even if we get to attach and load the agent, we still need the Ehcache code to get a reference to that Instrumentation instance. The agent classes are being loaded by the system class loader, but the Ehcache classes aren’t necessarily and we might not get access to the system class loader directly. We don’t necessarily need to, but we try to avoid accessing another agent class instance, loaded by some other class loader. This would be generally not possible, as we hide the java agent jar within the Ehcache-core jar, so the classes it contains can’t be present multiple times…

What if we can’t access an Instrumentation instance ?

Ehcache’s sizeOf engine, as we named it, falls back to other mechanisms to size POJO. We’ve added two other methods, to which we fallback shouldn’t we be able to access the Instrumentation instance: The Unsafe and, finally, the Reflection based one.
The UnsafeSizeOf will try to get a reference to the sun.misc.Unsafe#theUnsafe. Using that reference, we can now query for an Object’s last non-static field offset in memory using Unsafe.objectFieldOffset and do some math to calculate the object’s size in memory. I’ll come back later to the some math part…
And finally, shouldn’t we be able to gain access to theUnsafe, we use reflection based sizing. This will measure all primitives and references within an object and sum the size these use in memory. Dr. Heinz Kabutz published more details on that approach in his Java Specialists Newsletter #78: MemoryCounter for Java 1.4 back in 2003.

Now that’s all very simple, isn’t it ?

Well… Sadly it isn’t. But luckily, we’ve mostly sorted it all out for you! We just were done with the agent based implementation (which didn’t auto attach yet), and started the testing. Obviously, since this calls into the VM’s internal, it would all magically figure it out and all. Well, no. CMS wasn’t properly accounted for. CMS needs a certain minimal amount of memory to store information when an object is garbage collected and it’s memory allocation is “freed”. That affects the minimal size an object will use on heap. And that was Hotspot only… We then moved on to test on JRockit that required some finer adjustments, but I won’t start with these here now.
CMS, Compressed OOPS, minimum object size were just some of the things that we needed to account for in the some math to in the other implementations: pointer sizes (32 vs. 64 bit VMs), object alignment, field offset adjustment (on JRockit) and “object header” size. All these required us to gather all that information about the VM the sizing was happening in order to properly measure object sizes, even using the Instrumentation instance to measure.

Know what to measure !

As you could read in Heinz’s newsletter there is some objects you probably don’t want to account for. Especially while measuring the size a cached entries are using on heap. There are all the obvious static, classes and other “Flyweight type objects”. These can all automatically be discarded by the sizing engine. But some other times, you also don’t want every cached entry to account for a particular part of an object graph. Simply because every, for instance, every cached entry will reference that particular bit. Hibernate’s 2nd level cache is good example of that. For that particular example, we’ve added a “resource” file that describes fields and types to be discarded when measuring a cache entry’s size on heap. For application types though (ones not going into the cache through Hibernate, but applications using the Ehcache API directly), we’ve added the @IgnoreSizeOf annotation. Annotating a Field, a Type or even an entire package with it, will result in the sizing engine skipping that part of the graph (those types or the types in those packages respectively) while doing the sizing.

Try it now !

Ehcache 2.5 is out now and available for direct download or through maven central. It enables you to size your caches simply using values in bytes using Ehcache ARC, you can read more about cache sizing on the ehcache.org website.

Measuring Java Object Sizes

Asynchronous Job Execution in the Cloud @ JavaOne

I’ll be presenting “Asynchronous Job Execution in the Cloud” (
Session ID: 24301) later this week at JavaOne in San Francisco. The session will be held in the Hotel Nikko (Carmel I/II) on Wednesday Oct. 5, at 1pm.
I will cover how deployment topologies requires us to rethink how we go about job execution. As we moved from one machine to multiple and now even to ever changing environment, we need better tools to express the requirement of our job so that we can leverage IT infrastructures optimally and keep the throughput of our applications as high as possible.
I plan on covering this generally and then illustrate how Quartz achieves those goals when clustered using Terracotta. Looking forward to the session and discussions it might trigger. Also feel free to drop by one of our booth in either the Hilton Hotel (Continental Ballroom / JavaOne exhibitor hall / booth #5201) or in the Moscone Center’s South Hall (booth #640), where I also will be hanging around!

Asynchronous Job Execution in the Cloud @ JavaOne

Ehcache's Writer API Screencast

A common pattern to offloading the database is to use a cache in front of it. Yet, generally, it’s still application code that goes to the underlying system of records for writes. It also then becomes responsible for invalidating or updating cached entries.

Using cache writers can not only automate that aspect, but also enables you to scale your writes. This 5 minute screencast explains how to use Ehcache Writer API to achieve this and what it means to your application code: especially in distributed environments when clustering your caches with Terracotta, the contract becomes looser than the usual “happens once and only once”. Indeed, updates to the database will happen at least once. In case of failure entries could get updated more than once…

You can also download the Ehcache Raffle application, that demonstrates Cache Writers and Cache Loaders, from github.com.

Ehcache's Writer API Screencast

Quartz Where!

One week ago we released the first beta of our Fremantle release, which includes Ehcache 2.4, Terracotta Enterprise Suite 3.5 and at last but not least Quartz 2.0. You have been able to cluster your Quartz Scheduler instances using Terracotta for a while already. Yet, as with a JDBC backed storage, you had no control over what node your job would be executed on. The only guarantee was that your job would be executed once within the cluster. Quartz Where aims at addressing exactly that, and is one of the many new features that are part of this new major release of our product line.

A popular demand from clustered Quartz Scheduler users was to be able to specify where a Job would be executed: because data for the job is known to be present on some machine (like using NFS-like file sharing) or because the Job requires much processing and memory. Controlling the locality of execution is now feasible. We have tried to make this a seamless addition to Quartz: you can configure jobs to be dispatched to node groups using a simple configuration file; or programmatically schedule LocalityJob or LocalityTrigger instances. Let’s first cover the configuration based approach, which doesn’t require any code changes to an existing Quartz 2.0 application.

Configuration based locality of execution

Before getting started with this new feature, you will have to configure your Quartz scheduler to use the Terracotta Enterprise Store by setting the property org.quartz.jobStore.class to org.terracotta.quartz.EnterpriseTerracottaJobStore. If you were not using Terracotta to cluster Quartz, you will also have to set the org.quartz.jobStore.tcConfigUrl property to point to the Terracotta server. Here is a small example of a quartz.properties

org.quartz.scheduler.instanceName = QuartzWhereScheduler

org.quartz.scheduler.instanceId = AUTO
org.quartz.scheduler.instanceIdGenerator.class = org.terracotta.quartz.demo.locality.SystemPropertyIdGenerator

org.quartz.threadPool.class = org.quartz.simpl.SimpleThreadPool
org.quartz.threadPool.threadCount = 10
org.quartz.threadPool.threadPriority = 5
org.quartz.threadPool.threadsInheritContextClassLoaderOfInitializingThread = true

org.quartz.jobStore.class = org.terracotta.quartz.EnterpriseTerracottaJobStore
org.quartz.jobStore.tcConfigUrl = localhost:9510

Using the quartzLocality.properties configuration file, you can define node groups. A node group is composed of one or more Quartz Scheduler instance nodes (generally one per machine within your cluster). You define them as such:

org.quartz.locality.nodeGroup.slowNodes = tortoise, snail
org.quartz.locality.nodeGroup.fastNodes = hare, leopard
org.quartz.locality.nodeGroup.linuxNodes = tortoise

We have now defined three groups: the slowNodes, fastNodes and linuxNodes. We can now use the node groups to have jobs or triggers being executed by them, depending on their group. Quartz Jobs and Triggers were and still are uniquely identified by a name and group pair. We can now have all jobs (or triggers) of a certain group only get executed on a node of a given group through the same configuration file:

org.quartz.locality.nodeGroup.fastNodes.triggerGroups = bigJobGroup
org.quartz.locality.nodeGroup.linuxNodes.triggerGroups = reporting

Now all triggers from the group bigJobGroup will be executed by a Scheduler from the group fastNodes, either the hare or leopard scheduler. These scheduler nodes receive unique ids as before by providing an org.quartz.spi.InstanceIdGenerator implementation to the scheduler at configuration time (don’t mix this with the instanceName, which needs to be the same for all nodes from them to be a single clustered Scheduler). Triggers from the group reporting will always be executed on tortoise, as this is the only scheduler in the linuxNodes group.

Programmatic locality of execution

Using the new locality API for Quartz that is part of our Terracotta Enterprise Suite 3.5 you can achieve even finer grained control and express more complex constraints about where a job should be executed. The example below uses the new DSL like builder API introduced with Quartz 2.0. Let’s see how that looks:

LocalityJobDetail jobDetail =
    localJob(
        newJob(ImportantJob.class)
            .withIdentity("importantJob")
            .build())
        .where(
            node()
                .is(partOfNodeGroup("fastNodes")))
        .build();

On line 3, we create a new JobDetail for the Job implementation ImportantJob. We then wrap it on line 2 as a localJob that needs to be executed on a node that is part of the group fastNodes. You might have noticed that creating the JobDetail is pretty straight forward with the new API. Adding the locality information isn’t much more work neither. You can be much more precise on where the job should be executed though. Let’s have a look at this example:

scheduler.scheduleJob(
    localTrigger(
        newTrigger()
            .forJob("importantJob"))
        .where(node()
            .has(atLeastAvailable(512, MemoryConstraint.Unit.MB)
            .is(OsConstraint.LINUX)))
        .build());

Here we schedule an immediate trigger for the importantJob we’ve registered in the previous example. Line 2 is creating the locality aware trigger, defining it to require a node that is running Linux and has at least 512 MB of heap available. Using these constraints, here memory and OS, you can be much more explicit about what the characteristics of the node executing the Job should be.

Locality constraints

The new Terracotta clustered JobStore we’ve introduced will evaluate the constraint expressed on a Trigger and/or a Job to decide where to dispatch the Job for execution. We plan on providing implementation for expressing constraints on the CPU, Memory and Operating System characteristics of the node. I am still heavily working on these, but what is being shipped as part of this first beta should give you a good feel for where we are headed. To test it yourself today, go fetch the Fremantle beta 1 from the Terracotta website now!

Quartz Where!