Probably pure coincidence, but I stumbled upon this Estimating Java Object Sizes with Instrumentation blog. Given we just had released Ehcache 2.5 with Automatic Resource Control, aka ARC, I had to read through this. We do, amongst other, use an instance of
java.lang.instrument.Instrumentation to measure object sizes ourselves. Yet, we found some shortcomings to that approach:
Getting a reference to a Instrumentation instance !
As the blog mentions, you need an agent to get to that instance. Yet it felt like imposing every Ehcache users wanting to use ARC to add a
-javaagent: just wasn’t a great idea. Trying to work around this, it turns out Java 6 introduced the Attach API. Now we can try to attach to the VM while it’s running and load the agent.
And when I say try, I do mean try! As this could fail… For all kind of reasons: we’re running JDK5, so no Attach API for us to use; or the attaching to the VM itself fails, this could be for different reasons again. One particularly weird one being due to a bug on OS X, when the java.io.tmpdir system property is set! And even if we get to attach and load the agent, we still need the Ehcache code to get a reference to that
Instrumentation instance. The agent classes are being loaded by the system class loader, but the Ehcache classes aren’t necessarily and we might not get access to the system class loader directly. We don’t necessarily need to, but we try to avoid accessing another agent class instance, loaded by some other class loader. This would be generally not possible, as we hide the java agent jar within the Ehcache-core jar, so the classes it contains can’t be present multiple times…
What if we can’t access an Instrumentation instance ?
Ehcache’s sizeOf engine, as we named it, falls back to other mechanisms to size POJO. We’ve added two other methods, to which we fallback shouldn’t we be able to access the
Instrumentation instance: The Unsafe and, finally, the Reflection based one.
The UnsafeSizeOf will try to get a reference to the
sun.misc.Unsafe#theUnsafe. Using that reference, we can now query for an Object’s last non-static field offset in memory using
Unsafe.objectFieldOffset and do some math to calculate the object’s size in memory. I’ll come back later to the some math part…
And finally, shouldn’t we be able to gain access to theUnsafe, we use reflection based sizing. This will measure all primitives and references within an object and sum the size these use in memory. Dr. Heinz Kabutz published more details on that approach in his Java Specialists Newsletter #78: MemoryCounter for Java 1.4 back in 2003.
Now that’s all very simple, isn’t it ?
Well… Sadly it isn’t. But luckily, we’ve mostly sorted it all out for you! We just were done with the agent based implementation (which didn’t auto attach yet), and started the testing. Obviously, since this calls into the VM’s internal, it would all magically figure it out and all. Well, no. CMS wasn’t properly accounted for. CMS needs a certain minimal amount of memory to store information when an object is garbage collected and it’s memory allocation is “freed”. That affects the minimal size an object will use on heap. And that was Hotspot only… We then moved on to test on JRockit that required some finer adjustments, but I won’t start with these here now.
CMS, Compressed OOPS, minimum object size were just some of the things that we needed to account for in the some math to in the other implementations: pointer sizes (32 vs. 64 bit VMs), object alignment, field offset adjustment (on JRockit) and “object header” size. All these required us to gather all that information about the VM the sizing was happening in order to properly measure object sizes, even using the
Instrumentation instance to measure.
Know what to measure !
As you could read in Heinz’s newsletter there is some objects you probably don’t want to account for. Especially while measuring the size a cached entries are using on heap. There are all the obvious static, classes and other “Flyweight type objects”. These can all automatically be discarded by the sizing engine. But some other times, you also don’t want every cached entry to account for a particular part of an object graph. Simply because every, for instance, every cached entry will reference that particular bit. Hibernate’s 2nd level cache is good example of that. For that particular example, we’ve added a “resource” file that describes fields and types to be discarded when measuring a cache entry’s size on heap. For application types though (ones not going into the cache through Hibernate, but applications using the Ehcache API directly), we’ve added the
@IgnoreSizeOf annotation. Annotating a Field, a Type or even an entire package with it, will result in the sizing engine skipping that part of the graph (those types or the types in those packages respectively) while doing the sizing.
Ehcache 2.5 is out now and available for direct download or through maven central. It enables you to size your caches simply using values in bytes using Ehcache ARC, you can read more about cache sizing on the ehcache.org website.
I’ll be presenting “Asynchronous Job Execution in the Cloud” (
Session ID: 24301) later this week at JavaOne in San Francisco. The session will be held in the Hotel Nikko (Carmel I/II) on Wednesday Oct. 5, at 1pm.
I will cover how deployment topologies requires us to rethink how we go about job execution. As we moved from one machine to multiple and now even to ever changing environment, we need better tools to express the requirement of our job so that we can leverage IT infrastructures optimally and keep the throughput of our applications as high as possible.
I plan on covering this generally and then illustrate how Quartz achieves those goals when clustered using Terracotta. Looking forward to the session and discussions it might trigger. Also feel free to drop by one of our booth in either the Hilton Hotel (Continental Ballroom / JavaOne exhibitor hall / booth #5201) or in the Moscone Center’s South Hall (booth #640), where I also will be hanging around!
Here is a small introduction to the new Quartz Where feature that was just released as part of the Terracotta 3.5 Enterprise Edition.
I’ll be posting another screencast with some demo application benefiting from the feature, including the couple of steps required to migrate the application to use Where…
A common pattern to offloading the database is to use a cache in front of it. Yet, generally, it’s still application code that goes to the underlying system of records for writes. It also then becomes responsible for invalidating or updating cached entries.
Using cache writers can not only automate that aspect, but also enables you to scale your writes. This 5 minute screencast explains how to use Ehcache Writer API to achieve this and what it means to your application code: especially in distributed environments when clustering your caches with Terracotta, the contract becomes looser than the usual “happens once and only once”. Indeed, updates to the database will happen at least once. In case of failure entries could get updated more than once…
You can also download the Ehcache Raffle application, that demonstrates Cache Writers and Cache Loaders, from github.com.
One week ago we released the first beta of our Fremantle release, which includes Ehcache 2.4, Terracotta Enterprise Suite 3.5 and at last but not least Quartz 2.0. You have been able to cluster your Quartz Scheduler instances using Terracotta for a while already. Yet, as with a JDBC backed storage, you had no control over what node your job would be executed on. The only guarantee was that your job would be executed once within the cluster. Quartz Where aims at addressing exactly that, and is one of the many new features that are part of this new major release of our product line.
A popular demand from clustered Quartz Scheduler users was to be able to specify where a Job would be executed: because data for the job is known to be present on some machine (like using NFS-like file sharing) or because the Job requires much processing and memory. Controlling the locality of execution is now feasible. We have tried to make this a seamless addition to Quartz: you can configure jobs to be dispatched to node groups using a simple configuration file; or programmatically schedule
LocalityTrigger instances. Let’s first cover the configuration based approach, which doesn’t require any code changes to an existing Quartz 2.0 application.
Configuration based locality of execution
Before getting started with this new feature, you will have to configure your Quartz scheduler to use the Terracotta Enterprise Store by setting the property
org.terracotta.quartz.EnterpriseTerracottaJobStore. If you were not using Terracotta to cluster Quartz, you will also have to set the
org.quartz.jobStore.tcConfigUrl property to point to the Terracotta server. Here is a small example of a quartz.properties
org.quartz.scheduler.instanceName = QuartzWhereScheduler org.quartz.scheduler.instanceId = AUTO org.quartz.scheduler.instanceIdGenerator.class = org.terracotta.quartz.demo.locality.SystemPropertyIdGenerator org.quartz.threadPool.class = org.quartz.simpl.SimpleThreadPool org.quartz.threadPool.threadCount = 10 org.quartz.threadPool.threadPriority = 5 org.quartz.threadPool.threadsInheritContextClassLoaderOfInitializingThread = true org.quartz.jobStore.class = org.terracotta.quartz.EnterpriseTerracottaJobStore org.quartz.jobStore.tcConfigUrl = localhost:9510
quartzLocality.properties configuration file, you can define node groups. A node group is composed of one or more Quartz Scheduler instance nodes (generally one per machine within your cluster). You define them as such:
org.quartz.locality.nodeGroup.slowNodes = tortoise, snail org.quartz.locality.nodeGroup.fastNodes = hare, leopard org.quartz.locality.nodeGroup.linuxNodes = tortoise
We have now defined three groups: the slowNodes, fastNodes and linuxNodes. We can now use the node groups to have jobs or triggers being executed by them, depending on their
group. Quartz Jobs and Triggers were and still are uniquely identified by a name and group pair. We can now have all jobs (or triggers) of a certain group only get executed on a node of a given group through the same configuration file:
org.quartz.locality.nodeGroup.fastNodes.triggerGroups = bigJobGroup org.quartz.locality.nodeGroup.linuxNodes.triggerGroups = reporting
Now all triggers from the group
bigJobGroup will be executed by a Scheduler from the group
fastNodes, either the
leopard scheduler. These scheduler nodes receive unique ids as before by providing an
org.quartz.spi.InstanceIdGenerator implementation to the scheduler at configuration time (don’t mix this with the instanceName, which needs to be the same for all nodes from them to be a single clustered Scheduler). Triggers from the group
reporting will always be executed on tortoise, as this is the only scheduler in the
Programmatic locality of execution
Using the new locality API for Quartz that is part of our Terracotta Enterprise Suite 3.5 you can achieve even finer grained control and express more complex constraints about where a job should be executed. The example below uses the new DSL like builder API introduced with Quartz 2.0. Let’s see how that looks:
LocalityJobDetail jobDetail = localJob( newJob(ImportantJob.class) .withIdentity("importantJob") .build()) .where( node() .is(partOfNodeGroup("fastNodes"))) .build();
On line 3, we create a new
JobDetail for the
ImportantJob. We then wrap it on line 2 as a localJob that needs to be executed on a node that is part of the group
fastNodes. You might have noticed that creating the
JobDetail is pretty straight forward with the new API. Adding the locality information isn’t much more work neither. You can be much more precise on where the job should be executed though. Let’s have a look at this example:
scheduler.scheduleJob( localTrigger( newTrigger() .forJob("importantJob")) .where(node() .has(atLeastAvailable(512, MemoryConstraint.Unit.MB) .is(OsConstraint.LINUX))) .build());
Here we schedule an immediate trigger for the
importantJob we’ve registered in the previous example. Line 2 is creating the locality aware trigger, defining it to require a node that is running Linux and has at least 512 MB of heap available. Using these constraints, here memory and OS, you can be much more explicit about what the characteristics of the node executing the Job should be.
The new Terracotta clustered JobStore we’ve introduced will evaluate the constraint expressed on a Trigger and/or a Job to decide where to dispatch the Job for execution. We plan on providing implementation for expressing constraints on the CPU, Memory and Operating System characteristics of the node. I am still heavily working on these, but what is being shipped as part of this first beta should give you a good feel for where we are headed. To test it yourself today, go fetch the Fremantle beta 1 from the Terracotta website now!
Despite not having posted here for almost two years now, I’m not dead yet…
It’s been a year now, I joined Terracotta to work on their Hibernate second level cache implementation. A lot as happened since: Terracotta acquired Ehcache and Quartz, which gave the transparency team I joined some work…
My latest work there has been on Ehcache and its JTA support. Ehcache 2.1.0 beta has just been released with a whole bunch of new features. I’m planning on putting a post together on using JTA (and the coherent methods added to this release) with Ehcache in some very near future…
I’ve presented at this year’s Jazoon conference on Hölchoko, a framework for distributing the persistence of your JPA domain models to multiple clients. While the project itself is not where I would have expected it to be by now, I’ve finished enough of the demos to show what the idea is all about.
I’ve uploaded the presentation here. I know the demo still has the encumbrance of IntelliJ Idea forms for parts of the UI. I hope to get enough time to remove these very soon.
I plan on closing the remain tickets and release version 0.9.1 somewhere next week. I know the type system abstraction still requires quite some work. Finishing the remote filter for sending entities back to the server is just a matter of back porting from the in production systems and developing the dynamic proxy I show in the presentation. Code cleanup is still an issue and so is JavaDoc! HMVC is pretty there already…
I’ve just quickly hacked a fullscreen monitor for our continuous integration server at work. As for my personal projects, we use Hudson, which has a nice restful xml API.This small utility asks for the url of your Hudson server and retrieves all the jobs and their status. You can also provide the url to a view, would you want to only partially monitor your jobs (such as only the continuous integration builds, ignoring nightly builds, or the other way around).
When in full screen mode, the tool will poll the server every 15 seconds for job updates and render a fully red, yellow or blue (and yes, I kept the blue!) screen displaying the worse status of your currently monitored jobs. Should the screen not be blue (all monitored builds stable), it will also display the list of unstable or broken builds.
Everything is pretty much hacked together, with way too many inner classes (lazy me!). This is what I call a “one movie hack”, as I quickly wrote this with the computer on my lap next to my wife while we were watching a movie… or at least she was! Besides, it uses the IntelliJ Idea’s layout manager for the main screen. I’ll try to clean this all up. In the meantime the code is available through a:
svn co http://www.codespot.net/svn/repos/HudsonMonitor/trunk HudsonMonitor
and the executable jar is available here.