It’s been 4 years since we released Ehcache 2.0. While it was the second major release in almost 6 years, it didn’t break backward compatibility, but was “simply” the first release with a tight integration with the Terracotta stack, just 6 months after the acquisition. With JSR-107 close to being final, Java 8 lurking just around the corner and Ehcache having this close to a 10 year old API, it does seem like the perfect time to revamp the API, break free from the past and start working on Ehcache 3.0!
Ehcache is one of the most feature rich caching APIs out there. Yet it’s been growing “organically”, all founded on the very early API as designed some 10 years ago. We’ve learned a lot in the meantime and there have been compromises made along the way. But “fixing” these isn’t an easy task, often even impossible without breaking backward compatibility. In the meantime, while some caching solutions took very different approaches on it all, the expert group on JSR-107, Terracotta included, put great efforts in trying to come up with a standard API. We feel the new major version of the API should be based on a standardized API. JSR-107 is an option and a likely choice that will serve as a basis for the new version but Ehcache 3.0 will likely “extend” the specification in many aspects.
While JSR-107 lays a foundation to this new API, we also need to address some Ehcache specifics early on: One of these is the resource constraints user may wish to put on a cache, and the other is the exception type system. Let me try and expand a bit on these here.
Ehcache users always had the possibility to constrain caches based on “element counts”, whether on-heap or on-disk. Over time we’ve introduced more tiers to cache data in (off-heap, Terracotta clustered) and more ways to constraint the caches in size (Automatic Resource Control, which let you constraint memory used by caches and cache managers in bytes). When capacity is reached, the cache will evict cached entries based on some eviction algorithm and fire events about these.
The javax.caching API doesn’t address this at all. Yet as this being a core concept in Ehcache and one that we want to keep, it seems only sensible we extend the API to support that.
We at Terracotta actually were never really fond of the exception types and their usages in Ehcache. Mainly because of the way Ehcache evolved to support more and more advanced topologies, without wanting to break backward compatibility. When it moved to support distributed caches, an entirely new set of problems emerged and failure became an inherent part of the system. Yet nothing on the API really allowed for that. JSR-107 seems to come with the same limitation.
While for simple, memory-only use-cases the API and its exceptions seem good enough, distributed caches do indeed need more. This is something we want address early on in the design of the new Ehcache 3.0 API.
Ehcache 1.0 came out just a couple of days before Java 5 was released. While it was obvious that not everyone would jump on the boat right away, the enhancements that Java 5 brought were non-neglectable nonetheless, most noticeably generics. Yet, Ehcache never managed to integrate these in its own API. The net result being that 10 years later, Ehcache still lacks generics and leaves type safety entirely to the user to deal with. Lesson learned… This time around, we want to make sure the API is ready for the immediate future that is Java 8.
Like 10 years ago with Java 5, not everyone will migrate to Java 8 when it comes out. Actually probably only a very few reckless ones will do. But eventually, more and more people will and many of these will embrace its feature set. Lambdas are certainly a language feature that fits a caching API in many of its usages. As such, we want to make sure the Ehcache 3.0 API is ready for that. Foolishly enough, I believe it’s a task we can succeed in… hopefully without too much trouble!
Going about getting there
Having covered these two main “external” drivers (JSR-107 & Java 8) that are going to impact this Ehcache 3.0 API efforts, it would also be worthwhile to discuss the “internal” ones… Following are the very deep & ugly secrets of why we believe Ehcache 3.0 is a necessity!
As mentioned earlier, Ehcache grew its feature set organically. Most importantly it grew from a heap only caching solution to the feature richest API in the caching landscape, certainly in the open-source one. But some of it hadn’t been accounted for, or simply turned out to be harder to fit in. There are mainly two things we believe need fundamental fixing in order to make the life of users much simpler…
Interestingly enough this is something the JSR-107 expert group also left out. Not really surprising, as it probably would be really hard to have all vendors implement such a standardized lifecycle, let alone agree on one!
Ehcache has a much simpler task to address in that regard. The set of deployments and topologies Ehcache support is pretty much nailed down now. So “all” there is to do is clearly define a lifecycle that encompasses all these and lets a users easily move from one deployment to another while minimizing the set of new problems they have to think about. Whether we’re talking about moving a memory only cache to a “restartable one” that outlives the lifecycle of the virtual machine, yet keeps it all type safe; or move to a clustered deployment, over WAN and all the implication that brings along… It’ll probably never be entirely transparent, but we probably can make it as smooth as possible…
With the plethora of features, came the ever growing set of configuration options to tune it all for the millions of deployment scenarios out there… The current ehcache.xsd is as scary as it gets! And that’s not even addressing any of many conflicting configurations one can end up with.
Addressing this would require a much better isolation between these features and their configuration. As part of a prototype I did with Chris Dennis a while back now, we came up with a much more modular approach for it. An approach where “modules” don’t know about each other and can function without each other entirely. While also tying into the previous point on lifecycle, nailing this aspect right seems an absolute requirement for the user experience to be as smooth as possible.
Last but not least, the API. I’m talking about the main API here, the one that expands on the JSR-107 one. The one that users will use the most. Covering configuration, lifecycle and “everyday” cache operations. The code that makes all of the above concrete to engineers and developers.
And yet there is much more than that… There is this list of features Ehcache grew to have over the last 10 years! While they are not necessarily all to be ported to this new version, most of them are. Deciding how and when to port them is going to be the next step when done with the above. But there is also a caveat to this… We need to make sure that whatever is done during the foundation work phase, nothing rules out or impedes with these features that we know are going to be part of this new API. I will not even try to come up with an exhaustive list, but here are some of the main features to keep in mind:
- BootstrapCacheLoader (sync/async)
- Transaction (JTA-y, XA, Local, …)
- Write Through (explicit)
- Write Behind (explicit)
- Read Through (explicit, with multiples Loaders)
- Refresh Ahead (Inline & Scheduled)
- … and more!
So what now?
So why this blog now? Well, I know there are many users out there. What’s great about users of a library such as Ehcache is that all these users are developers. Many of them actually seem to love to complain about the tools and libraries they use! Now is the time, more than ever: come and complain! But come and complain in a positive way is all we ask. And if you feel like doing more than complaining, you might even participate! We’ve decided to start spec’ing this new API all in the open. The discussions will start happening on our Google Group ehcache-dev and the prototypes to API and code to be all on github.com/ehcache. So we need you. We need you to make your issues known. We need you to make your solutions known and we need up to start debating with everyone on how to make Ehcache 3.0 the caching API you always wanted. We already have the knowledge and expertise to implement it all, based on all we already have in the 2.0 line and all the cool stuff we’ve been hacking on lately… But now is the time to let us know what is to be done!
We’ll be starting this exercise as of today: both discussing the above, starting with foundation work on top of the javax.caching API, as well as pushing the experimental code we have been hacking on to github. API work will be happening both on github, through code reviews, as well as on the group for more “general” discussions. All I can do now, is hope to see you there!