Oct. 19, 2006, 4:43 p.m.

Java Singletons Without Locks

As a wonderful follow-up to my micro-optimizations post, I figured I'd write a little about a common micro-optimization and how to write lots of additional code to avoid something that is not measurably significant.

I Want my Double Checked Lock

We all know doubled checked locking is bad. We also know why. It is a sort of natural progression of java developers who want to create singletons (often too many of them), but don't want all access to lock for the one time during the lifetime of their application when the value isn't initialized.

The alternatives people offer aren't too bad. You can either statically construct the singleton instance in a static field in the class, or you can use the Initialization on Demand Holder Idiom to dynamically instantiate the singleton on the first usage. In both cases, the value is permanently loaded and expected never to change.

These options make plenty of sense at first, but they don't seem to allow much in the way of testing. A good test will start in a known state. If it needs to use a singleton, it's important that singleton not have leftover stuff from a previous test. If it wants to operate near a singleton, but not actually care about the access of it, then the ability to mock that singleton is more important.

But, These Singletons Don't Mutate

So how do we get singletons that mutate madly during testing, but are only initialized once during a deployed application lifetime?

The short answer is to synchronize, it's easy and you probably won't notice any problems with efficiency:

private static Whatever instance=null;

public static synchronized Whatever getInstance() {
  if(instance == null) {
    instance=new Whatever();
  } 
   return instance;
}

public static synchronized void setInstance(Whatever to) {
  instance=to;
}

Synchronized? That's Cheating

It is possible, however, to use an AtomicReference to seemingly get the best of both worlds. However, it comes at a cost of code complexity.

private static AtomicReference<Whatever> instanceRef=
  new AtomicReference<Whatever>(null);

public static synchronized Whatever getInstance() {
  Whatever rv=instanceRef.get();
  if(rv == null) { 
    synchronized(Whatever.class) {
      rv=instanceRef.get();
      if(rv == null) {
        rv=new Whatever();
        boolean changed=instanceRef.compareAndSet( null, rv);
        // Just a reminder, assert means stating
        // something that isn't supposed to be possible.
        assert changed : "Race condition updating singleton";
      }     
    }   
  } 
  return rv;
}

public static void setInstance(Whatever to) {
  instanceRef.set(to);
}

Writing concurrent code correctly is difficult and most people get it wrong (I expect to have a comment pointing out some technical problem with the above at some point). It's generally much easier to prove things wrong than right. I have a testing tool (available in spy.jar) that allows me to simultaneously execute a piece of code and return its value for inspection. By simultaneously, I mean as close as the JVM on which I'm testing will allow. This technique does not prove code correct, but I've used it to prove code incorrect before, and have it report to me when the immediate error is gone. Keeping these tests around mean that I have something to look at if they occasionally fail, or begin to fail under a particular JVM.

How Much Faster Is the AtomicReference Implementation?

Good question. Most would look at that and go, Hey look, it doesn't say synchronized, it must be fast!. Some of us would measure (eventually). Below you will find the results from my test scenarios. You can check out the code I used to run the test if you want to see how it behaves on your VM.

dustintmb:/tmp 593% java Whatever 10000000
10000000 total requests, 4 threads for MT test
+....+....+....+....+....
Results:
 Atomics.mt
        [4783, 4706, 4668, 4793, 4734]
 Atomics.st
        [557, 2732, 2746, 2737, 2739]
 Synchronized.mt
        [1692, 4739, 4787, 4716, 4557]
 Synchronized.st
        [507, 2657, 2673, 2670, 2664]
101.864u 5.417s 1:06.98 160.1%  0+0k 0+14io 0pf+0w

It appears that it's about the same or slower much of the time. That pretty much makes it not worth it to me. I'd be interested in seeing if atomics get better enough to end up making a difference at some point, though.

One thing you can clearly see from these examples is the way the VM adapts to the need for synchronization when multiple threads actually want access at the same time.

For fun, I did another run that included a singleton accessor with no synchronization whatsoever and found it to be roughly 5x faster than the synchronized one before the first MT invocation, and remained roughly constant throughout the test sequence. That's not included in the results or the test program because it's a terrible thing to do.

In Conclusion

Just synchronize it. It's OK, really.

Table of Contents

Feeds