Tuesday, July 7, 2009

Pimp My Lock: A case study in Scala API design

Josh Cough had an amusing article on pimping vs fixing libraries. He ran into an intolerable method in the Java standard libraries and decided to fix it with the Pimp My Library pattern. Today I want to pimp a different Java library: Lock. Scala has higher-level ways of dealing with concurrency than Locks, but every now and then you need to get down with the nitty-gritty and do some manual locking. By Java standards, Lock is a pretty good library, but I wanted to see if Scala could do better.

Basic Locks

Here's what the Lock interface looks like in Java:

  interface Lock {
void lock();
void lockInterruptibly();
Condition newCondition();
boolean tryLock();
boolean tryLock(long time, TimeUnit unit);
void unlock();
}

As a first approximation, let's distill the basic features (lock, unlock) into a Scala trait:

  trait Lock {
def lock(): Unit
def unlock(): Unit
}

Lock was introduced as a better alternative to Java's synchronized keyword, but in the simple case it's actually worse. The recommended idiom for using Lock (straight from the JavaDocs) looks like:

  Lock l = ...
l.lock();
try {
// access the resource protected by this lock
} finally {
l.unlock();
}

Compare this to using synchronized directly in Java:

  Object lock = ...
synchronized(lock) {
// access the resource protected by this lock
}

What can go wrong when using lock/unlock? You might forget to release a lock, causing deadlock. Or you might forget to release a lock inside a finally clause, causing deadlock in when your code throws exceptions. Neither of these can happen when using synchronized, since the lock is automatically released at the end of the synchronized block, and Java makes sure it's released even if an exception is thrown. With a little bit of Scala, we can fix these potential errors:

  trait Lock {
def lock(): Unit
def unlock(): Unit

def apply[T](block: => T): T = {
this.lock()
try {
block
} finally {
this.unlock()
}
}
}

In Scala, traits (unlike Java interfaces) can define methods. This means ANY Lock can take advantage of the method we just added, without having to reimplement it. Our apply method is parametrized on type T, which is just whatever type our block returns. The little arrow => next to our block parameter tells Scala to delay evaluation of the block until it is actually used. (This is called a "by-name parameter".) This is critical, because it makes sure the block is only evaluated once we've acquired the lock. We've called our method apply because that's the method Scala looks for when you try to use an object as a function. (In Scala, functions are objects and objects are functions.) Now the recommended usage idiom for Scala looks like this:

  val lock: Lock = ...
lock {
// access the resource protected by this lock
}

This looks a lot like Java's baked-in support for synchronization, except we didn't need any language features built specifically for locks. Everything is "Just A Library". We might want to go a little further and add map and foreach methods. Having these methods lets us use locks inside of for-comprehensions, which is useful when we want to compose locks with other classes than can also be used inside for-comprehensions. It turns out that map is slightly more general than apply, so we'll reimplement apply in terms of map:

  trait Lock {
def lock(): Unit
def unlock(): Unit

def map[T](f: this.type => T): T = {
this.lock()
try {
f(this)
finally {
this.unlock()
}
}
def foreach(f: this.type => Unit): Unit = map(f)
def apply[T](block: => T): T = map(_ => block)
}

Now we can use Locks inside for-comprehensions:

  val lock: Lock = ...
for(l <- lock) {
// access the resource protected by this lock
}

Advanced Locks

If Java's Lock interface (added in Java 1.5) is worse than the built-in synchronized (since Java 1.0?), then why was it added? The answer is that we've forgotten about all the other methods on Lock, namely:

    void lockInterruptibly();
boolean tryLock();
boolean tryLock(long time, TimeUnit unit);

It turns out the synchronized keyword only has one way to acquire a lock: block the thread until you have the lock. This is a pretty forceful way to go about things, since the thread will block forever if the lock can't be acquired. The Lock interface adds three more ways to acquire a lock: non-blocking (tryLock), with a timeout (tryLock(long,TimeUnit)), or interruptible (lockInterruptibly). With all of these methods, lock acquisition can fail. Two of them return false if the lock can't be acquired, and the third throws an InterruptedException.

How can we support these different modes in our Scala Lock trait? We could certainly add the same three methods, just like Java. But then map/foreach/apply are all fixed to use only one kind of acquisition strategy. One approach is to add lots more methods: mapInterruptibly, foreachInterruptibly, applyInterruptibly, mapTry, foreachTry, applyTry, etc. This quickly becomes a headache, and it defeats much of the point of adding these methods in the first place, which was to integrate with Scala's language features for map/foreach/apply.

Another approach is to allow "toggling" between the different modes. Let's start with lockInterruptibly, which has the same signature as lock. (In Java, the signatures are actually different, because lockInterruptibly throws a checked InterruptedException. In Scala, all exceptions are unchecked so the signatures are actually the same.) We can add a couple of methods to toggle between "interruptible" and "uninterruptible" modes:

  trait Lock {
def lock(): Unit
def unlock(): Unit

def map[T](f: this.type => T): T = ...
def foreach(f: this.type => Unit): Unit = ...
def apply[T](block: => T): T = ...

def interruptible: Lock
def uninterruptible: Lock
}

Calling interruptible returns a view of the same underlying Lock, except that lock behaves like lockInterruptibly. Calling uninterruptible returns a Lock with the original default semantics. The benefit of reusing the Lock interface to mean different things (either interruptible or uninterruptible semantics) is that we can reuse the map/foreach/apply methods that we've already implemented for Lock. (Of course, the user still has to deal with catching any exceptions that might be thrown by an interruptible Lock... The Scala compiler won't complain when you've forgotten to catch an Exception, like the Java compiler does.)

But what do we do about the two tryLock methods? They return Boolean, not Unit, so we can't use them as drop-in replacements for lock. One approach is to add an attempt method to Lock that uses tryLock instead of lock:

    def attempt[T](block: => T): ??? = ???

The problem here is... what do we return? We'd like to return T, but we can't guarantee that. If tryLock fails, we don't have a T to return since we won't have run the block. We could try returning Unit:

    def attempt(block: => Unit): Unit = ...

If the lock is acquired successfully, the block is run, otherwise it isn't run. But then we have no idea whether the block ran or not, and no way to run a different block in the case of failure to grab the lock. We could try propagating the Boolean:

    def attempt(block: => Unit): Boolean = ...

Then we can check the result of attempt to see if the block succeeded. If it didn't, we can execute another block. However, this means wrapping our call to attempt inside an if...else statement, which seems ugly. We also have no way to return a result from the block. We could try passing two different blocks:

    def attempt[T](block: => T)(orElse: => T): T = ...

Now we can pass two blocks to attempt, one that gets called if the lock is acquired successfully, and one that gets called if the lock can't be acquired. Now we can always return a T. This is still kind of ugly though. Passing two different blocks side-by-side doesn't give us any indication of which is which. Also, every time we wanted to do nothing if the lock can't be acquired, we'd have to explicitly pass an empty block. This doesn't seem very clean.

The best approach is to return Option. Option is a class in the Scala standard library. It's commonly used in situation where some action might succeed and return a value, or fail and return nothing. It has two subclasses, Some (which wraps the value when it exists) or None (which indicates the absence of a value). Conveniently, it also has a getOrElse method which returns the wrapped value (if it exists) or executes an alternative block of code (if the value doesn't exist). We can define attempt in terms of Option:

    def attempt[T](block: => T): Option[T] = ...

And use it like so:

  val lock: Lock = ...
lock.attempt {
// access the resource protected by this lock
} getOrElse {
// execute some block if the lock can't be acquired
}

This neatly solves most of the concerns we had. If we don't care about the return value we can ignore it, if we do care we can access it. If we don't care about the "or else" condition, we don't have to include it, if we do care we can. However, we still face the explosion of methods for all the different locking modes. At the very least we'd need an attemptFor method which also takes a duration of time to wait to acquire the lock (using the other version of tryLock). Also, we wouldn't be able to use these locking modes inside for-comprehensions, because map/foreach are hard-wired to use lock (interruptible or uninterruptible versions).

Better Abstractions

The frustrating thing is that lock and tryLock are so similar, yet so different. They both try to acquire a lock. One can fail (return false), one can't (always returns (or throws an exception...)). For one, higher-order functions return T. For another, they return Option[T]. They're similar enough that we want the same kinds of abstractions for both, but different enough that we can't use the same abstraction. So why not use a different abstraction?

  trait TryingLock {
def lock(): Boolean
def unlock(): Unit

def map[T](f: this.type => T): Option[T] = ...
def foreach(f: this.type => Unit): Unit = ...
def apply[T](block: => T): Option[T] = ...
}

The differences between Lock and TryingLock reflect the different behaviors that lock modes might have. But they share fundamentally very similar abstractions. Now our attempt and attemptFor methods can both return TryingLocks:

    def attempt: TryingLock
def attemptFor(duration: Duration): TryingLock

As with our interruptible and uninterruptible toggles, it's critically important that these methods return just views of the same underlying lock. They should merely "toggle" the mode we're using to access the underlying lock. Conveniently, our code example using attempt above works as-is with this newer version of attempt, thanks to TryingLock's apply method.

The End Result

  val lock: Lock = ...

lock {
// uninterruptible access to shared resources
}

lock.interruptible {
// interruptible access to shared resources
}

lock.attempt {
// non-blocking attempt at access to shared resources
}

lock.attemptFor(10.minutes) {
// blocking (with timeout) attempt at access to shared resources
} getOrElse {
// code to execute if attempt fails
}

You can get the source on Github. Enjoy!

Update:

As Miles Sabin points out, the abstractions I devoloped for Lock are all nested, yet one of the major motivations for Lock was non-nested locking disciplines like hand-over-hand locking and lock downgrading. I didn't want to neglect these use cases when I wrote my Lock trait, which is why the lock and unlock primitives are still available instead of buried as private methods. However, the majority of uses of Lock are probably nested, so in the spirit of making easy things easy and hard things possible, I thought it would be useful to provide nested access to Locks, as well as access to the non-nested lock and unlock primitives.
blog comments powered by Disqus