tag:blogger.com,1999:blog-34852707333475289872024-03-19T05:29:24.881-07:00Uncountably ManyJorge Ortizhttp://www.blogger.com/profile/14454965475839432618noreply@blogger.comBlogger3125tag:blogger.com,1999:blog-3485270733347528987.post-77353176138368093822009-07-24T12:19:00.000-07:002009-11-14T11:11:01.056-08:00Does pattern matching break encapsulation?<div>Scala attempts to unify functional and object-oriented programming. One of the concepts that Scala borrows from functional programming is <span style="font-style: italic;">pattern matching</span>.</div><br /><div>Pattern matching can be used anywhere you might use <code>switch/case</code>:</div><br /><div><pre><blockquote><code>def fibonnaci(n: Int): Int = n match {<br /> case 0 => 0<br /> case 1 => 1<br /> case n => f(n-1) + f(n-2)<br />}</code><blockquote></blockquote></blockquote></pre></div><br /><div>You can also use pattern matching to "break open" case classes and access the parameters that were used to construct it:</div><br /><div><pre><blockquote><code>trait Point<br />case class CartesianPoint(x: Double, y: Double) extends Point<br /><br />def isSpecial(point: Point): Boolean = point match {<br /> case CartesianPoint(x, y) if x*x + y*y > 25.0 => true<br /> case _ => false<br />}</code></blockquote></pre></div><br /><div>One criticism of pattern matching is that it violates <span style="font-style: italic;">encapsulation</span>, the object-oriented principle that says the interface of a class should be independent of its implementation details. If we change the implementation details of <code>CartesianPoint</code>, we'll break the pattern matching statements that depended on those details.</div><br /><span style="font-size:180%;">Extractors</span><br /><br /><div>Scala relies on extractors to make pattern matching more object-oriented. Extractors define a "view" on a particular kind of object. This allows Scala code that uses pattern matching to maintain encapsulation: just provide a "view" into your new implementation that matches the old one.</div><br /><div>Suppose we wanted to implement Points in terms of polar coordinates instead of Cartesian coordinates:</div><br /><div><pre><blockquote><code>trait Point<br />case class PolarPoint(radius: Double, theta: Double) extends Point<br /><br />object CartesianPoint {<br /> def apply(x: Double, y: Double): Point = {<br /> val radius = Math.sqrt(x*x + y*y)<br /> val theta = <br /> if (x == 0.0 && y == 0.0) 0.0<br /> else if (x == 0.0 && y > 0.0) Math.Pi/2<br /> else if (x == 0.0 && y < 0.0) 3*Math.Pi/2<br /> else if (x > 0 && y >= 0.0) Math.atan(y/x)<br /> else if (x > 0 && y < 0.0) Math.atan(y/x) + 2*Math.Pi<br /> else Math.atan(y/x) + Math.Pi<br /> PolarPoint(radius, theta)<br /> }<br /> def unapply(point: Point) = point match {<br /> case PolarPoint(r, th) =><br /> Some((r*Math.cos(th), r*Math.sin(th)))<br /> case _ =><br /> None<br /> }<br />}<br /></code></blockquote></pre></div><br /><div>The <code>apply</code> method on <code>CartesianPoint</code> specifies how to construct a <code>PolarPoint</code> from a pair of (x, y) coordinates. Likewise, the <code>unapply</code> method specifies how to construct a pair of (x, y) coordinates from a <code>PolarPoint</code>. It is this second method which is the <span style="font-style:italic;">extractor</span>. It lets us pattern match on <code>PolarPoint</code>s as if they were <code>CartesianPoint</code>s. In particular, the <code>isSpecial</code> method, defined above, can be used unchanged with our new, polar, implementation of points. Extractors let us keep encapsulation even when using pattern matching.</div><br /><span style="font-size:180%;">A catch (or two)</span><br /><br /><div>Is that the end of the story of pattern matching and encapsulation? Unfortunately, no. There are two ways in which pattern matching can break encapsulation. The first is through <code>sealed</code> classes, the second is through singletons.</div><br /><div>A sealed class can only be subclassed within the same file, so the compiler knows statically all the possible subclasses of that class. This lets the Scala compiler check whether a pattern match on a sealed type is <span style="font-style:italic;">exhaustive</span>. If you forget to check one of the possible subclasses, the Scala compiler will warn you that your match is not exhaustive. Consider the <code>List</code> class. <code>List</code> has two subclasses, <code>::</code> (cons) and <code>Nil</code>. Cons represents the non-empty <code>List</code>, with a head and a tail, while <code>Nil</code> represents the empty List, with no head and no tail. If we try to define a method that matches on a <code>List</code> but forgets to match the <code>Nil</code> case:</div><br /><div><pre><blockquote><code>def head[T](xs: List[T]): T = xs match {<br /> case hd :: tl => hd<br />}</code></blockquote></pre></div><br /><div>Then we get a reprimand from the compiler: <code>warning: match is not exhaustive! missing combination Nil</code>.</div><br /><div>If we define our own "cons" extractor, we don't get the same warning:</div><br /><div><pre><blockquote><code>object :/: {<br /> def unapply[T](xs: List[T]) =<br /> if (xs.isEmpty) None<br /> else Some(xs.head, xs.tail)<br />}<br /><br />def customHead[T](xs: List[T]): T = xs match {<br /> case hd :/: tl => hd<br />}</code></blockquote></pre></div><br /><div>Now, this is a relatively minor problem. The warning is nice to have, but it's not essential. The bigger problem is matching on singletons. The empty <code>List</code>, <code>Nil</code>, is an implementation detail we can never change. We can simulate the <code>::</code> case class with an extractor (as above with <code>:/:</code>, but there is simply no way we can simulate <code>Nil</code> with an extractor. The best we can do requires us to match on <code>MyNil()</code>, because matching on <code>MyNil</code> would match on the object, not it's unapply method.</div><br /><div><pre><blockquote><code>object MyNil {<br /> def unapply[T](xs: List[T]): Boolean =<br /> if (xs.isEmpty) true<br /> else false<br />}<br /><br />def empty[T](xs: List[T]): Boolean = xs match {<br /> case MyNil() => true<br /> case _ => false<br />}<br /></code></blockquote></pre></div><br /><span style="font-size:180%;">Conclusion</span><br /><br /><div>I don't want people to get me wrong: I love pattern matching. Practically every day I'm grateful that Scala has it. It makes certain kinds of programming problems much easier to solve. (Click <a href="http://stackoverflow.com/questions/563369/does-scalas-pattern-matching-violate-the-open-closed-principle">here</a> for a StackOverflow question where I discuss the kinds of problems that benefit from pattern matching vs OO-style inheritance and virtual method dispatch.) However, library designers should be conscious of the implementation details they might be exposing if they're letting their users pattern match on sealed classes or singletons. Pattern matching can be very poweful, but as with many things in Scala: with great power comes great responsibility.</div>Jorge Ortizhttp://www.blogger.com/profile/14454965475839432618noreply@blogger.comtag:blogger.com,1999:blog-3485270733347528987.post-2992034185985663702009-07-07T11:27:00.000-07:002009-11-14T11:11:12.723-08:00Pimp My Lock: A case study in Scala API design<div>Josh Cough had an amusing article on <a href="http://jackcoughonsoftware.blogspot.com/2009/06/pimp-vs-just-plain-fix-my-library.html">pimping vs fixing libraries</a>. He ran into an intolerable method in the Java standard libraries and decided to fix it with the <a href="http://www.artima.com/weblogs/viewpost.jsp?thread=179766">Pimp My Library</a> pattern. Today I want to pimp a different Java library: Lock. Scala has higher-level ways of dealing with concurrency than Locks, but every now and then you need to get down with the nitty-gritty and do some manual locking. By Java standards, Lock is a pretty good library, but I wanted to see if Scala could do better.</div><br /><div><span class="Apple-style-span" style="font-size:x-large;">Basic Locks</span></div><br /><div>Here's what the Lock interface looks like in Java:</div><br /><div><pre><code> interface Lock {<br /> void lock();<br /> void lockInterruptibly();<br /> Condition newCondition();<br /> boolean tryLock();<br /> boolean tryLock(long time, TimeUnit unit);<br /> void unlock();<br /> }</code></pre></div><br /><div>As a first approximation, let's distill the basic features (<code>lock</code>, <code>unlock</code>) into a Scala trait:</div><br /><div><pre><code> trait Lock {<br /> def lock(): Unit<br /> def unlock(): Unit<br /> }</code></pre></div><br /><div>Lock was introduced as a better alternative to Java's <code>synchronized</code> keyword, but in the simple case it's actually worse. The recommended idiom for using Lock (straight from the <a href="http://java.sun.com/j2se/1.5.0/docs/api/java/util/concurrent/locks/Lock.html">JavaDocs</a>) looks like:</div><br /><div><pre><code> Lock l = ...<br /> l.lock();<br /> try {<br /> // access the resource protected by this lock<br /> } finally {<br /> l.unlock();<br /> }</code></pre></div><br /><div>Compare this to using <code>synchronized</code> directly in Java:</div><br /><div><pre><code> Object lock = ...<br /> synchronized(lock) {<br /> // access the resource protected by this lock<br /> }</code></pre></div><br /><div>What can go wrong when using <code>lock/unlock</code>? You might forget to release a lock, causing deadlock. Or you might forget to release a lock inside a <code>finally</code> clause, causing deadlock in when your code throws exceptions. Neither of these can happen when using <code>synchronized</code>, since the lock is automatically released at the end of the synchronized block, and Java makes sure it's released even if an exception is thrown. With a little bit of Scala, we can fix these potential errors:</div><br /><div><pre><code> trait Lock {<br /> def lock(): Unit<br /> def unlock(): Unit<br /><br /> def apply[T](block: => T): T = {<br /> this.lock()<br /> try {<br /> block<br /> } finally {<br /> this.unlock()<br /> }<br /> }<br /> }</code></pre></div><br /><div>In Scala, traits (unlike Java interfaces) can define methods. This means ANY Lock can take advantage of the method we just added, without having to reimplement it. Our <code>apply</code> method is parametrized on type <code>T</code>, which is just whatever type our block returns. The little arrow <code>=></code> next to our <code>block</code> parameter tells Scala to delay evaluation of the block until it is actually used. (This is called a "by-name parameter".) This is critical, because it makes sure the block is only evaluated once we've acquired the lock. We've called our method <code>apply</code> because that's the method Scala looks for when you try to use an object as a function. (In Scala, functions are objects and objects are functions.) Now the recommended usage idiom for Scala looks like this:</div><br /><div><pre><code> val lock: Lock = ...<br /> lock {<br /> // access the resource protected by this lock<br /> }</code></pre></div><br /><div>This looks a lot like Java's baked-in support for synchronization, except we didn't need any language features built specifically for locks. Everything is "Just A Library". We might want to go a little further and add <code>map</code> and <code>foreach</code> methods. Having these methods lets us use locks inside of for-comprehensions, which is useful when we want to compose locks with other classes than can also be used inside for-comprehensions. It turns out that <code>map</code> is slightly more general than <code>apply</code>, so we'll reimplement <code>apply</code> in terms of <code>map</code>:</div><br /><div><pre><code> trait Lock {<br /> def lock(): Unit<br /> def unlock(): Unit<br /><br /> def map[T](f: this.type => T): T = {<br /> this.lock()<br /> try {<br /> f(this)<br /> finally {<br /> this.unlock()<br /> }<br /> }<br /> def foreach(f: this.type => Unit): Unit = map(f)<br /> def apply[T](block: => T): T = map(_ => block)<br /> }</code></pre></div><br /><div>Now we can use Locks inside for-comprehensions:</div><br /><div><pre><code> val lock: Lock = ...<br /> for(l <- lock) {<br /> // access the resource protected by this lock<br /> }</code></pre></div><br /><div><span class="Apple-style-span" style="font-size:x-large;">Advanced Locks</span></div><br /><div>If Java's Lock interface (added in Java 1.5) is worse than the built-in <code>synchronized</code> (since Java 1.0?), then why was it added? The answer is that we've forgotten about all the other methods on Lock, namely:</div><br /><div><pre><code> void lockInterruptibly();<br /> boolean tryLock();<br /> boolean tryLock(long time, TimeUnit unit);</code></pre></div><br /><div>It turns out the <code>synchronized</code> keyword only has one way to acquire a lock: block the thread until you have the lock. This is a pretty forceful way to go about things, since the thread will block forever if the lock can't be acquired. The Lock interface adds three more ways to acquire a lock: non-blocking (tryLock), with a timeout (tryLock(long,TimeUnit)), or interruptible (lockInterruptibly). With all of these methods, lock acquisition can fail. Two of them return false if the lock can't be acquired, and the third throws an InterruptedException. </div><br /><div>How can we support these different modes in our Scala Lock trait? We could certainly add the same three methods, just like Java. But then <code>map/foreach/apply</code> are all fixed to use only one kind of acquisition strategy. One approach is to add lots more methods: <code>mapInterruptibly, foreachInterruptibly, applyInterruptibly, mapTry, foreachTry, applyTry,</code> etc. This quickly becomes a headache, and it defeats much of the point of adding these methods in the first place, which was to integrate with Scala's language features for <code>map/foreach/apply</code>.</div><br /><div>Another approach is to allow "toggling" between the different modes. Let's start with <code>lockInterruptibly</code>, which has the same signature as <code>lock</code>. (In Java, the signatures are actually different, because <code>lockInterruptibly</code> throws a checked InterruptedException. In Scala, all exceptions are unchecked so the signatures are actually the same.) We can add a couple of methods to toggle between "interruptible" and "uninterruptible" modes:</div><br /><div><pre><code> trait Lock {<br /> def lock(): Unit<br /> def unlock(): Unit<br /><br /> def map[T](f: this.type => T): T = ...<br /> def foreach(f: this.type => Unit): Unit = ...<br /> def apply[T](block: => T): T = ...<br /><br /> def interruptible: Lock<br /> def uninterruptible: Lock<br /> }</code></pre></div><br /><div>Calling <code>interruptible</code> returns a view of the same underlying Lock, except that <code>lock</code> behaves like <code>lockInterruptibly</code>. Calling <code>uninterruptible</code> returns a Lock with the original default semantics. The benefit of reusing the Lock interface to mean different things (either interruptible or uninterruptible semantics) is that we can reuse the <code>map/foreach/apply</code> methods that we've already implemented for Lock. (Of course, the user still has to deal with catching any exceptions that might be thrown by an interruptible Lock... The Scala compiler won't complain when you've forgotten to catch an Exception, like the Java compiler does.)</div><br /><div>But what do we do about the two <code>tryLock</code> methods? They return Boolean, not Unit, so we can't use them as drop-in replacements for <code>lock</code>. One approach is to add an <code>attempt</code> method to Lock that uses <code>tryLock</code> instead of <code>lock</code>:</div><br /><div><pre><code> def attempt[T](block: => T): ??? = ???</code></pre></div><br /><div>The problem here is... what do we return? We'd like to return T, but we can't guarantee that. If <code>tryLock</code> fails, we don't have a T to return since we won't have run the block. We could try returning Unit:</div><br /><div><pre><code> def attempt(block: => Unit): Unit = ...</code></pre></div><br /><div>If the lock is acquired successfully, the block is run, otherwise it isn't run. But then we have no idea whether the block ran or not, and no way to run a different block in the case of failure to grab the lock. We could try propagating the Boolean:</div><br /><div><pre><code> def attempt(block: => Unit): Boolean = ...</code></pre></div><br /><div>Then we can check the result of <code>attempt</code> to see if the block succeeded. If it didn't, we can execute another block. However, this means wrapping our call to <code>attempt</code> inside an if...else statement, which seems ugly. We also have no way to return a result from the block. We could try passing two different blocks:</div><br /><div><pre><code> def attempt[T](block: => T)(orElse: => T): T = ...</code></pre></div><br /><div>Now we can pass two blocks to <code>attempt</code>, one that gets called if the lock is acquired successfully, and one that gets called if the lock can't be acquired. Now we can always return a <code>T</code>. This is still kind of ugly though. Passing two different blocks side-by-side doesn't give us any indication of which is which. Also, every time we wanted to do nothing if the lock can't be acquired, we'd have to explicitly pass an empty block. This doesn't seem very clean.</div><br /><div>The best approach is to return <a href="http://www.scala-lang.org/docu/files/api/scala/Option.html">Option</a>. Option is a class in the Scala standard library. It's commonly used in situation where some action might succeed and return a value, or fail and return nothing. It has two subclasses, Some (which wraps the value when it exists) or None (which indicates the absence of a value). Conveniently, it also has a <code>getOrElse</code> method which returns the wrapped value (if it exists) or executes an alternative block of code (if the value doesn't exist). We can define <code>attempt</code> in terms of Option:</div><br /><div><pre><code> def attempt[T](block: => T): Option[T] = ...</code></pre></div><br /><div>And use it like so:</div><br /><div><pre><code> val lock: Lock = ...<br /> lock.attempt {<br /> // access the resource protected by this lock<br /> } getOrElse {<br /> // execute some block if the lock can't be acquired<br /> }</code></pre></div><br /><div>This neatly solves most of the concerns we had. If we don't care about the return value we can ignore it, if we do care we can access it. If we don't care about the "or else" condition, we don't have to include it, if we do care we can. However, we still face the explosion of methods for all the different locking modes. At the very least we'd need an <code>attemptFor</code> method which also takes a duration of time to wait to acquire the lock (using the other version of <code>tryLock</code>). Also, we wouldn't be able to use these locking modes inside for-comprehensions, because <code>map/foreach</code> are hard-wired to use <code>lock</code> (interruptible or uninterruptible versions).</div><br /><div><span class="Apple-style-span" style="font-size:x-large;">Better Abstractions</span></div><br /><div>The frustrating thing is that <code>lock</code> and <code>tryLock</code> are so similar, yet so different. They both try to acquire a lock. One can fail (return false), one can't (always returns (or throws an exception...)). For one, higher-order functions return <code>T</code>. For another, they return <code>Option[T]</code>. They're similar enough that we want the same kinds of abstractions for both, but different enough that we can't use the same abstraction. So why not use a different abstraction?</div><br /><div><pre><code> trait TryingLock {<br /> def lock(): Boolean<br /> def unlock(): Unit<br /><br /> def map[T](f: this.type => T): Option[T] = ...<br /> def foreach(f: this.type => Unit): Unit = ...<br /> def apply[T](block: => T): Option[T] = ...<br /> }</code></pre></div><br /><div>The differences between Lock and TryingLock reflect the different behaviors that lock modes might have. But they share fundamentally very similar abstractions. Now our <code>attempt</code> and <code>attemptFor</code> methods can both return TryingLocks:</div><br /><div><pre><code> def attempt: TryingLock<br /> def attemptFor(duration: Duration): TryingLock</code></pre></div><br /><div>As with our <code>interruptible</code> and <code>uninterruptible</code> toggles, it's critically important that these methods return just views of the same underlying lock. They should merely "toggle" the mode we're using to access the underlying lock. Conveniently, our code example using <code>attempt</code> above works as-is with this newer version of <code>attempt</code>, thanks to TryingLock's <code>apply</code> method.</div><br /><div><span class="Apple-style-span" style="font-size:x-large;">The End Result</span></div><br /><div><pre><code> val lock: Lock = ...<br /><br /> lock {<br /> // uninterruptible access to shared resources<br /> }<br /><br /> lock.interruptible {<br /> // interruptible access to shared resources<br /> }<br /><br /> lock.attempt {<br /> // non-blocking attempt at access to shared resources<br /> }<br /><br /> lock.attemptFor(10.minutes) {<br /> // blocking (with timeout) attempt at access to shared resources<br /> } getOrElse {<br /> // code to execute if attempt fails<br /> }</code></pre></div><br /><div>You can get the <a href="http://github.com/joshcough/scala-parallel/tree/master">source</a> on Github. Enjoy!</div><br /><div><span class="Apple-style-span" style="font-size: large;"><b>Update:</b></span></div><br /><div>As Miles Sabin <a href="http://twitter.com/milessabin/status/2523324147">points</a> <a href="http://twitter.com/milessabin/status/2523671487">out</a>, the abstractions I devoloped for Lock are all nested, yet one of the major motivations for Lock was non-nested locking disciplines like <a href="http://cs.oswego.edu/pipermail/concurrency-interest/2007-April/004076.html">hand-over-hand locking</a> and <a href="http://java.sun.com/j2se/1.5.0/docs/api/java/util/concurrent/locks/ReentrantReadWriteLock.html">lock downgrading</a>. I didn't want to neglect these use cases when I wrote my Lock trait, which is why the <code>lock</code> and <code>unlock</code> primitives are <a href="http://github.com/joshcough/scala-parallel/blob/f38ced2d482440ad424783c4515054bc30d8dce8/src/main/scala/scala/util/concurrent/locks/Lock.scala">still available</a> instead of buried as private methods. However, the majority of uses of Lock are probably nested, so in the spirit of making easy things easy and hard things possible, I thought it would be useful to provide nested access to Locks, as well as access to the non-nested <code>lock</code> and <code>unlock</code> primitives.</div>Jorge Ortizhttp://www.blogger.com/profile/14454965475839432618noreply@blogger.comtag:blogger.com,1999:blog-3485270733347528987.post-73631250941026093892009-01-01T16:30:00.000-08:002009-11-14T11:11:25.853-08:00Facebook Ads: FailOne of Facbeook's monetizations strategies has been to imitate Google's very successful AdWords program, which places small, targeted, and unobtrusive text ads next to Google's search results. AdWords accounts for a large part of Google's considerable revenues, but Facebook has not enjoyed similar success with their own ads. There are many reasons why Google's ads work and Facebook's ads fail, but today I just want to mention one particular failure on Facebook's part: Facebook shows me ads when I don't want to see them, and when I do want to see ads Facebook makes it impossible for me to do so. <br /><br />First, consider Google. When I don't want to see ads, Google makes it easy for me to ignore them. Google's ads are small, text-only, set to one side, and are clearly labeled as advertisements. This empowerment to easily ignore ads keeps me happy as a user and coming back to Google for search. However, for certain search terms (like "flowers" or "divorce lawyer") it's likely that I'm looking for a certain product or service and actually want to see ads. When I want to see ads, Google makes it easy for me to do so: they're right there.<br /><br />Now consider Facebook. When I don't want to see ads, Facebook makes it just as easy as Google for me to ignore them. Facebook's ads are similarly small, set to one side, and clearly labeled as advertisements. What about when I want to see ads? Facebook doesn't run a general purpose search engine, so they don't know when I'm looking to buy flowers or get a divorce lawyer. (Maybe some day they'll figure out that my mom's birthday is coming up or that my spouse just cheated on me and posted video evidence of the infidelity to Facebook, but they're not quite there yet.) However, there are times when I find myself wanting to look at ads on Facebook: in the split-second after I click a link and I'm waiting for the next page to load, I have nothing better to do so my eyes wander and I glance at the ads that I had previously ignored. Every once in a while I skim an ad that actually piques my interest. I want to know more, but by then it's too late. A new page has loaded, and with it a new ad. Perplexingly, hitting the 'Back' button on my browser shows me the page I was previously on, but not the same ad that was previously on it. Yes, there have been Facebook ads compelling enough to make me interrupt my normal web browsing flow and go back a page on my browser just to give the ad a second look, and maybe even click on it, but Facebook chooses to show me a different ad instead.<br /><br />Major Fail.Jorge Ortizhttp://www.blogger.com/profile/14454965475839432618noreply@blogger.com