A recent thread on the scala-users list discussed a nifty technique for efficiently working with a common sequence processing use-case. The technique was coin view-map-find. Here I describe what I think is a little better tweak on the idea that I’ll give the daunting name view-map-filter*-find. A good Scala programmer should have this in his toolbelt.
The Problem
You have a sequence, and you want to find the first (or any) element that matches some criteria. Of course your first thought should be find. However, often it is the case that the predicate passed to find would be pretty complex — perhaps too complex to render meaningful code. Often in these cases it would yield much more readable, understandable code to break the predicate in to two parts: the first portion is a mapping, and the second is a predicate test on the mapped value.
For example, you have a list of file names, and you want the first name that references a directory that contains at least one JPG image file. Using just find, you get a pretty hairy predicate function:
val fileNames: Seq[String] = ...
val dirWithJPGs = fileNames find { name =>
val file = new java.io.File(name)
if(file.isDirectory) {
(
file.list.map { fn => new java.io.File(file, fn) }
map { f => f.isFile && fn.endsWith("jpg") }
) reduceLeft (_ || _)
} else { false }
}
Its pretty hard to see what’s going on there without staring at it for a bit. But the English description is pretty clear: Each name in the fileNames list is being mapped to a boolean value indicating whether or not the File is a directory AND contains at least one file ending in “jpg”.
Using maps and filters, we can get something a lot easier to understand in code:
val fileNames: Seq[String] = ...
val dirWithJPGs = fileNames map { new java.io.File(_) } find { dir =>
dir.isDirectory &&
(for(file <- dir.list.map { n => new java.io.File(n) })
yield file.getName.endsWith("jpg")
) reduceLeft (_ || _)
}
}
I find that a bit easier to read. An additional filter call is going to make it even easier to read:
val fileNames: Seq[String] = ...
val dirWithJPGs = fileNames map { new java.io.File(_) } filter {_.isDirectory} find { dir =>
(for(file <- dir.list.map { n => new java.io.File(n) })
yield file.getName.endsWith("jpg")
) reduceLeft (_ || _)
}
First the map call translates the initial String values in to something suitable for testing (java.io.File instances). Then any complicated test is cracked in to a series of one or more filters, with the final predicate applied with a find.
The final trick, which is really necessary to make this work, is to use a non-strict view of the original sequence, not the original sequence itself. If you map the original (strict) sequence, you end up mapping all elements before testing any of them. And if you’ve cracked your predicate in to intermediate filters, then each filter would also need to be applied to all members. But if you map a view, then each element in the view is mapped, filtered, and finally tested individually before moving on to the next.
So, say your looking for the first item in a 100,000-element sequence that solves some predicate function — you’re going to want to map and test the items individually. Making a view of a sequence is as easy as using the sequence’s view method.
val fileNames: Seq[String] = ...
val dirWithJPGs = (fileNames.view) map { new java.io.File(_) } filter {_.isDirectory} find { dir =>
(for(file <- dir.list.map { n => new java.io.File(n) })
yield file.getName.endsWith("jpg")
) reduceLeft (_ || _)
}