SIDEBAR
»
S
I
D
E
B
A
R
«
Scala Word of the Day: Catamorphism
January 20th, 2010 by Brian Maso

Catamorphism is the $20 word for a function that condenses a set of values down to a single value. For example, in SQL the terms grouping function or aggregator function refer to catamorphic functions, which includes SQL functions AVERAGE(), SUM(), and MEDIAN(). All of these SQL functions have in common a pattern that operates on a set of numeric values, and returns a single numeric value.

In functional programming, catamorphisms are embodied in the folds functions. I like Scala, so I’ll discuss some catamorphic function examples in found in the standard Scala library’s Iterable trait.

The Iterable trait in Scala represents the concept of an ordered series of same-typed values. This trait is common to all the collection types (Lists, Seqs, and Streams, and many other “series like” things). The Iterable trait defines several catamorphic functions — functions which reduce the series of values down to a single value.

Perhaps the simplest to start with is the forall() and exists() methods. These methods each take a Boolean test operation, applies it to each member of the Iterable, and returns the logical AND (forall()), or OR (exists) of the test values. Despite the fact that they act differently, both methods have the same type signature, because both are Boolean catamorphisms:

trait Iterable[A] {
  ...
  def forall(test: (A) => Boolean): Boolean = ...
  def find(test: (A) => Boolean): Boolean = ...
  ...
}

Event more generic are the foldLeft() and foldRight() methods in Iterable. foldRight() is really just a slight tweak on foldLeft(), so I’ll just explain left-hand version and for the right-hand version I’ll just do some… erm… hand-waving.

The foldLeft() method receives an aggregator function used to combine each member of the Iterable successively in to an aggregated value. foldLeft() starts with an initial value, which it combines with the first Iterable value using the aggregator function. The result is then aggregated with the second Iterable value; and then the third, and so on, until the entire Iterable has been reduce down to a single aggregated value. Thus foldLeft() is a catamorphism, as it combines and reduces the series of values represented by an Iterable down to a single value.

foldRight() is the same as foldLeft() — the only difference being which end of the series each starts with. The foldLeft() function starts with the head of the series (traditionally drawn to the left of the tail — hence the name) and proceeds to the coda; foldRight() starts with the series coda and proceeds to the head.

trait Iterable[A] {
  ...
  def foldLeft[B] (initialValue: B)(aggregatorFunc: (B, A) => B): B = ...
  def foldRight[B](initialValue: B)(aggregatorFunc: (B, A) => B): B = ...
  ...
}

If you think for a second, you can see how forall() and exists() could be implemented using foldLeft():

trait Iterable[A] {
  ...
  def foldLeft[B] (initialValue: B)(aggregatorFunc: (B, A) => B): B = ...
  ...
  def forall(test: (A) => Boolean): Boolean =
      foldLeft(true)(_ && test(_))
 
  def exists(test: (A) => Boolean): Boolean =
      foldLeft(false)(_ || test(_))
  ...
}

In addition to forall() and exists() there are other catamorphic convenience functions in Iterable, and as it turns out each of them could be re-implemented using foldLeft() (or foldRight()) as well. For example, mkString() creates a string representation of the Iterable series by combining each member’s toString() value, along with a prefix, a suffix, and a list separator string.

trait Iterable[A] {
  ...
  def foldLeft[B] (initialValue: B)(aggregatorFunc: (B, A) => B): B = ...
  ...
  def mkString(prefix: String, separator: String, suffix: String): String =
      foldLeft(prefix)(_ + separator + _) + suffix
  ...
}

Leave a Reply

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

»  Substance:WordPress   »  Style:Ahren Ahimsa