SIDEBAR
»
S
I
D
E
B
A
R
«
Non-Strict + Zip = Fab Fib!
Apr 7th, 2010 by Brian Maso

Just reviewed a gem from Programming Scala. Of all the Fibonacci implementations I’ve seen, my new favorite is below. Its 1 statement long, and there’s not a recursive function in sight:

lazy val fib: Stream[Int] =
 Stream.cons(0, Stream.cons(1, fib.zip(fib.tail).map(p => p._1 + p._2)))

If you have not seen that zip trick before, follow me on a little explanation. The code defines a Stream — a non-strict iterable — that begins with two literal values “0″ and “1″. The stream then continues the Fibonacci sequence by zipping the sequence to itself. More specifically, to its own tail sequence.

This figure illustrates what the zipper is creating.

Zipping a stream to itself to generate Fibonacci sequence

The tail value of the initial sequence is just the sequence starting with the literal value “1″. The zipper creates Pairs out of the each member of the sequence pairwise joined with the next member of the sequence.

The first pair is (0, 1).

The second pair is then (1, 0 + 1) = (1, 1).

The third pair is then (1, 1 + 1) = (1, 2).

The forth pair is then (2, 1 + 2) = (2, 3). And so on.

The coolest part is of course the complete lack of apparent recursion. The whole sequence is lazily evaluated, so the Stream takes up little space — one new Stream object per element in the sequence.

Generalizing

We can generalize this stream-zipping technique. When the value of a Stream element n can be calculated from the previous k sequence members, we can use a k-ary version of this technique. That is, if the stream theStream member n can be defined by some function s:

def theSeq(n) = s(theSeq(n-1), theSeq(n-2), …, theSeq(n-k))

We can define the stream in a single statement thus:

  • Explicitly define the first k-1 stream members
  • For all other members, perform k-1 zips to create a TupleK of the previous k sequence elements.
  • A single closure then defines the next element from this Tuple.

Here, for example, is a rolling average of the last 4 members of a Stream[Double]:

val data: Stream[Double] = ...
val padded_data = Stream.fill(4)(0.0) ++ data ++ Stream.fill(4)(0.0)
  // Note: tail padding not a problem even if data is infinite.
 
/* Here's where the stream is joined to itself. Also,
   mapping the (((Int,Int),Int),Int) to (Int,Int,Int,Int)
   for readability. Can't be recursively defined   */
def zip4[A](str: Stream[A]): (A,A,A,A) =
 (str zip str.tail zip str.tail.tail zip str.tail.tail.tail) map { p=>
     (p._1._1._1, p._1._1._2, p._1._2, p._2)}
 
/* Could be recursively defined in terms of
   base type Product */
def avg4(p: (Double,Double,Double,Double)): Double =
 (p._1, p._2, p._3, p._4) / 4
 
/* Finally, generating the rolling-average stream */
lazy val avg_rolling: Stream[Double] =
 zip4(padded_data) map (avg4)

You can use Iterator.sliding(n) to get a similar effect. And that does work on infinite, non-strict streams. Personally, I just though this technique was so cool, and it does have the benefit of strongly-typed tuples. (Iterator.sliding() simple provides more Streams. Try it out if you’re curious.)

Scala Word of the Day: For Loops (All 4 Kinds!)
Jan 22nd, 2010 by Brian Maso

You think you know how to define a for loop, do you? Scala has 4 different kinds.


1. Traditional for loop. Loops through the members of an Iteratable. The result of the loop block is Unit.

for(mbr <- collection) {
  println(mbr)
}

Not surprisingly, this loop prints out the members of the collection, in order that they are returned by the collection’s iterator. In fact, this loop construct is translated exactly to

collection.foreach(mbr => println(mbr))

2. For…Yield Loop. Use the yield keyword within the body of a for loop. The loop block has a result which is an iterator of the yielded results. Thus the for loop can act as the RHS of an assignment.

val transformed = for(mbr <- collection) {
  yield transform(mbr)
}

Each time through this example loop, the result of the loop is yielded out. All the yielded results are collected in to a single iterator. This type of for loop is syntactic sugar for the following statement

val transformed = collection.map(mbr => transform(mbr))

That’s right, a simple for…yield loop is just syntactic sugar for Iteratable.map().


Interlude: Watch Out for Infinite Iterables

Infinite (or extremely large) iterables, usually implemented as Streams or Ranges, are a bit tricky. For example, it probably wouldn’t surprise you to learn that the following traditional for loop will not terminate (at least not in your lifetime):

for(i <- 0 to 100000000000) println(i)

But executing this next for…yield loop does not cause an indefinite wait — it works fine! (Go ahead, try it out!)

val numberStrings = for(i <- 0 to 100000000000) yield ("" + i)

How can this be so? You need to know that o to 10000000000 results in a Range object, which is a non-strict iterator. That is, it is an iterator that calculates the “next” value as you iterate through it, rather than pre-computing all members up front and storing them in memory when the Range is created.

Iteratable functions that yield a scalar result, like foldLeft() or length() or any catamorphic function, are not safe to use with non-strict iterators, because these methods must iterate over the entire iteratable’s contents to produce a result. The foreach() method falls in to this unsafe category — foreach()‘s return type is Unit, which is considered scalar.

Remember the traditional for loop is just syntactic sugar for a foreach() call, and this explains why looping over the range with a traditional loop causes an indefinite wait.

Iteratable functions that yield results of similar size to the input, such as map() and flatMap(), are perfectly safe to use with non-strict iterators. These methods themselves yield non-strict iterators, and don’t need to iterate over the source iterator’s contents to yield a result.

The for…yield loop is just syntactic sugar for a map() call, and this explains why looping over the Range with a for…yield loop doesn’t cause an indefinite wait.

End of Interlude


3. Guarded For…Yield Loop. Throw an if guard in to a for..yield loop, and its translated to a filter-map pipeline (perhaps I should say filter |> map, using the pipe operator).

// Using the guarded for...yield loop syntactic sugar
val oddSquares =
  for(i <- 0 to 10000000000 if i % 2 == 1) yield (i*i)
 
// Exactly the same thing as
val oddSquares = (0 to 10000000000).filter(i=>i % 2 == 1).map(i=>i*i)

And as the example implies, this for loop style is non-strict collection safe (because both filter() and map() are non-strict safe).


4. Nested For Loop. Nesting iteration over iterators is syntactic sugar for a nested flatMap() call. Stare at the loop below and the form it gets translated in to by the Scala compiler for a little bit, you don’t need my explanation:

// Long-winded nested for loop
val pairsThatSumTo100 =
  for(i <- 0 to 100;
      j <- i to 100 if i + j == 100)
    yield Pair(i, j)
 
// Slightly shorter but harder to read raw form that gets compiled
val pairsThatSumTo100 =
  (0 to 100).flatMap(i=>(i to 100).filter(j=>i+j==100).map(j=>Pair(i,j)))

Note that nested for loops are also non-strict safe, because filter(), map() and flatMap() are all non-strict safe. Only the traditional for loop is not safe for non-strict use.

»  Substance:WordPress   »  Style:Ahren Ahimsa