»
S
I
D
E
B
A
R
«
Scala Word of the Day: For Loops (All 4 Kinds!)
Jan 22nd, 2010 by Brian Maso

You think you know how to define a for loop, do you? Scala has 4 different kinds.


1. Traditional for loop. Loops through the members of an Iteratable. The result of the loop block is Unit.

for(mbr <- collection) {
  println(mbr)
}

Not surprisingly, this loop prints out the members of the collection, in order that they are returned by the collection’s iterator. In fact, this loop construct is translated exactly to

collection.foreach(mbr => println(mbr))

2. For…Yield Loop. Use the yield keyword within the body of a for loop. The loop block has a result which is an iterator of the yielded results. Thus the for loop can act as the RHS of an assignment.

val transformed = for(mbr <- collection) {
  yield transform(mbr)
}

Each time through this example loop, the result of the loop is yielded out. All the yielded results are collected in to a single iterator. This type of for loop is syntactic sugar for the following statement

val transformed = collection.map(mbr => transform(mbr))

That’s right, a simple for…yield loop is just syntactic sugar for Iteratable.map().


Interlude: Watch Out for Infinite Iterables

Infinite (or extremely large) iterables, usually implemented as Streams or Ranges, are a bit tricky. For example, it probably wouldn’t surprise you to learn that the following traditional for loop will not terminate (at least not in your lifetime):

for(i <- 0 to 100000000000) println(i)

But executing this next for…yield loop does not cause an indefinite wait — it works fine! (Go ahead, try it out!)

val numberStrings = for(i <- 0 to 100000000000) yield ("" + i)

How can this be so? You need to know that o to 10000000000 results in a Range object, which is a non-strict iterator. That is, it is an iterator that calculates the “next” value as you iterate through it, rather than pre-computing all members up front and storing them in memory when the Range is created.

Iteratable functions that yield a scalar result, like foldLeft() or length() or any catamorphic function, are not safe to use with non-strict iterators, because these methods must iterate over the entire iteratable’s contents to produce a result. The foreach() method falls in to this unsafe category — foreach()’s return type is Unit, which is considered scalar.

Remember the traditional for loop is just syntactic sugar for a foreach() call, and this explains why looping over the range with a traditional loop causes an indefinite wait.

Iteratable functions that yield results of similar size to the input, such as map() and flatMap(), are perfectly safe to use with non-strict iterators. These methods themselves yield non-strict iterators, and don’t need to iterate over the source iterator’s contents to yield a result.

The for…yield loop is just syntactic sugar for a map() call, and this explains why looping over the Range with a for…yield loop doesn’t cause an indefinite wait.

End of Interlude


3. Guarded For…Yield Loop. Throw an if guard in to a for..yield loop, and its translated to a filter-map pipeline (perhaps I should say filter |> map, using the pipe operator).

// Using the guarded for...yield loop syntactic sugar
val oddSquares =
  for(i <- 0 to 10000000000 if i % 2 == 1) yield (i*i)
 
// Exactly the same thing as
val oddSquares = (0 to 10000000000).filter(i=>i % 2 == 1).map(i=>i*i)

And as the example implies, this for loop style is non-strict collection safe (because both filter() and map() are non-strict safe).


4. Nested For Loop. Nesting iteration over iterators is syntactic sugar for a nested flatMap() call. Stare at the loop below and the form it gets translated in to by the Scala compiler for a little bit, you don’t need my explanation:

// Long-winded nested for loop
val pairsThatSumTo100 =
  for(i <- 0 to 100;
      j <- i to 100 if i + j == 100)
    yield Pair(i, j)
 
// Slightly shorter but harder to read raw form that gets compiled
val pairsThatSumTo100 =
  (0 to 100).flatMap(i=>(i to 100).filter(j=>i+j==100).map(j=>Pair(i,j)))

Note that nested for loops are also non-strict safe, because filter(), map() and flatMap() are all non-strict safe. Only the traditional for loop is not safe for non-strict use.

Pipe Operator for Scala
Oct 2nd, 2009 by Brian Maso

My first not-completely-trivial task in Scala is to implement the equivalent of the F# pipe operator.

Quick Explanation of the Pipe Operator

The pipe operator is syntactic sugar that makes certain patterns of code easier to read. When generating a final result by a sequence of value calculations its easier to read the sequence from left -to-right, but Scala  only supports passing values to functions in function(value) form, which basically forces the eyes to read the sequence in right-to-left order — rather unnatural.

Here is one way of expressing a sequence of calculations, relying on local vals to hold intermediate sequence values:

def shuffle(str: String): String = . . .
def randomize(str: String): String = . . .
def camelCase(str: String): String = . . .
def futzWith(str: String): String = . . .
def applySeveralStringTransforms(str: String): String = {
  val shuffled = shuffle(str)
  val randomized = randomize(shuffled)
  val camelCased = camelCase(randomized)
  futzWith(camelCased)
}

Which reads easy enough — the result of the function is the result of shuffling, randomizing, camel-casing and finally futzing with the original String. And here’s the equivalent calculation as a single expression, without local vals. Note how the sequence of calculations is performed from right to left — first shuffle, then randomize, etc.

def applySeveralStringTransforms(str: String): String = {
  futzWith(camelCase(randomize(shuffle(str)))
}

A pipe operator allows you to pass the result of an expression to the right, preserving left-to-right visual order while still being semantically equivalent to either form above:

def applySeveralStringTransforms(str: String): String = {
  str |> shuffle |> randomize |> camelCase |> futzWith
}

Attempt # 1: Right-Associative Operator

My first attempt has me defining a right-associative operator, using Odersky’s pimp-my-library technique, to generate a temporary PipeFunc object which has a “|>:” operator. That is, I’m using a couple tricks to get the compiler to automatically rewrite this

x |>: func

like this

(new PipeFunc(func)).|>:(x)

Since my operator “|>:” ends with a colon, Scala treats it as a “right-associative” function — a function of the right-operand “func”, rather than a function of the left-operand “x”. A very short-lived PipeFunc object gets created by an implicit conversion, and this object has a method named “|>:” that gets applied.

Here’s the definition of my operator:

object Pipeline {
  implicit def toPipeFunc[X, Y](func: X => Y) = new PipeFunc(func)
  class PipeFunc[X, Y](func: X => Y) {
    def |>:(value: X): Y = func(value)
  }
}

There are immediate problems with this approach. Because of how Scala 2.7.5 (the version I am using) parses this code, it is apparently unable to recognize that the right-hand side of |>: is a function reference. So the following actually causes a compiler error:

// complains "|>:" is not a function of type Unit
"Hello, World!" |>: println

I’m not sure why this is, but using parens fixes the problem, but makes the expression too surprising and ugly for my taste. I’m going to have to go back to the drawing board…

Attempt #2: Left-Associative Operator

Instead of augmenting the function argument on the right, I’ll try augmenting the value argument on the left with a “|>” operator. I’ll use the same pimp-my-library technique to use an implicit conversion of the left-hand object to a temporary object that has a “|>” operator, which takes a function as its right-hand argument. That is, I’m using a couple tricks to get the compiler to automatically convert this

x |> func

to this

(new PipeLink(x)).|>(func)

Here’s my code for the operator definition:

object Pipeline {
  implicit def toPipeLink[X](v: X): PipeLink[X] = new PipeLink[X](v)
  class PipeLink[X](value: X) {
    def |>[Y](func: X => Y): Y = func(value)
  }
}

OK, works correctly, and doesn’t have any of the problems that the right-associative operator had.

I do have a hidden performance issue: creation of an very-short-lived temporary object. Consider how the the following single expression gets rewritten by the Scala compiler:

str |> shuffle |> randomize |> camelCase |> futzWith

to this:

(new PipeLink((new PipeLink((new PipeLink((new PipeLink(str)).
    |>(shuffle)).|>(randomize)).|>(camelCase)).|>(futzWith)

Ugly – obviously; wastefully creates too many objects – also true. 4 temporary PipeLinks created and thrown away. The JVM GC is pretty good at handling short-lived objects, but this is terribly wasteful. I can’t think of a way of getting rid of temporary objects — any suggestions greatly appreciated!

Smarter Than Me

Only after implementing did I find that Steve Gilham had already posted his implementation of the pipe operator. He immediately went with a left-associative operator implementation. Must be a clever guy! Both his solution and mine suffer from creating a temp object with each invocation of the pipe operator.

// complains “|>:” is not a function of type Unit
»  Substance: WordPress   »  Style: Ahren Ahimsa