Programmation appliquée en Scala

Copyright © Cay S. Horstmann 2015 Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License

Handling Nested Sequences

We can extend the usage of higher order functions on sequences to many
calculations which are usually expressed using nested loops.

Example: Given a positive integer n, find all pairs of positive integers i and j, with 1 <= j < i < n such that i + j is prime.

For example, if n = 7, the sought pairs are

Algorithm

A natural way to do this is to:

One natural way to generate the sequence of pairs is to:

This can be achieved by combining until and map:

  (1 until n) map (i =>
     (1 until i) map (j => (i, j)))

Question

What is the type of the result?

  (1 until n) map (i =>
     (1 until i) map (j => (i, j)))

Generate Pairs

The previous step gave a sequence of sequences, let's call it xss.

We can combine all the sub-sequences using foldRight with ++:

  (xss foldRight Seq[Int]())(_ ++ _)

Or, equivalently, we use the built-in method flatten

  xss.flatten

This gives:

  ((1 until n) map (i =>
      (1 until i) map (j => (i, j)))).flatten

Generate Pairs (2)

Here's a useful law:

 xs flatMap f  =  (xs map f).flatten

Hence, the above expression can be simplified to

(1 until n) flatMap (i =>
   (1 until i) map (j => (i, j)))

Assembling the pieces

By reassembling the pieces, we obtain the following expression:

(1 until n) flatMap (i =>
   (1 until i) map (j => (i, j))) filter ( pair =>
      isPrime(pair._1 + pair._2))

This works, but makes most people's head hurt.

Is there a simpler way?

For-Expressions

Higher-order functions such as map, flatMap or filter provide powerful constructs for manipulating lists.

But sometimes the level of abstraction required by these function make the program difficult to understand.

In this case, Scala's for expression notation can help.

For-Expression Example

Let persons be a list of elements of class Person, with fields name and age.

  case class Person(name: String, age: Int)

To obtain the names of persons over 20 years old, you can write:

  for ( p <- persons if p.age > 20 ) yield p.name

which is equivalent to:

  persons filter (p => p.age > 20) map (p => p.name)

The for-expression is similar to loops in imperative languages, except
that it builds a list of the results of all iterations.

Syntax of For

A for-expression is of the form

  for ( s ) yield e

where s is a sequence of generators and filters,
and e is an expression whose value is returned by an iteration.

Instead of ( s ), braces { s } can also be used, and then the
sequence of generators and filters can be written on multiple lines
without requiring semicolons.

Use of For

Here are two examples which were previously solved with higher-order functions:

Given a positive integer n, find all the pairs of positive integers
(i, j) such that 1 <= j < i < n, and i + j is prime.

   for {
      i <- 1 until n
      j <- 1 until i
      if isPrime(i + j)
   } yield (i, j)

Exercise

Write a version of scalarProduct (see last session) that makes use of
a for:

  def scalarProduct(xs: List[Double], ys: List[Double]) : Double =

Combinatorial Search Example

In the following we are going to see Sets which are another fundamental collection type.

We then combine Sets and For Expressions in a classical combinatorial search problem, namely the N-Queens problem.

Sets

Sets are another basic abstraction in the Scala collections.

So far, all the collections we have seen were sequences. But there are also two other two other fundamental class of collections: Setsand Maps

A set is written analogously to a sequence:

  val fruit = Set("apple", "banana", "pear")
      //> fruit  : scala.collection.immutable.Set[String] = Set(apple, banana, pear)
  val s = (1 to 6).toSet
      //> s  : scala.collection.immutable.Set[Int] = Set(5, 1, 6, 2, 3, 4)

Most operations on sequences are also available on sets:

  s map (_ + 2)
      //> res1: scala.collection.immutable.Set[Int] = Set(5, 6, 7, 3, 8, 4)
  fruit filter (_.startsWith("app"))              
      //> res2: scala.collection.immutable.Set[String] = Set(apple)
  s.nonEmpty
      //> res3: Boolean = true

(see Iterables Scaladoc for a list of all supported operations)

Sets vs Sequences

The principal differences between sets and sequences are:

Sets are unordered; the elements of a set do not have a predefined order in which they appear in the set

sets do not have duplicate elements:

  s map (_ / 2)      // Set(2, 0, 3, 1)

The fundamental operation on sets is contains:

  s contains 5       // true

Example: N-Queens

The eight queens problem is to place eight queens on a chessboard so that no queen is threatened by another.

In other words, there can’t be two queens in the same row, column, or diagonal.

We now develop a solution for a chessboard of any size, not just 8.

One way to solve the problem is to place a queen on each row.

Once we have placed k - 1 queens, one must place the kth queen in a column where it’s not “in check” with any other queen on the board.

Algorithm

We can solve this problem with a recursive algorithm:

Implementation

  def queens(n: Int) = {
   def placeQueens(k: Int): Set[List[Int]] = {
      if (k == 0) Set(List())
      else
         for {
            queens <- placeQueens(k - 1)
            col <- 0 until n
            if isSafe(col, queens)
         } yield col :: queens
   }
      placeQueens(n)
   }

Exercise

Write a function

   def isSafe(col: Int, queens: List[Int]): Boolean

which tests if a queen placed in an indicated column col is secure amongst the other placed queens.

It is assumed that the new queen is placed in the next availabale row after the other placed queens (in other words: in row queens.length).

Steps:

Queries with for

The for notation is essentially equivalent to the common operations of
query languages for databases.

Example: Suppose that we have a database books, represented as a list of books.

  case class Book(title: String, authors: List[String])

A Mini-Database

  val books: List[Book] = List(
    Book(title   = "Structure and Interpretation of Computer Programs",
	 authors = List("Abelson, Harald", "Sussman, Gerald J.")),
    Book(title   = "Introduction to Functional Programming",
	 authors = List("Bird, Richard", "Wadler, Phil")),
    Book(title   = "Effective Java",
	 authors = List("Bloch, Joshua")),
    Book(title   = "Java Puzzlers",
	 authors = List("Bloch, Joshua", "Gafter, Neal")),
    Book(title   = "Programming in Scala",
	 authors = List("Odersky, Martin", "Spoon, Lex", "Venners, Bill")))

Some Queries

To find the titles of books whose author’s name is “Bird”:

  for (b <- books; a <- b.authors if a startsWith "Bird,")
  yield b.title

To find all the books which have the word “Program” in the title:

  for (b <- books if b.title indexOf "Program" >= 0)
  yield b.title

Another Query

To find the names of all authors who have written at least two books present in the database.

  for {
    b1 <- books
    b2 <- books
    if b1 != b2
    a1 <- b1.authors
    a2 <- b2.authors
    if a1 == a2
  } yield a1

Why do solutions show up twice?

How can we avoid this?

Modified Query

To find the names of all authors who have written at least two books present in the database.

  for {
    b1 <- books
    b2 <- books
    if b1.title < b2.title 
    a1 <- b1.authors
    a2 <- b2.authors
    if a1 == a2
  } yield a1

Problem

What happens if an author has published three books?

  1. The author is printed once
  2. The author is printed twice
  3. The author is printed three times
  4. The author is not printed at all

For-Expressions and Higher-Order Functions

The syntax of for is closely related to the higher-order functions map, flatMap and filter.

First of all, these functions can all be defined in terms of for:

  def mapFun[T, U](xs: List[T], f: T => U): List[U] = 
    for (x <- xs) yield f(x)

  def flatMap[T, U](xs: List[T], f: T => Iterable[U]): List[U] = 
    for (x <- xs; y <- f(x)) yield y

  def filter[T](xs: List[T], p: T => Boolean): List[T] = 
    for (x <- xs if p(x)) yield x

Translation of For (1)

In reality, the Scala compiler expresses for-expressions in terms of map, flatMap and a lazy variant of filter.

Here is the translation scheme used by the compiler (we limit ourselves here to simple variables in generators)

A simple for-expression

  for (x <- e1) yield e2

is translated to

  e1.map(x => e2)

Translation of For (2)

A for-expression

  for (x <- e1 if f; s) yield e2

where f is a filter and s is a (potentially empty) sequence of generators and filters, is translated to

  for (x <- e1.withFilter(x => f); s) yield e2

(and the translation continues with the new expression)

You can think of withFilter as a variant of filter that does not
produce an intermediate list, but instead filters the following map or flatMap function application.

Translation of For (3)

A for-expression

  for (x <- e1; y <- e2; s) yield e3

where s is a (potentially empty) sequence of generators and filters, is translated into

  e1.flatMap(x => for (y <- e2; s) yield e3)

(and the translation continues with the new expression)

Example

Take the for-expression that computed pairs whose sum is prime:

   for {
     i <- 1 until n
     j <- 1 until i
     if isPrime(i + j)
   } yield (i, j)

Applying the translation scheme to this expression gives:

   (1 until n).flatMap(i => 
     (1 until i).withFilter(j => isPrime(i+j))
       .map(j => (i, j)))

This is almost exactly the expression which we came up with first!

Exercise

Translate

  for (b <- books; a <- b.authors if a startsWith "Bird") 
  yield b.title

into higher-order functions.

Generalization of for

Interestingly, the translation of for is not limited to lists or
sequences, or even collections;

It is based solely on the presence of
the methods map, flatMap and withFilter.

This lets you use the for syntax for your own types as well – you must only define map, flatMap and withFilter for these types.

There are many types for which this is useful: arrays, iterators, databases, XML data, optional values, parsers, etc.

For and Databases

For example, books might not be a list, but a database stored on some server.

As long as the client interface to the database defines the methods map, flatMap and withFilter, we can use the for syntax for querying the database.

This is the basis of the Scala data base connection frameworks
ScalaQuery and Slick.

Similar ideas underly Microsoft’s LINQ.

Map

Another fundamental collection type is the map.

A map of type Map[Key, Value] is a data structure
that associates keys of type Key with values of type Value.

Examples:

  val romanNumerals = Map("I" -> 1, "V" -> 5, "X" -> 10)
  val capitalOfCountry = Map("US" -> "Washington", "Switzerland" -> "Bern")

Maps are Iterables

Class Map[Key, Value] extends the collection type
Iterable[(Key, Value)].

Therefore, maps support the same collection operations as other iterables do.

Example:

  val countryOfCapital = capitalOfCountry map {
    case(x, y) => (y, x)
  }                      // Map("Washington" -> "US", "Bern" -> "Switzerland")

Note that maps extend iterables of key/value pairs.

In fact, the syntax key -> value is just an alternative way to write the pair (key, value).

Maps are Functions

Class Map[Key, Value] also extends the function type Key => Value, so maps can be used everywhere functions can.

In particular, maps can be applied to key arguments:

  capitalOfCountry("US")         // "Washington"

Querying Map

Applying a map to a non-existing key gives an error:

  capitalOfCountry("Andorra")
    // java.util.NoSuchElementException: key not found: Andorra

To query a map without knowing beforehand whether it contains a given key, you can use the get operation:

  capitalOfCountry get "US"      // Some("Washington")
  capitalOfCountry get "Andorra" // None

The result of a get operation is an Option value.

The Option Type

The Option type is defined as:

  trait Option[+A]
  case class Some[+A](value: A) extends Option[A]
  object None extends Option[Nothing]

The expression map get key returns

Decomposing Option

Since options are defined as case classes, they can be decomposed
using pattern matching:

  def showCapital(country: String) = capitalOfCountry.get(country) match {
    case Some(capital) => capital
    case None => "missing data"
  }

  showCapital("US")      // "Washington"
  showCapital("Andorra") // "missing data"

Options also support quite a few operations of the other collections, in particular, map, flatMap and filter, so you can run For Expressions on them!

Alternative Ways of Querying Map

There is an alternative way to tackle with the missing key exception when querying a Map

If we do not wish to use pattern matching on the Option result of the get methode, we can use the getOrElse methode.

So we can specify the returned value if the key is missing

capitalOfCountry getOrElse("Andorra", "<unknown>")//> res1: String = "<unknown>"
capitalOfCountry getOrElse("US", "<unknown>")     //> res2: String = Washington

Another possibility is to use the operation withDefaultValue that turns a map into a
total function:

  val cap1 = capitalOfCountry withDefaultValue "<unknown>"
  cap1("Andorra")             // "<unknown>"

You can take look at different operations in class Map here.

Sorted and GroupBy

Two useful operation of SQL queries in addition to for-expressions are
groupBy and orderBy.

orderBy on a collection can be expressed by sortWith and sorted.

  val fruit = List("apple", "pear", "orange", "pineapple")
  fruit sortWith (_.length < _.length)  // List("pear", "apple", "orange", "pineapple")
  fruit.sorted      // List("apple", "orange", "pear", "pineapple")

groupBy is available on Scala collections. It partitions a collection
into a map of collections according to a discriminator function f.

Example:

  fruit groupBy (_.head)  //> Map(p -> List(pear, pineapple),
			  //|     a -> List(apple),
			  //|     o -> List(orange))

Map Example

A polynomial can be seen as a map from exponents to coefficients.

For instance, can be represented with the map.

  Map(0 -> 5, 1 -> -2, 3 -> 1)

Based on this observation, let’s design a class Polynom that represents polynomials as maps.

Implementation of Polynom

class Poly(val terms0: Map[Int, Double]) {
  val terms = terms0 withDefaultValue 0.0
  
  def + (other: Poly) : Poly = {
    val newKeySet = terms.keySet ++ other.terms.keySet
	val setOfTuples = for(x <- newKeySet) yield
      ( x -> (terms(x) + other.terms(x)))
	new Poly(setOfTuples.toMap)
  } 
}

Variable Length Argument Lists

It’s quite inconvenient to have to write

  Polynom(Map(1 -> 2.0, 3 -> 4.0, 5 -> 6.2))

Can one do without the Map(...)?

Problem: The number of key -> value pairs passed to Map can vary.

We can accommodate this pattern using a repeated parameter:

   def Polynom(bindings: (Int, Double)*) =
     new Polynom(bindings.toMap withDefaultValue 0)
   
   Polynom(1 -> 2.0, 3 -> 4.0, 5 -> 6.2)

Inside the Polynom function, bindings is seen as a
Seq[(Int, Double)].

Final Implementation of Polynom

class Poly(val terms0: Map[Int, Double]) {
  def this(bindings: (Int, Double)*) = this(bindings.toMap)
  val terms = terms0 withDefaultValue 0.0
  
  def + (other: Poly) : Poly = {
    val newKeySet = terms.keySet ++ other.terms.keySet
	val setOfTuples = for(x <- newKeySet) yield
      ( x -> (terms(x) + other.terms(x)))
	new Poly(setOfTuples.toMap)
  } 
  override def toString =
  (for ((exp, coeff) <- terms.toList.sorted.reverse)
    yield coeff+"x^"+exp) mkString " + "
}

Last exercice for this unit!

Using for expressions write a function to find the numbers x,y,z below n such that

Homework

Do this as individual work, not with your partner

When all done, email the signed zip files to Fatemeh.Borran@heig-vd.ch