Sie sind auf Seite 1von 10

15-150 Spring 2017

Lab 11
5 April 2017

1 Introduction
In this lab, we will talk about what sequences are and youll get some practice using them.
Please take advantage of this opportunity to practice writing functions with the assistance
of the TAs and your classmates. You are encouraged to collaborate with your classmates
and to ask the TAs for help.

1.1 Getting Started

Update your clone of the git repository to get the files for this weeks lab as usual by running
git pull
from the top level directory (probably named 15150).

1.2 Methodology
You should practice writing requires and ensures specifications, and tests on the functions
you write on this assignment. In particular, every function you write should have both specs
and tests.

1.3 Compiling This Lab

As is common with modular code, this lab is distributed across many files and relies on the
SML/NJ compilation manager to introduce structures into the environment at the right time.
The files that contain relevant code are listed in the file, and the compilation
manager takes it from there. When you want to run your code for this lab, at the REPL,
you will enter
CM.make "";
As you progress through the lab, youll have to edit the file to uncomment the
files youve filled in. Make sure youre comfortable with this process! The current homework
is organized in the same way, so ask your TA or a neighbour if you cant get this to work.

2 Its dangerous to go alone! Take this.
For your convenience a brief description of some of the functions on sequences is given
here. There is a more comprehensive version of this information in your repository at
src/sequence/sequencereference.pdf. : (a -> b) -> a Seq.seq -> b Seq.seq, which takes a function and
a sequence and returns a sequence whose elements are the result of applying the given
function to the corresponding element in the given sequence.

Seq.reduce : (a * a -> a) -> a -> a Seq.seq -> a, which combines all

the elements of a sequence using a particular function and base case.

Seq.mapreduce : (a->b) -> b -> (b * b -> b) -> a Seq.seq -> b, which

combines the operations of and Seq.reduce by applying the given func-
tion of type a -> b to each element of the sequence before combining them as in

Seq.length : a Seq.seq -> int, which returns the number of elements in the

Seq.nth : a Seq.seq -> int -> a, which returns the element of the given se-
quence at the indicated index, assuming it is in bounds. Otherwise, it raises the
exception Range.

Seq.tabulate : (int -> a) -> int -> a Seq.seq, which computes a sequence
of the given length such that the value of each element of the sequence is the result of
applying the function to its index.

Seq.singleton : a -> a Seq.seq, which takes a value and produces a sequence

containing only that value.

Seq.append : a Seq.seq * a Seq.seq -> a Seq.seq, which takes two sequences

and appends the second to the end of the first

3 Mapreduce All Day
mapreduce is intended to be used with a function f of type t1 -> t2 for some types t1 and
t2, an associative binary function g of type t2 * t2 -> t2, a value z:t2 which is a zero
for g, and a sequence of items of type t1.
(Recall: g is associative when, for all a,b,c, we have g(a,g(b,c)) = g(g(a,b),c)) Also,
z is a zero for g when, for all x, we have g(x,z) = x.)
The behavior of mapreduce f z g <x1, . . ., xn> (assuming that g is associative) is
given by:

mapreduce f z g hx1 , . . . xn i = (f x1 ) g (f x2 ) g . . . g (f xn ) g z

The implementation of mapreduce uses a balanced parenthesization format for pairwise

combinations, as with reduce. Its work and span are as for reduce, when f and g are
constant time.
Although asymptotically mapping and then reducing has the same complexity as mapreduce,
it, mapreduce, is actually the more efficient of the two because it doesnt create an interme-
diate sequence.

Lets learn how to use mapreduce.

Task 3.1 In seqadd.sml write the function

seqFromList : a list -> a Seq.seq


seqToList : a Seq.seq -> a list

Hint: You should use mapreduce for seqToList.

4 Sequence Puzzles
The following functions ask you to become familiar with Seq.tabulate, Seq.length, and
Seq.nth. Add your functions to seqadd.sml.

4.1 Filter
Weve used the function filter, which takes a predicate and a list, and evaluates to a list
with only the items that satisfy the given predicate, before. One of your tasks for this lab is
to write an analogous function for sequences.

Task 4.1 Write

fun filter (p: a -> bool) (s: a Seq.seq) : a Seq.seq
such that filter p s evaluates to a sequence that includes only the elements x of s for
which p x
= true. Your implementation must not use Seq.filter.

4.2 Reduce
Contraction is an algorithmic technique in which we take a problem, reduce it to a smaller
problem and then recur (aka recurse) on the smaller problem. Its similar to Divide and
Conquer (the technique behind merge sort) but differs in a key way. Divide and Conquer
is based around the idea that we take our problem, divide it up, perform a recursive call
on each part, then combine the results together. In contraction, we take the input, make
it smaller, and then perform a single recursive call on the smaller input. Note that this
difference can result in contraction algorithms having much better runtime than divide and
conquer algorithms. Contraction is a very powerful technique that you will explore more in
15-210 (if you go on to take it). For our purposes we will use use contraction to implement
the sequence function reduce with O(n) work and O(log n) span assuming f has constant
work and span. The idea is to do pairwise reduction, and is illustrated by the following
reduce op+ 0 <1,2,3,4,4,3,2,1>
==> reduce op+ 0 <3, 7, 7, 3 >
==> reduce op+ 0 <10, 10 >
==> reduce op+ 0 <20>
==> 20

Task 4.2 Write

fun reduce (f : a * a -> a)(b : a)(s : a Seq.seq) : a
such that reduce f b s functions the same as Seq.reduce and runs in O(n) work and
O(log n) span assuming f has constant work and span. You may also assume that Seq.length
s is a power of 2. Your implementation must not use Seq.reduce or Seq.mapreduce.

4.3 Rake
In a more general sense, sometimes you may want a subsequence corresponding to some
stride along the sequence. Write the function

fun rake (s : a Seq.seq) (start:int,skip:int,last:int) : a Seq.seq

such that rake s (i,j,k) results in the sequence consisting of the element in s at position
i, then i + j, and so on until the last index of the form i + nj less than k.

Have a TA check your code before proceeding!

5 Finitely Branching Parallel Trees
A Finitely Branching Parallel Tree is similar to the binary trees introduced earlier in the
class, with the exception that each node can now have an arbitrary number of children
(rather than exactly 2). To represent this, we define each node to be a sequence of fbtrees
(this allows us to evaluate children of nodes in parallel). The datatype for these fbtrees is
as such:

datatype a fbtree = Leaf of a | Node of a fbtree seq

All of the functions weve previously defined for binary trees have analogs for fbtrees as
well. For example, here is the size function on fbtrees.

fun size (Leaf x) = 1

| size (Node s) = Seq.reduce (op +) 0 ( size s);

You will complete the following functions in fbtrees.sml.

Task 5.1 Write the function

depth : a fbtree -> int

Depth for fbtrees is defined as the longest path from the root to a leaf in the tree. A leaf
should have depth 1.

Task 5.2 Write the function

trav : a fbtree -> a list

such that trav T will evaluate to the inorder traversal of the tree T. This function might be
useful when testing your code. Hint: the function @ may be useful.
Higher order functions, like map and reduce, can be implemented for fbtrees as well.
Here is the implementation for map on fbtrees:

(* map : (a -> b) -> a fbtree -> b fbtree *)

(* REQUIRES: f is a total function *)
(* ENSURES: the output of map f T is equivalent to applying f
to all the leaves of T *)
fun map f (Leaf x) = Leaf (f x)
| map f (Node s) = Node ( (map f) s)

Task 5.3 Write the function

reduce : (a * a -> a) -> a -> a fbtree -> a

When g is associative and z is a zero for g, reduce g z T will evaluate to the result of
pairwise combining the items in trav T using g. You may use the Sequence functions for
this task.
(Recall: g is associative when, for all a,b,c, we have g(a,g(b,c)) = g(g(a,b),c)) Also,
z is a zero for g when, for all x, we have g(x,z) = x.)

6 Problem of Whacks
A Google Whack is a two-word search phrase that returns exactly one result.1 Past
Google Whacks include ambidextrous scallywags and disenthralled nimrod, though, un-
fortunately, as soon as something is known to be a Google Whack, it appears on pages
talking about Google Whacks, and then it no longer is.
In this problem, you will write a function whack to determine whether a pair of words
is a Google Whack. We represent the internet (the WWW) as a sequence of strings, where
each string represents the text of a page; e.g.

val theWWW = seqFromList [

"Ethers are a class of organic compounds that contain an ether group -- an",
"hI thEres do you livEs at three maiN streEt?",
"Doctor Who is a British science fiction television programme produced by the",

6.1 Helpers
We have provided the following helpers:

(* Compute a sequence of all the words (separated by spaces)

in the given page *)
val words : string -> string Seq.seq =
seqFromList o (String.tokens (fn s => s = #" "))

(* Converts a bool to either 1 for true or 0 for false *)

fun boolToInt b = case b of true => 1 | false => 0

(* Determine if the given int is 1 *)

fun isOne n = case n of 1 => true | _ => false

Also note that we have put the collection of web pages in WWW.sml. It can be accessed
via WWW.theWWW at the REPL, and returns a string Seq.seq . Keep in mind that these
pages are quite large.

6.2 BothHit

Task 6.1 Write the function

fun bothHit (w1 : string, w2 : string) : string Seq.seq -> bool

People sometimes also ask that each word return more than one hit individually.

For any sequence of words ws, bothHit (w1,w2) ws should return true iff both w1 and w2
occur in ws. You should use the seqExists function from the homework which determines
whether an element of a sequence satisfies the given predicate. Not all of you have turned in
the homework yet, so for the time being weve provided slowSeqExists which does not meet
the cost bounds requested on the homework. You may substitute your own implementation
from the homework if you desire.
For example:
val true = bothHit ("hi", "there") (words "hi there how are you")
val false = bothHit ("hi", "there") (words "hi how are you")

6.3 SearchBoth

Task 6.2 Using bothHit, write the function

fun searchBoth (w1 : string, w2 : string) : string Seq.seq -> string Seq.seq
Given a collection of Web pages pages, searchBoth (w1,w2) pages) should compute the
sequence of all pages in which both words occur. You should use the Seq.filter function.
For example,
val 2 = Seq.length (searchBoth ("Doctor","Who") theWWW)
for theWWW defined in the SML file.

6.4 CountBoth

Task 6.3 Using bothHit, write the funtion

fun countBoth (w1 : string, w2 : string) : string Seq.seq -> int
Given a collection of Web pages pages, countBoth (w1,w2) pages should compute the
number of pages in which both words occur.
One solution would be Seq.length o searchBoth(w1,w2). Try to find a solution that
runs in 1 pass and does not generate any intermediate sequences.

6.5 Whack

Task 6.4 Using countBoth, write the function

fun whack (w1 : string, w2 : string) : string Seq.seq -> bool
Given a collection of Web pages pages, whack (w1,w2) pages should return true if the
words are a Google Whack, and return false otherwise.
Have a TA check your code before proceeding!

6.6 BONUS TASK: All Whacks
If you manage to finish all of that, here is a more free-form bonus problem:

Task 6.5 Write a function

fun allWhacks (www : string Seq.seq) : (string * string) Seq.seq

that finds all of the Google Whacks on the www. Dont worry about efficiency; the point is
to play with sequence functions. You may want to write a helper or two.