Haskell

Grappling with Monads

March 31st, 2006

In learning Haskell, I have got to the point where I am trying to understand Monads.

Haskell is a lazy functional language, which means that almost everything in Haskell is expressed as a function, and that those functions are evaluated on an as-needed basis only. This approach means that you never quite know when or in what order your functions are going to be evaluated. It only becomes a problem when your functions have side-effects, such as interacting with the world outside Haskell. Input and output are the cannonical examples of such side effects.

Monads are Haskell’s way of controlling the order of evaluation. At least, thats the way they look to me right now.

So how do they work? Well, I only have a vague idea at the moment, but my current understanding is that they leverage Continuation Passing Style. Heres an example of CPS:

cps_add x y next = next x+y
-- here we pass in a function to be called with the result of our computation

Now, lets say we had a sequence of functions, f1, f2, f3, and we wanted them to be evaluated in that order. Normally, Haskell would evaluate them in whatever order it felt like, but if we made each function depend on the result of the previous one, then they would have to execute in sequence.

Monads seem to wrap up two facets of this requirement; the transformation of a sequence of functions into a CPS-like form such that each function depends on the result of the previous one, as well as defining the opaque data/class thingy that is passed between the transformed functions.

That was my insight for today, rough and unpolished as it may be. Am I on the right track? Anyone? Anyone? Bergman?

Functional Queues

March 22nd, 2006

Came across a very interesting queue implementation today in my Functional Programming class.

In an imperative language, such as C#, you would normally leverage arrays to create a resizeable circular buffer or somesuch. In functional programming languages, all datatypes are immutable, which tends to encourage one to avoid arrays and instead prefer linked lists.

How can you create a queue from linked lists of immutable nodes? You cant use a single linked list, because whilst you have easy access to the head, accessing the tail requires traversing the whole list (well, you can use a singly linked list, but its terribly inefficient).

The trick is to use two linked lists, called front and rear. The rear list holds its elements in reverse order:

For example, a queue containing items [1,2,3,4,5,6] might look like this: (front=[1,2,3], rear=[6,5,4])

Enqueuing an item inserts it at the head of the rear list, e.g. enqueuing 7 results in rear=[7,6,5,4]

Dequeuing an item removes the head of the front list, e.g. dequeuing from (front=[1,2,3], rear=[7,6,5,4]) results in (front=[2,3], rear=[7,6,5,4])

Thats all fine and good, but after dequeing all of the elements fron the front list, what do you do?

Well, you reverse the rear list and make it the front list, e.g. after dequeing two more items, we have (front=[], rear=[7,6,5,4]) and one more dequeue operations results in (front=[5,6,7], rear=[])

Ah, you might say, but reversing a list costs the same as traversing a whole list. And you would be right.

But not in an amortised sense.

Lets consider the behavior of this queue over time, as we add a few items, then remove a few items, and so on and so forth. Lets be explicit here, we add, say N items, then remove, say N items, and repeat.

Now at the point where the front list is empty and we want to remove an item from the queue, we have to reverse the rear list, which takes N steps. But we only got to that point by taking N steps, over time, adding items to the queue. The pattern of costs for adding and removing items would look something like this [1,1,1,1,1,5,1,1,1,1,1,5,…]. Once you smooth out the costs, you get an average of 2, no matter what N is.

The most common data structure that leverages this amortised behavior is the resizeable array, which, when its gets full, doubles in size. The pattern of costs for a resizeable array looks like this [1,1,2,1,1,4,1,1,1,1,8,1,1,1,1,1,1,1,1,16,…], where the numbers that arent 1 are the costs of doubling the array size. Smooth out the costs, and the average is also 2.

For those that are interested the Haskell code for the two-list Queue is here:
data Queue a = Q [a] [a]
isEmpty (Q front rear) = null front
fillf (Q [] rear) = Q (reverse rear) []
fillf q = q
tail (Q (x:front) rear) = fillf (Q front rear)
tail (Q [] _) = error "Empty Queue"
snoc (Q front rear) x = fillf (Q front (x:rear))