GIML: Lesson Seven

Creating your own data types

As one would expect from a modern programming language it is possible to create new data types in ML. Having created the datatypes we can create functions using pattern matching just as with the in built type list.

Enumerated Types

Perhaps the simplest example is akin to the enumerated type in C or Pascal. datatype direction = north | east | south | west; Four constructors are created by this declaration they can be used as patterns in defining functions. For example right turns 90 it takes a direction and returns a new one. fun right north = east | right east = south | right south = west | right west = north; As we might expect these functions can be treated as any other. For example val aboutface = right o right; val left = ...

Data types which carry data

We can construct data types which carry data - these are akin to variant record types in Pascal. Each variant consists of a constructor and various components. Example: the type money can be either cash or cheque. datatype money = cash of int | cheque of string * real; The int associated with cash is the amount in pennies, a cheque carries the name of the bank and the amount in pounds. For example: val guardian = cash 45; val flat = cheque("Abbey National", 36500.00); val busfare = cash 50; Pattern matching on such items may be used in defining functions: fun worth(cash x) = x | worth(cheque("BCCI",_)) = 0 | worth(cheque("Baring",_)) = 0 | worth(cheque(_,amount)) = floor(100.0*amount); floor is a standard function to truncate and convert a real to an integer.

Polymorphism and syntactic sugar

We can create a more general list by referring to 'a as a general type in place of the specific type int. We can also do away with the brackets by making the cons operator infix, the keyword infixr ensures that the association is to the right. In ML we use :: for cons. infixr ::; datatype 'a list = nil | :: of 'a * 'a list; This gives use the normal definition of lists. Note that the [1,2,3] notation is an additional facility.

Example: Queues

We wish to represent a first-in first-out queue. The data structure is similar to lists in that there is a "empty" constructor similar to nil and an "add" constructor which corresponds to cons. The front of the queue is at the right, nearest the bus stop, items are added to the left. Consider the bus queue shown, boris is at the front of the queue, ivan is last.:

This will be represented as "ivan" ++ "tanya" ++ "boris" ++ P This object is a queue containing ivan added to tanya added to boris added to the empty queue. "ivan" ++ ("tanya" ++ ("boris" ++ P)) The empty queue is P, chosen for its uncanny similarity to a bus stop, it indicates the position of the front of the queue. ++ is the add operator, it associates to the right like :: The ML code allowing such a declaration is as follows: datatype 'a queue = P | ++ of 'a * 'a queue; infixr ++; The operations on a queue are front and remove. front returns the element at the front of the queue (without altering the queue), remove returns the rest of the queue with the first element removed. Note that both of these are strictly functions not procedures, remove does not change an existing queue it simply returns part of the queue, the original is still intact after the function call.

front() =

The function remove applied to the above queue returns the queue consisting of ivan and tanya:

remove() =

The following equations for front and remove may regarded as axiomatic - that is they serve as definitions of what a queue is, as well as providing a means of calculating expressions. We would normally start by considering the simplest patterns then move on to more complicated patterns. For example front P however in this case the front of an empty queue has no meaning, instead we consider the queue containing exactly one item as our simplest or base case. The queue containing one item has the pattern lonely++P where lonely is an item front(lonely++P) = lonely

front( ++ ) =

A more general queue consists of one item at the back (muggins) added on to the rest of the queue (everyOneElse) this queue has the pattern muggins++everyOneElse. If everyOneElse is not empty then the front of the whole thing is the same as the front of just everyOneElse without muggins. front(muggins++everyOneElse) = front everyOneElse

front( ++ ) = front()

Similarly removing one item from a queue gives the empty queue: remove(lonely ++ P) = P

remove( ++ ) =

and remove(muggins ++ everyOneElse) has muggins as the last element together with what is left of everyOneElse when one item is removed, hence remove(muggins++everyOneElse) = muggins++(remove everyOneElse)

remove( ++ ) = ++ remove()

This translates into ML with just a few keywords thrown in: fun front(x++P) = x | front(x++q) = front q; fun remove(x++P) = P | remove(x++q) = x++(remove q); When we enter this into ML we are kindly reminded that we have non-exhaustive patterns, that is both front and remove are only partial functions. Note that P is a specific queue (the empty one) whereas q stands for any queue and x stands for any item.

But what is the subtext?