
You should have seen the idea of abstract data types in at least a few classes so far. The key idea behind abstract data types is that the interface and the implementation are separate. The interface defines the set of operations available for that type. The implementation is the code that makes that work. For a single interface, there might be many different ways of implementing it.
So why separate the interface and the implementation? Because as long as the interface is implemented properly, any code that uses that interface to access the data doesn't have to care about how its implemented. That implementation can change at any time as needed without breaking anyone else's code.
I this lecture we'll talk about a specific abstraction that's useful for implementing the inductive types we've talked about. Don't worry about how it's implemented--just use the abstraction.
| Note: the methods defined here for defining and using these types are not a standard part of Scheme. If you're using DrScheme, a set of compatibility routines are provided. If you're using something else, the code from the textbook authors' web site has similar compatibility routines for other versions of Scheme. |
In languages like C++ you're used to using data structures like struct (or classes) that let you store multiple "fields" of information. Many languages support unions, which are aggregate structures that can contain different sets of fields (but only one set at at time). Unions with their different variants work somewhat like classes and subclasses.
The operator we'll use to define these unions and their variants is define-datatype:
(define-datatype type-name type-predicate-name
{ ( variant-name { ( field-name predicate ) }* ) }* )
As part of the definition, you specify the name of the type you're creating (the type-name), the name of the predicate that will be created to test for that type (the type-predicate-name), and each variant that this type can have (the variant-names). For each variant, you define the data fields (the field-names) and predicates that test for valid types for those fields (the predicates).
So, for example, a binary tree has either interior nodes or leaf nodes. Suppose that for our application we want the leaf nodes to store numbers and the interior nodes to have a symbol tag (as well as the two subtrees).
You can define a binary tree and these two variants with the following:
(define-datatype bintree bintree?
(leaf-node
(datum number?))
(interior-node
(key symbol?)
(left bintree?)
(right bintree?)))
This creates a datatype called bintree and a bintree?
predicate to test for that type. This type has two variants: leaf-node
and interior-node. leaf-nodes have a single field called datum
that must satisfy the number? predicate. interior-nodes
have three fields: a key that must satisfy the symbol?
predicate and left and right subtrees that must each
satisfy the bintree? predicate. (Note the recursive definition/use
of bintree?.)
This is much the way you'd define a similar structure in C, but notice that
instead of using a type name to define the fields you specify a predicate to
test for valid entries in those fields. Not only is this function-based approach
in keeping with the functional paradigm of Scheme, but it's also more flexible.
If you wanted a field to contain numbers or strings (but no other types), you
could use (lambda (x) (or (number? x) (string? x))) for the field's
predicate. Or, if you planned to use that repeatedly you could give that function
and name and use that name when you use define-datatype.
Once you define this type, it automatically defines two constructors (one for each variant) that have the same name as the variant and take parameters in the same order as the fields:
(leaf-node 5)
(interior-node 'a (leaf-node 1) (leaf-node 2))
The first expression returns a leaf node containing 5. The second expression
returns an interior node with the symbol a for the key, a leaf
node containing 1 for the left tree, and a leaf node containing 2 for the right
tree.
As another example, we can also use define-datatype to define
an s-list:
(define-datatype s-list s-list?
(empty-s-list)
(non-empty-s-list
(first symbol-exp?)
(rest s-list?)))
(define-datatype symbol-exp symbol-exp?
(symbol-symbol-exp
(data symbol?))
(s-list-symbol-exp
(data s-list?)))
Notice that because there are two non-terminals in the original definition, we define two types.
bintree given here does not allow for empty
trees (or subtrees). How might you change the definition here to allow them?
How would you use it to construct a right-skewed tree containing 'a, 'b, and
1?s-list type to define
an s-list equivalent to (a (b c) d)?Quite often, we might want to use lists as fields in a variant. The list? predicate tests only for lists,
not lists of a specific type. To facilitate this, the define-datatype package also includes a useful
list-of function:
(define list-of
(lambda (pred)
(lambda (val)
(or (null? val)
(and (pair? val)
(pred (car val))
((list-of pred) (cdr val)))))))
You don't have to type this into your own code--if the define-datatype
package is loaded, this function is part of it.
The function list-of is basically a curried function that takes
predicate and returns a new function that applies that predicate to each element
of a list. Here's how we might use it:
(define list-of-numbers? (list-of number?))
(define list-of-symbols? (list-of symbol?))
(See how useful currying turns out to be. You only have to code the list verification and traversal once.)
You can even use the predicate you get back from list-of as the
parameter to another call to list-of:
(define list-of-lists-of-numbers? (list-of (list-of number?)))
Now that we can define different variants, we need some way of processing the
different variants a particular type can have. We do this with a cases
constuct:
(cases type-name expression
{ (variant-name ( { field-name }* ) consequent) *
(else default))
This takes an expression of type type-name and tests to see which of the variant-names it matches. If it matches one of the listed variants, each of the corresponding field-names is bound to a local variable by that name and the consequent is evaluated and returned.
The following example uses the cases constuct to define a leaf-sum function that
recursively traverses the tree and adds up the leaves:
(define leaf-sum
(lambda (tree)
(cases bintree tree
(leaf-node (datum) datum)
(interior-node (key left right)
(+ (leaf-sum left) (leaf-sum right))))))
If tree evaluates to a leaf-node, the datum field
is extracted and returned. If tree evaluates to an interior-node,
the key, left, and right fields are extracted,
and the sum of recursing on the left subtree and recursing on the
right subtree is returned.
leaf-sum here to accomodate the case
where a tree is empty? (Use your modified definition asked about an earlier
question.)subst to use the new slist type defined
here.

Last updated at 9:54 am on Friday, August 27, 2004.