In the book Concepts, Techniques and Models of Computer Programming (CTM), Peter van Roy and Seif Haridi build a programming language based on a very small kernel language. The kernel language is extended as the models needs become more complex but it remains simple and understandable. The language used in the book is called Oz and it is implemented by the Mozart Programming System.
The models are described by three different views:- The computation model; the language and how sentences of the language are executed by the abstract machine.
- The programming model; the techniques and design principles used to write programs.
- Reasoning techniques; techniques for reasoning about the program to increase confidence that they work as expected.
The Kernel Language
The full kernel language, supporting the models is shown below.
<Statement> ::= % Statements needed in declarative model <Statement1> <Statement2> % Statement sequence | X = <number>|<atom>|<boolean> % Variable and value | X = f(I1:Y1 ... In:Yn)| % Record and tuple | X = Y % Variable variable binding | local X1 ... Xn in S1 end % Variable declaration | proc {X Y1 ... Yn} S1 end % Procedure declaration | {X Y1 ... Yn} % Procedure application (call) | if B then S1 else S2 end % Conditional | case X of pat then S1 else S2 end % Pattern matching | {NewName X} % Creates a new locally unique name % Statements needed in declarative model with exceptions | try S1 catch X then S2 end % Try a statement and catch the possible exception | raise X end % Raise an exception % Statements needed in concurrent declarative model | thread S1 end % Run statement in new thread. % Statements needed in stateful model | {NewCell I C} % Creates a cell C with content I | {Exchange C Old New} % Replaces the value of C with New; Old is returned
The Practical Language
Apart from extending the kernel language, the practical language of Oz is also extended by adding linguistic abstractions and syntactic sugar.
A linguistic abstraction is a new linguistic construct such as a function fun instead of a proc. This construct is both an abstraction and an addition to the language.
fun {Max X Y} if X>=Y then X else Y end end % Translates into: proc {Max X Y ?R} R = if X>=Y then X else Y end end
Syntactic sugar on the other hand does not involve a new abstraction. It simply allows us to write the same statements in a more convenient and readable way An example is:
if N==1 then [1] else local L in ... end end % Translates into: if N==1 then [1] else L in ... end
Both are added by means of transformations into the kernel language. While building the language all the transformations are shown in detail making it very clear what is going on.
The Declarative Model
The basic model of CTM is called the declarative model. Van Roy and Haridi defines a declarative model as evaluating functions over partial data structures. It covers both functional programming as in Scheme without state and logic programming as in Prolog without search.
The Abstract Machine
The abstract machine of the declarative model consists of: A single-assignment store, a value-store extended with single-assignment or dataflow variables. An environment, a mapping from variable identifiers to entities in the store. A semantic stack, containing pairs of statements in execution and the associated environment.
When a program is executing different values are added to the stack, the store and the environment according to the specified semantics of the statement. A thorough walk-through is done in the book.
Dataflow Variables
The single-assignment store may contain variables that are unbound, dataflow variables. This means that they can be used as out-parameters from procedures and partial values in records. It is also possible to bind two variables to each other. When one becomes bound then all variables see that binding.
When an unbound variable is accessed before it is bound execution waits until the variable is bound and then continues. This will enable both concurrent declarative programming and relational programming later in the book.
Practical Language
To make the language practical several syntactic conveniences are added to the kernel language:- Nested partial values:
person(name:"George" age:25)
instead oflocal A B in A="George" B=25 X=person(name:A age:B) end
expr1
andthenexpr2
translates into ifexpr1
thenexpr2
else false endexpr1
orelseexpr2
translates into ifexpr1
then true elseexpr2
end- if and case statements can be nested concisely.
- Statements can be turned into expressions using a nesting marker
$
: proc{name}
skip end translates into name = proc{$}
skip end - Records are written animal(name:”Tapir” age:77)
- Tuples are records with integer values animal(1:”Tapir” 2:77) may be written as animal(“Tapir” 77)
- Anonymous tuples are commonly written with infix notation with # as the separator
- ’#’ (“Tapir” 77) equals “Tapir”#77
- Lists are either the atom nil or a tuple ’|’ (H T).
- In infix that is H|T.
- 1 | 2 | 3 | nil may be abbreviated [1 2 3]
Procedures are selected over functions since they are more flexible. They allow multiple outputs or none. In combination with the strong support for records, procedures will later enable modules and object oriented programming through simple transformations.
When adding functions as linguistic abstractions, notice how the dataflow variable R
is unbound until
inside the procedure. This is impossible if using a normal value store where the variables are bound
at definition time.
% Functions as linguistic abstractions fun {Max X Y} if X>=Y then X else Y end end % Translates into: proc {Max X Y ?R} R = if X>=Y then X else Y end end
Loops as linguistic abstractions (definitions of the procedures inf Programming Techniques below):
% The integer loops for I in A..B do statements end % and for I in A..B;S do statements end % translates into (with S=1 in the first case) {For A B S proc {$ I} statements end} % The list loop for X in L do statements end % translates into {ForAll L proc {$ X} statements end}
Programming Techniques
Since the declarative model is similar to the functional (stateless) model, standard functional programming techniques such as higher order functions and recursion is used.
Iteration
Iteration is done with recursion:
% Standard iteration fun {Iterate S IsDone Transform} if {IsDone S} then S else S1 in S1={Transform S} {Iterate S1 IsDone Transform} end end % A list loop proc {ForAll L P} case L of nil then skip [] X|L2 then {P X} {ForAll L2 P} end end % An integer loop proc {For A B S P} proc {LoopUp C} if C=<B then {P C} {LoopUp C+S} end end proc {LoopDown C} if C>=B then {P C} {LoopDown C+S} end end in if S>0 then {LoopUp A} end if S<0 then {LoopDown A} end end
Difference Lists
Another technique that is enabled by choosing single-assignment variables is programming with difference lists. A difference list is a 2-tuple where the first element is a list and the second element is a tail of the first list.
nil#nil % Represents the empty list
[a]#[a] % so does this
X#X % and this
(a|b|c|X)#X % Represents [a b c]
(a|b|c|d|X)#(d|X) % so does this
[a b c d]#[d] % and this
Notice how unbound variables can be placed at the end of the lists. This enables some very cool
features. To append (a|b|c|X)#X
and (d|e|f|Y)#Y
, just bind X
to (d|e|f|Y)
. This creates the
difference list (a|b|c|d|e|f|Y)#Y
. We have just appended the lists [a b c]
and [d e f]
with a
single binding. Here is a function that appends any two difference lists:
% Function to append to difference lists fun {AppendD D1 D2} S1#E1=D1 S2#E2=D2 in E1=S2 S1#E2 end % It can be used like this (Browse means evaluate in Oz) local X Y in {Browse {AppendD (1|2|3|X)#X (4|5|Y)#Y}} end
Constant time append versus linear time of the first list with a value store. Nice!
Abstract Data Types
Simple implementations of ADT leak implementation details to the user. This makes it possible to preform unsafe operations.
% Simple implementation of a stack. fun {NewStack} nil end fun {Push S E} E|S end fun {Pop S E} case S of X|S1 then E=X S1 end end fun {IsEmpty S} S==nil end
One way to avoid this is by using keys for accessing the ADT. Oz introduces
the concept of names, created with {NewName}
, to create tokens that are unique within the system.
{NewName}
is not declarative, since it returns a new value every time it is called, but programs
using it may still behave declaratively.
% A procedure for creating a wrap/unwrap pair, for protecting values. proc {NewWrapper ?Wrap ?Unwrap} Key={NewName} in fun {Wrap X} fun {$ K} if K==Key then X end end end fun {Unwrap W} {W Key} end end % Protect the value ProtectedValue={Wrap Value} % Retrieve the value Value={Unwrap ProtectedValue} % A secure version of the stack local Wrap Unwrap in {NewWrapper Wrap Unwrap} fun {NewStack} {Wrap nil} end fun {Push S E} {Wrap E|{Unwrap S}} end fun {Pop S E} case {Unwrap S} of X|S1 then E=X {Wrap S1} end end fun {IsEmpty S} {Unwrap S}==nil end end
Programming in the large
Oz supports structuring programs by modules and functors. A module is implemented as a record with procedures. Only the interface is put into the export record.
declare MyList in local proc {Append ... } ... end proc {MergeSort ...} ... end proc {Sort ... } ... {MergeSort ...} ... end proc {Member ...} ... end in MyList=´export´(append:Append sort:Sort member:Member) end
Using procedural abstraction this module can be turned into a software component or functor.
Every timefun {MyListFunctor} proc {Append ... } ... end proc {MergeSort ...} ... end proc {Sort ... } ... {MergeSort ...} ... end proc {Member ...} ... end in ´export´(append:Append sort:Sort member:Member) end
MyListFunctor
is called it creates and returns a new MyList module.
- A functor definition can be evaluated at runtime, giving a functor.
- A functor can have external references to other entities through closures.
- A functor can have arguments, which are the other modules needed.
- A functor is lightweight; it can be used to encapsulate a single entity.
A linguistic abstraction is added to the practical language.
functor export append:Append sort:Sort member:Member define proc {Append ... } ... end proc {MergeSort ...} ... end proc {Sort ... } ... {MergeSort ...} ... end proc {Member ...} ... end end
Apart from {NewName}
, everything up to now has been declarative, that is without side-effects.
To enable programming with functors we need to be able to load functors from multiple files. To
do this we need IO.
Values in Oz can be stored and loaded with the Pickle
module. Note that loading takes a
URL and hence can be loaded over the network.
{Pickle.save X FN} % Save X in file FN {Pickle.load FNURL ?X} % Load X from file (or URL) FNURL
And since procedures are values and functors procedures, they to can be saved and loaded. Since
modules are so common additional functions exists in the Module
module.
% Loads the MyList module into the variable MyListModule MyListModule = Module.link [´MyList.ozf´]} % System modules are loaded from a known path while others may be loaded explicitly. functor import Browser FO at ´file:///home/mydir/FileOps.ozf´ define {Browser.browse {FO.countLines ´/etc/passwd´}} end
The Pickle
module is also used when implementing distributed computing in a very elegant way in
the chapter about distribution.
The Other Models
Van Roy and Haridi investigates many models apart from the declarative model: The concurrent declarative model, the concurrent message-passing model, the stateful model, the object-oriented model, the stateful concurrent model, the relational model, the distributed model and the constraint programming model. Here is quick overview of some of the other models:
The Concurrent Declarative Model
The abstract machine of the declarative model is extended to include multiple semantic stacks
instead of just one. Nothing else is changed. The kernel language is extended with the statement:
thread S1
end.
thread proc {Count N} if N>0 then {Count N-1} end end in {Count 1000000} end
Streams are also implemented in the concurrent declarative model with the use of flow variables.
fun {Generate N Limit} if N<Limit then N|{Generate N+1 Limit} else nil end end fun {Sum Xs A} case Xs of X|Xr then {Sum Xr A+X} [] nil then A end end local Xs S in thread Xs={Generate 0 150000} end % Producer thread thread S={Sum Xs 0} end % Consumer thread {Browse S} end
Laziness cab be added to the model by one more instruction, ByNeed
, to the kernel language.
The execution state is extended with a by-need trigger, a pair trig(x, y)
consisting of a
dataflow variable y and a one-argument procedure x. Next to the single-assignment
store, a new store called the trigger store is added to the abstract machine.
The trigger store contains all the by-need triggers and is initially empty.
lazy
is a linguistic abstraction that is defined in terms of ByNeed
.
fun lazy {Generate N} N|{Generate N+1} end % Is transformed into fun {Generate N} {ByNeed fun {$} N|{Generate N+1} end} end
The Stateful Model
Explicit state is added as one new basic type to the computation model. We call the type a cell. A cell is a pair of a constant, which is a name value, and a reference into the single-assignment store. The set of all cells lives in the mutable store, which is added to the abstract machine.
Compared to the declarative model, it adds just two new statements, the cell operations
NewCell
and Exchange
. Syntactic sugar is added with: X=@C
, bind X to the content of cell C
and C:=X
: set x to be the content of cell C.
The Object Oriented Model
Object oriented programming is supported by adding linguistic abstractions for class, attr and meth.
% A class class Counter attr val meth init(Value) val:=Value end meth browse {Browse @val} end meth inc(Value) val:=@val+Value end end % A class without semantic support local proc {Init M S} init(Value)=M in (S.val):=Value end proc {Browse2 M S} {Browse @(S.val)} end proc {Inc M S} inc(Value)=M in (S.val):=@(S.val)+Value end in Counter=c(attrs:[val] methods:m(init:Init browse:Browse2 inc:Inc)) end
The methods of Oz supports variable length argument lists, optional arguments and an otherwise method that will be called if no other method is appropriate. This gives strong support for delegation and meta-programming.
As can be seen above, classes are implemented as a set of methods and a set of attributes.
An object can be implemented like this:
fun {New WClass InitialMethod} State Obj Class={Unwrap WClass} in State = {MakeRecord s Class.attrs} {Record.forAll State proc {$ A} {NewCell _ A} end} proc {Obj M} {Class.methods.{Label M} M State Obj} end {Obj InitialMethod} Obj end
Inheritance is implemented by calculating a new class record starting from existing class records. They are combined according to the preferred inheritance rules.
Summary
Van Roy and Haridi has produced an impressive book in the spirit of the Structure and Interpretation of Computer Programs. The book is notably free from critique of other languages. The authors let their impressive work speak for itself.
This short walk-through of the book, with most of the models and details left out, cannot do justice to this incredible work. A must read for anyone interested in programming language design.
Five out of five!
No comments:
Post a Comment