4 Application Programmer's Interface

The compiler is available to Mozart applications through the module Compiler. This chapter describes the functionality provided by that module and its classes.

First, a number of additional secondary type names used in this description is introduced in Section 4.1, then the Compiler module is described in Section 4.2. The material in that section should prove sufficient for most cases. The remainder of the chapter is intended for advanced uses.

An arbitrary number of compilers may be instantiated, each with its own internal state, and used concurrently. We distinguish between compiler engines, described in Section 4.3, which store the state of a compiler and perform the compilation proper, and compiler interfaces, described in Section 4.4, which allow to observe the activities of compiler engines and to react to them. Both of these use the narrator/listener mechanism described in Appendix B; familiarity with this is assumed.

Finally, examples are presented in Section 4.5; in particular, the provided abstractions are explained in terms of compiler engines and interfaces.

4.1 Additional Secondary Types

This section describes additional secondary types used in the descriptions in this chapter. The conventions defined in Section 2.3 of ``The Oz Base Environment'' will be respected.

Coord

stands for information about source coordinates. This is either unit if no information is known or a tuple pos(FileName Line Column), where FileName is represented as an atom ('' meaning `unknown') and Line and Column are integers. Line numbering begins at 1 and column numbering at 0; a column number of ~1 means `unknown'.

SwitchName

is an atom which must be a valid switch name (see Appendix A).

PrintName

is an atom which must be a valid variable print name.

Env

represents an environment, represented as a record whose features are valid print names.

4.2 The Compiler Module

evalExpression

{Compiler.evalExpression +V +Env ?KillP X}

evaluates an expression, given as a virtual string V, in a base environment enriched by the bindings given by Env, either returning the result X of the evaluation or raising an exception. Furthermore, the variable KillP is bound to a nullary procedure which, when applied, interrupts compilation.

virtualStringToValue

{Compiler.virtualStringToValue +V X}

is a replacement for System.virtualStringToValue, which was available in Mozart's predecessor DFKI Oz.

Note that you are discouraged from using this for large data structures: Because it is much more powerful than System.virtualStringToValue, it can also be much less efficient. Rather, you should use pickling and unpickling of data structures (see Chapter 22 of ``System Modules'').

engine

Compiler.engine

is the final class from which compiler engines can be instantiated. This is described in detail in Section 4.3.

interface

Compiler.interface

is a class providing a simple mechanism to create compiler interfaces. It is described in detail in Section 4.4.

parseOzFile

{Compiler.parseOzFile +V +O +P +Dictionary ?T}

parses the Oz source file named V, returning an abstract syntax tree as defined in Appendix C in T. O is an instance of the PrivateNarrator class described in Appendix B; its methods are invoked for example to report compilation errors. P is a unary procedure expecting a switch name as described in Appendix A and returning a boolean value indicating the switch's state; in the current implementation, only the settings of gump, allowdeprecated and showinsert are requested. Finally, Dictionary is the set of macro names: The keys are defined macro names; its items should always be true. As a side-effect, Dictionary is modified according to \define and \undef macro directives.

parseOzVirtualString

{Compiler.parseOzVirtualString +V +O +P +Dictionary ?T}

is similar to parseOzFile, except that V denotes the source text itself instead of a source file name.

assemble

{Compiler.assemble +Ts +Xs +SwitchR ?P ?V}

takes a list of bytecode instructions Ts for the Mozart virtual machine (see Appendix D), assembles them and returns the result in P, a nullary procedure which causes the code to be executed when applied. Xs is a list of global variables (the closure of P), the first element corresponding to register g(0). SwitchR is a record whose features are switch names and whose values are booleans. In the current implementation, the switches profile, controlflowinfo, verify, and peephole are used. All features of SwitchR are optional (default values are substituted). V is a lazily computed virtual string containing an external representation of the assembled code after peephole optimization.

4.3 Compiler Engines

Instances of the Compiler.engine class are active objects called compiler engines. Each object's thread processes all queries inside its query queue sequentially.

The final class Compiler.engine inherits from Narrator.'class', described in Appendix B.

4.3.1 Methods of the Compiler.engine Class

enqueue

enqueue(+T ?I <= _)

enqueue(+Ts ?Is <= _)

appends a new query T to the query queue. If T is an unknown query, an exception is raised immediately. All of the query's input arguments (the subtrees of T) are type-checked before it is enqueued.

Internally, each enqueued query is assigned a unique identification number I. This may be used later to remove the query from the queue (unless its execution has already begun or terminated).

The argument to enqueue may also be a list of queries: These are guaranteed to be executed in sequence without other queries interleaving. The second argument then returns a list of identification numbers.

dequeue

dequeue(+I)

dequeues the query with identification number I, if that query is still waiting in the query queue for execution, else does nothing.

interrupt

interrupt()

interrupts execution of the current query. Does not affect the remaining queue.

clearQueue

clearQueue()

flushes the whole remaining queue. Does not affect the currently processed query (if any).

4.3.2 Queries

This chapter documents the queries understood by the Mozart Compiler.

Some queries request state information from the compiler engine. The following description annotates the corresponding output variables with a question mark, although they only become bound when the query is actually executed. If binding an output variable raises an exception, an error is reported through the registered listeners (see Appendix B).

Macro Definitions

macroDefine(+V)

Add V to the set of defined macro names.

macroUndef(+V)

Remove V from the set of defined macro names.

getDefines(?PrintNames)

Return all currently defined macro names as a list, in no particular order.

Compiler Switches

setSwitch(+SwitchName +B)

Set the state of the given switch to either `on', if B == true, or to `off', if B == false.

getSwitch(+SwitchName ?B)

Return the state of the given switch.

pushSwitches()

Save the current settings of all switches onto the internal switch state stack.

popSwitches()

Restore all switch settings from the topmost element of the internal switch state stack, provided it is not empty, else do nothing.

Compiler Options

setMaxNumberOfErrors(+I)

Set the maximal number of errors to report for any one compilation before aborting it to I. A negative value means never to abort.

getMaxNumberOfErrors(?I)

Return the maximal number of errors to report for any one compilation before aborting it.

setBaseURL(+VU)

Set the base URL relative to which the require clause of computed functors is resolved. A value of unit means to resolve the imports relative to the location of the file in which the functor keyword appeared.

getBaseURL(?AU)

Return the base URL relative to which the require clause of computed functors is resolved.

setGumpDirectory(+VU)

Set the directory in which Gump output files are created. Can be relative. unit means the current working directory.

getGumpDirectory(?VU)

Return the directory in which Gump output files are created.

The Environment

addToEnv(+PrintName X)

Add a binding for a variable with the given print name, and bound to X, to the environment.

lookupInEnv(+PrintName X)

Look up the binding for the variable with the given print name in the environment and bind X to its value. If it does not exist, report an error.

removeFromEnv(+PrintName)

Remove the binding for the variable with the given print name from the environment if it exists.

putEnv(+Env)

Replace the current environment by the one given by Env.

mergeEnv(+Env)

Adjoin Env to the current environment, overwriting already existing bindings.

getEnv(?Env)

Return the current environment.

Feeding Source Text

feedVirtualString(+V)

Evaluate the Oz source code given by the virtual string V.

feedVirtualString(+V +R)

Evaluate the Oz source code given by the virtual string V, returning the resulting value in R.result (if the \switch +expression switch is set and R has the feature result).

feedFile(+V)

Evaluate the Oz source code contained in the file with name V.

feedFile(+V +R)

Evaluate the Oz source code contained in the file with name V, returning the resulting value in R.result (if the \switch +expression switch is set and R has the feature result).

Synchronization

ping(?U)

Bind the variable U to unit on execution of this query. This allows to synchronize on the work of the compiler, e. g., to be informed when a compilation is finished.

ping(?U X)

Works like the ping(_) query, except gives a value which will reappear in the response notification sent to interfaces. This allows to identify the ping query with its pong notification.

Custom Front-Ends

setFrontEnd(+ParseFileP +ParseVirtualStringP)

Replace the front-end used by the compiler by a custom front-end implemented by procedures ParseFileP and ParseVirtualStringP. These procedures have the same signature as Compiler.parseOzFile and Compiler.parseOzVirtualString, documented above. Indeed, these procedures implement the default Oz front-end.

4.4 Compiler Interfaces

As said above, compiler engines are narrators. The term ``compiler interface'' simply denotes a standard listener attached to a compiler engine. This section presents what is required to implement a compiler interface.

First the notifications sent by compiler engines are documented. These include normal compiler output and information about compiler state changes. Then a specific compiler interface is described that makes many compilation tasks easy to control.

4.4.1 Sent Notifications

Query Queue

newQuery(I T)

A new query T with identification I has been enqueued.

runQuery(I T)

The query T with identification I is now being executed.

removeQuery(I)

The query with identification I has been removed from the query queue, either because it finished executing or because it was dequeued by a user program.

Compiler Activity

busy()

The compiler is currently busy (i. e., executing a query).

idle()

The compiler is currently idle (i. e., waiting for a query to be enqueued).

State Change

switch(SwitchName B)

The given switch has been set to B.

switches(R)

The settings of all switches is transmitted as a record mapping each switch name to its setting.

maxNumberOfErrors(I)

The maximum number of errors after which to abort compilation has been set to I.

baseURL(AU)

The base URL relative to which the require clause of computed functors is resolved has been set to AU.

env(Env)

The environment has been set to Env.

Output

info(V)

An information message V is to be printed out.

info(V Coord)

An information message V, related to the source coordinates Coord, is to be printed out.

message(R Coord)

An error or warning message R, related to the source coordinates Coord, is to be printed out. R has the standard error message format, described in Chapter 24 of ``System Modules''.

insert(V Coord)

During parsing, the file named V has been read. The corresponding \insert directive (if any) was at source coordinates Coord.

displaySource(TitleV ExtV V)

A source text V with title TitleV is to be displayed; its format is the one for which the file extension ExtV is typically used (such as oz or ozm).

attention()

The error output buffer should be raised with the cursor at the current output coordinates (an error message should follow).

Synchronization

pong(X)

This is sent in response to a ping(_) or ping(_ X) query (see Section 4.3). In the first case, unit is returned in X.

4.4.2 The Compiler.interface Class

The Compiler.interface class is a subclass of the error listener class described in Appendix B. Its purpose is to provide a standard listener powerful enough to server many purposes, to spare the user of defining an own listener.

Methods

In addition to the standard error listener interface, it supports the following methods.

init(+EngineO +VerboseL <= false)

initializes a new compiler interface, attaching it to the compiler engine EngineO. VerboseL can be one of true, false, or auto: If true, all messages, including the compiler's banner, will be output. If false, no messages will be output. If auto, the interface will remain silent unless an error of warning message arrives, in which case it will become verbose.

sync()

waits until the compiler engine becomes idle.

getInsertedFiles(?Vs)

returns a list of the file names that compilation has caused the inclusion of so far, in order of appearance.

getSource(?V)

returns the source that has last been displayed by the compiler (typically some intermediate representation), or the empty string if none.

reset()

clears the internal lists of inserted files and the displayed source.

clear()

is the same as reset.

4.5 Examples

4.5.1 Inspecting the Parse Tree

As a first example, here is an application that expects as single argument the name of an Oz source file. It parses the file, accepting Gump syntax, and displays the parse tree in the Inspector (or parseError if parsing failed). If there was a parse error, it extracts an error messages and prints that to standard output.

functor 
import 
   Narrator('class')
   ErrorListener('class')
   Application(getArgs)
   Compiler(parseOzFile)
   Inspector(inspect)
   System(printInfo)
define 
   PrivateNarratorO
   NarratorO = {New Narrator.'class' init(?PrivateNarratorO)}
   ListenerO = {New ErrorListener.'class' init(NarratorO)}
 
   case {Application.getArgs plain} of [FileName] then 
      {Inspector.inspect
       {Compiler.parseOzFile FileName PrivateNarratorO
        fun {$ Switch} Switch == gump end {NewDictionary}}}
      if {ListenerO hasErrors($)} then 
         {System.printInfo {ListenerO getVS($)}}
         {ListenerO reset()}
      end 
   end 
end 

4.5.2 A Look into the Provided Abstractions

The implementation of the Compiler.evalExpression procedure is a good example of how to use compiler engines and interfaces. evalExpression causes compilation of an expression within a specified environment. It is synchronous, i. e., only returns after the compilation has finished. Compiler error messages are raised as exceptions, and the compilation may be interrupted using the nullary procedure returned in Kill.

Startup

Since we both want to control a compilation (done by a new compiler engine) and to observe the compilation process (to synchronize and to determine whether it produced errors), we first instantiate both an engine and an interface which we register with the engine. A number of queries are enqueued to the engine: We need to set the environment and appropriate compiler switches for compilation of an expression and to cause synchronous execution of the compiled program. When we're done configuring the compiler, we can start compilation of the source proper, expecting a result to be returned in variable Result.

Killing

We then define the Kill procedure. The rest of the observation is performed in a new thread, because we want to kill the observation as well when Kill is invoked. Kill will clear any non-processed queries from the queue and interrupt the current one, then kill the observation thread (unless it had been already dead).

Observing

Next we'll observe the running compiler, and for this we need to make use of the interface we created earlier. When the compiler becomes idle, we check whether it has output any error messages, in which case we record the faulty condition, else we report success. The main thread waits until the condition becomes known and reacts upon it.

proc {Compiler.evalExpression VS Env ?Kill ?Result} E I S in 
   E = {New Compiler.engine init()}
   I = {New Compiler.interface init(E)}
   {E enqueue(mergeEnv(Env))}
   {E enqueue(setSwitch(expression true))}
   {E enqueue(setSwitch(threadedqueries false))}
   {E enqueue(feedVirtualString(VS return(result: ?Result)))}
   thread T in 
      T = {Thread.this}
      proc {Kill}
         {E clearQueue()}
         {E interrupt()}
         try 
            {Thread.terminate T}
            S = killed
         catch _ then skip   % already dead
         end 
      end 
      {I sync()}
      if {I hasErrors($)} then Ms in 
         {I getMessages(?Ms)}
         S = error(compiler(evalExpression VS Ms))
      else 
         S = success
      end 
   end 
   case S of error(M) then 
      {Exception.raiseError M}
   [] success then skip 
   [] killed then skip 
   end 
end 

virtualStringToValue

The Compiler.virtualStringToValue is trivial to implement on top of the functionality provided by evalExpression.

fun {Compiler.virtualStringToValue VS}
   {Compiler.evalExpression VS env() _}
end 


Leif Kornstaedt
Version 1.4.0 (20080702)