next up previous contents
Next: About this document ... Up: Design of CMU Common Previous: Compiler Retargeting   Contents

Subsections

Run-Time System

The Type System

The Info Database

The info database provides a functional interface to global information about named things in CMUCL. Information is considered to be global if it must persist between invocations of the compiler. The use of a functional interface eliminates the need for the compiler to worry about the details of the representation. The info database also handles the need to multiple ``global'' environments, which makes it possible to change something in the compiler without trashing the running Lisp environment.

The info database contains arbitrary lisp values, addressed by a combination of name, class and type. The Name is an EQUAL-thing which is the name of the thing that we are recording information about. Class is the kind of object involved: typical classes are Function, Variable, Type. A type names a particular piece of information within a given class. Class and Type are symbols, but are compared with STRING=.

The IR1 Interpreter

May be worth having a byte-code representation for interpreted code. This way, an entire system could be compiled into byte-code for debugging (the "check-out" compiler?).

Given our current inclination for using a stack machine to interpret IR1, it would be straightforward to layer a byte-code interpreter on top of this.

Instead of having no interpreter, or a more-or-less conventional interpreter, or byte-code interpreter, how about directly executing IR1?

We run through the IR1 passes, possibly skipping optional ones, until we get through environment analysis. Then we run a post-pass that annotates IR1 with information about where values are kept, i.e. the stack slot.

We can lazily convert functions by having FUNCTION make an interpreted function object that holds the code (really a closure over the interpreter). The first time that we try to call the function, we do the conversion and processing. Also, we can easily keep track of which interpreted functions we have expanded macros in, so that macro redefinition automatically invalidates the old expansion, causing lazy reconversion.

Probably the interpreter will want to represent MVs by a recognizable structure that is always heap-allocated. This way, we can punt the stack issues involved in trying to spread MVs. So a continuation value can always be kept in a single cell.

The compiler can have some special frobs for making the interpreter efficient, such as a call operation that extracts arguments from the stack slots designated by a continuation list. Perhaps

    (values-mapcar fun . lists)
<==>
    (values-list (mapcar fun . lists))

This would be used with MV-CALL.

This scheme seems to provide nearly all of the advantages of both the compiler and conventional interpretation. The only significant disadvantage with respect to a conventional interpreter is that there is the one-time overhead of conversion, but doing this lazily should make this quite acceptable.

With respect to a conventional interpreter, we have major advantages: + Full syntax checking: safety comparable to compiled code. + Semantics similar to compiled code due to code sharing. Similar diagnostic messages, etc. Reduction of error-prone code duplication. + Potential for full type checking according to declarations (would require running IR1 optimize?) + Simplifies debugger interface, since interpreted code can look more like compiled code: source paths, edit definition, etc.

For all non-run-time symbol annotations (anything other than SYMBOL-FUNCTION and SYMBOL-VALUE), we use the compiler's global database. MACRO-FUNCTION will use INFO, rather than vice-versa.

When doing the IR1 phases for the interpreter, we probably want to suppress optimizations that change user-visible function calls: - Don't do local call conversion of any named functions (even lexical ones). This is so that a call will appear on the stack that looks like the call in the original source. The keyword and optional argument transformations done by local call mangle things quite a bit. Also, note local-call converting prevents unreferenced arguments from being deleted, which is another non-obvious transformation. - Don't run source-transforms, IR1 transforms and IR1 optimizers. This way, TRACE and BACKTRACE will show calls with the original arguments, rather than the "optimized" form, etc. Also, for the interpreter it will actually be faster to call the original function (which is compiled) than to "inline expand" it. Also, this allows implementation-dependent transforms to expand into

There are some problems with stepping, due to our non-syntactic IR1 representation. The source path information is the key that makes this conceivable. We can skip over the stepping of a subform by quietly evaluating nodes whose source path lies within the form being skipped.

One problem with determining what value has been returned by a form. With a function call, it is theoretically possible to precisely determine this, since if we complete evaluation of the arguments, then we arrive at the Combination node whose value is synonymous with the value of the form. We can even detect this case, since the Node-Source will be EQ to the form. And we can also detect when we unwind out of the evaluation, since we will leave the form without having ever reached this node.

But with macros and special-forms, there is no node whose value is the value of the form, and no node whose source is the macro call or special form. We can still detect when we leave the form, but we can't be sure whether this was a normal evaluation result or an explicit RETURN-FROM.

But does this really matter? It seems that we can print the value returned (if any), then just print the next form to step. In the rare case where we did unwind, the user should be able to figure it out.

[We can look at this as a side-effect of CPS: there isn't any difference between a "normal" return and a non-local one.]

[Note that in any control transfer (normal or otherwise), the stepper may need to unwind out of an arbitrary number of levels of stepping. This is because a form in a TR position may yield its to a node arbitrarily far out.]

Another problem is with deciding what form is being stepped. When we start evaluating a node, we dive into code that is nested somewhere down inside that form. So we actually have to do a loop of asking questions before we do any evaluation. But what do we ask about?

If we ask about the outermost enclosing form that is a subform of the the last form that the user said to execute, then we might offer a form that isn't really evaluated, such as a LET binding list.

But once again, is this really a problem? It is certainly different from a conventional stepper, but a pretty good argument could be made that it is superior. Haven't you ever wanted to skip the evaluation of all the LET bindings, but not the body? Wouldn't it be useful to be able to skip the DO step forms?

All of this assumes that nobody ever wants to step through the guts of a macroexpansion. This seems reasonable, since steppers are for weenies, and weenies don't define macros (hence don't debug them). But there are probably some weenies who don't know that they shouldn't be writing macros.

We could handle this by finding the "source paths" in the expansion of each macro by sticking some special frob in the source path marking the place where the expansion happened. When we hit code again that is in the source, then we revert to the normal source path. Something along these lines might be a good idea anyway (for compiler error messages, for example).

The source path hack isn't guaranteed to work quite so well in generated code, though, since macros return stuff that isn't freshly consed. But we could probably arrange to win as long as any given expansion doesn't return two EQ forms.

It might be nice to have a command that skipped stepping of the form, but printed the results of each outermost enclosed evaluated subform, i.e. if you used this on the DO step-list, it would print the result of each new-value form. I think this is implementable. I guess what you would do is print each value delivered to a DEST whose source form is the current or an enclosing form. Along with the value, you would print the source form for the node that is computing the value.

The stepper can also have a "back" command that "unskips" or "unsteps". This would allow the evaluation of forms that are pure (modulo lexical variable setting) to be undone. This is useful, since in stepping it is common that you skip a form that you shouldn't have, or get confused and want to restart at some earlier point.

What we would do is remember the current node and the values of all local variables. heap before doing each step or skip action. We can then back up the state of all lexical variables and the "program counter". To make this work right with set closure variables, we would copy the cell's value, rather than the value cell itself.

[To be fair, note that this could easily be done with our current interpreter: the stepper could copy the environment alists.]

We can't back up the "program counter" when a control transfer leaves the current function, since this state is implicitly represented in the interpreter's state, and is discarded when we exit. We probably want to ask for confirmation before leaving the function to give users a chance to "unskip" the forms in a TR position.

Another question is whether the conventional stepper is really a good thing to imitate... How about an editor-based mouse-driven interface? Instead of "skipping" and "stepping", you would just designate the next form that you wanted to stop at. Instead of displaying return values, you replace the source text with the printed representation of the value.

It would show the "program counter" by highlighting the *innermost* form that we are about to evaluate, i.e. the source form for the node that we are stopped at. It would probably also be useful to display the start of the form that was used to designate the next stopping point, although I guess this could be implied by the mouse position.

Such an interface would be a little harder to implement than a dumb stepper, but it would be much easier to use. [It would be impossible for an evalhook stepper to do this.]

Use of %PRIMITIVE

Note: %PRIMITIVE can only be used in compiled code. It is a trapdoor into the compiler, not a general syntax for accessing "sub-primitives". It's main use is in implementation-dependent compiler transforms. It saves us the effort of defining a "phony function" (that is not really defined), and also allows direct communication with the code generator through codegen-info arguments.

Some primitives may be exported from the VM so that %PRIMITIVE can be used to make it explicit that an escape routine or interpreter stub is assuming an operation is implemented by the compiler.


Debugger Information

Although the compiler's great freedom in choice of function call conventions and variable representations has major efficiency advantages, it also has unfortunate consequences for the debugger. The debug information that we need is even more elaborate than for conventional "compiled" languages, since we cannot even do a simple backtrace without some debug information. However, once having gone this far, it is not that difficult to go the extra distance, and provide full source level debugging of compiled code.

Full debug information has a substantial space penalty, so we allow different levels of debug information to be specified. In the extreme case, we can totally omit debug information.


The Debug-Info Structure

The Debug-Info structure directly represents information about the source code, and points to other structures that describe the layout of run-time data structures.

Make some sort of minimal debug-info format that would support at least the common cases of level 1 (since that is what we would release), and perhaps level 0. Actually, it seems it wouldn't be hard to crunch nearly all of the debug-function structure and debug-info function map into a single byte-vector. We could have an uncrunch function that restored the current format. This would be used by the debugger, and also could be used by purify to delete parts of the debug-info even when the compiler dumps it in crunched form. [Note that this isn't terribly important if purify is smart about debug-info...]

Compiled source map representation:

[### store in debug-function PC at which env is properly initialized, i.e. args (and return-pc, etc.) in internal locations. This is where a :function-start breakpoint would break.]

[### Note that that we can easily cache the form-number => source-path or form-number => form translation using a vector indexed by form numbers that we build during a walk.]

Instead of using source paths in the debug-info, use "form numbers". The form number of a form is the number of forms that we walk to reach that form when doing a pre-order walk of the source form. [Might want to use a post-order walk, as that would more closely approximate evaluation order.]

We probably want to continue using source-paths in the compiler, since they are quick to compute and to get you to a particular form. [### But actually, I guess we don't have to precompute the source paths and annotate nodes with them: instead we could annotate the nodes with the actual original source form. Then if we wanted to find the location of that form, we could walk the root source form, looking that original form. But we might still need to enter all the forms in a hashtable so that we can tell during IR1 conversion that a given form appeared in the original source.]

Note that form numbers have an interesting property: it is quite efficient to determine whether an arbitrary form is a subform of some other form, since the form number of B will be > than A's number and < A's next sibling's number iff B is a subform of A.

This should be quite useful for doing the source=>pc mapping in the debugger, since that problem reduces to finding the subset of the known locations that are for subforms of the specified form.

Assume a byte vector with a standard variable-length integer format, something like this:

    0..253 => the integer
    254 => read next two bytes for integer
    255 => read next four bytes for integer

Then a compiled debug block is just a sequence of variable-length integers in a particular order, something like this:

    number of successors
    ...offsets of each successor in the function's blocks vector...
    first PC
    [offset of first top-level form (in forms) (only if not component default)]
    form number of first source form
    first live mask (length in bytes determined by number of VARIABLES)
    ...more <PC, top-level form offset, form-number, live-set> tuples...

We determine the number of locations recorded in a block by finding the start of the next compiled debug block in the blocks vector.

[### Actually, only need 2 bits for number of successors 0,1,2. We might want to use other bits in the first byte to indicate the kind of location.] [### We could support local packing by having a general concept of "alternate locations" instead of just regular and save locations. The location would have a bit indicating that there are alternate locations, in which case we read the number of alternate locations and then that many more SC-OFFSETs. In the debug-block, we would have a second bit mask with bits set for TNs that are in an alternate location. We then read a number for each such TN, with the value being interpreted as an index into the Location's alternate locations.]

It looks like using structures for the compiled-location-info is too bulky. Instead we need some packed binary representation.

First, let's represent an SC/offset pair with an "SC-Offset", which is an integer with the SC in the low 5 bits and the offset in the remaining bits:

    ----------------------------------------------------
    | Offset (as many bits as necessary) | SC (5 bits) |
    ----------------------------------------------------
Probably the result should be constrained to fit in a fixnum, since it will be more efficient and gives more than enough possible offsets.

We can then represent a compiled location like this:

    single byte of boolean flags:
	uninterned name
	packaged name
	environment-live
	has distinct save location
        has ID (name not unique in this fun)
    name length in bytes (as var-length integer)
    ...name bytes...
    [if packaged, var-length integer that is package name length]
     ...package name bytes...]
    [If has ID, ID as var-length integer]
    SC-Offset of primary location (as var-length integer)
    [If has save SC, SC-Offset of save location (as var-length integer)]

But for a whizzy breakpoint facility, we would need a good source=>code map. Dumping a complete code=>source map might be as good a way as any to represent this, due to the one-to-many relationship between source and code locations.

We might be able to get away with just storing the source locations for the beginnings of blocks and maintaining a mapping from code ranges to blocks. This would be fine both for the profiler and for the "where am I running now" indication. Users might also be convinced that it was most interesting to break at block starts, but I don't really know how easily people could develop an understanding of basic blocks.

It could also be a bit tricky to map an arbitrary user-designated source location to some "closest" source location actually in the debug info. This problem probably exists to some degree even with a full source map, since some forms will never appear as the source of any node. It seems you might have to negotiate with the user. He would mouse something, and then you would highlight some source form that has a common prefix (i.e. is a prefix of the user path, or vice-versa.) If they aren't happy with the result, they could try something else. In some cases, the designated path might be a prefix of several paths. This ambiguity might be resolved by picking the shortest path or letting the user choose.

At the primitive level, I guess what this means is that the structure of source locations (i.e. source paths) must be known, and the source=>code operation should return a list of <source,code> pairs, rather than just a list of code locations. This allows the debugger to resolve the ambiguity however it wants.

I guess the formal definition of which source paths we would return is: All source paths in the debug info that have a maximal common prefix with the specified path. i.e. if several paths have the complete specified path as a prefix, we return them all. Otherwise, all paths with an equally large common prefix are returned: if the path with the most in common matches only the first three elements, then we return all paths that match in the first three elements. As a degenerate case (which probably shouldn't happen), if there is no path with anything in common, then we return *all* of the paths.

In the DEBUG-SOURCE structure we may ultimately want a vector of the start positions of each source form, since that would make it easier for the debugger to locate the source. It could just open the file, FILE-POSITION to the form, do a READ, then loop down the source path. Of course, it could read each form starting from the beginning, but that might be too slow.

Do XEPs really need Debug-Functions? The only time that we will commonly end up in the debugger on an XEP is when an argument type check fails. But I suppose it would be nice to be able to print the arguments passed...

Note that assembler-level code motion such as pipeline reorganization can cause problems with our PC maps. The assembler needs to know that debug info markers are different from real labels anyway, so I suppose it could inhibit motion across debug markers conditional on policy. It seems unworthwhile to remember the node for each individual instruction.

For tracing block-compiled calls: Info about return value passing locations? Info about where all the returns are?

We definitely need the return-value passing locations for debug-return. The question is what the interface should be. We don't really want to have a visible debug-function-return-locations operation, since there are various value passing conventions, and we want to paper over the differences.

Probably should be a compiler option to initialize stack frame to a special uninitialized object (some random immediate type). This would aid debugging, and would also help GC problems. For the latter reason especially, this should be locally-turn-onable (off of policy? the new debug-info quality?).

What about the interface between the evaluator and the debugger? (i.e. what happens on an error, etc.) Compiler error handling should be integrated with run-time error handling. Ideally the error messages should look the same. Practically, in some cases the run-time errors will have less information. But the error should look the same to the debugger (or at least similar).

Debugger Interface

How does the debugger interface to the "evaluator" (where the evaluator means all of native code, byte-code and interpreted IR1)? It seems that it would be much more straightforward to have a consistent user interface to debugging all code representations if there was a uniform debugger interface to the underlying stuff, and vice-versa.

Of course, some operations might not be supported by some representations, etc. For example, fine-control stepping might not be available in native code. In other cases, we might reduce an operation to the lowest common denominator, for example fetching lexical variables by string and admitting the possibility of ambiguous matches. [Actually, it would probably be a good idea to store the package if we are going to allow variables to be closed over.]

Some objects we would need:

Location:
	The constant information about the place where a value is stored,
        everything but which particular frame it is in.  Operations:
        location name, type, etc.
        location-value frame location (setf'able)
	monitor-location location function
            Function is called whenever location is set with the location,
            frame and old value.  If active values aren't supported, then we
            dummy the effect using breakpoints, in which case the change won't
            be noticed until the end of the block (and intermediate changes
            will be lost.)
debug info:
        All the debug information for a component.
Frame:
	frame-changed-locations frame => location*
            Return a list of the locations in frame that were changed since the
            last time this function was called.  Or something.  This is for
            displaying interesting state changes at breakpoints.
	save-frame-state frame => frame-state
	restore-frame-state frame frame-state
	    These operations allow the debugger to back up evaluation, modulo
	    side-effects and non-local control transfers.  This copies and
	    restores all variables, temporaries, etc, local to the frame, and
	    also the current PC and dynamic environment (current catch, etc.)

	    At the time of the save, the frame must be for the running function
	    (not waiting for a call to return.)  When we restore, the frame
	    becomes current again, effectively exiting from any frames on top.
	    (Of course, frame must not already be exited.)
       
Thread:
        Representation of which stack to use, etc.
Block:
        What successors the block has, what calls there are in the block.
        (Don't need to know where calls are as long as we know called function,
        since can breakpoint at the function.)  Whether code in this block is
        wildly out of order due to being the result of loop-invariant
        optimization, etc.  Operations:
        block-successors block => code-location*
        block-forms block => (source-location code-location)*
            Return the corresponding source locations and code locations for
            all forms (and form fragments) in the block.

Variable maps

There are about five things that the debugger might want to know about a variable:

Name Although a lexical variable's name is "really" a symbol (package and all), in practice it doesn't seem worthwhile to require all the symbols for local variable names to be retained. There is much less VM and GC overhead for a constant string than for a symbol. (Also it is useful to be able to access gensyms in the debugger, even though they are theoretically ineffable).

ID Which variable with the specified name is this? It is possible to have multiple variables with the same name in a given function. The ID is something that makes Name unique, probably a small integer. When variables aren't unique, we could make this be part of the name, e.g. "FOO#1", "FOO#2". But there are advantages to keeping this separate, since in many cases lifetime information can be used to disambiguate, making qualification unnecessary.

SC When unboxed representations are in use, we must have type information to properly read and write a location. We only need to know the SC for this, which would be amenable to a space-saving numeric encoding.

Location Simple: the offset in SC. [Actually, we need the save location too.]

Lifetime In what parts of the program does this variable hold a meaningful value? It seems prohibitive to record precise lifetime information, both in space and compiler effort, so we will have to settle for some sort of approximation.

The finest granularity at which it is easy to determine liveness is the block: we can regard the variable lifetime as the set of blocks that the variable is live in. Of course, the variable may be dead (and thus contain meaningless garbage) during arbitrarily large portions of the block.

Note that this subsumes the notion of which function a variable belongs to. A given block is only in one function, so the function is implicit.

The variable map should represent this information space-efficiently and with adequate computational efficiency.

The SC and ID can be represented as small integers. Although the ID can in principle be arbitrarily large, it should be 100 in practice. The location can be represented by just the offset (a moderately small integer), since the SB is implicit in the SC.

The lifetime info can be represented either as a bit-vector indexed by block numbers, or by a list of block numbers. Which is more compact depends both on the size of the component and on the number of blocks the variable is live in. In the limit of large component size, the sparse representation will be more compact, but it isn't clear where this crossover occurs. Of course, it would be possible to use both representations, choosing the more compact one on a per-variable basis. Another interesting special case is when the variable is live in only one block: this may be common enough to be worth picking off, although it is probably rarer for named variables than for TNs in general.

If we dump the type, then a normal list-style type descriptor is fine: the space overhead is small, since the shareability is high.

We could probably save some space by cleverly representing the var-info as parallel vectors of different types, but this would be more painful in use. It seems better to just use a structure, encoding the unboxed fields in a fixnum. This way, we can pass around the structure in the debugger, perhaps even exporting it from the low-level debugger interface.

[### We need the save location too. This probably means that we need two slots of bits, since we need the save offset and save SC. Actually, we could let the save SC be implied by the normal SC, since at least currently, we always choose the same save SC for a given SC. But even so, we probably can't fit all that stuff in one fixnum without squeezing a lot, so we might as well split and record both SCs.

In a localized packing scheme, we would have to dump a different var-info whenever either the main location or the save location changes. As a practical matter, the save location is less likely to change than the main location, and should never change without the main location changing.

One can conceive of localized packing schemes that do saving as a special case of localized packing. If we did this, then the concept of a save location might be eliminated, but this would require major changes in the IR2 representation for call and/or lifetime info. Probably we will want saving to continue to be somewhat magical.]

How about:

(defstruct var-info
  ;;
  ;; This variable's name. (symbol-name of the symbol)
  (name nil :type simple-string)
  ;;
  ;; The SC, ID and offset, encoded as bit-fields.
  (bits nil :type fixnum)
  ;;
  ;; The set of blocks this variable is live in.  If a bit-vector, then it has
  ;; a 1 when indexed by the number of a block that it is live in.  If an
  ;; I-vector, then it lists the live block numbers.  If a fixnum, then that is
  ;; the number of the sole live block.
  (lifetime nil :type (or vector fixnum))
  ;;
  ;; The variable's type, represented as list-style type descriptor.
  type)

Then the debug-info holds a simple-vector of all the var-info structures for that component. We might as well make it sorted alphabetically by name, so that we can binary-search to find the variable corresponding to a particular name.

We need to be able to translate PCs to block numbers. This can be done by an I-Vector in the component that contains the start location of each block. The block number is the index at which we find the correct PC range. This requires that we use an emit-order block numbering distinct from the IR2-Block-Number, but that isn't any big deal. This seems space-expensive, but it isn't too bad, since it would only be a fraction of the code size if the average block length is a few words or more.

An advantage of our per-block lifetime representation is that it directly supports keeping a variable in different locations when in different blocks, i.e. multi-location packing. We use a different var-info for each different packing, since the SC and offset are potentially different. The Name and ID are the same, representing the fact that it is the same variable. It is here that the ID is most significant, since the debugger could otherwise make same-name variables unique all by itself.

Stack parsing

[### Probably not worth trying to make the stack parseable from the bottom up. There are too many complications when we start having variable sized stuff on the stack. It seems more profitable to work on making top-down parsing robust. Since we are now planning to wire the bottom-up linkage info, scanning from the bottom to find the top frame shouldn't be too inefficient, even when there was a runaway recursion. If we somehow jump into hyperspace, then the debugger may get confused, but we can debug this sort of low-level system lossage using ADB.]

There are currently three relevant context pointers: - The PC. The current PC is wired (implicit in the machine). A saved PC (RETURN-PC) may be anywhere in the current frame. - The current stack context (CONT). The current CONT is wired. A saved CONT (OLD-CONT) may be anywhere in the current frame. - The current code object (ENV). The current ENV is wired. When saved, this is extra-difficult to locate, since it is saved by the caller, and is thus at an unknown offset in OLD-CONT, rather than anywhere in the current frame.

We must have all of these to parse the stack.

With the proposed Debug-Function, we parse the stack (starting at the top) like this: 1] Use ENV to locate the current Debug-Info 2] Use the Debug-Info and PC to determine the current Debug-Function. 3] Use the Debug-Function to find the OLD-CONT and RETURN-PC. 4] Find the old ENV by searching up the stack for a saved code object containing the RETURN-PC. 5] Assign old ENV to ENV, OLD-CONT to CONT, RETURN-PC to PC and goto 1.

If we changed the function representation so that the code and environment were a single object, then the location of the old ENV would be simplified. But we still need to represent ENV as separate from PC, since interrupts and errors can happen when the current PC isn't positioned at a valid return PC.

It seems like it might be a good idea to save OLD-CONT, RETURN-PC and ENV at the beginning of the frame (before any stack arguments). Then we wouldn't have to search to locate ENV, and we also have a hope of parsing the stack even if it is damaged. As long as we can locate the start of some frame, we can trace the stack above that frame. We can recognize a probable frame start by scanning the stack for a code object (presumably a saved ENV).

Probably we want some fairly general mechanism for specifying that a TN should be considered to be live for the duration of a specified environment. It would be somewhat easier to specify that the TN is live for all time, but this would become very space-inefficient in large block compilations.

This mechanism could be quite useful for other debugger-related things. For example, when debuggability is important, we could make the TNs holding arguments live for the entire environment. This would guarantee that a backtrace would always get the right value (modulo setqs).

Note that in this context, "environment" means the Environment structure (one per non-let function). At least according to current plans, even when we do inter-routine register allocation, the different functions will have different environments: we just "equate" the environments. So the number of live per-environment TNs is bounded by the size of a "function", and doesn't blow up in block compilation.

The implementation is simple: per-environment TNs are flagged by the :Environment kind. :Environment TNs are treated the same as :Normal TNs by everyone except for lifetime/conflict analysis. An environment's TNs are also stashed in a list in the IR2-Environment structure. During the conflict analysis post-pass, we look at each block's environment, and make all the environment's TNs always-live in that block.

We can implement the "fixed save location" concept needed for lazy frame creation by allocating the save TNs as wired TNs at IR2 conversion time. We would use the new "environment lifetime" concept to specify the lifetimes of the save locations. There isn't any run-time overhead if we never get around to using the save TNs. [Pack would also have to notice TNs with pre-allocated save TNs, packing the original TN in the stack location if its FSC is the stack.]

We want a standard (recognizable) format for an "escape" frame. We must make an escape frame whenever we start running another function without the current function getting a chance to save its registers. This may be due either to a truly asynchronous event such as a software interrupt, or due to an "escape" from a miscop. An escape frame marks a brief conversion to a callee-saves convention.

Whenever a miscop saves registers, it should make an escape frame. This ensures that the "current" register contents can always be located by the debugger. In this case, it may be desirable to be able to indicate that only partial saving has been done. For example, we don't want to have to save all the FP registers just so that we can use a couple extra general registers.

When the debugger see an escape frame, it knows that register values are located in the escape frame's "register save" area, rather than in the normal save locations.

It would be nice if there was a better solution to this internal error concept. One problem is that it seems there is a substantial space penalty for emitting all that error code, especially now that we don't share error code between errors because we want to preserve the source context in the PC. But this probably isn't really all that bad when considered as a fraction of the code. For example, the check part of a type check is 12 bytes, whereas the error part is usually only 6. In this case, we could never reduce the space overhead for type checks by more than 1/3, thus the total code size reduction would be small. This will be made even less important when we do type check optimizations to reduce the number of type checks.

Probably we should stick to the same general internal error mechanism, but make it interact with the debugger better by allocating linkage registers and allowing proceedable errors. We could support shared error calls and non-proceedable errors when space is more important than debuggability, but this is probably more complexity than is worthwhile.

We jump or trap to a routine that saves the context (allocating at most the return PC register). We then encode the error and context in the code immediately following the jump/trap. (On the MIPS, the error code can be encoded in the trap itself.) The error arguments would be encoded as SC-offsets relative to the saved context. This could solve both the arg-trashing problem and save space, since we could encode the SC-offsets more tersely than the corresponding move instructions.

Object Format

Tagging

The following is a key of the three bit low-tagging scheme:

000
even fixnum
001
function pointer
010
even other-immediate (header-words, characters, symbol-value trap value, etc.)
011
list pointer
100
odd fixnum
101
structure pointer
110
odd other immediate
111
other-pointer to data-blocks (other than conses, structures, and functions)

This tagging scheme forces a dual-word alignment of data-blocks on the heap, but this can be pretty negligible:

Everything else is vector-like including code, so these probably take up so many words that one extra one doesn't matter.

GC Comments

Data-Blocks comprise only descriptors, or they contain immediate data and raw bits interpreted by the system. GC must skip the latter when scanning the heap, so it does not look at a word of raw bits and interpret it as a pointer descriptor. These data-blocks require headers for GC as well as for operations that need to know how to interpret the raw bits. When GC is scanning, and it sees a header-word, then it can determine how to skip that data-block if necessary. Header-Words are tagged as other-immediates. See the sections "Other-Immediates" and "Data-Blocks and Header-Words" for comments on distinguishing header-words from other-immediate data. This distinction is necessary since we scan through data-blocks containing only descriptors just as we scan through the heap looking for header-words introducing data-blocks.

Data-Blocks containing only descriptors do not require header-words for GC since the entire data-block can be scanned by GC a word at a time, taking whatever action is necessary or appropriate for the data in that slot. For example, a cons is referenced by a descriptor with a specific tag, and the system always knows the size of this data-block. When GC encounters a pointer to a cons, it can transport it into the new space, and when scanning, it can simply scan the two words manifesting the cons interpreting each word as a descriptor. Actually there is no cons tag, but a list tag, so we make sure the cons is not nil when appropriate. A header may still be desired if the pointer to the data-block does not contain enough information to adequately maintain the data-block. An example of this is a simple-vector containing only descriptor slots, and we attach a header-word because the descriptor pointing to the vector lacks necessary information - the type of the vector's elements, its length, etc.

There is no need for a major tag for GC forwarding pointers. Since the tag bits are in the low end of the word, a range check on the start and end of old space tells you if you need to move the thing. This is all GC overhead.

Structures

A structure descriptor has the structure lowtag type code, making structurep a fast operation. A structure data-block has the following format:

    -------------------------------------------------------
    |   length (24 bits) | Structure header type (8 bits) |
    -------------------------------------------------------
    |   structure type name (a symbol)                    |
    -------------------------------------------------------
    |   structure slot 0                                  |
    -------------------------------------------------------
    |   ... structure slot length - 2                     |
    -------------------------------------------------------

The header word contains the structure length, which is the number of words (other than the header word.) The length is always at least one, since the first word of the structure data is the structure type name.

Fixnums

A fixnum has one of the following formats in 32 bits:

    -------------------------------------------------------
    |        30 bit 2's complement even integer   | 0 0 0 |
    -------------------------------------------------------
or
    -------------------------------------------------------
    |        30 bit 2's complement odd integer    | 1 0 0 |
    -------------------------------------------------------

Effectively, there is one tag for immediate integers, two zeros. This buys one more bit for fixnums, and now when these numbers index into simple-vectors or offset into memory, they point to word boundaries on 32-bit, byte-addressable machines. That is, no shifting need occur to use the number directly as an offset.

This format has another advantage on byte-addressable machines when fixnums are offsets into vector-like data-blocks, including structures. Even though we previously mentioned data-blocks are dual-word aligned, most indexing and slot accessing is word aligned, and so are fixnums with effectively two tag bits.

Two tags also allow better usage of special instructions on some machines that can deal with two low-tag bits but not three.

Since the two bits are zeros, we avoid having to mask them off before using the words for arithmetic, but division and multiplication require special shifting.

Other-immediates

As for fixnums, there are two different three-bit lowtag codes for other-immediate, allowing 64 other-immediate types:

----------------------------------------------------------------
|   Data (24 bits)        | Type (8 bits with low-tag)   | 1 0 |
----------------------------------------------------------------

The type-code for an other-immediate type is considered to include the two lowtag bits. This supports the concept of a single "type code" namespace for all descriptors, since the normal lowtag codes are disjoint from the other-immediate codes.

For other-pointer objects, the full eight bits of the header type code are used as the type code for that kind of object. This is why we use two lowtag codes for other-immediate types: each other-pointer object needs a distinct other-immediate type to mark its header.

The system uses the other-immediate format for characters, the symbol-value unbound trap value, and header-words for data-blocks on the heap. The type codes are laid out to facilitate range checks for common subtypes; for example, all numbers will have contiguous type codes which are distinct from the contiguous array type codes. See section 38.7 for details.

Data-Blocks and Header-Word Format

Pointers to data-blocks have the following format:

----------------------------------------------------------------
|      Dual-word address of data-block (29 bits)       | 1 1 1 |
----------------------------------------------------------------

The word pointed to by the above descriptor is a header-word, and it has the same format as an other-immediate:

----------------------------------------------------------------
|   Data (24 bits)        | Type (8 bits with low-tag) | 0 1 0 |
----------------------------------------------------------------
This is convenient for scanning the heap when GC'ing, but it does mean that whenever GC encounters an other-immediate word, it has to do a range check on the low byte to see if it is a header-word or just a character (for example). This is easily acceptable performance hit for scanning.

The system interprets the data portion of the header-word for non-vector data-blocks as the word length excluding the header-word. For example, the data field of the header for ratio and complex numbers is two, one word each for the numerator and denominator or for the real and imaginary parts.

For vectors and data-blocks representing Lisp objects stored like vectors, the system ignores the data portion of the header-word:

----------------------------------------------------------------
| Unused Data (24 bits)   | Type (8 bits with low-tag) | 0 1 0 |
----------------------------------------------------------------
|           Element Length of Vector (30 bits)           | 0 0 | 
----------------------------------------------------------------

Using a separate word allows for much larger vectors, and it allows length to simply access a single word without masking or shifting. Similarly, the header for complex arrays and vectors has a second word, following the header-word, the system uses for the fill pointer, so computing the length of any array is the same code sequence.

Data-Blocks and Other-immediates Typing

These are the other-immediate types. We specify them including all low eight bits, including the other-immediate tag, so we can think of the type bits as one type - not an other-immediate major type and a subtype. Also, fetching a byte and comparing it against a constant is more efficient than wasting even a small amount of time shifting out the other-immediate tag to compare against a five bit constant.

Number   (< 36)
  bignum                                           10
    ratio                                          14
    single-float                                   18
    double-float                                   22
    complex                                        26
    (complex single-float)                         30
    (complex double-float)                         34

Array   (>= 38 code 118)
   Simple-Array   (>= 38 code 102)
         simple-array                              38
      Vector  (>= 42 code 114)
         simple-string                             42
         simple-bit-vector                         46
         simple-vector                             50
         (simple-array (unsigned-byte 2) (*))      54
         (simple-array (unsigned-byte 4) (*))      58
         (simple-array (unsigned-byte 8) (*))      62
         (simple-array (unsigned-byte 16) (*))     66
         (simple-array (unsigned-byte 32) (*))     70
         (simple-array (signed-byte 8) (*))        74
         (simple-array (signed-byte 16) (*))       78
         (simple-array (signed-byte 30) (*))       82
         (simple-array (signed-byte 32) (*))       86
         (simple-array single-float (*))           90
         (simple-array double-float (*))           94
         (simple-array (complex single-float) (*)  98
         (simple-array (complex double-float) (*)  102
      complex-string                               106
      complex-bit-vector                           110
      (array * (*))   -- general complex vector.   114
   complex-array                                   118

code-header-type                                   122
function-header-type                               126
closure-header-type                                130
funcallable-instance-header-type                   134
unused-function-header-1-type                      138
unused-function-header-2-type                      142
unused-function-header-3-type                      146
closure-function-header-type                       150
return-pc-header-type (a.k.a LRA)                  154
value-cell-header-type                             158
symbol-header-type                                 162
base-character-type                                166
system-area-pointer-type (header type)             170
unbound-marker                                     174
weak-pointer-type                                  178
structure-header-type                              182
fdefn-type                                         186

Strings

All strings in the system are C-null terminated. This saves copying the bytes when calling out to C. The only time this wastes memory is when the string contains a multiple of eight characters, and then the system allocates two more words (since Lisp objects are dual-word aligned) to hold the C-null byte. Since the system will make heavy use of C routines for systems calls and libraries that save reimplementation of higher level operating system functionality (such as pathname resolution or current directory computation), saving on copying strings for C should make C call out more efficient.

The length word in a string header, see section "Data-Blocks and Header-Word Format", counts only the characters truly in the Common Lisp string. Allocation and GC will have to know to handle the extra C-null byte, and GC already has to deal with rounding up various objects to dual-word alignment.

Symbols and NIL

Symbol data-block has the following format:

-------------------------------------------------------
|     7 (data-block words)     | Symbol Type (8 bits) |
-------------------------------------------------------
|               Value Descriptor                      |
-------------------------------------------------------
|                       Function Pointer              |
-------------------------------------------------------
|                     Raw Function Address            |
-------------------------------------------------------
|                        Setf Function                |
-------------------------------------------------------
|                        Property List                |
-------------------------------------------------------
|                          Print Name                 |
-------------------------------------------------------
|                           Package                   |
-------------------------------------------------------

Most of these slots are self-explanatory given what symbols must do in Common Lisp, but a couple require comments. We added the Raw Function Address slot to speed up named call which is the most common calling convention. This is a non-descriptor slot, but since objects are dual word aligned, the value inherently has fixnum low-tag bits. The GC method for symbols must know to update this slot. The Setf Function slot is currently unused, but we had an extra slot due to adding Raw Function Address since objects must be dual-word aligned.

The issues with nil are that we want it to act like a symbol, and we need list operations such as CAR and CDR to be fast on it. CMU Common Lisp solves this by putting nil as the first object in static space, where other global values reside, so it has a known address in the system:

-------------------------------------------------------  <-- space
|                               0                     |      start
-------------------------------------------------------
|     7 (data-block words)     | Symbol Type (8 bits) |
-------------------------------------------------------  <-- nil
|                           Value/CAR                 |
-------------------------------------------------------
|                         Definition/CDR              |
-------------------------------------------------------
|                      Raw Function Address           |
-------------------------------------------------------
|                         Setf Function               |
-------------------------------------------------------
|                         Property List               |
-------------------------------------------------------
|                           Print Name                |
-------------------------------------------------------
|                            Package                  |
-------------------------------------------------------
|                              ...                    |
-------------------------------------------------------
In addition, we make the list typed pointer to nil actually point past the header word of the nil symbol data-block. This has usefulness explained below. The value and definition of nil are nil. Therefore, any reference to nil used as a list has quick list type checking, and CAR and CDR can go right through the first and second words as if nil were a cons object.

When there is a reference to nil used as a symbol, the system adds offsets to the address the same as it does for any symbol. This works due to a combination of nil pointing past the symbol header-word and the chosen list and other-pointer type tags. The list type tag is four less than the other-pointer type tag, but nil points four additional bytes into its symbol data-block.

;;;; Array Headers.

The array-header data-block has the following format:

----------------------------------------------------------------
| Header Len (24 bits) = Array Rank +5   | Array Type (8 bits) |
----------------------------------------------------------------
|               Fill Pointer (30 bits)                   | 0 0 | 
----------------------------------------------------------------
|               Fill Pointer p (29 bits) -- t or nil   | 1 1 1 |
----------------------------------------------------------------
|               Available Elements (30 bits)             | 0 0 | 
----------------------------------------------------------------
|               Data Vector (29 bits)                  | 1 1 1 | 
----------------------------------------------------------------
|               Displacement (30 bits)                   | 0 0 | 
----------------------------------------------------------------
|               Displacedp (29 bits) -- t or nil       | 1 1 1 | 
----------------------------------------------------------------
|               Range of First Index (30 bits)           | 0 0 | 
----------------------------------------------------------------
                              .
                              .
                              .
The array type in the header-word is one of the eight-bit patterns from section "Data-Blocks and Other-immediates Typing", indicating that this is a complex string, complex vector, complex bit-vector, or a multi-dimensional array. The data portion of the other-immediate word is the length of the array header data-block. Due to its format, its length is always five greater than the array's number of dimensions. The following words have the following interpretations and types:
Fill Pointer:
This is a fixnum indicating the number of elements in the data vector actually in use. This is the logical length of the array, and it is typically the same value as the next slot. This is the second word, so LENGTH of any array, with or without an array header, is just four bytes off the pointer to it.
Fill Pointer P:
This is either T or NIL and indicates whether the array uses the fill-pointer or not.
Available Elements:
This is a fixnum indicating the number of elements for which there is space in the data vector. This is greater than or equal to the logical length of the array when it is a vector having a fill pointer.
Data Vector:
This is a pointer descriptor referencing the actual data of the array. This a data-block whose first word is a header-word with an array type as described in sections "Data-Blocks and Header-Word Format" and "Data-Blocks and Other-immediates Typing"
Displacement:
This is a fixnum added to the computed row-major index for any array. This is typically zero.
Displacedp:
This is either t or nil. This is separate from the displacement slot, so most array accesses can simply add in the displacement slot. The rare need to know if an array is displaced costs one extra word in array headers which probably aren't very frequent anyway.
Range of First Index:
This is a fixnum indicating the number of elements in the first dimension of the array. Legal index values are zero to one less than this number inclusively. IF the array is zero-dimensional, this slot is non-existent.
... (remaining slots):
There is an additional slot in the header for each dimension of the array. These are the same as the Range of First Index slot.

Bignums

Bignum data-blocks have the following format:

-------------------------------------------------------
|      Length (24 bits)        | Bignum Type (8 bits) |
-------------------------------------------------------
|             least significant bits                  |
-------------------------------------------------------
                            .
                            .
                            .
The elements contain the two's complement representation of the integer with the least significant bits in the first element or closer to the header. The sign information is in the high end of the last element.

Code Data-Blocks

A code data-block is the run-time representation of a "component". A component is a connected portion of a program's flow graph that is compiled as a single unit, and it contains code for many functions. Some of these functions are callable from outside of the component, and these are termed "entry points".

Each entry point has an associated user-visible function data-block (of type function). The full call convention provides for calling an entry point specified by a function object.

Although all of the function data-blocks for a component's entry points appear to the user as distinct objects, the system keeps all of the code in a single code data-block. The user-visible function object is actually a pointer into the middle of a code data-block. This allows any control transfer within a component to be done using a relative branch.

Besides a function object, there are other kinds of references into the middle of a code data-block. Control transfer into a function also occurs at the return-PC for a call. The system represents a return-PC somewhat similarly to a function, so GC can also recognize a return-PC as a reference to a code data-block. This representation is known as a Lisp Return Address (LRA).

It is incorrect to think of a code data-block as a concatenation of "function data-blocks". Code for a function is not emitted in any particular order with respect to that function's function-header (if any). The code following a function-header may only be a branch to some other location where the function's "real" definition is.

The following are the three kinds of pointers to code data-blocks:

Code pointer (labeled A below):
A code pointer is a descriptor, with other-pointer low-tag bits, pointing to the beginning of the code data-block. The code pointer for the currently running function is always kept in a register (CODE). In addition to allowing loading of non-immediate constants, this also serves to represent the currently running function to the debugger.
LRA (labeled B below):
The LRA is a descriptor, with other-pointer low-tag bits, pointing to a location for a function call. Note that this location contains no descriptors other than the one word of immediate data, so GC can treat LRA locations the same as instructions.
Function (labeled C below):
A function is a descriptor, with function low-tag bits, that is user callable. When a function header is referenced from a closure or from the function header's self-pointer, the pointer has other-pointer low-tag bits, instead of function low-tag bits. This ensures that the internal function data-block associated with a closure appears to be uncallable (although users should never see such an object anyway).

Information about functions that is only useful for entry points is kept in some descriptors following the function's self-pointer descriptor. All of these together with the function's header-word are known as the "function header". GC must be able to locate the function header. We provide for this by chaining together the function headers in a NIL terminated list kept in a known slot in the code data-block.

A code data-block has the following format:

A -->
****************************************************************
|  Header-Word count (24 bits)    |   Code-Type (8 bits)       |
----------------------------------------------------------------
|  Number of code words (fixnum tag)                           |
----------------------------------------------------------------
|  Pointer to first function header (other-pointer tag)        |
----------------------------------------------------------------
|  Debug information (structure tag)                           |
----------------------------------------------------------------
|  First constant (a descriptor)                               |
----------------------------------------------------------------
|  ...                                                         |
----------------------------------------------------------------
|  Last constant (and last word of code header)                |
----------------------------------------------------------------
|  Some instructions (non-descriptor)                          |
----------------------------------------------------------------
|     (pad to dual-word boundary if necessary)                 |

B -->
****************************************************************
|  Word offset from code header (24)   |   Return-PC-Type (8)  |
----------------------------------------------------------------
|  First instruction after return                              |
----------------------------------------------------------------
|  ... more code and LRA header-words                          |
----------------------------------------------------------------
|     (pad to dual-word boundary if necessary)                 |

C -->
****************************************************************
|  Offset from code header (24)  |   Function-Header-Type (8)  |
----------------------------------------------------------------
|  Self-pointer back to previous word (with other-pointer tag) |
----------------------------------------------------------------
|  Pointer to next function (other-pointer low-tag) or NIL     |
----------------------------------------------------------------
|  Function name (a string or a symbol)                        |
----------------------------------------------------------------
|  Function debug arglist (a string)                           |
----------------------------------------------------------------
|  Function type (a list-style function type specifier)        |
----------------------------------------------------------------
|  Start of instructions for function (non-descriptor)         |
----------------------------------------------------------------
|  More function headers and instructions and return PCs,      |
|  until we reach the total size of header-words + code        |
|  words.                                                      |
----------------------------------------------------------------

The following are detailed slot descriptions:

Code data-block header-word:
The immediate data in the code data-block's header-word is the number of leading descriptors in the code data-block, the fixed overhead words plus the number of constants. The first non-descriptor word, some code, appears at this word offset from the header.
Number of code words:
The total number of non-header-words in the code data-block. The total word size of the code data-block is the sum of this slot and the immediate header-word data of the previous slot. header-word.
Pointer to first function header:
A NIL-terminated list of the function headers for all entry points to this component.
Debug information:
The DEBUG-INFO structure describing this component. All information that the debugger wants to get from a running function is kept in this structure. Since there are many functions, the current PC is used to locate the appropriate debug information. The system keeps the debug information separate from the function data-block, since the currently running function may not be an entry point. There is no way to recover the function object for the currently running function, since this data-block may not exist.
First constant ... last constant:
These are the constants referenced by the component, if there are any.

LRA header word:
The immediate header-word data is the word offset from the enclosing code data-block's header-word to this word. This allows GC and the debugger to easily recover the code data-block from an LRA. The code at the return point restores the current code pointer using a subtract immediate of the offset, which is known at compile time.

Function entry point header-word:
The immediate header-word data is the word offset from the enclosing code data-block's header-word to this word. This is the same as for the return-PC header-word.
Self-pointer back to header-word:
In a non-closure function, this self-pointer to the previous header-word allows the call sequence to always indirect through the second word in a user callable function. See section "Closure Format". With a closure, indirecting through the second word gets you a function header-word. The system ignores this slot in the function header for a closure, since it has already indirected once, and this slot could be some random thing that causes an error if you jump to it. This pointer has an other-pointer tag instead of a function pointer tag, indicating it is not a user callable Lisp object.
Pointer to next function:
This is the next link in the thread of entry point functions found in this component. This value is NIL when the current header is the last entry point in the component.
Function name:
This function's name (for printing). If the user defined this function with DEFUN, then this is the defined symbol, otherwise it is a descriptive string.
Function debug arglist:
A printed string representing the function's argument list, for human readability. If it is a macroexpansion function, then this is the original DEFMACRO arglist, not the actual expander function arglist.
Function type:
A list-style function type specifier representing the argument signature and return types for this function. For example,
(function (fixnum fixnum fixnum) fixnum)
or
(function (string &key (:start unsigned-byte)) string)
This information is intended for machine readablilty, such as by the compiler.

Closure Format

A closure data-block has the following format:

----------------------------------------------------------------
|  Word size (24 bits)           |  Closure-Type (8 bits)      |
----------------------------------------------------------------
|  Pointer to function header (other-pointer low-tag)          |
----------------------------------------------------------------
|                                 .                            |
|                      Environment information                 |
|                                 .                            |
----------------------------------------------------------------

A closure descriptor has function low-tag bits. This means that a descriptor with function low-tag bits may point to either a function header or to a closure. The idea is that any callable Lisp object has function low-tag bits. Insofar as call is concerned, we make the format of closures and non-closure functions compatible. This is the reason for the self-pointer in a function header. Whenever you have a callable object, you just jump through the second word, offset some bytes, and go.

Function call

Due to alignment requirements and low-tag codes, it is not possible to use a hardware call instruction to compute the LRA. Instead the LRA for a call is computed by doing an add-immediate to the start of the code data-block.

An advantage of using a single data-block to represent both the descriptor and non-descriptor parts of a function is that both can be represented by a single pointer. This reduces the number of memory accesses that have to be done in a full call. For example, since the constant pool is implicit in an LRA, a call need only save the LRA, rather than saving both the return PC and the constant pool.

Memory Layout

CMUCL has four spaces, read-only, static, dynamic-0, and dynamic-1. Read-only contains objects that the system never modifies, moves, or reclaims. Static space contains some global objects necessary for the system's runtime or performance (since they are located at a known offset at a known address), and the system never moves or reclaims these. However, GC does need to scan static space for references to moved objects. Dynamic-0 and dynamic-1 are the two heap areas for stop-and-copy GC algorithms.

What global objects are at the head of static space???

   NIL
   eval::*top-of-stack*
   lisp::*current-catch-block*
   lisp::*current-unwind-protect*
   FLAGS (RT only)
   BSP (RT only)
   HEAP (RT only)

In addition to the above spaces, the system has a control stack, binding stack, and a number stack. The binding stack contains pairs of descriptors, a symbol and its previous value. The number stack is the same as the C stack, and the system uses it for non-Lisp objects such as raw system pointers, saving non-Lisp registers, parts of bignum computations, etc.

System Pointers

The system pointers reference raw allocated memory, data returned by foreign function calls, etc. The system uses these when you need a pointer to a non-Lisp block of memory, using an other-pointer. This provides the greatest flexibility by relieving contraints placed by having more direct references that require descriptor type tags.

A system area pointer data-block has the following format:

-------------------------------------------------------
|     1 (data-block words)        | SAP Type (8 bits) |
-------------------------------------------------------
|             system area pointer                     |
-------------------------------------------------------

"SAP" means "system area pointer", and much of our code contains this naming scheme. We don't currently restrict system pointers to one area of memory, but if they do point onto the heap, it is up to the user to prevent being screwed by GC or whatever.

Memory Management

Stacks and Globals

Heap Layout

Garbage Collection

Interface to C and Assembler

Linkage Table

The linkage table feature is based on how dynamic libraries dispatch. A table of functions is used which is filled in with the appropriate code to jump to the correct address.

For CMUCL, this table is stored at target-foreign-linkage-space-start. Each entry is target-foreign-linkage-entry-size bytes long.

At startup, the table is initialized with default values in os_foreign_linkage_init. On x86 platforms, the first entry is code to call the routine resolve_linkage_tramp. All other entries jump to the first entry. The function resolve_linkage_tramp looks at where it was called from to figure out which entry in the table was used. It calls lazy_resolve_linkage with the address of the linkage entry. This routine then fills in the appropriate linkage entry with code to jump to where the real routine is located, and returns the address of the entry. On return, resolve_linkage_tramp then just jumps to the returned address to call the desired function. On all subsequent calls, the entry no longer points to resolve_linkage_tramp but to the real function.

This describes how function calls are made. For foreign data, lazy_resolve_linkage stuffs the address of the actual foreign data into the linkage table. The lisp code then just loads the address from there to get the actual address of the foreign data.

For sparc, the linkage table is slightly different. The first entry is the entry for call_into_c so we never have to look this up. All other entries are for resolve_linkage_tramp. This has the advantage that resolve_linkage_tramp can be much simpler since all calls to foreign code go through call_into_c anyway, and that means all live Lisp registers have already been saved. Also, to make life simpler, we lie about closure_tramp and undefined_tramp in the Lisp code. These are really functions, but we treat them as foreign data since these two routines are only used as addresses in the Lisp code to stuff into a lisp function header.

On the Lisp side, there are two supporting data structures for the linkage table: *linkage-table-data* and *foreign-linkage-symbols*. The latter is a hash table whose key is the foriegn symbol (a string) and whose value is an index into *linkage-table-data*.

*linkage-table-data* is a vector with an unlispy layout. Each entry has 3 parts:

Whenever a new foreign symbol is defined, a new *linkage-table-data* entry is created. *foreign-linkage-symbols* is updated with the symbol and the entry number into *linkage-table-data*.

The *linkage-table-data* is accessed from C (hence the unlispy layout), to figure out the symbol name and the type so that the address of the symbol can be determined. The type tells the C code how to fill in the entry in the linkage-table itself.

Low-level debugging

Core File Format

Fasload File Format

General

The purpose of Fasload files is to allow concise storage and rapid loading of Lisp data, particularly function definitions. The intent is that loading a Fasload file has the same effect as loading the source file from which the Fasload file was compiled, but accomplishes the tasks more efficiently. One noticeable difference, of course, is that function definitions may be in compiled form rather than S-expression form. Another is that Fasload files may specify in what parts of memory the Lisp data should be allocated. For example, constant lists used by compiled code may be regarded as read-only.

In some Lisp implementations, Fasload file formats are designed to allow sharing of code parts of the file, possibly by direct mapping of pages of the file into the address space of a process. This technique produces great performance improvements in a paged time-sharing system. Since the Mach project is to produce a distributed personal-computer network system rather than a time-sharing system, efficiencies of this type are explicitly not a goal for the CMU Common Lisp Fasload file format.

On the other hand, CMU Common Lisp is intended to be portable, as it will eventually run on a variety of machines. Therefore an explicit goal is that Fasload files shall be transportable among various implementations, to permit efficient distribution of programs in compiled form. The representations of data objects in Fasload files shall be relatively independent of such considerations as word length, number of type bits, and so on. If two implementations interpret the same macrocode (compiled code format), then Fasload files should be completely compatible. If they do not, then files not containing compiled code (so-called "Fasdump" data files) should still be compatible. While this may lead to a format which is not maximally efficient for a particular implementation, the sacrifice of a small amount of performance is deemed a worthwhile price to pay to achieve portability.

The primary assumption about data format compatibility is that all implementations can support I/O on finite streams of eight-bit bytes. By "finite" we mean that a definite end-of-file point can be detected irrespective of the content of the data stream. A Fasload file will be regarded as such a byte stream.

Strategy

A Fasload file may be regarded as a human-readable prefix followed by code in a funny little language. When interpreted, this code will cause the construction of the encoded data structures. The virtual machine which interprets this code has a stack and a table, both initially empty. The table may be thought of as an expandable register file; it is used to remember quantities which are needed more than once. The elements of both the stack and the table are Lisp data objects. Operators of the funny language may take as operands following bytes of the data stream, or items popped from the stack. Results may be pushed back onto the stack or pushed onto the table. The table is an indexable stack that is never popped; it is indexed relative to the base, not the top, so that an item once pushed always has the same index.

More precisely, a Fasload file has the following macroscopic organization. It is a sequence of zero or more groups concatenated together. End-of-file must occur at the end of the last group. Each group begins with a series of seven-bit ASCII characters terminated by one or more bytes of all ones #xFF; this is called the header. Following the bytes which terminate the header is the body, a stream of bytes in the funny binary language. The body of necessity begins with a byte other than #xFF. The body is terminated by the operation FOP-END-GROUP.

The first nine characters of the header must be FASL FILE in upper-case letters. The rest may be any ASCII text, but by convention it is formatted in a certain way. The header is divided into lines, which are grouped into paragraphs. A paragraph begins with a line which does not begin with a space or tab character, and contains all lines up to, but not including, the next such line. The first word of a paragraph, defined to be all characters up to but not including the first space, tab, or end-of-line character, is the name of the paragraph. A Fasload file header might look something like this:

FASL FILE >SteelesPerq>User>Guy>IoHacks>Pretty-Print.Slisp
Package Pretty-Print
Compiled 31-Mar-1988 09:01:32 by some random luser
Compiler Version 1.6, Lisp Version 3.0.
Functions: INITIALIZE DRIVER HACK HACK1 MUNGE MUNGE1 GAZORCH
	   MINGLE MUDDLE PERTURB OVERDRIVE GOBBLE-KEYBOARD
	   FRY-USER DROP-DEAD HELP CLEAR-MICROCODE
	    %AOS-TRIANGLE %HARASS-READTABLE-MAYBE
Macros:    PUSH POP FROB TWIDDLE
one or more bytes of #xFF

The particular paragraph names and contents shown here are only intended as suggestions.

Fasload Language

Each operation in the binary Fasload language is an eight-bit (one-byte) opcode. Each has a name beginning with "FOP-". In the following descriptions, the name is followed by operand descriptors. Each descriptor denotes operands that follow the opcode in the input stream. A quantity in parentheses indicates the number of bytes of data from the stream making up the operand. Operands which implicitly come from the stack are noted in the text. The notation " stack" means that the result is pushed onto the stack; " table" similarly means that the result is added to the table. A construction like "n(1) value(n)" means that first a single byte n is read from the input stream, and this byte specifies how many bytes to read as the operand named value. All numeric values are unsigned binary integers unless otherwise specified. Values described as "signed" are in two's-complement form unless otherwise specified. When an integer read from the stream occupies more than one byte, the first byte read is the least significant byte, and the last byte read is the most significant (and contains the sign bit as its high-order bit if the entire integer is signed).

Some of the operations are not necessary, but are rather special cases of or combinations of others. These are included to reduce the size of the file or to speed up important cases. As an example, nearly all strings are less than 256 bytes long, and so a special form of string operation might take a one-byte length rather than a four-byte length. As another example, some implementations may choose to store bits in an array in a left-to-right format within each word, rather than right-to-left. The Fasload file format may support both formats, with one being significantly more efficient than the other for a given implementation. The compiler for any implementation may generate the more efficient form for that implementation, and yet compatibility can be maintained by requiring all implementations to support both formats in Fasload files.

Measurements are to be made to determine which operation codes are worthwhile; little-used operations may be discarded and new ones added. After a point the definition will be "frozen", meaning that existing operations may not be deleted (though new ones may be added; some operations codes will be reserved for that purpose).

0:
FOP-NOP
No operation. (This is included because it is recognized that some implementations may benefit from alignment of operands to some operations, for example to 32-bit boundaries. This operation can be used to pad the instruction stream to a desired boundary.)

1:
FOP-POP table
One item is popped from the stack and added to the table.

2:
FOP-PUSH index(4) stack
Item number index of the table is pushed onto the stack. The first element of the table is item number zero.

3:
FOP-BYTE-PUSH index(1) stack
Item number index of the table is pushed onto the stack. The first element of the table is item number zero.

4:
FOP-EMPTY-LIST stack
The empty list (()) is pushed onto the stack.

5:
FOP-TRUTH stack
The standard truth value (T) is pushed onto the stack.

6:
FOP-SYMBOL-SAVE n(4) name(n) stack & table
The four-byte operand n specifies the length of the print name of a symbol. The name follows, one character per byte, with the first byte of the print name being the first read. The name is interned in the default package, and the resulting symbol is both pushed onto the stack and added to the table.

7:
FOP-SMALL-SYMBOL-SAVE n(1) name(n) stack & table
The one-byte operand n specifies the length of the print name of a symbol. The name follows, one character per byte, with the first byte of the print name being the first read. The name is interned in the default package, and the resulting symbol is both pushed onto the stack and added to the table.

8:
FOP-SYMBOL-IN-PACKAGE-SAVE index(4) n(4) name(n) stack & table
The four-byte index specifies a package stored in the table. The four-byte operand n specifies the length of the print name of a symbol. The name follows, one character per byte, with the first byte of the print name being the first read. The name is interned in the specified package, and the resulting symbol is both pushed onto the stack and added to the table.

9:
FOP-SMALL-SYMBOL-IN-PACKAGE-SAVE index(4) n(1) name(n) stack & table
The four-byte index specifies a package stored in the table. The one-byte operand n specifies the length of the print name of a symbol. The name follows, one character per byte, with the first byte of the print name being the first read. The name is interned in the specified package, and the resulting symbol is both pushed onto the stack and added to the table.

10:
FOP-SYMBOL-IN-BYTE-PACKAGE-SAVE index(1) n(4) name(n) stack & table
The one-byte index specifies a package stored in the table. The four-byte operand n specifies the length of the print name of a symbol. The name follows, one character per byte, with the first byte of the print name being the first read. The name is interned in the specified package, and the resulting symbol is both pushed onto the stack and added to the table.

11:
FOP-SMALL-SYMBOL-IN-BYTE-PACKAGE-SAVE index(1) n(1) name(n) stack & table
The one-byte index specifies a package stored in the table. The one-byte operand n specifies the length of the print name of a symbol. The name follows, one character per byte, with the first byte of the print name being the first read. The name is interned in the specified package, and the resulting symbol is both pushed onto the stack and added to the table.

12:
FOP-UNINTERNED-SYMBOL-SAVE n(4) name(n) stack & table
Like FOP-SYMBOL-SAVE, except that it creates an uninterned symbol.

13:
FOP-UNINTERNED-SMALL-SYMBOL-SAVE n(1) name(n) stack & table
Like FOP-SMALL-SYMBOL-SAVE, except that it creates an uninterned symbol.

14:
FOP-PACKAGE table
An item is popped from the stack; it must be a symbol. The package of that name is located and pushed onto the table.

15:
FOP-LIST length(1) stack
The unsigned operand length specifies a number of operands to be popped from the stack. These are made into a list of that length, and the list is pushed onto the stack. The first item popped from the stack becomes the last element of the list, and so on. Hence an iterative loop can start with the empty list and perform "pop an item and cons it onto the list" length times. (Lists of length greater than 255 can be made by using FOP-LIST* repeatedly.)

16:
FOP-LIST* length(1) stack
This is like FOP-LIST except that the constructed list is terminated not by () (the empty list), but by an item popped from the stack before any others are. Therefore length+1 items are popped in all. Hence an iterative loop can start with a popped item and perform "pop an item and cons it onto the list" length+1 times.

17-24:
FOP-LIST-1, FOP-LIST-2, ..., FOP-LIST-8
FOP-LIST-k is like FOP-LIST with a byte containing k following it. These exist purely to reduce the size of Fasload files. Measurements need to be made to determine the useful values of k.

25-32:
FOP-LIST*-1, FOP-LIST*-2, ..., FOP-LIST*-8
FOP-LIST*-k is like FOP-LIST* with a byte containing k following it. These exist purely to reduce the size of Fasload files. Measurements need to be made to determine the useful values of k.

33:
FOP-INTEGER n(4) value(n) stack
A four-byte unsigned operand specifies the number of following bytes. These bytes define the value of a signed integer in two's-complement form. The first byte of the value is the least significant byte.

34:
FOP-SMALL-INTEGER n(1) value(n) stack
A one-byte unsigned operand specifies the number of following bytes. These bytes define the value of a signed integer in two's-complement form. The first byte of the value is the least significant byte.

35:
FOP-WORD-INTEGER value(4) stack
A four-byte signed integer (in the range to ) follows the operation code. A LISP integer (fixnum or bignum) with that value is constructed and pushed onto the stack.

36:
FOP-BYTE-INTEGER value(1) stack
A one-byte signed integer (in the range -128 to 127) follows the operation code. A LISP integer (fixnum or bignum) with that value is constructed and pushed onto the stack.

37:
FOP-STRING n(4) name(n) stack
The four-byte operand n specifies the length of a string to construct. The characters of the string follow, one per byte. The constructed string is pushed onto the stack.

38:
FOP-SMALL-STRING n(1) name(n) stack
The one-byte operand n specifies the length of a string to construct. The characters of the string follow, one per byte. The constructed string is pushed onto the stack.

39:
FOP-VECTOR n(4) stack
The four-byte operand n specifies the length of a vector of LISP objects to construct. The elements of the vector are popped off the stack; the first one popped becomes the last element of the vector. The constructed vector is pushed onto the stack.

40:
FOP-SMALL-VECTOR n(1) stack
The one-byte operand n specifies the length of a vector of LISP objects to construct. The elements of the vector are popped off the stack; the first one popped becomes the last element of the vector. The constructed vector is pushed onto the stack.

41:
FOP-UNIFORM-VECTOR n(4) stack
The four-byte operand n specifies the length of a vector of LISP objects to construct. A single item is popped from the stack and used to initialize all elements of the vector. The constructed vector is pushed onto the stack.

42:
FOP-SMALL-UNIFORM-VECTOR n(1) stack
The one-byte operand n specifies the length of a vector of LISP objects to construct. A single item is popped from the stack and used to initialize all elements of the vector. The constructed vector is pushed onto the stack.

43:
FOP-INT-VECTOR len(4) size(1) data( ) stack
The four-byte operand n specifies the length of a vector of unsigned integers to be constructed. Each integer is size bits long, and is packed according to the machine's native byte ordering. size must be a directly supported i-vector element size. Currently supported values are 1,2,4,8,16 and 32.

44:
FOP-UNIFORM-INT-VECTOR n(4) size(1) value(@ceilingsize/8) stack
The four-byte operand n specifies the length of a vector of unsigned integers to construct. Each integer is size bits big, and is initialized to the value of the operand value. The constructed vector is pushed onto the stack.

45:
Unused

46:
FOP-SINGLE-FLOAT data(4) stack
The data bytes are read as an integer, then turned into an IEEE single float (as though by make-single-float).

47:
FOP-DOUBLE-FLOAT data(8) stack
The data bytes are read as an integer, then turned into an IEEE double float (as though by make-double-float).

48:
FOP-STRUCT n(4) stack
The four-byte operand n specifies the length structure to construct. The elements of the vector are popped off the stack; the first one popped becomes the last element of the structure. The constructed vector is pushed onto the stack.

49:
FOP-SMALL-STRUCT n(1) stack
The one-byte operand n specifies the length structure to construct. The elements of the vector are popped off the stack; the first one popped becomes the last element of the structure. The constructed vector is pushed onto the stack.

50-52:
Unused

53:
FOP-EVAL stack
Pop an item from the stack and evaluate it (give it to EVAL). Push the result back onto the stack.

54:
FOP-EVAL-FOR-EFFECT
Pop an item from the stack and evaluate it (give it to EVAL). The result is ignored.

55:
FOP-FUNCALL nargs(1) stack
Pop nargs+1 items from the stack and apply the last one popped as a function to all the rest as arguments (the first one popped being the last argument). Push the result back onto the stack.

56:
FOP-FUNCALL-FOR-EFFECT nargs(1)
Pop nargs+1 items from the stack and apply the last one popped as a function to all the rest as arguments (the first one popped being the last argument). The result is ignored.

57:
FOP-CODE-FORMAT implementation(1) version(1)
This FOP specifiers the code format for following code objects. The operations FOP-CODE and its relatives may not occur in a group until after FOP-CODE-FORMAT has appeared; there is no default format. The implementation is an integer indicating the target hardware and environment. See compiler/generic/vm-macs.lisp for the currently defined implementations. version for an implementation is increased whenever there is a change that renders old fasl files unusable.

58:
FOP-CODE nitems(4) size(4) code(size) stack
A compiled function is constructed and pushed onto the stack. This object is in the format specified by the most recent occurrence of FOP-CODE-FORMAT. The operand nitems specifies a number of items to pop off the stack to use in the "boxed storage" section. The operand code is a string of bytes constituting the compiled executable code.

59:
FOP-SMALL-CODE nitems(1) size(2) code(size) stack
A compiled function is constructed and pushed onto the stack. This object is in the format specified by the most recent occurrence of FOP-CODE-FORMAT. The operand nitems specifies a number of items to pop off the stack to use in the "boxed storage" section. The operand code is a string of bytes constituting the compiled executable code.

60-61:
Unused

62:
FOP-VERIFY-TABLE-SIZE size(4)
If the current size of the table is not equal to size, then an inconsistency has been detected. This operation is inserted into a Fasload file purely for error-checking purposes. It is good practice for a compiler to output this at least at the end of every group, if not more often.

63:
FOP-VERIFY-EMPTY-STACK
If the stack is not currently empty, then an inconsistency has been detected. This operation is inserted into a Fasload file purely for error-checking purposes. It is good practice for a compiler to output this at least at the end of every group, if not more often.

64:
FOP-END-GROUP
This is the last operation of a group. If this is not the last byte of the file, then a new group follows; the next nine bytes must be "FASL FILE".

65:
FOP-POP-FOR-EFFECT stack
One item is popped from the stack.

66:
FOP-MISC-TRAP stack
A trap object is pushed onto the stack.

67:
Unused

68:
FOP-CHARACTER character(3) stack
The three bytes are read as an integer then converted to a character. This FOP is currently rather useless, as extended characters are not supported.

69:
FOP-SHORT-CHARACTER character(1) stack
The one byte specifies the code of a Common Lisp character object. A character is constructed and pushed onto the stack.

70:
FOP-RATIO stack
Creates a ratio from two integers popped from the stack. The denominator is popped first, the numerator second.

71:
FOP-COMPLEX stack
Creates a complex number from two numbers popped from the stack. The imaginary part is popped first, the real part second.

72-73:
Unused

74:
FOP-FSET
Except in the cold loader (Genesis), this is a no-op with two stack arguments. In the initial core this is used to make DEFUN functions defined at cold-load time so that global functions can be called before top-level forms are run (which normally installs definitions.) Genesis pops the top two things off of the stack and effectively does (SETF SYMBOL-FUNCTION).

75:
FOP-LISP-SYMBOL-SAVE n(4) name(n) stack & table
Like FOP-SYMBOL-SAVE, except that it creates a symbol in the LISP package.

76:
FOP-LISP-SMALL-SYMBOL-SAVE n(1) name(n) stack & table
Like FOP-SMALL-SYMBOL-SAVE, except that it creates a symbol in the LISP package.

77:
FOP-KEYWORD-SYMBOL-SAVE n(4) name(n) stack & table
Like FOP-SYMBOL-SAVE, except that it creates a symbol in the KEYWORD package.

78:
FOP-KEYWORD-SMALL-SYMBOL-SAVE n(1) name(n) stack & table
Like FOP-SMALL-SYMBOL-SAVE, except that it creates a symbol in the KEYWORD package.

79-80:
Unused

81:
FOP-NORMAL-LOAD
This FOP is used in conjunction with the cold loader (Genesis) to read top-level package manipulation forms. These forms are to be read as though by the normal loaded, so that they can be evaluated at cold load time, instead of being dumped into the initial core image. A no-op in normal loading.

82:
FOP-MAYBE-COLD-LOAD
Undoes the effect of FOP-NORMAL-LOAD.

83:
FOP-ARRAY rank(4) stack
This operation creates a simple array header (used for simple-arrays with rank /= 1). The data vector is popped off of the stack, and then rank dimensions are popped off of the stack (the highest dimensions is on top.)

84-139:
Unused

140:
FOP-ALTER-CODE index(4)
This operation modifies the constants part of a code object (necessary for creating certain circular function references.) It pops the new value and code object are off of the stack, storing the new value at the specified index.

141:
FOP-BYTE-ALTER-CODE index(1)
Like FOP-ALTER-CODE, but has only a one byte offset.

142:
FOP-FUNCTION-ENTRY index(4) stack
Initializes a function-entry header inside of a pre-existing code object, and returns the corresponding function descriptor. index is the byte offset inside of the code object where the header should be plunked down. The stack arguments to this operation are the code object, function name, function debug arglist and function type.

143:
Unused

144:
FOP-ASSEMBLER-CODE length(4) stack
This operation creates a code object holding assembly routines. length bytes of code are read and placed in the code object, and the code object descriptor is pushed on the stack. This FOP is only recognized by the cold loader (Genesis.)

145:
FOP-ASSEMBLER-ROUTINE offset(4) stack
This operation records an entry point into an assembler code object (for use with FOP-ASSEMBLER-FIXUP). The routine name (a symbol) is on stack top. The code object is underneath. The entry point is defined at offset bytes inside the code area of the code object, and the code object is left on stack top (allowing multiple uses of this FOP to be chained.) This FOP is only recognized by the cold loader (Genesis.)

146:
Unused

147:
FOP-FOREIGN-FIXUP len(1) name(len) offset(4) stack
This operation resolves a reference to a foreign (C) symbol. len bytes are read and interpreted as the symbol name. First the kind and the code-object to patch are popped from the stack. The kind is a target-dependent symbol indicating the instruction format of the patch target (at offset bytes from the start of the code area.) The code object is left on stack top (allowing multiple uses of this FOP to be chained.)

148:
FOP-ASSEMBLER-FIXUP offset(4) stack
This operation resolves a reference to an assembler routine. The stack args are (routine-name, kind and code-object). The kind is a target-dependent symbol indicating the instruction format of the patch target (at offset bytes from the start of the code area.) The code object is left on stack top (allowing multiple uses of this FOP to be chained.)

149-199:
Unused

200:
FOP-RPLACA table-idx(4) cdr-offset(4)

201:
FOP-RPLACD table-idx(4) cdr-offset(4)
These operations destructively modify a list entered in the table. table-idx is the table entry holding the list, and cdr-offset designates the cons in the list to modify (like the argument to nthcdr.) The new value is popped off of the stack, and stored in the car or cdr, respectively.

202:
FOP-SVSET table-idx(4) vector-idx(4)
Destructively modifies a simple-vector entered in the table. Pops the new value off of the stack, and stores it in the vector-idx element of the contents of the table entry table-idx.

203:
FOP-NTHCDR cdr-offset(4) stack
Does nthcdr on the top-of stack, leaving the result there.

204:
FOP-STRUCTSET table-idx(4) vector-idx(4)
Like FOP-SVSET, except it alters structure slots.

255:
FOP-END-HEADER
Indicates the end of a group header, as described above.

Glossary

assert (a type)
In Python, all type checking is done via a general type assertion mechanism. Explicit declarations and implicit assertions (e.g. the arg to + is a number) are recorded in the front-end (implicit continuation) representation. Type assertions (and thus type-checking) are "unbundled" from the operations that are affected by the assertion. This has two major advantages: See also restrict.

back end
The back end is the part of the compiler that operates on the virtual machine intermediate representation. Also included are the compiler phases involved in the conversion from the front end representation (or ICR).

bind node
This is a node type the that marks the start of a lambda body in ICR. This serves as a placeholder for environment manipulation code.

IR1
The first intermediate representation, also known as ICR, or the Implicit Continuation Represenation.

IR2
The second intermediate representation, also known as VMR, or the Virtual Machine Representation.

basic block
A basic block (or simply "block") has the pretty much the usual meaning of representing a straight-line sequence of code. However, the code sequence ultimately generated for a block might contain internal branches that were hidden inside the implementation of a particular operation. The type of a block is actually cblock. The block-info slot holds an VMR-block containing backend information.

block compilation
Block compilation is a term commonly used to describe the compile-time resolution of function names. This enables many optimizations.

call graph
Each node in the call graph is a function (represented by a flow graph.) The arcs in the call graph represent a possible call from one function to another. See also tail set.

cleanup
A cleanup is the part of the implicit continuation representation that retains information scoping relationships. For indefinite extent bindings (variables and functions), we can abandon scoping information after ICR conversion, recovering the lifetime information using flow analysis. But dynamic bindings (special values, catch, unwind protect, etc.) must be removed at a precise time (whenever the scope is exited.) Cleanup structures form a hierarchy that represents the static nesting of dynamic binding structures. When the compiler does a control transfer, it can use the cleanup information to determine what cleanup code needs to be emitted.

closure variable
A closure variable is any lexical variable that has references outside of its home environment. See also indirect value cell.

closed continuation
A closed continuation represents a tagbody tag or block name that is closed over. These two cases are mostly indistinguishable in ICR.

home
Home is a term used to describe various back-pointers. A lambda variable's "home" is the lambda that the variable belongs to. A lambda's "home environment" is the environment in which that lambda's variables are allocated.

indirect value cell
Any closure variable that has assignments (setqs) will be allocated in an indirect value cell. This is necessary to ensure that all references to the variable will see assigned values, since the compiler normally freely copies values when creating a closure.

set variable
Any variable that is assigned to is called a "set variable". Several optimizations must special-case set variables, and set closure variables must have an indirect value cell.

code generator
The code generator for a VOP is a potentially arbitrary list code fragment which is responsible for emitting assembly code to implement that VOP.

constant pool
The part of a compiled code object that holds pointers to non-immediate constants.

constant TN
A constant TN is the VMR of a compile-time constant value. A constant may be immediate, or may be allocated in the constant pool.

constant leaf
A constant leaf is the ICR of a compile-time constant value.

combination
A combination node is the ICR of any fixed-argument function call (not apply or multiple-value-call.)

top-level component
A top-level component is any component whose only entry points are top-level lambdas.

top-level lambda
A top-level lambda represents the execution of the outermost form on which the compiler was invoked. In the case of compile-file, this is often a truly top-level form in the source file, but the compiler can recursively descend into some forms (eval-when, etc.) breaking them into separate compilations.

component
A component is basically a sequence of blocks. Each component is compiled into a separate code object. With block compilation or local functions, a component will contain the code for more than one function. This is called a component because it represents a connected portion of the call graph. Normally the blocks are in depth-first order (DFO).

component, initial
During ICR conversion, blocks are temporarily assigned to initial components. The "flow graph canonicalization" phase determines the true component structure.

component, head and tail
The head and tail of a component are dummy blocks that mark the start and end of the DFO sequence. The component head and tail double as the root and finish node of the component's flow graph.

local function (call)
A local function call is a call to a function known at compile time to be in the same component. Local call allows compile time resolution of the target address and calling conventions. See block compilation.

conflict (of TNs, set)
Register allocation terminology. Two TNs conflict if they could ever be live simultaneously. The conflict set of a TN is all TNs that it conflicts with.

continuation
The ICR data structure which represents both: In the Implicit Continuation Representation, the environment is implicit in the continuation's BLOCK (hence the name.) The ICR continuation is very similar to a CPS continuation in its use, but its representation doesn't much resemble (is not interchangeable with) a lambda.

cont
A slot in the node holding the continuation which receives the node's value(s). Unless the node ends a block, this also implicitly indicates which node should be evaluated next.

cost
Approximations of the run-time costs of operations are widely used in the back end. By convention, the unit is generally machine cycles, but the values are only used for comparison between alternatives. For example, the VOP cost is used to determine the preferred order in which to try possible implementations.

CSP, CFP
See control stack pointer and control frame pointer.

Control stack
The main call stack, which holds function stack frames. All words on the control stack are tagged descriptors. In all ports done so far, the control stack grows from low memory to high memory. The most recent call frames are considered to be ``on top'' of earlier call frames.

Control stack pointer
The allocation pointer for the control stack. Generally this points to the first free word at the top of the stack.

Control frame pointer
The pointer to the base of the control stack frame for a particular function invocation. The CFP for the running function must be in a register.

Number stack
The auxiliary stack used to hold any non-descriptor (untagged) objects. This is generally the same as the C call stack, and thus typically grows down.

Number stack pointer
The allocation pointer for the number stack. This is typically the C stack pointer, and is thus kept in a register.

NSP, NFP
See number stack pointer, number frame pointer.

Number frame pointer
The pointer to the base of the number stack frame for a particular function invocation. Functions that don't use the number stack won't have an NFP, but if an NFP is allocated, it is always allocated in a particular register. If there is no variable-size data on the number stack, then the NFP will generally be identical to the NSP.

Lisp return address
The name of the descriptor encoding the "return pc" for a function call.

LRA
See lisp return address. Also, the name of the register where the LRA is passed.

Code pointer
A pointer to the header of a code object. The code pointer for the currently running function is stored in the code register.

Interior pointer
A pointer into the inside of some heap-allocated object. Interior pointers confuse the garbage collector, so their use is highly constrained. Typically there is a single register dedicated to holding interior pointers.

dest
A slot in the continuation which points the the node that receives this value. Null if this value is not received by anyone.

DFN, DFO
See Depth First Number, Depth First Order.

Depth first number
Blocks are numbered according to their appearance in the depth-first ordering (the block-number slot.) The numbering actually increases from the component tail, so earlier blocks have larger numbers.

Depth first order
This is a linearization of the flow graph, obtained by a depth-first walk. Iterative flow analysis algorithms work better when blocks are processed in DFO (or reverse DFO.)

Object
In low-level design discussions, an object is one of the following: These are tagged with three low-tag bits as described in the section 38 This is synonymous with descriptor. In other parts of the documentation, may be used more loosely to refer to a lisp object.

Lisp object
A Lisp object is a high-level object discussed as a data type in the Common Lisp definition.

Data-block
A data-block is a dual-word aligned block of memory that either manifests a Lisp object (vectors, code, symbols, etc.) or helps manage a Lisp object on the heap (array header, function header, etc.).

Descriptor
A descriptor is a tagged, single-word object. It either contains immediate data or a pointer to data. This is synonymous with object. Storage locations that must contain descriptors are referred to as descriptor locations.

Pointer descriptor
A descriptor that points to a data block in memory (i.e. not an immediate object.)

Immediate descriptor
A descriptor that encodes the object value in the descriptor itself; used for characters, fixnums, etc.

Word
A word is a 32-bit quantity.

Non-descriptor
Any chunk of bits that isn't a valid tagged descriptor. For example, a double-float on the number stack. Storage locations that are not scanned by the garbage collector (and thus cannot contain pointer descriptors) are called non-descriptor locations. Immediate descriptors can be stored in non-descriptor locations.

Entry point
An entry point is a function that may be subject to ``unpredictable'' control transfers. All entry points are linked to the root of the flow graph (the component head.) The only functions that aren't entry points are let functions. When complex lambda-list syntax is used, multiple entry points may be created for a single lisp-level function. See external entry point.

External entry point
A function that serves as a ``trampoline'' to intercept function calls coming in from outside of the component. The XEP does argument syntax and type checking, and may also translate the arguments and return values for a locally specialized calling calling convention.

XEP
An external entry point.

lexical environment
A lexical environment is a structure that is used during VMR conversion to represent all lexically scoped bindings (variables, functions, declarations, etc.) Each node is annotated with its lexical environment, primarily for use by the debugger and other user interfaces. This structure is also the environment object passed to macroexpand.

environment
The environment is part of the ICR, created during environment analysis. Environment analysis apportions code to disjoint environments, with all code in the same environment sharing the same stack frame. Each environment has a ``real'' function that allocates it, and some collection let functions. Although environment analysis is the last ICR phase, in earlier phases, code is sometimes said to be ``in the same/different environment(s)''. This means that the code will definitely be in the same environment (because it is in the same real function), or that is might not be in the same environment, because it is not in the same function.

fixup
Some sort of back-patching annotation. The main sort encountered are load-time assembler fixups, which are a linkage annotation mechanism.

flow graph
A flow graph is a directed graph of basic blocks, where each arc represents a possible control transfer. The flow graph is the basic data structure used to represent code, and provides direct support for data flow analysis. See component and ICR.

foldable
An attribute of known functions. A function is foldable if calls may be constant folded whenever the arguments are compile-time constant. Generally this means that it is a pure function with no side effects.

FSC
full call
function attribute
function "real" (allocates environment) meaning function-entry more vague (any lambda?) funny function GEN (kill and...) global TN, conflicts, preference GTN (number) IR ICR VMR ICR conversion, VMR conversion (translation) inline expansion, call kill (to make dead) known function LAMBDA leaf let call lifetime analysis, live (tn, variable) load tn LOCS (passing, return locations) local call local TN, conflicts, (or just used in one block) location (selection) LTN (number) main entry mess-up (for cleanup) more arg (entry) MV non-local exit non-packed SC, TN non-set variable operand (to vop) optimizer (in icr optimize) optional-dispatch pack, packing, packed pass (in a transform) passing locations (value) conventions (known, unknown) policy (safe, fast, small, ...) predecessor block primitive-type reaching definition REF representation selection for value result continuation (for function) result type assertion (for template) (or is it restriction) restrict a TN to finite SBs a template operand to a primitive type (boxed...) a tn-ref to particular SCs

return (node, vops) safe, safety saving (of registers, costs) SB SC (restriction) semi-inline side-effect in ICR in VMR sparse set splitting (of VMR blocks) SSET SUBPRIMITIVE successor block tail recursion tail recursive tail recursive loop user tail recursion

template TN TNBIND TN-REF transform (source, ICR) type assertion inference top-down, bottom-up assertion propagation derived, asserted descriptor, specifier, intersection, union, member type check type-check (in continuation) UNBOXED (boxed) descriptor unknown values continuation unset variable unwind-block, unwinding used value (dest) value passing VAR VM VOP

XEP


next up previous contents
Next: About this document ... Up: Design of CMU Common Previous: Compiler Retargeting   Contents
root 2003-09-16