Building a Runnable Program

Prev: Scripting Languages Next: Run-time Program Management

15.1-15.3 Check Your Understanding

  1. What is a code generator generator? Why might it be useful?
  2. What is a basic block? A control flow graph?
  3. What are virtual registers? What purpose do they serve?
  4. What is the difference between local and global code improvement?
  5. What is register spilling?
  6. Explain what is meant by the “level” of an intermediate form (IF). What are the comparative advantages and disadvantages of high-, medium-, and low-level IFs?
  7. What is the IF most commonly used in Ada compilers?
  8. Name two advantages of a stack-based IF. Name one disadvantage.
  9. Explain the rationale for basing a family of compilers, several languages and several target machines, on a single IF.
  10. Why might a compiler employ more than one IF?
  11. Outline some of the major design alternatives for back-end compiler organization and structure.
  12. What is sometimes called the “middle end” of a compiler?
  13. Why is management of a limited set of physical registers usually deferred until late in the compilation process?

15.4-15.7 Check Your Understanding

  1. What are the distinguishing characteristics of a relocatable object file? An executable object file?
  2. Why do operating systems typically zero-fill pages used for uninitialized data?
  3. List four tasks commonly performed by an assembler.
  4. Summarize the comparative advantages of assembly language and object code as the output of a compiler.
  5. Give three examples of pseudoinstructions and three examples of directives that an assembler might be likely to provide.
  6. Why might an assembler perform its own final pass of instruction scheduling?
  7. Explain the distinction between absolute and relocatable words in an object file. Why is the notion of relocatability more complicated than it used to be?
  8. What is the difference between linking and loading?
  9. What are the principal tasks of a linker?
  10. How can a linker enforce type checking across compilation units?
  11. What is the motivation for dynamic linking?

15.9 Exercises

  1. If you were writing a two-pass compiler, why might you choose a high-level IF as the link between the front end and the back end? Why might you choose a medium-level IF?
  2. Consider a language like Ada or Modula-2, in which a module M can be divided into a specification, header, file and an implementation, body, file for the purpose of separate compilation. Should M’s specification itself be separately compiled, or should the compiler simply read it in the process of compiling M’s body and the bodies of other modules that use abstractions defined in M? If the specification is compiled, what should the output consist of?
  3. Many research compilers, for example for SR, Cedar, Lynx, and Modula-3, have used C as their IF. C is well documented and mostly machine independent, and C compilers are much more widely available than alternative back ends. What are the disadvantages of generating C, and how might they be overcome?
  4. List as many ways as you can think of in which the back end of a just-in-time compiler might differ from that of a more conventional compiler. What design goals dictate the differences?
  5. Suppose that k, the number of temporary registers in Figure 15.6, is 4. This is an artificially small number for modern machines. Give an example of an expression that will lead to register spilling under our naive register allocation algorithm.
  6. Modify the attribute grammar of Figure 15.6 in such a way that it will generate the control flow graph of Figure 15.3 instead of the linear assembly code of Figure 15.7.
  7. Add productions and attribute rules to the grammar of Figure 15.6 to handle Ada-style for loops, described in Section 6.5.1. Using your modified grammar, hand-translate the syntax tree of Figure 15.10 into pseudo-assembly notation. Keep the index variable and the upper loop bound in registers.
  8. One problem, of many, with the code we generated in Section 15.3 is that it computes at run time the value of expressions that could have been computed at compile time. Modify the grammar of Figure 15.6 to perform a simple form of constant folding: whenever both operands of an operator are compile-time constants, we should compute the value at compile time and then generate code that uses the value directly. Be sure to consider how to handle overflow.
  9. Modify the grammar of Figure 15.6 to generate jump code for Boolean expressions, as described in Section 6.4.1. You should assume short-circuit evaluation.
  10. Our GCD program did not employ subroutines. Extend the grammar of Figure 15.6 to handle procedures without parameters; feel free to adopt any reasonable conventions on the structure of the syntax tree. Be sure to generate appropriate prologue and epilogue code for each subroutine, and to save and restore any needed temporary registers.
  11. The grammar of Figure 15.6 assumes that all variables are global. In the presence of subroutines, we should need to generate different code, with fp-relative displacement mode addressing, to access local variables and parameters. In a language with nested scopes we should need to dereference the static chain, or index into the display, to access objects that are neither local nor global. Suppose that we are compiling a language with nested subroutines, and are using a static chain. Modify the grammar of Figure 15.6 to generate code to access objects correctly, regardless of scope. You may find it useful to define a to_register subroutine that generates the code to load a given object. Be sure to consider both l-values and r-values, and parameters passed by both value and result.
  12. 15.12-15.15 In More Depth.