Compilers Priciples, Techniques, & Tools Notes
Chapter 1 Introduction
1.1 Language Processors
1.1.1 Exercises for Section 1.1
编译器和解释器的区别:
1 |
|
1.2 The Structure of a Compiler
Two parts to this mapping: analysis and synthesis.
The analysis part also collects information about the source program and stores it in a data structure called a symbol table
.
The analysis part is often called the front end of the compiler, the synthesis part is the back end.
The analysis part breaks up the source program into constituent pieces and imposes a grammatical structure on them.
The synthesis part constructs the desired target program.
1.2.1 Lexical Analysis
For each lexeme, the lexical analyzer produces as output a token of the form:
1 |
|
Lexical analysis divides program text into “words” or “tokens”.
1.2.2 Syntax Analysis
A typical representation is a syntax tree in which each interior node represents an operation and the children of the node represent the arguments of the operation.
The tree shows the order in which the operations in the assignment.
1.2.3 Semantic Analysis
An important part of semantic snalysis is type checking, where the compiler checks that each operator has matching operands.
The operator inttofloat, which explicity converts its integer argument into a floating-point number.
1.2.4 Intermediate Code Generation
Syntax trees are a form of intermediate representation.
The intermediate representation should have two important properties:
- it should be easy to produce
- it should be easy to translate into the target machine
1.2.5 Code Optimization
To improve the intermediate code so that better target code will result.
1.2.6 Code Generation
The code generator takes as input an intermediate representation of the source program and maps it into the target language.
The F
in each instruction tells us that it deals with floating-point numbers.
1.2.7 Symbol-Table Management
An essential function of a compiler is to record the variable names used in the source program and collect information about the storage allocated for a name.
The symbol table is a data structure containing a record for each variable name, with field for the attributes of the name.
1.2.8 The Grouping of Phases into Passes
1.2.9 Compiler-Construction Tools
1.3 The Evolution of Programming Languages
1.3.1 The Move to Higher-level Language
Initially, The instructions in an assembly language were just mnemonic representations of machine instructions. Later, macro instructions were added to assembly languages.
First-generation languages are the machine languages.
Second-generation the assembly languages.
Third-generation the higher-level languages.
Fourth-generation are languages designed for special application.
1.3.2 Impact on Compilers
1.4 The Science of Building a Compiler
1.4.2 The Science of Code Optimization
1.5 Applications of Compiler Technology
1.5.1 Implementation of High-Level Programming Languages
Optimizing compilers include techniques to improve the performance of generated code, thus offsetting the inefficiency introduced by high-level abstractions.
The key ideas behind object orientation are:
1. Data abstraction
2. Inheritance of properties
Java has a built-in garbage-collection facility that automatically frees the memory of variable that are no longer in use.
1.5.2 Optimizations for Computer Architectures
Two basic techniques:
- parallelism
- memory hierarchies
1.5.4 Program Translations
1.5.5 Software Productivity Tools
1.6 Programming Language Basics
1.6.1 The Static/Dynamic Distinction
If a language uses a policy that allow the compiler to decide an issue, then we say that the language uses a static policy or that the issue can be decided at compile time. 意思是需要手动设置吗。
The scope of declarations. A language uses static scope or lexical scope if it is possible to determine the scope of a declaration by looking only at the program.
Otherwise, the language uses dynamic scope.
The compiler to determine the location in memory where the declared variable can be found.
1.6.2 Environments and States
The association of names with locations in memory and then with values can be described by two mappings that change as the program runs.
The environment is a mapping from names to locations in the store.
The state is a mapping from locations in store to their values.
1.6.4 Explicit Access Control
Classes and structures introduce a new scope for their members.
Declarations and Definitions
Declarations tell us about the types of things, while definitions tell us about their values.
1.6.6 Parameter Passing Mechanisms
The strict call-by-value requires that the caller copy the entire actual parameter into the soace belonging to the corresponding formal parameter.
Chapter 2 A Simple Symtax-Directed Translator
2.1 Introduction
A notation in which operators appear after their operands.
Two forms of intermediate code are:
- abstract syntax trees
- simply syntax trees
represents the hierarchical syntactic structure of the source program.
Three-address code
The form:
1 |
|
op
is a binary operator, y and z are addressed for operands, and x is the address for the result of the operation.
2.2 Syntax Definition
A grammar naturally describes the hierarchical structure of most programming language constructs.
Such a rule is called a production. In a production, lexical elements like the keyword if
and the parentheses are called terminals. Variable like expr
and stmt
represent sequences of terminals and are called nonterminals.
2.2.3 Parse Trees
A parse tree pictorially shows how the start symble of a grammar derives a string in the language.