Brief Contents1 Introduction 12 A Simple Compiler 313 Scanning—Theory and Practice 574 Grammars and Parsing 1135 Top-Down Parsing 1436 Bottom-Up Parsing 1797 Syntax-Directed Translation 2358 Symbol Tables and Declaration Processing 2799 Semantic Analysis 34310 Intermediate Representations Editor-in-Chief: Michael hirschAcquisitions Editor: Matt GoldsteinEditorial assistant: Chelsea bellManaging editor: Jeff holcombDirector of Marketing: Margaret WaplesMarketing Manager: Erin DavisMarketing Coordinator: Kathryn FerrantiMedia Producer: Katelyn BollerSenior Manufacturing Buyer: Carol MelvilleSenior Media Buyer: Ginny MichaudArt Director: Linda KnowlesCover Designer: Elena sidorovaPrinter/Binder: Hamilton Printing CoCover Printer: Lehigh Phoenix HagerstownMany of the designations used by manufacturers and sellers to distinguish their products are claimed astrademarks. Where those designations appear in this book, and Addison-Wesley was aware of atrademark claim, the designations have been printed in initial caps or all capsThe programs and applications presented in this book have been included for their instructional valueThey have been tested with care, but are not guaranteed for any particular purpose. The publisher doesnot offer any warranties or representations, nor does it accept any liabilities with respect to theprograms or applicationsLibrary of Congress Cataloging-in-Publication DataFischer. Charles NCrafting a compiler/Charles N Fischer, Ron K Cytron, Richard J. LeBlanc, Jrp cm.--( Crafting a compiler with C)Includes bibliographicalreferences and indexISBN978-0-13-606705-4(alk. paper)1. Compilers( Computer programs)ICytron, Ron K (Ronald Kaplan), 1958-Il LeBlanc, Richard J(Richard Joseph),1950-Ill TitleQA76.76C65F572009005453-dc222009038265Copyright o 2010 Pearson Education, Inc, publishing as Addison-Wesley. All rights reservedManufactured in the United States of America. This publication is protected by Copyright, andpermission should be obtained from the publisher prior to any prohibited reproduction, storage in aretrieval system, or transmission in any form or by any means, electronic, mechanical, photocopyingrecording, or likewise. To obtain permission(s)to use material from this work, please submit a writtenrequest to Pearson Education, Inc, Permissions Department, 501 Boylston Street, Suite 900, BostonMassachusetts. 0211610987654321-HA-1312111009Addison-WesleyISPEARSONISBN10:0-13-606705-0w. pearsonhighered comISBN13:978-0-13-606705-4PrefaceMuch has changed since Crafting a Compiler, by Fischer and Leblanc, waspublished in 1988. While instructors may remember the 54-inch floppy disk ofsoftware that accompanied that text, most students today have neither seen norheld such a disk. Many changes have occurred in the programming languagesthat students experience in class and in the marketplace. In 1991 the bookwas available in two forms, with algorithms presented in either C or AdaWhile c remains a popular language ada has become relatively obscure anddid not achieve its predicted popularity. The C++ language evolved fromC with the addition of object-oriented features. java M was developed as asimpler object-oriented language, gaining popularity becaof its securityPlacement curriculum moved from Pascal to C++ to ava e Board Advancedand ability to be run within a Web browser. The College board advancedWhile much has changed, students and faculty alike continue to study andteach the subject of compiler construction. Research in the area of compilersand programing language translation continues at a brisk pace, as compilersare tasked with accommodating an increasing diversity of architectures andprogramming languages. Software development environments depend oncompilers interacting successfully with a variety of software toolchain compo-nents such as syntax-informed editors, performance profilers, and debuggersAll modern software efforts rely on their comiilers to check vigorously forerrors and to translate programs faithfullySome texts experience relatively minor changes over time, acquiring per-haps some new exercises or examples. This book reflects a substantive revisionof the material from 1988 and 1991. While the focus of this text remains onteaching the fundamentals of compiler construction, the algorithms and approaches have been brought into modern practiceCoverage of topics that have faded from practical use (e.g,attributegrammars)has been minimized or removed altogetherAlgorithms are presented in a pseudocode style that should be familiar tostudents who have studied the fundamental algorithms of our disciplinePrefacePseudocode enables a concise formulation of an algorithm and a rationaldiscussion of the algorithm's purpose and constructionThe details of implementation in a particular language have been relegated to the Crafting a Compiler Supplement which is available onlinehttp://www.pearsonhighered.com/fischer/Parsing theory and practice are organized to facilitate a variety of pedagogical approachesstudy the material at a high level to gain a broadof todown and bottom-up parsing. Others may study a particular approachter detailThe front- and back-end phases of a compiler are connected by the abtract syntax tree(ast), which is created as theing. Most compilers build an AST, but relatively few texts articulate itsconstruction and useThe visitor pattern is introduced for traversing the ast during semanticanalysis and code generationLaboratory and studio exercises are available to instructorsInstructors can assign some components as exercises for the studentswhile other components are supplied from our course-support Web siteSome texts undergo revision by the addition of more graduate-level materialWhile such information may be useful in an advanced course, the focus ofCrafting a Compiler remains on the undergraduate-level study of compiler con-struction. a graduate course could be offered using Chapters 13 and 14, withthe earlier portions of the text serving as reference materialText and referenceAs a classroom text, this book is oriented toward a curriculum that we havedeveloped over the past 25 years. The book is very flexible and has beenadopted for courses ranging from a three-credit upper-level course taught ina ten-week quarter to a six-credit semester-long graduate course. The textis accessible to any student who has a basic background in programmingalgorithms, and data structures. The text is well suited to a single semester orquarter offering because its flexibility allows an instructor to craft a syllabusaccording to his or her interests. Author-sponsored solutions are available forthose components that are not studied in detail. It is feasible to write portionsof a compiler from parsing to code generation in a single semesterPrefaceThis book is also a valuable professional reference because of its complecoverage of techniques that are of practical importance to compiler construction. Many of our students have reported, even some years after their grad-uation, of their successful application of these techniques to problems theyencounter in their workInstructor resourcesTheWebsiteforthisbookcanbefoundathttp://www.pearsonhigheredcom/fischer/. The material posted for qualified instructors includes samplelaboratory and project assignments, studio(active-learning) sessions, librariesof code that can be used as class-furnished solutions and solutions to selectedexercisesFor access to these materials, qualified instructors should contact theirlocalPearsonRepresentativebyvisitinghttp://www.peaRsonhighered.combysendingemailtocomputing@aw.com,orbyvisitingthePearsonInstructorResourceCenterathttp://www.pearsonhighered.com/irc/.Student resourcesThebooksWebsiteathttp://www.pearsonhighered.com/fischer/containsworking code for examples used throughout the book, including code for thetoy language ac that is introduced in Chapter 2. The site also contains tutorialnotes and a page with links to various compiler-construction toolsaccess to these materials may be guarded by a password that is distributedwith the book or obtained from an instructorProject ApproachThis book offers a comprehensive coverage of relevant theoretical topics incompiler construction. However, a cohesive implementation project is typically an important aspect of planning a curriculum in compiler constructionThus, the book and the online materials are biased in favor of a sequence ofexploratory exercises, culminating in a project, to support learning this mateLab exercises, studio sessions, and course projects appear in the Crafting aCompiler supplement, and readers are invited to send us other materials or linksfor posting at our Web site. The exercises parallel the chapters and progressionof material presented in the text. For example, chapter 2 introduces the toPrefacelanguage ac to give an overview of the compilation process The Web sitecontains full, working versions of the scanner, parser, semantics analyzer, andcode generator for that language. These components will be available in avariety of source programming languagesThe Web site also offers material in supporof developing a workingcompiler for a simple language modeled after Java. This allows instructors toassign some components as exercises while other components are providedto fill in any gaps. Some instructors may provide the entire compiler and askstudents to implement extensions. Polishing and refining existing componentscan also be the basis of class projectsPseudocode and guidesA significant change from the Fischer and Leblanc text is that algorithmsare no longer presented in any specific programming language such as Cor Ada. Instead algorithms are presented in pseudocode using a style thatshould be familiar to those who have studied even the most fundamentalgorithms [CLRSo1]. Pseudocode simplifies the exposition of an algorithmy omitting unnecessary detail. However, the pseudocode is suggestive ofconstructs used in real programming languages, so implementation should bestraightforward. An index of all pseudocode methods is provided as a guideat the end of this bookThe text makes extensive use of abbreviations(including acronyms)tosimplify exposition and to help readers acquire the terminology used in compiler construction. Each abbreviation is fully defined automatically at its firstreference in each chapter. For example, AST has already been used in this pref-ace, as an abbreviation of abstract syntax tree but context-free grammar(Cfghas not. For further help, an index of all abbreviations appears as a guide atthe end of the book. The full index contains abbreviations and indicates wherethey are referenced throughout the book. Terms such as guide are shown inboldface. Each reference to such terms is included in the full indexUsing this bookAn introductory course on compiler construction could begin with Chapters 12, and 3. For parsing technique, either top-down( Chapter 5)or bottom-upChapter 6) could be chosen but some instructors will choose to cover bothMaterial from Chapter 4 can be covered as necessary to support the parsingtechniques that will be studied. Chapter 7 articulates the ast and presents thevisitor pattern forits traversal. Some instructors may assign AST-managementutilities as a lab exercise, while others may use the utilities provided by thePrefaceWeb site. Various aspects of semantic analysis can then be covered at theinstructor's discretion in Chapters 8 and 9. A quarter-based course could endhere, with another quarter continuing with the study of code generation, asdescribed nextChapter 10 provides an overview of the Java Virtual Machine (VM),which should be covered if students will generate JVM code in their projectCode generation for such virtual machines is covered in Chapter 11. Instructorswho prefer students to generate machine code could skip Chapters 10 and 11and cover Chapters 12 and 13 instead. An introductory course could includematerial from the beginning of Chapter 14 on automatic program optimizationFurther study could include more detail of the parsing techniques coveredChapters 4, 5, and 6. Semantic analysis and type checking could be studiedin greater breadth and depth in Chapters 8 and 9. Advanced concepts such asstatic single assignment (SSA) Form could be introduced from Chapters 10and 14. Advanced topics in program analysis and transformation, including data flow frameworks, could be drawn from Chapter 14. Chapters 13and 14 could be the basis for a gradute compiler course, with earlier chaptersproviding useful reference materialChapter descriptionsChapter 1 IntroductionThe text begins with an overview of the compilation process. The conceptsof constructing a compiler from a collection of components are emphasizedAn overview of the history of compilers is presented and the use of tools forgenerating compiler components is introducedChapter 2 A Simple compilerThe simple language ac is presented, and each of the compiler's componentsis discussed with respect to translating ac to another language dc. Thesecomponents are presented in pseudocode, and complete code can be found inthe Crafting a Compiler SupplementChapter 3 Scanning-Theory and practiceThe basic concepts and techniques for building the lexical analysis componentsof a compiler are presented This discussion includes the development of handcoded scanners as well as the use of scanner-generation tools for implementingtable-driven lexical analyzersChapter 4 grammars and ParsingThis chapter covers the fundamentals of formal language concepts, including context-free grammars, grammar notation, derivations, and parse treesGrammar-analvsis algorithms are introduced that are used in Chapters 5 and 6PrefaceChapter 5 Top-Down ParsingTop-down parsing is a popular technique for constructing relatively simpleparsers. This chapter shows how such parsers can be written using explicitcode or by constructing a table for use by a generic top-down parsing engineSyntactic error diagnosis, recovery, and repair are discussedChapter 6 Bottom-up ParsingMost compilers for modern programming languages use one of the bottom-up parsing techniques presented in this chapter. Tools for generating suchparsers automatically from a context-free grammar are widely available. Thechapter describes the thhich such tools are built, includingaof increasingly sophisticated approaches to resolving conflicts that hampeparser construction for some grammars. Grammar and language ambiguityare thoroughly discussed, and heuristics are presented for understanding andresolving ambiguous grammarsChapter 7 Syntax-Directed TranslationThis marks the mid-point of the book in terms of a compilers componentsPrior chapters have considered the lexical and syntactic analysis of programsa goal of those chapters is the construction of an AST. In this chapter, theAST is introduced and an interface is articulated for constructing, managing,and traversing the AST. This chapter is pivotal in the sense that subsequentchapters depend on understanding both the Ast and the visitor pattern thatfacilitates traversal and processing of the AST. The Crafting a Compiler supplement contains a tutorial on the visitor pattern, including examples drawn fromChapter8 Symbol Tables and Declaration ProcessingThis chapter emphasizes the use of a symbol table as an abstract componenthat can be utilized throughout the compilation process. A precise interface isdefined for the symbol table and various implementation issues and ideas arepresented. This discussion includes a study of the implementation of nestedscopesThe semantic analysis necessary for processing symbol declarations is introduced, including types, variables, arrays, structures, and enumerations. Ansubclasses,and superclasses g is presented, including object-oriented classes,introduction to type checkinChapter9 Semantic AnalysisAdditional semantic analysis is required for language specifications that arenot easily checked while parsing. Various control structures are examinedincluding conditional branches and loops. The chapter includes a discussionof exceptions and the semantic analysis they require at compile-timePrefaceChapter 10 Intermediate representationsThis chapter considers two intermediate representations are are widely usedby compilers. The first is the jvM instruction set and bytecode format, whichhas become the standard format for representing compiled java programsFor readers who are interested in targeting the JVM in a compiler project,Chapters 10 and 11 provide the necessary background and techniques. Theother representation is SSA Form, which is used by many optimizing compilers. This chapter defines SSA Form, but its construction is delayed untilChapter 14, where some requisite definitions and algorithms are presentedChapter 11 Code Generation for a Virtual machineThis chapter considers code generation for a virtual machine(VM). The advantages of considering such a target is that many of the details of runtimesupport are subsumed by the VM. For example, most VMs offer an unlimitednumber of registers, so that the issue of register allocation, albeit interesting,can be postponed until the fundamentals of code generation are masteredThe VM's instruction set is typically at a higher level than machine code. Forexample, a method call is often supported by a single VM instruction, whilethe same call would require many more instructions in machine codeWhile an eager reader interested in generating machine code may betempted to skip Chapter 11, we recommend studying this chapter beforeattempting code generation at the machine-code level. The ideas from thishapter are easily applied to Chapters 12 and 13, but they are easier to under-stand from the perspective of a VMChapter 12 Runtime SupportMuch of the functionality embedded in a VM is its runtime support (e.gIts support for managing storage). This chapter discusses various conceptsand implementation strategies for providing the runtime support needed formodern programming languages. Study of this material can provide an understanding of the construction of a VM. For those who write code generatorsfor a target architecture( Chapter 13), runtime support must be provided, sothe study of this material is essential to creating a working compilerThe chapter includes discussion of storage that is statically allocated, stackllocated, and heap allocated. References to nonlocal storage are considered,along with implementation structures such as frames and displays to supportsuch referencesChapter 13 Target Code generationThis chapter is similar to Chapter 11, except that the target of code generationis a relatively low-level instruction set when compared with a VM. The chapterincludes a thorough discussion of topics that arise in such code generation,including register allocation, management of temporaries, code schedulinginstruction selection, and some basic peephole optimization