FACTOID # 94: In pure number terms, more crimes are committed in America than in any other nation. The same goes for burglaries, car thefts, rapes and assaults.
 
 Home   Encyclopedia   Statistics   Countries A-Z   Flags   Maps   Education   Forum   FAQ   About 
 
 
 
WHAT'S NEW
RECENT ARTICLES
More Recent Articles »
 

SEARCH ALL

FACTS & STATISTICS    Advanced view

Search encyclopedia, statistics and forums:

 

 

(* = Graphable)

 

 


Encyclopedia > Code generation (compiler)
This article is about machine code generation with a compiler. For other uses, see Code generation.

In computer science, code generation is the process by which a compiler's code generator converts a syntactically correct program into a linear sequence of instructions that could be executed by a machine. This article or section does not cite its references or sources. ... Computer science, or computing science, is the study of the theoretical foundations of information and computation and their implementation and application in computer systems. ... This article is about the computing term. ...


Sophisticated compilers typically use several cascaded code generation stages to compile code fully. Compilers use this multi-stage compilation process partly because many algorithms for code optimization are easier to apply to an intermediate code form, and partly because it facilitates the creation of a single compiler that can target multiple architectures, as only the last of the code generation stages (the backend) needs to change from target to target. (For more information on compiler design, see Compiler.) In computing, optimization is the process of modifying a system to improve its efficiency. ... This article is about the computing term. ...


The input to the code generator typically consists of a parse tree or an abstract syntax tree. The tree is converted into a linear sequence of instructions, usually in an intermediate language such as three address code. Further stages of compilation may or may not be referred to as "code generation", depending on whether they involve a significant change in the representation of the program. (For example, a peephole optimization pass would not likely be called "code generation", although a code generator might incorporate a peephole optimization pass.) A parse tree is a tree that represents the syntactic structure of a string according to some formal grammar. ... In computer science, an abstract syntax tree (AST) is a finite, labeled, directed tree, where the internal nodes are labeled by operators, and the leaf nodes represent the operands of the node operators. ... In computer science, an intermediate language is the language of an abstract machine designed to aid in the analysis of computer programs. ... In computer science, three address code is a form of representing intermediate code used by compilers to aid in the implementation of code-improving transformations. ... In compiler theory, peephole optimization is a kind of optimization performed over a very small set of instructions in a segment of generated code. ...

Contents

Major tasks in code generation

In addition to the basic conversion from an intermediate representation into a linear sequence of machine instructions, a typical code generator tries to optimize the generated code in some way. The generator may try to use faster instructions, use fewer instructions, exploit available registers, and avoid redundant computations. Register or registration may mean: Registration (or licensing) is required of a number of occupations and professions where maintenance of standards is required to protect public safety. ... Partial redundancy elimination (PRE) is a compiler optimization that eliminates expressions that are redundant on some but not necessarily all paths through a program. ...


Tasks which are typically part of a sophisticated compiler's "code generation" phase include:

Instruction selection is typically carried out by doing a recursive postorder traversal on the abstract syntax tree, matching particular tree configurations against templates; for example, the tree W := ADD(X,MUL(Y,Z)) might be transformed into a linear sequence of instructions by recursively generating the sequences for t1 := X and t2 := MUL(Y,Z), and then emitting the instruction ADD W, t1, t2. Instruction selection is a compiler optimization that transforms an intermediate representation of a program into the final compiled code, either in binary or assembly format. ... In computer science, instruction scheduling is a compiler optimization used to improve instruction-level parallelism, which improves performance on machines with instruction pipelines. ... In compiler optimization, register allocation is the process of multiplexing a large number of target program variables onto a small number of CPU registers. ... In computer science and mathematics, a variable (IPA pronunciation: ) (sometimes called a pronumeral) is a symbolic representation denoting a quantity or expression. ... In computer architecture, a processor register is a small amount of very fast computer memory used to speed the execution of computer programs by providing quick access to commonly used values—typically, the values being in the midst of a calculation at a given point in time. ... See also Recursion. ... In computer science, tree traversal is the process of visiting each node in a tree data structure. ...


In a compiler that uses an intermediate language, there may be two instruction selection stages — one to convert the parse tree into intermediate code, and a second phase much later to convert the intermediate code into instructions in the ISA of the target machine. This second phase does not require a tree traversal; it can be done linearly, and typically involves a simple replacement of intermediate-language operations with their corresponding opcodes. However, if the compiler is actually a language translator (for example, one that converts Eiffel to C), then the second code-generation phase may involve building a tree from the linear intermediate code! An instruction set, or instruction set architecture (ISA), describes the aspects of a computer architecture visible to a programmer, including the native datatypes, instructions, registers, addressing modes, memory architecture, interrupt and exception handling, and external I/O (if any). ... Microprocessors perform operations using binary bits (on/off/1or0). ... Eiffel is an ISO-standardized object-oriented programming language designed for extensibility, reusability, reliability and programmer productivity. ... C is a general-purpose, procedural, imperative computer programming language developed in 1972 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system. ...


Runtime code generation

When code generation occurs at runtime, as in just-in-time compilation (JIT), it is important that the entire process be efficient with respect to space and time. For example, when regular expressions are interpreted and used to generate code at runtime, a non-determistic finite state machine is often generated instead of a deterministic one, because usually the former can be created more quickly and occupies less memory space than the latter. Despite its generally generating less efficient code, JIT code generation can take advantage of profiling information that is available only at runtime. In computer science, run time (with a space, though often its spelled without one) describes the operation of a computer program, the duration of its execution, from beginning to termination (compare compile time). ... In computing, just-in-time compilation (JIT), also known as dynamic translation, is a technique for improving the performance of bytecode-compiled programming systems, by translating bytecode into native machine code at runtime. ... In computing, a regular expression (abbreviated as regexp or regex, with plural forms regexps, regexes, or regexen) is a string that describes or matches a set of strings, according to certain syntax rules. ... Fig. ... In software engineering, performance analysis (also known as dynamic program analysis) is the investigation of a programs behavior using information gathered as the program runs, as opposed to static code analysis. ...


Related concepts

The fundamental task of taking input in one language and producing output in a non-trivially different language can be understood in terms of the core transformational operations of formal language theory. Consequently, some techniques that were originally developed for use in compilers have come to be employed in other ways as well. For example, yacc (Yet Another Compiler Compiler) takes input in Backus-Naur form and converts it to a parser in C. Though it was originally created for automatic generation of a parser for a compiler, yacc is also often used to automate writing code that needs to be modified each time specifications are changed. (For example, see [1].) Transformational grammar is a broad term describing grammars (almost exclusively those of natural languages) which have been developed in a Chomskian tradition. ... In mathematics, logic and computer science, a formal language is a set of finite-length words (i. ... Yacc is a piece of computer software that serves as the standard parser generator on Unix systems. ... A compiler-compiler or compiler generator is a program that generates the source code of a parser, interpreter, or compiler from a programming language description. ... The Backus-Naur form (BNF) (also known as Backus normal form) is a metasyntax used to express context-free grammars: that is, a formal way to describe formal languages. ... C is a general-purpose, procedural, imperative computer programming language developed in 1972 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system. ...


Many integrated development environments (IDEs) support some form of automatic source code generation, often using algorithms in common with compiler code generators, although commonly less complicated. (See also: Program transformation, Data transformation.) An integrated development environment (IDE), also known as integrated design environment and integrated debugging environment, is a type of computer software that assists computer programmers in developing software. ... Source code generation is the act of generating source code basing on an ontological model such as a template and is accomplished with a programming tool such as a template processor or an IDE. These tools allow the generation of source code through any of various means. ... ÁInsert non-formatted text here ... In metadata, a data transformation converts data from a source data format into destination data. ...


Reflection

In general, a syntax and semantic analyzer tries to retrieve the structure of the program from the source code, while a code generator uses this structural information (e.g., data types) to produce code. In other words, the former adds information while the latter loses some of the information. One consequence of this information loss is that reflection becomes difficult or even impossible. To counter this problem, code generators often embed syntactic and semantic information in addition to the code necessary for execution. A data type is a constraint placed upon the interpretation of data in a type system in computer programming. ... In computer science, reflection is the process by which a computer program of the appropriate type can be modified in the process of being executed, in a manner that depends on abstract features of its code and its runtime behavior. ...



 
 

COMMENTARY     


Share your thoughts, questions and commentary here
Your name
Your comments

Want to know more?
Search encyclopedia, statistics and forums:

 


Lesson Plans | Student Area | Student FAQ | Reviews | Press Releases |  Feeds | Contact
The Wikipedia article included on this page is licensed under the GFDL.
Images may be subject to relevant owners' copyright.
All other elements are (c) copyright NationMaster.com 2003-5. All Rights Reserved.
Usage implies agreement with terms, 1022, m