FACTOID # 152: Of the eight countries which include the word "democratic" in their conventional long form name, three are dictatorships: North Korea (Democratic People's Republic of Korea), Laos (Lao People's Democratic Republic) and the Democratic republic of the Congo.
 
 Home   Encyclopedia   Statistics   Countries A-Z   Flags   Maps   Education   Forum   FAQ   About 
 
 
 
WHAT'S NEW
RECENT ARTICLES
More Recent Articles »
 

SEARCH ALL

FACTS & STATISTICS    Advanced view

Search encyclopedia, statistics and forums:

 

 

(* = Graphable)

 

 


Encyclopedia > Data flow analysis

A simple way to perform dataflow analysis (DFA) of programs is to set up dataflow equations for each node of the control flow graph (CFG) and solve them by repeatedly calculating the output from the input locally at each node until the whole system stabilizes, i.e., it reaches a fixpoint. To guarantee termination it is required that the data flow equations form a fixed-point data-flow analysis, i.e., the local updates by the equations are monotone. A node is a basic unit used to build data structures, such as linked lists and tree data structures. ... A control flow graph (CFG) is an abstract data structure used in compilers. ... In mathematics, functions between ordered sets are monotonic (or monotone) if they preserve the given order. ...


A canonical example of a data-flow analysis is reaching definitions. In compiler theory, reaching definitions is a data-flow analysis which statically determines which definitions may reach a given point in the code. ...

Contents


An iterative algorithm

The fixpoint of the dataflow equations can be calculated using the round-robin iterative algorithm:

 texttt{for} i leftarrow 1 texttt{to} N mathit{initialize node} i
texttt{while} (mathit{sets are still changing}) texttt{for} i leftarrow 1 texttt{to} N mathit{recompute sets at node} i

The order matters

The efficiency of iteratively solving data-flow equations is influenced by the order at which local nodes are visited. For the above round-robin iterative algorithm it depends in which order nodes are selected from the set of nodes. Furthermore, it depends, whether the data-flow equations are used for forward or backward data-flow analysis over the CFG. In the following, a few iteration orders for solving data-flow equations are discussed (a related concept to iteration order of a CFG is tree traversal of a tree). A control flow graph (CFG) is an abstract data structure used in compilers. ... In computer science, tree traversal is the process of visiting each node in a tree data structure. ... A labeled tree with 6 vertices and 5 edges In graph theory, a tree is a graph in which any two vertices are connected by exactly one path. ...

  • random order This iteration order is not aware whether the DFA solves a forward or backward data-flow problem. Therefore, the performance is relatively poor compared to specialized iteration orders.
  • postorder This is a typical iteration order for backward data-flow problems. In postorder iteration a node is visited after all its successor nodes have been visited. Typically, the postorder iteration is implemented with the depth-first strategy.
  • reverse-postorder This is a typical iteration order for forward data-flow problems. In reverse-postorder iteration a node is visited before all its successor nodes have been visited. A more natural name for reverse-postorder iteration would be "preorder iteration", but this would be confusing with the mathematical definition of preorder.

Postorder is a possible order for visiting nodes in a control flow graph (CFG) when iteratively solving dataflow analysis problems. ... This article is about the mathematics concept. ...

Examples

The following are examples of properties of computer programs that can be calculated by data-flow analysis. Note that the properties calculated by data-flow analysis are typically only approximations of the real properties. This is because data-flow analysis operates on the syntactical structure of the CFG without simulating the exact control flow of the program. However, to be still useful in practice, a data-flow analysis algorithm is typically designed to calculate an upper respectively lower approximation of the real program properties.


Forward Analysis

  • reaching definitions

The reaching definition analysis calculates for each program point the set of variable values that may potentially reach this program point.

 1: if b==4 then 2: a = 5; 3: else 4: a = 3; 5: endif 6: 7: if a < 4 then 8: ... 

The reaching definition of variable "a" at line 7 is the set {3,5}.

Backward Analysis

  • live variables

The live variable analysis calculates for each program point the variables that may be potentially read afterwards before their next write update.

 1: a = 3; 2: c = 5; 3: a = b + c; 

The set of live variables at line 2 is {b,c}, but the set of live variables at line 1 is only {b} since variable "c" is updated in line 2.

Sensitivities

Modern Research on Data Flow Analysis frequently talk about different sensitivities. It is not quite clear whether all authors intend the same when using these terms. The information here might be utterly wrong, so please correct it and then remove this sentence, because I have looked everywhere for proper definitions of these terms.

  • Flow Sensitivity generally means to respect the flow of information and not merge it prematurely. In the intraprocedural case it is to merge over all paths (such as by performing an indexed depth first search over the CFG) as opposed to joining information at each merge-point (as when calculating the fixpoint). For example, a flow-insensitive alias analysis might conclude that "x may alias y in procedure q". A flow-sensitive analysis could conclude "x may alias y after statement 20 of procedure q", and conclude otherwise for statement 10.
  • context sensitive analyses take into account the context of the call. When this is not done, an interprocedural analysis can consider impossible paths, such as the the path leading to one call, through the procedure, and exiting from a different call.
  • path sensitive analyses attempt to only take into account valid paths. If a certain operation is guarded by a condition and much later in the program another operation is guarded by the same condition, you have only two valid paths: either do both operations or none.

Further Reading

Aho, Alfred V. Sethi, Ravi. Ullman, Jeffrey D. Compilers: Principles, Techniques and Tools. Addison Wesley. 1986.


Appel, Andrew W. Modern Compiler Implementation in ML. Cambridge University Press. 1999.


Cooper, Keith D. and Torczon, Linda. Engineering a Compiler. Morgan Kaufmann. 2005.


Muchnick, Steven S. Advanced Compiler Design and Implementation. Morgan Kaufmann. 1997.


Hecht, Matthew S. Flow Analysis of Computer Programs. Elsevier North-Holland Inc. 1977.



 
 

COMMENTARY     


Share your thoughts, questions and commentary here
Your name
Your comments

Want to know more?
Search encyclopedia, statistics and forums:

 


Lesson Plans | Student Area | Student FAQ | Reviews | Press Releases |  Feeds | Contact
The Wikipedia article included on this page is licensed under the GFDL.
Images may be subject to relevant owners' copyright.
All other elements are (c) copyright NationMaster.com 2003-5. All Rights Reserved.
Usage implies agreement with terms, 1022, m