FACTOID # 94: In pure number terms, more crimes are committed in America than in any other nation. The same goes for burglaries, car thefts, rapes and assaults.
 
 Home   Encyclopedia   Statistics   Countries A-Z   Flags   Maps   Education   Forum   FAQ   About 
 
 
 
WHAT'S NEW
RECENT ARTICLES
More Recent Articles »
 

SEARCH ALL

FACTS & STATISTICS    Advanced view

Search encyclopedia, statistics and forums:

 

 

(* = Graphable)

 

 


Encyclopedia > Automatic parallelization

Automatic parallelization (also known as auto parallelization or Autoparallelization), refers to the use of a modern optimizing parallelizing compiler to convert sequential code into multi-threaded or vectorized (or even both) code in order to utilize multiple processors simultaneously in a shared-memory multiprocessor (SMP) machine. The goal of automatic parallelization is to relieve programers from the tedious and error-prone manual parallelization process. Though highly improved since several decades, full automatic parallelization of sequential programs by compilers remains a grand challenge due to the complex program analysis needed and the unknown factors (such as input data range) during compilation. A diagram of the operation of a typical multi-language, multi-target compiler. ... Source code (commonly just source or code) is any series of statements written in some human-readable computer programming language. ... Many programming languages, operating systems, and other software development environments support what are called threads of execution. ... Multiprocessing is traditionally known as the use of multiple concurrent processes in a system as opposed to a single process at any one instant. ... In computer science, program analysis is the use of specialized software, called a profiler, to gather data about a programs execution. ...


The programming control structures on which auto parallelization places the most focus are loops, because, in general, most of the execution time of a program takes place inside some form of loop. An auto parallelization compiler tries to split up a loop so that its iterations can be executed on separate processors concurrently. In computer science control flow (or alternatively, flow of control) refers to the order in which the individual statements or instructions of an imperative program are performed or executed. ... In computer science, runtime describes the operation of a computer program, the duration of its execution, from beginning to termination (compare compile time). ... Iteration is the repetition of a process, typically within a computer program. ... A CPU The processor sub-system of a data processing system processes received information after it has been encoded into data by the input sub-system. ...

Contents

Compiler Parallelization Analysis

The compiler usually conducts two passes of analysis before actual parallelization in order to determine the following:

  • Is it safe to parallelize the loop? Answering this question needs accurate dependence analysis and alias analysis
  • Is it worthwhile to parallelize it? Reliable estimation of workload of programs and capacity of parallel systems is required.

The first pass of the compiler performs a data dependence analysis of the loop to determine whether each iteration of the loop can be executed independently of the others. Data dependence can sometimes be dealt with, but it may incur additional overhead in the form of message passing, synchronization of shared memory, or some other method of processor communication. In compiler theory, dependence analysis produces execution-order constraints between statements/instructions. ... Alias analysis is a technique in compiler theory, used to determine if a storage location may be accessed in more than one way. ... In compiler theory, dependence analysis produces execution-order constraints between statements/instructions. ... In computer science, Message passing is used in concurrent programming, parallel programming, and object-oriented programming, to accomplish communication by sending messages to recipients. ... In computer hardware, shared memory refers to a (typically) large block of random access memory that can be accessed by several different central processing units (CPUs) in a multiple-processor computer system. ...


The second pass attempts to justify the parallelization effort by comparing the theoretical execution time of the code after parallelization to the code's sequential execution time. Somewhat counterintuitively, code does not always benefit from parallel execution. The extra overhead that can be associated with using multiple processors can eat into the potential speedup of parallelized code.


A brief example of Auto-Parallelization

Code 1 below can be auto-parallelized by a compiler because each iteration is independent of the others, and the final result of array z will be correct regardless of the execution order of the other iterations.

 !code 1 do i=1,n z(i) = x(i) + y(i) enddo 

On the other hand, code 2 below cannot be auto-parallelized, because the value of z(i) depends on the result of the previous iteration z(i-1).

 !code 2 do i=1,n z(i) = z(i -1)*2 enddo 

This does not mean that the code cannot be parallelized. However, current parallelizing compilers are not capable of bringing out these parallelisms automatically, and it is very questionable as to whether this code would benefit from parallelization in the first place.


Workaround

Due to the inherent difficulties in full automatic parallelization, several easier approaches exist to get a parallel program in higher quality. They are:

  • let programmers to add "hints" into programs to guide compiler parallelization, such as HPF for distributed memory systems and OpenMP for shared memory systems.
  • build an interactive system between programmers and parallelizing tools/compilers. Notable examples are SUIF explorer, Polaris compiler, and CAPTOOLS(now Parawise).
  • hardware-supported speculative threading

High Performance Fortran (HPF) is an extension of Fortran 90 with constructs that support parallel computing. ... Distributed memory is a concept used in parallel computing. ... OpenMP logo The OpenMP application programming interface (API) supports multi-platform shared memory multiprocessing programming in C/C++ and Fortran on many architectures, including Unix and Microsoft Windows platforms. ... In computer hardware, shared memory refers to a (typically) large block of random access memory that can be accessed by several different central processing units (CPUs) in a multiple-processor computer system. ...

Historic auto parallelizing compilers

Most research compilers for automatic parallelization consider Fortran programs only for the simpler program analyses compared to C programs. Typical examples are: A diagram of the operation of a typical multi-language, multi-target compiler. ...

  • Vienna Fortran compiler
  • Paradigm compiler
  • Polaris compiler
  • SUIF compiler

  Results from FactBites:
 
FORTRAN90/SX Multitasking User's Guide Chapter 6 (4736 words)
When creating a microtasked routine for a parallel region, the automatic parallelization function passes as arguments all variables that need to be shared among the tasks, and declares locally within the new routine all variables that need to have a private copy for each task.
When parallelism cannot be found within a loop nest, the automatic parallelization function attempts to find loops or loop nests that are completely independent of each other, and execute them as parallel cases.
The automatic parallelization function allows redundant code in parallel regions, because it is more efficient to have all the parallel tasks execute the same thing than to have them end the old parallel region, have one processor execute the scalar code, and then start a new region.
Automatic parallelization - Wikipedia, the free encyclopedia (521 words)
Automatic parallelization (also known as auto parallelization or Autoparallelization), refers to the use of a modern optimizing parallelizing compiler to convert sequential code into multi-threaded or vectorized (or even both) code in order to utilize multiple processors simultaneously in a shared-memory multiprocessor (SMP) machine.
Though highly improved since several decades, full automatic parallelization of sequential programs by compilers remains a grand challenge due to the complex program analysis needed and the unknown factors (such as input data range) during compilation.
However, current parallelizing compilers are not capable of bringing out these parallelisms automatically, and it is very questionable as to whether this code would benefit from parallelization in the first place.
  More results at FactBites »


 
 

COMMENTARY     


Share your thoughts, questions and commentary here
Your name
Your comments

Want to know more?
Search encyclopedia, statistics and forums:

 


Lesson Plans | Student Area | Student FAQ | Reviews | Press Releases |  Feeds | Contact
The Wikipedia article included on this page is licensed under the GFDL.
Images may be subject to relevant owners' copyright.
All other elements are (c) copyright NationMaster.com 2003-5. All Rights Reserved.
Usage implies agreement with terms, 1022, m