FACTOID # 73: 62% of Bulgarians describe themselves as either 'not very' or 'not at all' happy.
 
 Home   Encyclopedia   Statistics   Countries A-Z   Flags   Maps   Education   Forum   FAQ   About 
 
WHAT'S NEW
RECENT ARTICLES
More Recent Articles »
 

FACTS & STATISTICS    Simple view

  1. Select countries to view: (hold down Control key and click to select several)

     

     

    Compare:

     

     

  1. Select fact or statistic: (* = graphable)

     

     

     

  2. (OPTIONAL) Compare to statistic: (both need to be graphable)

     

     

     

  3. View result as:

     

       
(OR) SEARCH ALL encyclopedia, stats & forums:   

Encyclopedia > Procedure call

In computer science, a subroutine (function, procedure, or subprogram) is a sequence of code which performs a specific task, as part of a larger program, and is grouped as one, or more, statement blocks; such code is sometimes collected into software libraries. Subroutines can be "called", thus allowing programs to access the subroutine repeatedly without the subroutine's code having been written more than once. In many programming languages the word function is reserved for subroutines that return a value; however, languages and thus programmers in the commonly-used C do not make this distinction.

Contents

History

The first use of subprograms was in assembly languages that did not have a call instruction. On these computers, subroutines needed to be called by a sequence of lower level instructions, possibly implemented as a macro. These instructions typically modified the program code rather like the infamous Cobol alter statement, modifying the address of a branch at a standard location so that it behaved like an explicit return instruction. Even with this cumbersome approach subroutines proved so useful that soon most architectures provided instructions to help with subroutine calls, clear up to explicit call instructions.


When an assembly language program executes a call, program flow jumps to another location, but the address of the next instruction (I.e. the instruction that follows the call instruction in memory.) is kept somewhere to use when returning [unless an execution stack or similar is being used, as once proposed by Henry Baker]. The IBM 360 range used to save this address in a register, relying on macros to save and restore deeper return addresses in memory associated with individual subroutines, then using branches to the address specified in the register to accomplish a return. However stacks have proved a very useful approach, and are typically used in more modern architectures. On these, the return address is 'pushed' as a point of return on the stack. The subroutine exits by 'pop'ing a return value off the end of the stack, which takes the previously pushed return address and jumps to it, so program flow continues right after the call instruction.


Due to usage of a stack, a subroutine can call itself (see recursion) or other subroutines (nested calls), and of course it can call the same subroutine from several distinct places. Assembly languages generally do not provide programmers with such conveniences as local variables or subroutine parameters. They get to be implemented by passing values in registers or pushing them onto the stack [or another stack, if there is more than one]. When there is just one stack, the typical stack layout reads like this (function1 calls function2) : ... - function1 local variables - parameters for function2 - function1 return address [of the call instruction in question] - function2 local variables. This is with a forwards growing stack, while on many architectures the stack grows backwards in memory. On the other hand, it is quite practical to have two stacks growing towards each other in a common scratch space, using one mainly for control information like return addresses and loop counters and the other for data. (This is what Forth does.)


If the procedure or function itself uses stack handling commands, e.g. to store intermediate calculation values, the programmer needs to keep track of the number of 'push' and 'pop' instructions as to not corrupt the original return address.


Technical overview

A subprogram, as its name suggests, somehow behaves like a computer program. Typically, the caller waits for subprograms to finish and continues execution only after a subprogram "returns". Subroutines are often given parameters to refine their behavior or to perform a certain computation with given [variable] values. Generally, subprograms execute their statements from top to bottom.


In most imperative programming languages, subprograms may have so-called side-effects; that is, they may cause changes that remain after the subprogram has returned. Usually, compilers cannot predict whether a subprogram has a side-effect or not, but can determine if a subprogram calls no other subprograms, or at least no other subprograms that have side-effects. In imperative programming, compilers usually assume every subprogram has a side-effect to avoid complex analysis of execution paths. Because of its side-effects, a subprogram may return different results each time it is called, even if it is called with the same arguments. A simple example is a subprogram that implements a Pseudorandom number generator; i.e. a subprogram that returns a random number each time it is called.


Such behavior is invalid in a strict mathematical sense. An exception to this common behaviour is found in functional programming languages, where subprograms can have no side effects, and will always return the same result if repeatedly called with the same arguments. [Note that subprograms are referred to as functions in these languages.]


C and C++ examples

In the C and C++ programming languages, subprograms are referred to as "functions". Note that these languages use the special keyword void to indicate that a function does not return any value — that is it only has side-effects.


Below are three such functions - the first function does absolutely nothing; it is called with: "function1();. The second function returns the number 5; the function can be called with: "function2();" The third function returns a desired selection (1-5), and is called with: "function3(number);".

 void function1(void) { } int function2(void) { return 5; } int function3(int number) { int selection[] = {5,1,3,2,4}; return selection[number]; } 

Why use subprograms?

There are numerous motivations for the use of subprograms:

  • to reduce redundancy in a program,
  • to enable reuse of code across multiple programs,
  • to decompose complex problems into simpler pieces,
  • to improve readability of a program,
  • to replicate useful mathematical functions,
  • to hide or regulate part of the program (see Information hiding),
  • to improve maintainability and reduce risk of errors,
  • to improve ease of extension.

Generally, to make use of a subprogram, a programmer places some form of call instruction--which constitutes a call site--into an instruction sequence. When the call site is encountered, the instruction sequence is temporarily suspended, and the subprogram itself executes until it completes, at which time the original instruction sequence resumes.


Local variables, recursion and re-entrancy

A subprogram may find it useful to make use of a certain amount of "scratch" space; that is, memory used during the execution of that subprogram to hold intermediate results. Variables stored in this scratch space are referred to as local variables, and the scratch space itself is referred to as an activation record. An activation record typically has a return address that tells it where to pass control back to when the subprogram finishes.


A subprogram may have any number and nature of call sites; in fact, a subprogram may even call itself, causing its execution to suspend while another nested execution of the same subprogram occurs. This is referred to as recursion, and is a useful technique for making some complex algorithms more comprehensible. However, recursion poses a problem if the recursive execution modifies any local variables, because when the suspended execution resumes, it will find that the data stored in its local variables have been lost.


Early languages like Fortran simply didn't support recursion for this reason. Modern languages almost invariably provide a fresh activation record for every execution of a subprogram; that way, the nested execution is free to modify its local variables without concern for the effect on other suspended executions in progress. As nested calls accumulate, a call stack structure is formed, consisting of one activation record for each suspended subprogram. In fact, this stack structure is virtually ubiquitous, and so activation records are commonly referred to as stack frames.


If a subprogram can function properly even when called while another execution is already in progress, that subprogram is said to be re-entrant. A recursive subprogram must be re-entrant. Re-entrant subprograms are also useful in multi-threaded situations, since multiple threads can call the same subprogram without fear of interfering with each other.


In a multi-threaded environment, there is generally more than one stack. An environment which fully supports coroutines or lazy evaluation may use data structures other than stacks to store their activation records.


Conventions

A number of conventions for the coding of subprograms have been developed. It has been commonly preferable that the name of a subprogram should be a verb when it does a certain task, an adjective when it makes some inquiry, and a noun when it is used to substitute variables and such.


Experienced programmers recommend that a subprogram perform only one task. If a subprogram performs more than one task, it should be split up into more subprograms. They argue that subprograms are key components in maintaining code and their roles in the program must be distinct.


Some advocate that each subprogram should have minimal dependency on other pieces of code. For example, they see the use of global variables as evil because it adds tight-coupling between subprograms and global variables, if such coupling is not unnecessary at all and advise to refactor subprograms to take parameters instead. This practice is controversial because it tends to increase the number of passed parameters to subprograms.


See programming practice for a more detailed discussion of programming disciplines.


Related terms and clarification

Different programming languages and methodologies possess notions and mechanisms related to subprograms:

  • Subroutine is practically synonymous with "subprogram." The former term may derive from the terminology of assembly languages and Fortran.
  • Function and procedure often denote a subprogram that takes parameters and may or may not have a return value. Many make the distinction between "functions", that possess return values and appear in expressions, versus "procedures", that possess no return values and appear in statements [though this is not a distinction found in the C programming language]. (See also Command-Query Separation.)
  • Method is a special kind of subprogram used in object-oriented programming that describes some behaviour of an object.
  • Closure is a subprogram together with the values of some of its variables captured from the environment in which it was created.
  • Coroutine is a subprogram that returns to its caller before completing.
  • Event handler, or simply "handler," is a subprogram that is called in response to an "event", such as a computer user moving the mouse or typing on the keyboard. The AppleScript scripting language simply uses the term "handler" as a synonym for subprogram.
  • Threaded code makes code even more compact. It uses a small interpreter to execute subroutines that consist of lists of subroutine addresses. The lowest levels of subroutine are the only machine language.

See also

  • Function (mathematics)

  Results from FactBites:
 
Remote procedure call - Wikipedia, the free encyclopedia (467 words)
Remote procedure call (RPC) is a protocol that allows a computer program running on one computer to cause a subroutine on another computer to be executed without the programmer explicitly coding the details for this interaction.
The RPC (Remote Procedure Call) was made famous in 2003 by the Blaster Worm virus, which used the protocol to initiate a shutdown of the Windows computer system, without the user's input.
The first popular implementation of RPC on Unix was Sun's RPC (sometimes called ONC RPC), which was used as the basis for NFS.
DCE 1.1: Remote Procedure Call - Remote Procedure Call Model (6034 words)
The RPC mechanism maps the local procedure call paradigm onto an environment where the calling procedure and the called procedure are distributed between different execution contexts that usually, but not necessarily, reside on physically separate computers that are linked by communications networks.
RPC must provide a means of specifying that a remote procedure begins (and ends) its execution in a disabled scope for either general or asynchronous cancellability in order to avoid a race condition between the beginning of the procedure and establishing the cancellability scopes within the procedure.
However, the call thread is no longer part of the RPC thread, and the orphaned call is unable to return results to the client; the caller does not know whether or not the called routine has terminated yet, how it may have terminated, or even if it executed.
  More results at FactBites »


 

COMMENTARY     


Share your thoughts, questions and commentary here
Your name
Your comments
Please enter the 5-letter protection code

Want to know more?
Search encyclopedia, statistics and forums:

 


Lesson Plans | Student Area | Student FAQ | Reviews | Press Releases |  Feeds | Contact
The Wikipedia article included on this page is licensed under the GFDL.
Images may be subject to relevant owners' copyright.
All other elements are (c) copyright NationMaster.com 2003-5. All Rights Reserved.
Usage implies agreement with terms.