FACTOID # 47: Danish workers strike 150 times more than their German neighbours.
 
 Home   Encyclopedia   Statistics   Countries A-Z   Flags   Maps   Education   Forum   FAQ   About 
 
WHAT'S NEW
RECENT ARTICLES
More Recent Articles »
 

SEARCH ALL

FACTS & STATISTICS    Advanced view

Search encyclopedia, statistics and forums:

 

 

(* = Graphable)

 

 


Encyclopedia > C preprocessor

The C preprocessor (cpp) is the preprocessor for the C programming language. It is invoked by the compiler to handle directives such as #include, #define, and #if. Since the language of such directives is not strictly specific to the grammar of C, the preprocessor can also be invoked independently to process another type of file. In computer science, a preprocessor is a program that takes source code and performs transformations on it, before the step of compilation or interpretation. ... The C Programming Language, Brian Kernighan and Dennis Ritchie, the original edition that served for many years as an informal specification of the language The C programming language (often, just C) is a general-purpose, procedural, imperative computer programming language developed in the early 1970s by Dennis Ritchie for use... A diagram of the operation of a typical multi-language compiler. ...


The transformations it makes on its input form the first four so-called Phases of Translation. Though an implementation may choose to perform some or all phases simultaneously, it must behave as if it performed them one-by-one in order.

Contents


Phases

  1. Trigraph Replacement - The preprocessor replaces trigraph sequences with the characters they represent.
  2. Line Splicing - Physical source lines that are continued with escaped newline sequences are spliced to form logical lines.
  3. Tokenization - The preprocessor breaks the result into preprocessing tokens and whitespace. It replaces comments with whitespace.
  4. Macro Expansion and Directive Handling - Preprocessing directive lines, including file inclusion and conditional compilation, are executed. The preprocessor simultaneously expands macros and, in the 1999 version of the C standard, handles _Pragma operators.

In the C family of programming languages a trigraph is a sequence of three characters that represents a single character, the first two of which are both question marks. ... For information on the programming language Whitespace, see Whitespace programming language. ... Preprocessing is the act of processing data before it is parsed. ...

Examples

This section goes into some detail about C preprocessor usage. Good programming practice when writing C macros is crucial, particularly in a collaborative setting, so notes on this have been included. Of course, it is possible to abuse these features, but this is not recommended in a production environment.


The most common use of the preprocessor is to include another file:

 #include <stdio.h> int main (void) { printf("Hello, world!n"); return 0; } 

The preprocessor replaces the line #include <stdio.h> with the system header file of that name, which declares the printf() function amongst other things. More precisely, the entire text of the file 'stdio.h' replaces the #include directive. In computer science, a subroutine (function, method, procedure, or subprogram) is a portion of code within a larger program, which performs a specific task and is relatively independent of the remaining code. ...


This can also be written using double quotes, e.g. #include "stdio.h". The angle brackets were originally used to indicate 'system' include files, and double quotes user-written include files, and it is good practice to retain this distinction. C compilers and programming environments all have a facility which allows the programmer to define where include files can be found. This can be introduced through a command line flag, which can be parameterized using a makefile, so that a different set of include files can be swapped in for different operating systems, for instance. The correct title of this article is make. ...


Conventionally include files are given a .h extension, and files not included by others are given a .c extension. However, there is no requirement that this be observed. Occasionally you will see files with other extensions included from a .c file, in particular others with a .c extension.


The #ifdef, #ifndef, #else, #elif and #endif directives can be used for conditional compilation.

 #define __WINDOWS__ #ifdef __WINDOWS__ #include <windows.h> #else #include <unistd.h> #endif 

The first line defines a macro __WINDOWS__. The macro could instead be defined from the compiler's command line, perhaps to control compilation of the program from a makefile. ... make is a computer program that automates the compilation of programs whose files are dependent on each other. ...


The subsequent code tests if a macro __WINDOWS__ is defined. If it is, as in this example, the file <windows.h> is included, otherwise <unistd.h>.


Macro definition and expansion

There are two types of macros, object-like and function-like. Function-like macros take parameters; object-like macros don't. The generic syntax for declaring an identifier as a macro of each type is, respectively,

 #define <identifier> <replacement token list> #define <identifier>(<parameter list>) <replacement token list> 

Wherever the identifier appears in the source code it is replaced with the replacement token list, which can be empty. For an identifier declared to be a function-like macro, it is only replaced when the following token is also a left parenthesis that begins the argument list of the macro invocation. The exact procedure followed for expansion of function-like macros with arguments is subtle.


Object-like macros are conventionally used as part of good programming practice to create symbolic names for constants, e.g.

 #define PI 3.14159 

instead of hard-coding those numbers throughout one's code.


An example of a function-like macro is:

 #define RADTODEG(x) ((x) * 57.29578) 

This defines a radians to degrees conversion which can be written subsequently, e.g. RADTODEG(34). This is expanded in-place, so the caller does not need to litter copies of the multiplication constant all over his code. The macro here is written as all uppercase to emphasize that it is a macro, not a compiled function. The radian (symbol: rad, or a superscript c ( half circle)) is the SI unit of plane angle. ...


Note that the macro uses parentheses both around the argument and around the entire expression. Omitting either of these can lead to unexpected results. For example:

  • Without parentheses around the argument:
  • Macro defined as #define RADTODEG(x) (x * 57.29578)
  • RADTODEG(a + b) expands to (a + b * 57.29578)
  • Without parentheses around the whole expression:
  • Macro defined as #define RADTODEG(x) (x) * 57.29578
  • 1 / RADTODEG(a) expands to 1 / (a) * 57.29578

neither of which give the probably intended result.


Another example of a function-like macro is:

 #define MIN(a,b) ((a)>(b)?(b):(a)) 

This macro illustrates one of the dangers of using function-like macros. One of the arguments, a or b, will be evaluated twice when this "function" is called. So, if the expression MIN(++firstnum,secondnum) is evaluated, then firstnum may be incremented twice, not once as would be expected.


One of the most subtle and easy to abuse features of the C macropreprocessor is string concatenation. This is a feature of macrofunctions where two arguments can be 'glued' together using ## preprocessor operator. This allows two strings to be concatenated in the preprocessed code. This can be used to construct elaborate macros which act much like C++ templates (without many of their benefits). C++ (pronounced see plus plus, IPA: /siː pləs pləs/) is a general-purpose computer programming language. ... In computer science, generics is a technique that allows one value to take different datatypes (so-called polymorphism) as long as certain contracts such as subtypes and signature are kept. ...


For instance:

 #define MYCASE(_item,_id)  case _id:  _item##_##_id=_id; break switch(x) { MYCASE(widget,23); } 

The line MYCASE(widget,23) gets expanded here into case 23: widget_23=23; break. (The semicolon after the right parentheses does not get expanded, but becomes the semicolon that completes the break statement.)


Note that the _ between the ## is 'literal' whereas the _id and _item arguments are 'arguments' to the function-style macro.


One stylistic note about this macro is that the semicolon on the last line of the macro definition is omitted so that the macro looks 'natural' when written. It could be included in the macro definition, but then there would be lines in the code without semicolons at the end which would throw off the casual reader.


The macro can be extended over as many lines as required using a backslash escape at the end of the line. The macro ends on the last line which does not end in a backslash.


Properly used, multi-line macros can greatly reduce the size and complexity of a C program and enhance its readability and maintainability.


Quoting the Macro Arguments

Although macro expansion does not occur within a quoted string, the text of the macro arguments can be quoted and treated as a string literal by using the "#" directive. For example, with the macro

 #define QUOTEME(x) #x 

the code

 printf("%sn", QUOTEME(1+2)); 

will expand to

 printf("%sn", "1+2"); 

This capability can be used with automatic string concatenation to make debugging macros. For example, the macro in

 #define dumpme(x, fmt) printf("%s:%u: %s=" fmt, __FILE__, __LINE__, #x, x) int some_function() { int foo; /* [a lot of complicated code goes here] */ dumpme(x, "%u"); /* [more complicated code goes here] */ } 

would print the name of an expression and its value, along with the file name and the line number.


X-Macros

One little-known use pattern of the C preprocessor is known by the name X-Macros. X-Macros are the practice of using the #include directive multiple times on the same source header file, each time in a different environment of defined macros.

 File: commands.x.h COMMAND(ADD, "Addition command") COMMAND(SUB, "Substraction command") COMMAND(XOR, "Exclusive-or command") 
 enum command_indices { #define COMMAND(name, description) COMMAND_##name , #include "commands.x.h" #undef COMMAND COMMAND_COUNT /* The number of existing commands */ }; typedef result_t (*command_handler_t)(state_t *); command_handler_t command_handlers[] = { #define COMMAND(name, description) &handler_##name , #include "commands.x.h" #undef COMMAND NULL }; char *command_descriptions[] = { #define COMMAND(name, description) description , #include "commands.x.h" #undef COMMAND NULL }; 

The above allows for definition of new commands to consist of changing the command list in the X-Macro header file, and defining a new command handler of the proper name. The command descriptions list, handler list, and enumeration are updated automatically by the preprocessor.


User-defined compilation errors

The #error directive inserts an error message into the compiler output.

 #error "Gosh!" 

This prints Gosh! in the compiler output and halts the computation at that point. This is extremely useful if you aren't sure whether a given line is being compiled or not. It is also useful if you have a heavily parameterized body of code and want to make sure a particular #define has been introduced from the makefile, e.g.:

 #ifdef WINDOWS ... /* windows specific code */ #elif defined(UNIX) ... /* unix specific code */ #else #error "What's your operating system?" #endif 

Compiler-specific preprocessor features

The #pragma directive is a compiler specific directive which compiler vendors may use for their own purposes. For instance, #pragmas are often used to allow suppression of specific error messages, manage heap and stack debugging, etc.


Standard positioning macros

Certain symbols are predefined in ANSI C. Two useful ones are __FILE__ and __LINE__, which expand into the current file and line number. For instance:

 // debugging macros so we can pin down message provenance at a glance #define WHERESTR "[file %s, line %d] " #define WHEREARG __FILE__,__LINE__ printf(WHERESTR ": hey, x=%dn", WHEREARG, x); 

This prints the value of x, preceded by the file and line number, allowing quick access to which line the message was produced on. Note that the WHERESTR argument is concatenated with the following string.


External links

Wikibooks
Wikibooks has more about this subject:

  Results from FactBites:
 
C# languages preprocessor directives (976 words)
C# preprocessor is fundamentally very similar to C preprocessor and the whole concept in C# has been taken from C language specification.
The C preprocessor is a macro processor that is used automatically by the C compiler to transform your program before actual compilation.
The directive `#error' causes the preprocessor to report a fatal error and the directive `#warning' is like the directive `#error', but causes the preprocessor to issue a warning and continue preprocessing.
The C Preprocessor: 1. The C Preprocessor (9907 words)
The C preprocessor is designed for C-like languages; you may run into problems if you apply it to other kinds of languages, because it assumes that it is dealing with C. For example, the C preprocessor sometimes outputs extra white space to avoid inadvertent C token concatenation, and this may cause problems with other languages.
The C preprocessor normally has several predefined macros that vary between machines because their purpose is to indicate what type of system and machine is in use.
One of the jobs of the C preprocessor is to inform the C compiler of where each line of C code came from: which source file and which line number.
  More results at FactBites »


 

COMMENTARY     


Share your thoughts, questions and commentary here
Your name
Your comments
Please enter the 5-letter protection code

Want to know more?
Search encyclopedia, statistics and forums:

 


Lesson Plans | Student Area | Student FAQ | Reviews | Press Releases |  Feeds | Contact
The Wikipedia article included on this page is licensed under the GFDL.
Images may be subject to relevant owners' copyright.
All other elements are (c) copyright NationMaster.com 2003-5. All Rights Reserved.
Usage implies agreement with terms.