AWK | Paradigm: | scripting language, procedural, event-driven | | Appeared in: | 1977, last revised 1985, current POSIX edition is IEEE Std 1003.1-2004 | | Designed by: | Alfred Aho, Peter Weinberger, and Brian Kernighan | | Typing discipline: | none; can handle strings, integers and floating point numbers; regular expressions | | Major implementations: | awk, GNU Awk, mawk, nawk, MKS AWK, Thompson AWK (compiler), Awka (compiler) | | Dialects: | old awk oawk 1977, new awk nawk 1985, GNU Awk | | Influenced by: | C, SNOBOL4, Bourne shell | | Influenced: | Perl, Korn Shell (ksh93, dtksh, tksh), JavaScript | | OS: | Cross-platform | | Website: | [1] | | AWK is a general purpose computer language that is designed for processing text-based data, either in files or data streams. The name AWK is derived from the surnames of its authors — Alfred Aho, Peter Weinberger, and Brian Kernighan; however, it is commonly pronounced "awk" and not as a string of separate letters. awk, when written in all lowercase letters, refers to the Unix or Plan 9 program that runs other programs written in the AWK programming language. A programming paradigm is a paradigmatic style of programming (compare with a methodology, which is a paradigmatic style of doing software engineering). ...
Scripting languages (commonly called scripting programming languages or script languages) are computer programming languages created to shorten the traditional edit-compile-link-run process. ...
This article or section does not cite its references or sources. ...
Event-driven programming is a computer programming paradigm. ...
Dr. Alfred V. Aho is a computer scientist. ...
Peter J. Weinberger is a computer scientist who worked at AT&T Bell Labs and contributed to the design of the pioneering AWK programming language (he is the W in AWK). ...
Brian Wilson Kernighan (pronounced Ker-ni-han; the g is silent; born 1942) is a computer scientist who worked at the Bell Labs and contributed to the design of the pioneering AWK and AMPL programming languages. ...
In computer science, a type system defines how a programming language classifies values and expressions into types, how it can manipulate those types and how they interact. ...
Wikibooks has a book on the topic of C Programming The C programming language (often, just C) is a general-purpose, procedural, imperative computer programming language developed in the early 1970s by Dennis Ritchie for use on the Unix operating system. ...
SNOBOL (StriNg Oriented symBOlic Language) is a computer programming language developed between 1962 and 1967 at AT&T Bell Laboratories by David J. Farber, Ralph E. Griswold and Ivan P. Polonsky. ...
The Bourne shell, or sh, was the default Unix shell of Unix Version 7, and replaced the Thompson shell, whose executable file had the same name, sh. ...
Perl, also Practical Extraction and Report Language (a backronym, see below) is a dynamic procedural programming language designed by Larry Wall and first released in 1987. ...
Korn shell logo. ...
The syntax of JavaScript is a set of rules that defines how a JavaScript program will be written and interpreted. ...
To meet Wikipedias quality standards, this article or section may require cleanup. ...
A cross-platform (or platform independent) programming language, software application or hardware device works on more than one system platform (e. ...
Website - Wikipedia, the free encyclopedia /**/ @import /skins-1. ...
A computer language is a language used by, or in association with, computers. ...
A family name, or surname, is that part of a persons name that indicates to what family he or she belongs. ...
Dr. Alfred V. Aho is a computer scientist. ...
Peter J. Weinberger is a computer scientist who worked at AT&T Bell Labs and contributed to the design of the pioneering AWK programming language (he is the W in AWK). ...
Brian Wilson Kernighan (pronounced Ker-ni-han; the g is silent; born 1942) is a computer scientist who worked at the Bell Labs and contributed to the design of the pioneering AWK and AMPL programming languages. ...
Unix or UNIX is a computer operating system originally developed in the 1960s and 1970s by a group of AT&T Bell Labs employees including Ken Thompson, Dennis Ritchie, and Douglas McIlroy. ...
To meet Wikipedias quality standards, this article or section may require cleanup. ...
AWK is an example of a programming language that extensively uses the string datatype, associative arrays (that is, arrays indexed by key strings), and regular expressions. The power, terseness, and limitations of AWK programs and sed scripts inspired Larry Wall to write Perl. Because of their dense notation, all these languages are often used for writing one-liner programs. A programming language is an artificial language that can be used to control the behavior of a machine (often a computer). ...
In computer programming and some branches of mathematics, strings are sequences of various simple objects. ...
In computer science, a datatype or data type (often simply a type) is a name or label for a set of values and some operations which one can perform on that set of values. ...
It has been suggested that Map (C++ container) be merged into this article or section. ...
A regular expression (abbreviated as regexp or regex, with plural forms regexps, regexes, or regexen) is a string that describes or matches a set of strings, according to certain syntax rules. ...
The title of this article should be sed. ...
Larry Wall (b. ...
Perl, also Practical Extraction and Report Language (a backronym, see below) is a dynamic procedural programming language designed by Larry Wall and first released in 1987. ...
A one-liner is a computer program or expression that takes no more than a single line. ...
AWK is one of the early tools to appear in Version 7 Unix and gained popularity as a way to add computational features to a Unix pipeline. A version of the AWK language is a standard feature of nearly every modern Unix-like operating system available today. AWK is mentioned in the Single UNIX Specification as one of the mandatory utilities of a Unix operating system. Besides the Bourne shell, AWK is the only other scripting language available in a standard Unix environment. Implementations of AWK exist as installed software for almost all other operating systems. Seventh Edition Unix, also called Version 7 Unix, Version 7 or just V7, was an important early release of the Unix operating system. ...
A Unix-like operating system is one that behaves in a manner similar to a Unix system, while not necessarily conforming to or being certified to any version of the Single UNIX Specification. ...
To meet Wikipedias quality standards, this article or section may require cleanup. ...
The Single UNIX Specification (SUS) is the collective name of a family of standards for computer operating systems to qualify for the name Unix. The SUS is developed and maintained by the Austin Group, based on earlier work by the IEEE and The Open Group. ...
Unix or UNIX is a computer operating system originally developed in the 1960s and 1970s by a group of AT&T Bell Labs employees including Ken Thompson, Dennis Ritchie, and Douglas McIlroy. ...
To meet Wikipedias quality standards, this article or section may require cleanup. ...
The Bourne shell, or sh, was the default Unix shell of Unix Version 7, and replaced the Thompson shell, whose executable file had the same name, sh. ...
Structure of AWK programs Generally speaking, two pieces of data are given to AWK: a command file and a primary input file. A command file (which can be an actual file, or can be included in the command line invocation of awk) contains a series of commands which tell AWK how to process the input file. The primary input file is typically text that is formatted in some way; it can be an actual file, or it can be read by awk from the standard input. A typical AWK program consists of a series of lines, each of the form A command line interface or CLI is a method of interacting with a computer by giving it lines of textual commands (that is, a sequence of characters) either from keyboard input or from a script. ...
/pattern/ { action } where pattern is a regular expression and action is a command. Most implementations of AWK use extended regular expressions by default. AWK looks through the input file; when it finds a line that matches pattern, it executes the command(s) specified in action. Alternate line forms include: A regular expression (abbreviated as regexp or regex, with plural forms regexps, regexes, or regexen) is a string that describes or matches a set of strings, according to certain syntax rules. ...
- BEGIN { action }
- Executes action commands at the beginning of the script execution, i.e. before any of the lines are processed.
- END { action }
- Similar to the previous form, but executes action after the end of input.
- /pattern/
- Prints any lines matching pattern.
- { action }
- Executes action for each line in the input.
Each of these forms can be included multiple times in the command file. Lines in the command file are executed in order, so if there are two "BEGIN" statements, the first is executed, then the second, and then the rest of the lines. BEGIN and END statements do not have to be located before and after (respectively) the other lines in the command file. AWK was created as a broadbased replacement to C algorithmic approaches developed to integrate text parsing methods. Wikibooks has a book on the topic of C Programming The C programming language (often, just C) is a general-purpose, procedural, imperative computer programming language developed in the early 1970s by Dennis Ritchie for use on the Unix operating system. ...
AWK commands AWK commands are the statement that is substituted for action in the examples above. AWK commands can include function calls, variable assignments, calculations, or any combination thereof. AWK contains built-in support for many functions; many more are provided by the various flavors of AWK. Also, some flavors support the inclusion of dynamically linked libraries, which can also provide more functions. In computer science, a library is a collection of subprograms used to develop software. ...
For brevity, the enclosing curly braces ( { } ) will be omitted from these examples.
The print command The print command is used to output text. The simplest form of this command is print This displays the contents of the current line. In AWK, lines are broken down into fields, and these can be displayed separately: - print $1
- Displays the first field of the current line
- print $1, $3
- Displays the first and third fields of the current line, separated by a predefined string called the output field separator (OFS) whose default value is a single space character
Although these fields ($X) may bear resemblance to variables (the $ symbol indicates variables in perl), they actually refer to the fields of the current line. A special case, $0, refers to the entire line. In fact, the commands "print" and "print $0" are identical in functionality. The print command can also display the results of calculations and/or function calls: print 3+2 print foobar(3) print foobar(variable) print sin(3-2) Output may be sent to a file: print "expression" > "file name" Variables, et cetera Variable names can use any of the characters [A-Za-z0-9_], with the exception of language keywords. The operators + - * / are addition, subtraction, multiplication, and division, respectively. For string concatenation, simply place two variables (or string constants) next to each other, optionally with a space in between. String constants are delimited by double quotes. Statements need not end with semicolons. Finally, comments can be added to programs by using # as the first character on a line. Delimited data uses specific characters (delimiters) to separate its values. ...
User-defined functions In a format similar to C, function definitions consist of the keyword function, the function name, argument names and the function body. Here is an example function: Wikibooks has a book on the topic of C Programming The C programming language (often, just C) is a general-purpose, procedural, imperative computer programming language developed in the early 1970s by Dennis Ritchie for use on the Unix operating system. ...
function add_three (number, temp) { temp = number + 3 return temp } This statement can be invoked as follows: print add_three(36) # prints 39 Functions can have variables that are in the local scope. The names of these are added to the end of the argument list, though values for these should be omitted when calling the function. It is convention to add some whitespace in the argument list before the local variables, in order to indicate where the parameters end and the local variables begin. For information on the programming language Whitespace, see Whitespace programming language. ...
Sample applications Hello World Here is the ubiquitous "Hello world program" program written in AWK: A hello world program is a software program that prints out Hello, world! on a display device. ...
BEGIN { print "Hello, world!"; exit } Print lines longer than 80 characters Print all lines longer than 80 characters. Note that the default action is to print the current line. length > 80 Print a count of words Count words in the input, and print lines, words, and characters (like wc) wc (short for word count) is a command in Unix-like operating systems. ...
{ w += NF; c += length} END { print NR, w, c } Sum first column Sum first column of input { s += $1 } END { print s } Calculate word frequencies Word frequency, (uses associative arrays) It has been suggested that Map (C++ container) be merged into this article or section. ...
BEGIN { FS="[^a-zA-Z]+"} { for (i=1; i<=NF; i++) words[tolower($i)]++ } END { for (i in words) print i, words[i] } Self-contained AWK scripts As with many other programming languages, self-contained AWK script can be constructed using the so-called "shebang" syntax. In computing, a shebang is a specific pair of characters used in a special line that begins a text file (commonly called a script) causing Unix-like operating systems to execute the commands in the text file using a specified interpreter (program) when executed. ...
For example, a UNIX command called hello.awk that prints the string "Hello, world!" may be built by going first creating a file named hello.awk containing the following lines: #!/usr/bin/awk -f BEGIN { print "Hello, world!"; exit } AWK versions and implementations AWK was originally written in 1977, and distributed with Version 7 Unix. For the album by Ash, see 1977 (album). ...
Seventh Edition Unix, also called Version 7 Unix, Version 7 or just V7, was an important early release of the Unix operating system. ...
In 1985 its authors started expanding the language, most significantly by adding user-defined functions. The language is described in the book The AWK Programming Language, published 1988, and its implementation was made available in releases of UNIX System V. To avoid confusion with the incompatible older version, this version was sometimes known as "new awk" or nawk. This implementation was released under a free software license in 1996, and is still maintained by Brian Kernighan. (see external links below) 1985 (MCMLXXXV) was a common year starting on Tuesday of the Gregorian calendar. ...
1988 (MCMLXXXVIII) was a leap year starting on Friday of the Gregorian calendar. ...
It has been suggested that Traditional Unix be merged into this article or section. ...
Generally speaking, free software license is a phrase used by the free software movement to mean any software license that meets the free software definition of the Free Software Foundation (FSF). ...
1996 (MCMXCVI) was a leap year starting on Monday of the Gregorian calendar, and was designated the International Year for the Eradication of Poverty. ...
GNU awk, or gawk, is another free software implementation. It was written before the original implementation became freely available, and is still widely used. Almost every Linux distribution comes with a recent version of gawk and gawk is widely recognized as the de-facto standard implementation in the Linux world; gawk version 3.0 was included as awk in FreeBSD up to version 5.0. Subsequent versions of FreeBSD use nawk in order to avoid the GPL, a more restrictive (in the sense that GPL licensed code cannot be modified to become proprietary software) license than the BSD license. GNU (pronounced ) is a free operating system consisting of a kernel, libraries, system utilities, compilers, and end-user applications. ...
A Linux distribution is a Unix-like operating system comprising the Linux kernel and other assorted free software/open-source software, and possibly proprietary software. ...
Linux (also known as GNU/Linux) is a Unix-like computer operating system. ...
FreeBSD is a Unix-like free operating system descended from AT&T UNIX via the Berkeley Software Distribution (BSD) branch through 386BSD and 4. ...
The GNU logo Wikisource has original text related to this article: GNU General Public License The GNU General Public License (GNU GPL or simply GPL) is a widely used free software license, originally written by Richard Stallman for the GNU project. ...
Two of the most common free software licenses are the BSD license and the GNU General Public License license (GPL). ...
The BSD license is a permissive license and is one of the most widely used free software licenses. ...
xgawk is a SourceForge project based on gawk. It extends gawk with dynamically loadable libraries. mawk is a very fast AWK implementation by Mike Brennan based on a byte code interpreter. It has been suggested that this article or section be merged with Managed code. ...
Downloads and further information about these versions are available from the sites listed below. Thompson AWK or TAWK is an AWK compiler for DOS and Windows, previously sold by Thompson Automation Software (which has ceased its activities). A diagram of the operation of a typical multi-language compiler. ...
Microsofts disk operating system, MS-DOS, was Microsofts implementation of DOS, which was the first popular operating system for the IBM PC, and until recently, was widely used on the PC compatible platform. ...
Microsoft Windows is a family of operating systems by Microsoft. ...
Jawk is a SourceForge project to implement AWK in Java. Extensions to the language are added to provide access to Java features within AWK scripts (i.e., Java threads, sockets, Collections, etc). It has been suggested that this article or section be merged with Java (Sun). ...
It has been suggested that this article or section be merged with Java (Sun). ...
Digression - The bird emblematic of AWK (a.o. on The AWK Programming Language book cover) is the Auk.
Genera Alle Uria Alca Pinguinus Cepphus Brachyramphus Synthliboramphus Ptychoramphus Cyclorrhynchus Aethia Cerorhinca Fratercula This article is about a family of birds. ...
Books Dr. Alfred V. Aho is a computer scientist. ...
Brian Kernighan¹ (born 1942) is a computer scientist who worked at the Bell Labs and contributed to the design of the pioneering AWK and AMPL programming languages. ...
Peter J. Weinberger is a computer scientist who worked at AT&T Bell Labs and contributed to the design of the pioneering AWK programming language (he is the W in AWK). ...
Dale Dougherty is one of the co-founders (with Tim OReilly) of OReilly Media. ...
Programming Perl is a classic OReilly book. ...
See also The title of this article should be sed. ...
This is a list of Unix programs. ...
External links |