|
In computer science, the Kolmogorov complexity (also known as descriptive complexity, Kolmogorov-Chaitin complexity, stochastic complexity, algorithmic entropy, or program-size complexity) of an object such as a piece of text is a measure of the computational resources needed to specify the object. For example consider the following two strings of length 64 Computer science, or computing science, is the study of the theoretical foundations of information and computation and their implementation and application in computer systems. ...
Andrey Nikolaevich Kolmogorov (Russian: ÐндÑеÌй ÐиколаÌÐµÐ²Ð¸Ñ ÐолмогоÌÑов) (April 25, 1903 - October 20, 1987) was a Soviet mathematician who made major advances in different academic fields (among them probability theory, topology, intuitionistic logic, turbulence, classical mechanics and computational complexity). ...
Look up computation in Wiktionary, the free dictionary. ...
In computer programming and formal language theory, (and other branches of mathematics), a string is an ordered sequence of symbols. ...
0101010101010101010101010101010101010101010101010101010101010101 1100100001100001110111101110110011111010010000100101011110010110 The first string admits a short English language description, namely "32 repetitions of '01'", which consists of 20 characters. The second one has no obvious simple description other than writing down the string itself, which has 64 characters. It is certainly possible that there is a more succinct descriptions, but they may be arbitrarily difficult to discover. The English language is a West Germanic language that originates in England. ...
This image illustrates part of the Mandelbrot set fractal. The size of the JPEG file encoding the bitmap of this image is more than 17 kilobytes (approximately 140000 bits). The same file can be generated by a computer program much shorter than 140000 bits, however. Thus, the Kolmogorov complexity of the JPEG file encoding the bitmap is much less than 140000. More formally, the complexity of a string is the length of the string's shortest description in some fixed description language. The sensitivity of complexity relative to the choice of description language is discussed below. It can be shown that the Kolmogorov complexity of any string cannot be too much larger than the length of the string itself. Strings whose Kolmogorov complexity is small relative to the string's size are not considered to be complex. The notion of Kolmogorov complexity is surprisingly deep and can be used to state and prove impossibility results akin to Gödel's incompleteness theorem and Turing's halting problem. Download high resolution version (1024x768, 189 KB)This is a file from the Wikimedia Commons, a repository of free content hosted by the Wikimedia Foundation. ...
Download high resolution version (1024x768, 189 KB)This is a file from the Wikimedia Commons, a repository of free content hosted by the Wikimedia Foundation. ...
Initial image of a Mandelbrot set zoom sequence with continuously coloured environment The Mandelbrot set is a set of points in the complex plane that forms a fractal. ...
The boundary of the Mandelbrot set is a famous example of a fractal. ...
JPG redirects here. ...
Complexity in general usage is the opposite of simplicity. ...
Description language may refer generally a kind of language which is used to describe an interface to another. ...
In mathematical logic, Gödels incompleteness theorems are two celebrated theorems proven by Kurt Gödel in 1931. ...
In computability theory the halting problem is a decision problem which can be stated as follows: Given a description of a program and a finite input, decide whether the program finishes running or will run forever, given that input. ...
Definition To define Kolmogorov complexity, we must first specify a description language for strings. Such a description language can be based on a programming language such as Lisp, Pascal, or Java Virtual Machine bytecode. If P is a program which outputs a string x, then P is a description of x. The length of the description is just the length of P as a character string. In determining the length of P, the lengths of any subroutines used in P must be accounted for. The length of any integer constant n which occurs in the program P is the number of bits required to represent n, that is (roughly) log2n. Lisp is a family of computer programming languages with a long history and a distinctive fully-parenthesized syntax. ...
Pascal is a structured imperative computer programming language, developed in 1970 by Niklaus Wirth as a language particularly suitable for structured programming. ...
A Java Virtual Machine (JVM) is a set of computer software programs and data structures which implements a specific virtual machine model. ...
We could alternatively choose an encoding for Turing machines, where an encoding is a function which associates to each Turing Machine M a bitstring <M>. If M is a Turing Machine which on input w outputs string x, then the concatenated string <M> w is a description of x. For theoretical analysis, this approach is more suited for constructing detailed formal proofs and is generally preferred in the research literature. In this article we will use an informal approach. For the test of artificial intelligence, see Turing test. ...
Any string s has at least one description, namely the program function GenerateFixedString() return s Among all the descriptions of s, there is one with shortest length denoted d(s). In case there is more than one program of the same minimal length, choose one arbitrarily, for example selecting the lexicographically first among them. d(s) is the minimal description of s. The Kolmogorov complexity of s, written K(s), is  In the other words, K(s) is the length of the minimal description of s. We now consider how the choice of description language affects the value of K and show that the effect of changing the description language is bounded. Theorem. If K1 and K2 are the complexity functions relative to description languages L1 and L2, then there is a constant c (which depends only on the languages L1 and L2) such that  Proof. By symmetry, it suffices to prove that there is some constant c such that for all bitstrings s,  To see why this is so, there is a program in the language L1 which acts as an interpreter for L2: In computer science, an interpreter is a computer program that executes, or performs, instructions written in a computer programming language. ...
function InterpretLanguage(string p) where p is a program in L2. The interpreter is characterized by the following property: - Running InterpretLanguage on input p returns the result of running p.
Thus if P is a program in L2 which is a minimal description of s, then InterpretLanguage(P) returns the string s. The length of this description of s is the sum of - The length of the program InterpretLanguage, which we can take to be the constant c.
- The length of P which by definition is K2(s).
This proves the desired upper bound. See also invariance theorem. In algorithmic information theory, the invariance theorem, originally proved by Ray Solomonoff, states that a universal Turing machine provides an optimal means of description, up to a constant. ...
History and context Algorithmic information theory is the area of computer science that studies Kolmogorov complexity and other complexity measures on strings (or other data structures). The field was developed by Andrey Kolmogorov, Ray Solomonoff and Gregory Chaitin starting in the late 1960s. There are several variants of Kolmogorov complexity or algorithmic information. The most widely used one is based on self-delimiting programs and is mainly due to Leonid Levin (1974). This article or section is in need of attention from an expert on the subject. ...
A binary tree, a simple type of branching linked data structure. ...
Andrey Nikolaevich Kolmogorov (Russian: ÐндÑеÌй ÐиколаÌÐµÐ²Ð¸Ñ ÐолмогоÌÑов) (April 25, 1903 - October 20, 1987) was a Soviet mathematician who made major advances in different academic fields (among them probability theory, topology, intuitionistic logic, turbulence, classical mechanics and computational complexity). ...
Ray Solomonoff (born 1926) invented the concept of algorithmic probability around 1960. ...
Gregory John Chaitin (born 1947) is an Argentine-American mathematician and computer scientist. ...
Leonid Levin (born November 2, 1948, USSR) is a computer scientist. ...
Naming this concept "Kolmogorov complexity" is an example of the Matthew effect. The term Matthew effect may refer to a number of ideas all centrally related to a parable in the Gospel of Matthew, depending on context: // Matthew effect derives its name from a line spoken by the Master in Jesuss parable of the talents in the Christian Bibles book...
Basic results In the following, we will fix one definition and simply write K(s) for the complexity of the string s. It is not hard to see that the minimal description of a string cannot be too much larger than the string itself: the program GenerateFixedString above that outputs s is a fixed amount larger than s. Theorem. There is a constant c such that  Uncomputability of Kolmogorov complexity The first surprising result is that there is no way to effectively compute K. Theorem. K is not a computable function. Computable functions (or Turing-computable functions) are the basic objects of study in computability theory. ...
In other words, there is no program which takes a string s as input and produces the integer K(s) as output. We show this by contradiction. Suppose there is a program function KolmogorovComplexity(string s) that takes as input a string s and returns K(s). Now consider the program function GenerateComplexString(int n) for i = 1 to infinity: for each string s of length exactly i if KolmogorovComplexity(s) >= n return s quit This program calls KolmogorovComplexity as a subroutine. This program tries every string, starting with the shortest, until it finds a string with complexity at least n, then returns that string. Therefore, given any positive integer n, it produces a string with Kolmogorov complexity at least as great as n. The program itself has a fixed length U. The input to the program GenerateComplexString is an integer n; here, the size of n is measured by the number of bits required to represent n which is log2(n). Now consider the following program: function GenerateParadoxicalString() return GenerateComplexString(n0) This program calls GenerateComplexString as a subroutine and also has a free parameter n0. This program outputs a string s whose complexity is at least n0. By an auspicious choice of the parameter n0 we will arrive at a contradiction. To choose this value, note s is described by the program GenerateParadoxicalString whose length is at most  where C is the "overhead" added by the program GenerateParadoxicalString. Since n grows faster than log2(n), there exists a value n0 such that  But this contradicts the definition of having a complexity at least n0. Thus the program named "KolmogorovComplexity" cannot actually computably find the complexity of arbitrary strings. This is proof by contradiction where the contradiction is similar to the Berry paradox: "Let n be the smallest positive integer that cannot be defined in fewer than twenty English words." It is also possible to show the uncomputability of K by reduction from the uncomputability of the halting problem H, since K and H are turing-equivalent. [1] The Berry paradox is the apparent contradiction that arises from expressions such as the following: The smallest positive integer not nameable in under eleven words. ...
In computability theory the halting problem is a decision problem which can be stated as follows: Given a description of a program and a finite input, decide whether the program finishes running or will run forever, given that input. ...
In computability theory, the Turing degree of a subset of the natural numbers, , is the equivalence class of all subsets of equivalent to under Turing reducibility. ...
Chain rule for Kolmogorov complexity -
The chain rule for Kolmogorov complexity states that The chain rule for Kolmogorov complexity is an analogue of the chain rule for Information entropy, which states: That is, the combined randomness of two sequences X and Y is the sum of the randomness of X plus whatever randomness is left in Y once we know X. This follows...
 It states that the shortest program that reproduces X and Y is no more than a logarithmic factor larger than a program to reproduce X and a program to reproduce Y given X. Using this statement one can define an analogue of mutual information for Kolmogorov complexity. The Big O notation is a mathematical notation used to describe the asymptotic behavior of functions. ...
In probability theory and, in particular, information theory, the mutual information, or transinformation, of two random variables is a quantity that measures the mutual dependence of the two variables. ...
Compression It is however straightforward to compute upper bounds for K(s): simply compress the string s with some method, implement the corresponding decompressor in the chosen language, concatenate the decompressor to the compressed string, and measure the resulting string's length. âSource codingâ redirects here. ...
A string s is compressible by a number c if it has a description whose length does not exceed |s| − c. This is equivalent to saying K(s) ≤ |s| − c. Otherwise s is incompressible by c. A string incompressible by 1 is said to be simply incompressible; by the pigeonhole principle, incompressible strings must exist, since there are 2n bit strings of length n but only 2n−2 shorter strings, that is strings of length n − 1 or less. The inspiration for the name of the principle: pigeons in holes. ...
For the same reason, "most" strings are complex in the sense that they cannot be significantly compressed: K(s) is not much smaller than |s|, the length of s in bits. To make this precise, fix a value of n. There are 2n bitstrings of length n. The uniform probability distribution on the space of these bitstrings assigns to each string of length exactly n equal weight 2−n. In probability theory and statistics, the discrete uniform distribution is a discrete probability distribution that can be characterized by saying that all values of a finite set of possible values are equally probable. ...
Probability is the likelihood that something is the case or will happen. ...
Theorem. With the uniform probability distribution on the space of bitstrings of length n, the probability that a string is incompressible by c is at least 1 − 2−c+1 + 2−n. To prove the theorem, note that the number of descriptions of length not exceeding n − c is given by the geometric series: In mathematics, a geometric progression is a sequence of numbers such that the quotient of any two successive members of the sequence is a constant called the common ratio of the sequence. ...
 There remain at least  many bitstrings of length n that are incompressible by c. To determine the probability divide by 2n. This theorem is the justification for various challenges in comp.compression FAQ. Despite this result, it is sometimes claimed by certain individuals (considered cranks) that they have produced algorithms which uniformly compress data without loss. See lossless data compression. Crank is a pejorative term for a person who holds some belief which the vast majority of his contemporaries would consider false, clings to this belief in the face of all counterarguments or evidence presented to him. ...
Lossless data compression is a class of data compression algorithms that allows the exact original data to be reconstructed from the compressed data. ...
Chaitin's incompleteness theorem We know that most strings are complex in the sense that they cannot be described in any significantly "compressed" way. However, it turns out that the fact that a specific string is complex cannot be formally proved, if the string's length is above a certain threshold. The precise formalization is as follows. First fix a particular axiomatic system S for the natural numbers. The axiomatic system has to be powerful enough so that to certain assertions A about complexity of strings one can associate a formula FA in S. This association must have the following property: if FA is provable from the axioms of S, then the corresponding assertion A is true. This "formalization" can be achieved either by an artificial encoding such as a Gödel numbering or by a formalization which more clearly respects the intended interpretation of S. In mathematics, an axiomatic system is any set of axioms from which some or all axioms can be used in conjunction to logically derive theorems. ...
In mathematics, a natural number can mean either an element of the set {1, 2, 3, ...} (i. ...
In formal number theory a Gödel numbering is a function which assigns to each symbol and formula of some formal language a unique natural number called a Gödel number (GN). ...
Theorem. There exists a constant L (which only depends on the particular axiomatic system and the choice of description language) such that there does not exist a string s for which the statement  (as formalized in S) can be proven within the axiomatic system S. Note that by the abundance of nearly incompressible strings, the vast majority of those statements must be true. The proof of this result is modeled on a self-referential construction used in Berry's paradox. The proof is by contradiction. If the theorem were false, then The Berry paradox is the apparent contradiction that arises from expressions such as the following: The smallest positive integer not nameable in under eleven words. ...
- Assumption (X): For any integer n there exists a string s for which there is a proof in S of the formula "K(s) ≥ n" (which we assume can be formalized in S).
We can find an effective enumeration of all the formal proofs in S by some procedure function NthProof(int n) which takes as input n and outputs some proof. This function enumerates all proofs. Some of these are proofs for formulas we do not care about here (examples of proofs which will be listed by the procedure NthProof are the various known proofs of the law of quadratic reciprocity, those of Fermat's little theorem or the proof of Fermat's last theorem all translated into the formal language of S). Some of these are complexity formulas of the form K(s) ≥ n where s and n constants in the language of S. There is a program In mathematics, in number theory, the law of quadratic reciprocity, conjectured by Euler and Legendre and first satisfactorily proved by Gauss, connects the solvability of two related quadratic equations in modular arithmetic. ...
Fermats little theorem (not to be confused with Fermats last theorem) states that if p is a prime number, then for any integer a, This means that if you start with a number, initialized to 1, and repeatedly multiply, for a total of p multiplications, that number by...
Pierre de Fermats conjecture written in the margin of his copy of Arithmetica proved to be one of the most intriguing and enigmatic mathematical problems ever devised. ...
function NthProofProvesComplexityFormula(int n) which determines whether the nth proof actually proves a complexity formula K(s) ≥ L. The strings s and the integer L in turn are computable by programs: function StringNthProof(int n) function ComplexityLowerBoundNthProof(int n) Consider the following program function GenerateProvablyComplexString(int n) for i = 1 to infinity: if NthProofProvesComplexityFormula(i) and ComplexityLowerBoundNthProof(i) >= n return StringNthProof(i) quit Given an n, this program tries every proof until it finds a string and a proof in the formal system S of the formula K(s) ≥ n. The program terminates by our Assumption (X). Now this program has a length U. There is an integer n0 such that U + log2(n0) + C < n0, where C is the overhead cost of function GenerateProvablyParadoxicalString() return GenerateProvablyComplexString(n0) quit The program GenerateProvablyParadoxicalString outputs a string s for which K(s) ≥ n0 can be formally proved in S. In particular K(s) ≥ n0 is true. However, s is also described by a program of length U+log2(n0)+C so its complexity is less than n0. This contradiction proves Assumption (X) cannot hold. Similar ideas are used to prove the properties of Chaitin's constant. In the computer science subfield of algorithmic information theory the Chaitin constant or halting probability is a construction by Gregory Chaitin which describes the probability that a randomly generated program for a given model of computation or programming language will halt. ...
The minimum message length principle of statistical and inductive inference and machine learning was developed by C.S. Wallace and D.M. Boulton in 1968. MML is Bayesian (it incorporates prior beliefs) and information-theoretic. It has the desirable properties of statistical invariance (the inference transforms with a re-parametrisation, such as from polar coordinates to Cartesian coordinates), statistical consistency (even for very hard problems, MML will converge to any underlying model) and efficiency (the MML model will converge to any true underlying model about as quickly as is possible). C.S. Wallace and D.L. Dowe showed a formal connection between MML and algorithmic information theory (or Kolmogorov complexity) in 1999. Minimum message length (MML) is a formal information theory restatement of Occams Razor: even when models are not equal in goodness of fit accuracy to the observed data, the one generating the shortest overall message is more likely to be correct (where the message consists of a statement of...
Professor Christopher Stewart Wallace (26 October 1933â7 August 2004) was an Australian computer scientist (and physicist, etc. ...
Bayesian probability is an interpretation of probability suggested by Bayesian theory, which holds that the concept of probability can be defined as the degree to which a person believes a proposition. ...
Kolmogorov randomness Kolmogorov randomness (also called algorithmic randomness) defines a string (usually of bits) as being random if and only if it is shorter than any computer program that can produce that string. This definition of randomness is critically dependent on the definition of Kolmogorov complexity. To make this definition complete, a computer has to be specified, usually a Turing machine. According to the above definition of randomness, a random string is also an "incompressible" string, in the sense that it is impossible to give a representation of the string using a program whose length is shorter than the length of the string itself. However, it must also be noted that according to this definition, most strings shorter than a certain length end up to be (Chaitin-Kolmogorovically) random because the best one can do with very small strings is to write a program which simply prints these strings. In computer programming and formal language theory, (and other branches of mathematics), a string is an ordered sequence of symbols. ...
BIT is an acronym for: Bannari amman Institute of Technology Bangalore Institute of Technology Beijing Institute of Technology Benzisothiazolinone Bilateral Investment Treaty Bhilai Institute of Technology - Durg Birla Institute of Technology - Mesra Battles in Time (Doctor Who magazine) BIT International College, formerly the Bohol Institute of Technology in Bohol, Philippines...
âRandomâ redirects here. ...
A computer program is a collection of instructions that describe a task, or set of tasks, to be carried out by a computer. ...
For the test of artificial intelligence, see Turing test. ...
References - Thomas M. Cover, Joy A. Thomas. Elements of information theory, 1st Edition. New York: Wiley-Interscience, 1991. ISBN 0-471-06259-6.
- 2nd Edition. New York: Wiley-Interscience, 2006. ISBN 0-471-24195-4.
- Rónyai Lajos, Ivanyos Gábor, Szabó Réka, Algoritmusok. TypoTeX, 1999.
- Ming Li and Paul Vitányi, An Introduction to Kolmogorov Complexity and Its Applications, Springer, 1997. Introduction chapter full-text.
- Yu Manin, A Course in Mathematical Logic, Springer-Verlag, 1977.
- Michael Sipser, Introduction to the Theory of Computation, PWS Publishing Company, 1997.
See also The Berlekamp-Massey algorithm is a algorithm for finding the shortest linear feedback shift register (LFSR) for a given output sequence. ...
Chaitin-Kolmogorov randomness (also called algorithmic randomness) defines a string (usually of bits) as being random if and only if it is shorter than any computer program that can produce that string. ...
âSource codingâ redirects here. ...
This is a list of important publications in computer science, organized by field. ...
In information theory and computer science, the Levenshtein distance is a string metric which is one way to measure edit distance. ...
Fractal compression is a lossy compression method used to compress images using fractals. ...
External links - The Legacy of Andrei Nikolaevich Kolmogorov
- Chaitin's online publications
- Solomonoff's IDSIA page
- Generalizations of algorithmic information by J. Schmidhuber
- Ming Li and Paul Vitanyi, An Introduction to Kolmogorov Complexity and Its Applications, 2nd Edition, Springer Verlag, 1997.
- Tromp's lambda calculus computer model offers a concrete definition of K()
- Universal AI based on Kolmogorov Complexity ISBN 3-540-22139-5 by M. Hutter: ISBN 3-540-22139-5
- Minimum Message Length and Kolmogorov Complexity (by C.S. Wallace and D.L. Dowe, Computer Journal, Vol. 42, No. 4, 1999).
- David Dowe's Minimum Message Length (MML) and Occam's razor pages.
- P. Grunwald, M. A. Pitt and I. J. Myung (ed.), Advances in Minimum Description Length: Theory and Applications, M.I.T. Press, April 2005, ISBN 0-262-07262-9.
|