|
Algorithmic learning theory (or inductive inference) is a framework for machine learning. ...
The framework was introduced in E. Mark Gold's seminal paper "Language identification in the limit". The objective of language identification is for a machine running one program to be capable of developing another program by which any given sentence can be tested to determine whether it is "grammatical" or "ungrammatical". The language being learned need not be English or any other natural language - in fact the definition of "grammatical" can be absolutely anything known to the tester. Language identification in the limit is a formal model for inductive inference. ...
The English language is a West Germanic language that originates in England. ...
The term natural language is used to distinguish languages spoken and signed (by hand signals and facial expressions) by humans for general-purpose communication from constructs such as writing, computer-programming languages or the languages used in the study of formal logic, especially mathematical logic. ...
In the framework of algorithmic learning theory, the tester gives the learner an example sentence at each step, and the learner responds with a hypothesis, which is a suggested program to determine grammatical correctness. It is required of the tester that every possible sentence (grammatical or not) appears in the list eventually, but no particular order is required. It is required of the learner that at each step the hypothesis must be correct for all the sentences so far. A hypothesis (foundation from ancient Greek hupothesis where hupo = under and thesis = placing) is a proposed explanation for a phenomenon. ...
A computer program or software program (usually abbreviated to a program) is a step-by-step list of instructions written for a particular computer architecture in a particular computer programming language. ...
A particular learner is said to be able to "learn a language in the limit" if there is a certain number of steps beyond which its hypothesis no longer changes. At this point it has indeed learned the language, because every possible sentence appears somewhere in the sequence of inputs (past or future), and the hypothesis is correct for all inputs (past or future), so the hypothesis is correct for every sentence. The learner is not required to be able to tell when it has reached a correct hypothesis, all that is required is that it be true. Gold showed that any language which is defined by a Turing machine program can be learned in the limit by another Turing-complete machine using enumeration. This is done by the learner testing all possible Turing machine programs in turn until one is found which is correct so far - this forms the hypothesis for the current step. Eventually, the correct program will be reached, after which the hypothesis will never change again (but note that the learner does not know that it won't need to change). An artistic representation of a Turing Machine . ...
Enumeration is the name given to the kind of mathematics that deals with counting objects. ...
Gold also showed that if the learner is given only positive examples (that is, only grammatical sentences appear in the input, not ungrammatical sentences), then the language can only be guaranteed to be learned in the limit if there are only a finite number of possible sentences in the language (this is possible if, for example, sentences are known to be of limited length). In mathematics, a set is called finite if and only if there is a bijection between the set and some set of the form {1, 2, ..., n} where is a natural number. ...
Language identification in the limit is a very theoretical model. It does not allow for limits of runtime or computer memory which can occur in practice, and the enumeration method may fail if there are errors in the input. However the framework is very powerful, because if these strict conditions are maintained, it allows the learning of any program known to be computable. This is because a Turing machine program can be written to mimic any program in any conventional programming language. See Church-Turing thesis. In computer science, run time (with a space, though often its spelled without one) describes the operation of a computer program, the duration of its execution, from beginning to termination (compare compile time). ...
The terms storage (U.K.) or memory (U.S.) refer to the parts of a digital computer that retain physical state (data) for some interval of time, possibly even after electrical power to the computer is turned off. ...
A programming language or computer language is a standardized communication technique for expressing instructions to a computer. ...
In computability theory the Church-Turing thesis, Churchs thesis, Churchs conjecture or Turings thesis, named after Alonzo Church and Alan Turing, is a hypothesis about the nature of mechanical calculation devices, such as electronic computers. ...
Other frameworks of learning consider a much more restricted class of function than Turing machines, but complete the learning more quickly (in polynomial time). An example of such a framework is Probably approximately correct learning. In computational complexity theory, Polynomial time refers to the computation time of a problem where the time, m(n), is no greater than a polynomial function of the problem size, n. ...
Probably approximately correct learning (PAC learning) is a framework of learning that was proposed by Leslie Valiant in his paper A theory of the learnable. ...
External links
|