|
Type inference is a feature present in some strongly statically typed programming languages. It is often characteristic of — but not limited to — functional programming languages in general. Some languages that include type inference are: Haskell, Cayenne, Clean, ML, OCaml, Epigram, Scala, Nemerle, D, Chrome and Boo. This feature is planned for Fortress, C# 3.0, C++0x and Perl 6. The ability to infer types automatically makes many programming tasks easier, leaving the programmer free to omit type annotations while maintaining type safety. In computer science and computer programming, the term strong typing is used to describe those situations where programming languages specify one or more restrictions on how operations involving values having different datatypes can be intermixed. ...
On computer science, a datatype (often simply type) is a name or label for a set of values and some operations which can be performed on that set of values. ...
A programming language is an artificial language that can be used to control the behavior of a machine, particularly a computer. ...
Functional programming is a programming paradigm that treats computation as the evaluation of mathematical functions. ...
Haskell is a standardized purely functional programming language with non-strict semantics, named after the logician Haskell Curry. ...
Cayenne is a functional programming language with dependent types. ...
In computer science, Clean is a general-purpose purely functional computer programming language. ...
ML is a general-purpose functional programming language developed by Robin Milner and others in the late 1970s at the University of Edinburgh, whose syntax is inspired by ISWIM. Historically, ML stands for metalanguage as it was conceived to develop proof tactics in the LCF theorem prover (the language of...
Objective Caml (OCaml) is a general-purpose programming language descended from the ML family, created by Xavier Leroy, Jérôme Vouillon, Damien Doligez, Didier Rémy and others in 1996. ...
monogram is the name of a functional programming language with dependent types and of the IDE usually packaged with it. ...
Scala is a multi-paradigm programming language designed to express common programming patterns in a concise, elegant, and type-safe way. ...
Nemerle logo Nemerle is a high-level statically-typed programming language for the . ...
D is an object-oriented, imperative, multiparadigm system programming language designed by Walter Bright of Digital Mars as a re-engineering of C++. This was done by re-designing many C++ features, and borrowing ideas from other programming languages. ...
Chrome is a programming language for the Common Language Infrastructure developed by RemObjects Software. ...
Boo is an object oriented, statically typed programming language developed starting in 2003, which seeks to make use of the Common Language Infrastructure support for Unicode, internationalization and web style applications, while using a Python-inspired syntax and a special focus on language and compiler extensibility. ...
Fortress is a draft specification for a new programming language currently developed by Sun Microsystems as part of a DARPA-funded supercomputing initiative. ...
The title given to this article is incorrect due to technical limitations. ...
Perl 6 is a planned major revision to the Perl programming language. ...
A type signature defines the inputs and outputs for a function or method. ...
Nontechnical explanation In most programming languages, all values have a type which describes the kind of data a particular value describes. In some languages, the type is known only at runtime; these languages are dynamically typed. In other languages, the type is known at compile time; these languages are statically typed. In statically typed languages, the input and output types of functions and local variables ordinarily must be explicitly provided by type annotations. For example, in C: A data type is a constraint placed upon the interpretation of data in a type system in computer programming. ...
In computer science, runtime or run time describes the operation of a computer program, the duration of its execution, from beginning to termination (compare compile time). ...
One major distinction made in the nature and behaviour of programming languages is that of its typing. ...
In computer science, compile time, as opposed to runtime, is the time when a compiler compiles code written in a programming language into an executable form. ...
In computer science, a datatype (often simply type) is a name or label for a set of values and some operations which can be performed on that set of values. ...
In computer science, a local variable is a variable that is given local scope. ...
C is a general-purpose, block structured, procedural, imperative computer programming language developed in 1972 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system. ...
int addone(int x) { int result; result = x+1; return result; } The beginning of this function definition, int addone(int x) declares that addone is a function which takes one argument, an integer, and returns an integer. int result; declares that the local variable result is an integer. In a language where type inference is available, the code might be written like this instead: addone(x) { val result; result = x+1; return result; } This looks very similar to how code is written in a dynamically typed language, yet all types are known at compile time. In the imaginary language in which the last example is written, + always takes two integers and returns one integer. (This is how it works in, for example, OCaml.) From this, the type inferencer can infer that the value of x+1 is an integer, therefore result is an integer, therefore the return value of addone is an integer. Similarly, since + requires that both of its arguments be integers, x must be an integer, and therefore addone accepts one integer as an argument. Objective Caml (OCaml) is a general-purpose programming language descended from the ML family, created by Xavier Leroy, Jérôme Vouillon, Damien Doligez, Didier Rémy and others in 1996. ...
Technical description Type inference refers to the ability to deduce automatically, either partially or fully, the type of the value derived from the eventual evaluation of an expression. As this process is systematically performed at compile time, the compiler is often able to infer the type of a variable or the type signature of a function, without explicit type annotations having been given. In many cases, it is possible to omit type annotations from a program completely if the type inference system is robust enough, or the program or language simple enough. A type signature defines the inputs and outputs for a function or method. ...
To obtain the information required to infer correctly the type of an expression lacking an explicit type annotation, the compiler either gathers this information as an aggregate and subsequent reduction of the type annotations given for its subexpressions (which may themselves be variables or functions), or through an implicit understanding of the type of various atomic values (e.g., () : Unit; true : Bool; 42 : Integer; 3.14159 : Real; etc.). It is through recognition of the eventual reduction of expressions to implicitly typed atomic values that the compiler for a type inferring language is able to compile a program completely without type annotations. In the case of highly complex forms of higher order programming and polymorphism, it is not always possible for the compiler to infer as much, however, and type annotations are occasionally necessary for disambiguation. A unit type is a mathematical type that allows only one value. ...
Higher Order Programming is programming that exploits the ability to use functions as values; it is usually borrowed from models of computation like the lambda calculus which make heavy use of Higher-order functions. ...
In computer science, polymorphism means allowing a single definition to be used with different types of data (specifically, different classes of objects). ...
Example For example, let us consider the Haskell function map, which applies a procedure to each element of a list, and may be defined as: Haskell is a standardized purely functional programming language with non-strict semantics, named after the logician Haskell Curry. ...
map f [] = [] map f (first:rest) = f first : map f rest From this, it is evident that the function map takes a list as its second argument, that its first argument f is a function that can be applied to the type of elements of that list, and that the result of map is constructed as a list with elements that are results of f. So assuming that a list contains elements of the same type, we can reliably construct a type signature map :: (a -> b) -> [a] -> [b] where the syntax "a->b" denotes a procedure that takes an a as its parameter and produces a b. "a->b->c" is equivalent to "a->(b->c)". Note that the inferred type of map is parametrically polymorphic: The type of the arguments and results of f are not inferred, but left as type variables, and so map can be applied to functions and lists of various types, as long as the actual types match in each invocation. In computer science, polymorphism is the use of a general interface to manipulate things of various specialized types. ...
Hindley–Milner type inference algorithm The common algorithm used to perform the type inference is the one now commonly referred to as Hindley–Milner or Damas–Milner algorithm. The origin of this algorithm is the type inference algorithm for the simply typed lambda calculus, which was devised by Haskell B. Curry and Robert Feys in 1958. In 1969 Roger Hindley extended this work and proved that their algorithm always inferred the most general type. In 1978 Robin Milner, independently of Hindley's work, provided an equivalent algorithm. In 1985 Luis Damas finally proved that Milner's algorithm is complete and extended it to support systems with polymorphic references. A typed lambda calculus is a typed formalism which uses the lambda-symbol () to denote anonymous function abstraction. ...
Haskell Brooks Curry (September 12, 1900 - September 1, 1982) was an American mathematician and logician. ...
Robert Feys (1889 â 1961) was a logician and philosopher. ...
Robin Milner is a prominent British computer scientist. ...
The Algorithm The algorithm proceeds in two steps. First, we need to generate a number of equations to solve, then we need to solve them.
Generating the equations The algorithm used for generating the equations is similar to a regular type checker, so let's consider first a regular type checker for the typed lambda calculus given by Typed versions of the lambda calculus extend the standard lambda calculus with types. ...
 and
 where E is a primitive expression (such as "3") and T is a primitive type (such as "Integer"). We want to construct a function f of type , where ε is a type environment and t is a term. We assume that this function is already defined on primitives. The other cases are: - f(ε,v) = τ where (v,τ) is in ε
where and f(ε,e) = τ1 where  Note that so far we do not specify what to do when we fail to meet the various conditions. This is because, in the simple type checking algorithm, the check simply fails whenever anything goes wrong. Now, we develop a more sophisticated algorithm that can deal with type variables and constraints on them. Therefore, we extend the set T of primitive types to include an infinite supply of variables, denoted by lowercase Greek letters α,β,... See [1] for more details.
Solving the equations Solving the equations proceeds by unification. This is - maybe surprisingly - a rather simple algorithm. The function u operates on type equations and returns a structure called a "substitution". A substitution is simply a mapping from type variables to types. Substitutions can be composed and extended in the obvious ways. For the idea of global unification, see globalization. ...
Unifying the empty set of equations is easy enough: , where is the identity substitution. Unifying a variable with a type goes this way: , where is the substitution composition operator, and C' is the set of remaining constraints C with the new substitution applied to it. Of course, . The interesting case remains as .
References Benjamin C. Pierce is a professor of computer science at the University of Pennsylvania. ...
External links |