FACTOID # 106: United we stand? The United Kingdom and United States are both in the top ten for Gross Domestic Product - and for child poverty.
 
 Home   Encyclopedia   Statistics   Countries A-Z   Flags   Maps   Education   Forum   FAQ   About 
 
WHAT'S NEW
RECENT ARTICLES
More Recent Articles »
 

SEARCH ALL

FACTS & STATISTICS   

Search encyclopedia, statistics and forums:

 

 

(* = Graphable)

 

 


Encyclopedia > Type inference

Type inference is a feature present in some strongly statically typed programming languages. It is often characteristic of — but not limited to — functional programming languages in general. Some languages that include type inference are: Haskell, Cayenne, Clean, ML, OCaml, Epigram, Scala, Nemerle, D, Chrome and Boo. This feature is planned for Fortress, C# 3.0, C++0x and Perl 6. The ability to infer types automatically makes many programming tasks easier, leaving the programmer free to omit type annotations while maintaining type safety. In computer science and computer programming, the term strong typing is used to describe those situations where programming languages specify one or more restrictions on how operations involving values having different datatypes can be intermixed. ... On computer science, a datatype (often simply type) is a name or label for a set of values and some operations which can be performed on that set of values. ... A programming language is an artificial language that can be used to control the behavior of a machine, particularly a computer. ... Functional programming is a programming paradigm that treats computation as the evaluation of mathematical functions. ... Haskell is a standardized purely functional programming language with non-strict semantics, named after the logician Haskell Curry. ... Cayenne is a functional programming language with dependent types. ... In computer science, Clean is a general-purpose purely functional computer programming language. ... ML is a general-purpose functional programming language developed by Robin Milner and others in the late 1970s at the University of Edinburgh, whose syntax is inspired by ISWIM. Historically, ML stands for metalanguage as it was conceived to develop proof tactics in the LCF theorem prover (the language of... Objective Caml (OCaml) is a general-purpose programming language descended from the ML family, created by Xavier Leroy, Jérôme Vouillon, Damien Doligez, Didier Rémy and others in 1996. ... monogram is the name of a functional programming language with dependent types and of the IDE usually packaged with it. ... Scala is a multi-paradigm programming language designed to express common programming patterns in a concise, elegant, and type-safe way. ... Nemerle logo Nemerle is a high-level statically-typed programming language for the . ... D is an object-oriented, imperative, multiparadigm system programming language designed by Walter Bright of Digital Mars as a re-engineering of C++. This was done by re-designing many C++ features, and borrowing ideas from other programming languages. ... Chrome is a programming language for the Common Language Infrastructure developed by RemObjects Software. ... Boo is an object oriented, statically typed programming language developed starting in 2003, which seeks to make use of the Common Language Infrastructure support for Unicode, internationalization and web style applications, while using a Python-inspired syntax and a special focus on language and compiler extensibility. ... Fortress is a draft specification for a new programming language currently developed by Sun Microsystems as part of a DARPA-funded supercomputing initiative. ... The title given to this article is incorrect due to technical limitations. ... Perl 6 is a planned major revision to the Perl programming language. ... A type signature defines the inputs and outputs for a function or method. ...

Contents

Nontechnical explanation

In most programming languages, all values have a type which describes the kind of data a particular value describes. In some languages, the type is known only at runtime; these languages are dynamically typed. In other languages, the type is known at compile time; these languages are statically typed. In statically typed languages, the input and output types of functions and local variables ordinarily must be explicitly provided by type annotations. For example, in C: A data type is a constraint placed upon the interpretation of data in a type system in computer programming. ... In computer science, runtime or run time describes the operation of a computer program, the duration of its execution, from beginning to termination (compare compile time). ... One major distinction made in the nature and behaviour of programming languages is that of its typing. ... In computer science, compile time, as opposed to runtime, is the time when a compiler compiles code written in a programming language into an executable form. ... In computer science, a datatype (often simply type) is a name or label for a set of values and some operations which can be performed on that set of values. ... In computer science, a local variable is a variable that is given local scope. ... C is a general-purpose, block structured, procedural, imperative computer programming language developed in 1972 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system. ...

 int addone(int x) { int result; result = x+1; return result; } 

The beginning of this function definition, int addone(int x) declares that addone is a function which takes one argument, an integer, and returns an integer. int result; declares that the local variable result is an integer. In a language where type inference is available, the code might be written like this instead:

 addone(x) { val result; result = x+1; return result; } 

This looks very similar to how code is written in a dynamically typed language, yet all types are known at compile time. In the imaginary language in which the last example is written, + always takes two integers and returns one integer. (This is how it works in, for example, OCaml.) From this, the type inferencer can infer that the value of x+1 is an integer, therefore result is an integer, therefore the return value of addone is an integer. Similarly, since + requires that both of its arguments be integers, x must be an integer, and therefore addone accepts one integer as an argument. Objective Caml (OCaml) is a general-purpose programming language descended from the ML family, created by Xavier Leroy, Jérôme Vouillon, Damien Doligez, Didier Rémy and others in 1996. ...


Technical description

Type inference refers to the ability to deduce automatically, either partially or fully, the type of the value derived from the eventual evaluation of an expression. As this process is systematically performed at compile time, the compiler is often able to infer the type of a variable or the type signature of a function, without explicit type annotations having been given. In many cases, it is possible to omit type annotations from a program completely if the type inference system is robust enough, or the program or language simple enough. A type signature defines the inputs and outputs for a function or method. ...


To obtain the information required to infer correctly the type of an expression lacking an explicit type annotation, the compiler either gathers this information as an aggregate and subsequent reduction of the type annotations given for its subexpressions (which may themselves be variables or functions), or through an implicit understanding of the type of various atomic values (e.g., () : Unit; true : Bool; 42 : Integer; 3.14159 : Real; etc.). It is through recognition of the eventual reduction of expressions to implicitly typed atomic values that the compiler for a type inferring language is able to compile a program completely without type annotations. In the case of highly complex forms of higher order programming and polymorphism, it is not always possible for the compiler to infer as much, however, and type annotations are occasionally necessary for disambiguation. A unit type is a mathematical type that allows only one value. ... Higher Order Programming is programming that exploits the ability to use functions as values; it is usually borrowed from models of computation like the lambda calculus which make heavy use of Higher-order functions. ... In computer science, polymorphism means allowing a single definition to be used with different types of data (specifically, different classes of objects). ...


Example

For example, let us consider the Haskell function map, which applies a procedure to each element of a list, and may be defined as: Haskell is a standardized purely functional programming language with non-strict semantics, named after the logician Haskell Curry. ...

 map f [] = [] map f (first:rest) = f first : map f rest 

From this, it is evident that the function map takes a list as its second argument, that its first argument f is a function that can be applied to the type of elements of that list, and that the result of map is constructed as a list with elements that are results of f. So assuming that a list contains elements of the same type, we can reliably construct a type signature

 map :: (a -> b) -> [a] -> [b] 

where the syntax "a->b" denotes a procedure that takes an a as its parameter and produces a b. "a->b->c" is equivalent to "a->(b->c)".


Note that the inferred type of map is parametrically polymorphic: The type of the arguments and results of f are not inferred, but left as type variables, and so map can be applied to functions and lists of various types, as long as the actual types match in each invocation. In computer science, polymorphism is the use of a general interface to manipulate things of various specialized types. ...


Hindley–Milner type inference algorithm

The common algorithm used to perform the type inference is the one now commonly referred to as Hindley–Milner or Damas–Milner algorithm.


The origin of this algorithm is the type inference algorithm for the simply typed lambda calculus, which was devised by Haskell B. Curry and Robert Feys in 1958. In 1969 Roger Hindley extended this work and proved that their algorithm always inferred the most general type. In 1978 Robin Milner, independently of Hindley's work, provided an equivalent algorithm. In 1985 Luis Damas finally proved that Milner's algorithm is complete and extended it to support systems with polymorphic references. A typed lambda calculus is a typed formalism which uses the lambda-symbol () to denote anonymous function abstraction. ... Haskell Brooks Curry (September 12, 1900 - September 1, 1982) was an American mathematician and logician. ... Robert Feys (1889 — 1961) was a logician and philosopher. ... Robin Milner is a prominent British computer scientist. ...


The Algorithm

The algorithm proceeds in two steps. First, we need to generate a number of equations to solve, then we need to solve them.


Generating the equations

The algorithm used for generating the equations is similar to a regular type checker, so let's consider first a regular type checker for the typed lambda calculus given by Typed versions of the lambda calculus extend the standard lambda calculus with types. ...


e , ::= E mid v mid (lambda v:tau. e) mid (e, e)


and


tau , ::= T mid tau to tau


where E is a primitive expression (such as "3") and T is a primitive type (such as "Integer").


We want to construct a function f of type (epsilon, t) to tau, where ε is a type environment and t is a term. We assume that this function is already defined on primitives. The other cases are:

  • f(ε,v) = τ where (v,τ) is in ε
  • f(epsilon, g, e) = tau where f(epsilon, g) = tau_1 to tau and f(ε,e) = τ1
  • f(epsilon, lambda v:tau. e) = tau to f(epsilon_1, e) where epsilon_1 = epsilon cup (v, tau)

Note that so far we do not specify what to do when we fail to meet the various conditions. This is because, in the simple type checking algorithm, the check simply fails whenever anything goes wrong.


Now, we develop a more sophisticated algorithm that can deal with type variables and constraints on them. Therefore, we extend the set T of primitive types to include an infinite supply of variables, denoted by lowercase Greek letters α,β,...


See [1] for more details.


Solving the equations

Solving the equations proceeds by unification. This is - maybe surprisingly - a rather simple algorithm. The function u operates on type equations and returns a structure called a "substitution". A substitution is simply a mapping from type variables to types. Substitutions can be composed and extended in the obvious ways. For the idea of global unification, see globalization. ...


Unifying the empty set of equations is easy enough: u, emptyset = mathbf{i}, where mathbf{i} is the identity substitution.


Unifying a variable with a type goes this way: u, ([alpha = T] cup C) = u, (C') cdot (alpha mapsto T), where cdot is the substitution composition operator, and C' is the set of remaining constraints C with the new substitution alpha mapsto T applied to it.


Of course, u, ([T = alpha] cup C) = u ([alpha = T] cup C).


The interesting case remains as u, ([S to S' = T to T']cup C) = u , ({[S = T], [S' = T']}cup C).


References

  1. ^ Pierce, Benjamin C. (2002). Types and Programming Languages. MIT Press. ISBN 0-262-16209-1. 

Benjamin C. Pierce is a professor of computer science at the University of Pennsylvania. ...

External links


  Results from FactBites:
 
Type inference for Python | Lambda the Ultimate (1783 words)
A question was raised in that thread having to do with why static type inference in these languages is difficult.
In another discussion, I suggested type feedback as a way to collect otherwise undeclared type info to augment a DT (dynamically typed) language when an IDE execution environment can be assumed as context for collecting and storing info, and presenting some UI to a developer for using that info.
The whole point of "soft" type inference approaches is to be able to get some of the benefits of static type analysis without the requirement that the entire program be explicitly well-typed.
Type inference and union types | Lambda the Ultimate (1918 words)
At the time of type checking we may freely propogate information about the type of a union without worrying about what its compiled representation will be, so it doesn't make sense to me to say that the inference algorithm can not cope with untagged union types.
Type inference with untagged unions is not neccessarily completely useless.
These would present somewhat of a problem for a type inference algorithm, since the invariants may be of arbitrary complexity and thus cannot necessarily be evaluated at compile time.
  More results at FactBites »

 

COMMENTARY     


Share your thoughts, questions and commentary here
Your name
Your location
Your comments
Please enter the 5-letter protection code


Lesson Plans | Student Area | Student FAQ | Reviews | Press Releases |  Feeds | Contact
The Wikipedia article included on this page is licensed under the GFDL.
Images may be subject to relevant owners' copyright.
All other elements are (c) copyright NationMaster.com 2003-5. All Rights Reserved.
Usage implies agreement with terms.