FACTOID # 35: Looking for Czech and Slovak men? Half are in factories.
 
 Home   Encyclopedia   Statistics   Countries A-Z   Flags   Maps   Education   Forum   FAQ   About 
 
WHAT'S NEW
RECENT ARTICLES
More Recent Articles »
 

FACTS & STATISTICS    Simple view

  1. Select countries to view: (hold down Control key and click to select several)

     

     

    Compare:

     

     

  1. Select fact or statistic: (* = graphable)

     

     

     

  2. (OPTIONAL) Compare to statistic: (both need to be graphable)

     

     

     

  3. View result as:

     

       
(OR) SEARCH ALL encyclopedia, stats & forums:   

Encyclopedia > Object recognition
Jump to: navigation, search

Computer vision is the study of methods which allow computers to "understand" images, or multidimensional data in general. The term "understand" means here that specific information is being extracted from the image data for a specific purpose: either for presenting it to a human operator (e. g., if cancerous cells have been detected in a microscopy image), or for controlling some process (e. g., an industry robot or an autonomous vehicle). The image data that is fed into a computer vision system is often a digital gray-scale or colour image, but can also be in the form of two or more such images (e. g., from a stereo camera pair), a video sequence, or a 3D volume (e. g., from a tomography device). In most practical computer vision applications, the computers are pre-programmed to solve a particular task, but methods based on learning are now becoming increasingly common. The tower of a personal computer. ...


The field of computer vision can be charaterized as immature and diverse. Even though earlier work exists, it was not until the late 1970's that a more focused study of the field started when computers could manage the processing of large data sets such as images. However, these studies usually originated from various other fields, and consequently there is no standard formulation of the "computer vision problem". Also, and to an even larger extent, there is no standard formulation of how computer vision problems should be solved. Instead, there exists an abundance of methods for solving various well-defined computer vision tasks, where the methods often are very task specific and seldom can be generalized over a wide range of applications. Many of the methods and applications are still in the state of basic research, but more and more methods have found their way into commercial products, where they often constitute a part of a larger system which can solve complex tasks (e.g., in the area of medical images, or quality control and measurements in industrial processes).


Computer vision can be seen as an a subfield of artificial intelligence where image data is being fed into a system as an alternative to text based input for controlling the behaviour of a system. Also, some of the learning methods which are used in computer vision are based on learning techniqes developed within artificial intelligence. ... ...


Since a camera can be seen as a light sensor, there are various methods in computer vision based on correspondences between a physical phenomenon related to light and images of that phenomenon. For example, it is possible to extract information about motion in fluids and waves by analyzing images of these phenomena. Also, a subfield within computer vision deals with the physical process which given a scene of objects, light sources, and camera lenses forms the image in a camera. Consequently, computer vision can also be seen as an extension of physics. Since antiquity, people have tried to understand the behavior of matter: why unsupported objects drop to the ground, why different materials have different properties, and so forth. ...


A third field which plays an important role is biology, specifically the study of the biological vision system. Over the last century, there has been an extensive study of eyes, neurons, and the brain structures devoted to processing of visual stimuli in both humans and various animals. This has led to a coarse, yet complicated, description of how "real" vision systems operate in order to solve certain vision related tasks. These results have led to a subfield within computer vision where artificial systems are designed to mimic the processing and behaviour of biological systems, at different levels of complexity. Also, some of the learning-based methods developed within computer vision have their background in biology. Main articles: Life All organisms (viruses not included) consist of cells, which in turn, are based on a common carbon-based biochemistry. ...


Yet another field related to computer vision is signal processing. Many existing methods for processing of one-variable signals, typically temporal signals, can be extended in a natural way to processing of two-variable signals or multi-variable signals in computer vision. However, because of the specific nature of images there are many methods developed within computer vision which have no counterpart in the processing of one-variable signals. A distinct character of these methods is the fact that they are non-linear which, together with the multi-dimensionality of the signal, defines a subfield in signal processing as a part of computer vision. Signal processing is the processing, amplification and interpretation of signals. ...


Beside the above mentioned views on computer vision, many of the related research topics can also be studied from a purely mathematical point of view. For example, at lot of methods in computer vision are based on statistics or optimization. Finally, a significant part of the field is devoted to the implementation aspect of computer vision; how existing methods can be realized in various combinations of software and hardware, or how these methods can be modified in order to gain processing speed without losing too much performance. Jump to: navigation, search Statistics is a type of data analysis which practice includes the planning, summarizing, and interpreting of observations of a system possibly followed by predicting or forecasting of future events based on a mathematical model of the system being observed. ... In mathematics, optimization is the discipline which is concerned with finding the maxima and minima of functions, possibly subject to constraints. ...


Computer vision and (digital) image processing are related fields. The distinction between the two is not very clear, e.g., computer vision uses many methods which traditonally belong to image processing. One formal distinction would be to say that image processing deals with transforming images, producing one image from another, or with producing low-level information about an image, such as edges or lines. Neither of these tasks provide, or require, an interpretation about what the image contains in terms of objects or events. Computer vision, on the other hand, uses models and assumptions about the real world depicted in the images to extract information which, e.g., can be used to control actions on objects in a scene. In more advanced systems, these models can be learned rather than programmed. This article needs to be cleaned up to conform to a higher standard of quality. ...

Contents


Examples of applications for computer vision

Another way to describe computer vision is in terms of applications areas. One of the most prominent application fields is medical computer vision or medical image processing. This area is characterized by the extraction of information from image data for the purpose of making a medical diagnosis of a patient. Typical image data is in the form of microscopy images, X-ray images, angiography images, ultrasonic images, and tomography images. Examples of information which can be extracted from such image data is detection of tumours, arteriosclerosis, or other malign changes. It can also be measurements of, e. g., organ dimensions, blood flow, etc. This application area also supports medical research by providing new information, e.g., about the structure of the brain, or about the quality of medical treatments. Microscopy is any technique for producing visible images of structures or details too small to otherwise be seen by the human eye, using a microscope or other magnification tool. ... In the NATO phonetic alphabet, X-ray represents the letter X. An X-ray picture (radiograph) taken by Röntgen An X-ray is a form of electromagnetic radiation with a wavelength approximately in the range of 5 pm to 10 nanometers (corresponding to frequencies in the range 30 PHz... Angiography or arteriography is a medical imaging technique in which an X-ray picture is taken to visualize the inner opening of blood filled structures, including arteries, veins and the heart chambers. ... Medical ultrasonography is an ultrasound-based imaging diagnostic technique used to visualize internal organs, their size, structure and their pathological lesions. ... Tomography is imaging by sections or sectioning. ... Tumor (American English) or tumour (British English) originally means swelling, and is sometimes still used with that meaning. ... // Introduction Arteriosclerosis means the hardening of the arteries in Greek. ...


A second application area in computer vision is in industry. Here, information is extracted for the purpose of supporting a manufacturing process. One example is quality control where details or final products are being automatically inspected in order to find defects. Another example is measurement of position and orientation of details to be picked up by a robot arm.


Military applications is probably one of the larges areas for computer vision, even though only a smaller part of it is open to public. The obvious examples are detection of enemy soldiers or vehicles and guidance of missiles to a designated target. More advanced systems for missile guidance send the missile to an area rather than a specific target, and target selection is made when the missile reach the area based on image data aquired there. Modern military concepts, such as "battlefield awareness", imply that various sensors, including image sensors, provide a rich set of information about a combat scene which can be used to support strategic decisions. In this case, automatic processing of the data is used to reduce complexity and to fuse information from multiple sensors to increase reliability.


One of the newer application areas is autonomous vehicles which ranges from submersibles, land-based vehicles (small robots with wheels, cars or trucks) to aerial vehicles. An unmanned aerial vehicle is often denoted UAV. The level of autonomy ranges from fully autonomous (unmanned) vehicles to vehicles where computer vision based systems support a driver or a pilot in various situations. Fully autonomous vehicles typically use computer vision for navigation, i. e., for knowing where it is, or for producing a map of its environment and for detecting obstacles. It can also be used for detecting certain task specific events, e. g., a UAV looking for forest fires. Examples of supporting system are obstacle warning systems in cars and systems for autonomous landing of aircrafts. Several car manufactures have demonstrated system for autonomous driving of cars, but this technology has still not reached a level where it can be put on the market. There are ample examples of military autonomous vehicles ranging from advanced missiles to UAVs for recon missions or missile guidance. Space exploration is already being made with autonomous vehicles using computer vision, e. g., NASA's Mars Exploration Rover. Unmanned Aerial Vehicle over Iraq. ...


Typical tasks of computer vision

Object Recognition

Detecting the presence and/or pose of known objects in an image In geometry, an affine transformation or affine map (from the Latin, affinis, connected with) between two vector spaces consists of a linear transformation followed by a translation. ...


Examples:

A digital image is a representation of a two-dimensional image as a finite set of digital values, called picture elements or pixels. ... Content-based image retrieval (CBIR), also known as query by image content (QBIC) and content-based visual information retrieval (CBVIR) is the application of computer vision to the image retrieval problem, that is, the problem of searching for digital images in large databases. ...

Tracking

Tracking known objects through an image sequence Video tracking is the process of following movable objects in time using a camera. ...


Examples:

  • Tracking a single person walking through a shopping center.

Scene interpretation

Creating a model from an image/video. Model has many different meanings, depending on the context. ...


Examples:

  • Creating a model of the surrounding terrain from images, which are being taken by a robot-mounted camera.

Model has many different meanings, depending on the context. ... Jump to: navigation, search A humanoid robot playing the trumpet In practical usage, a robot is an autonomous or semi-autonomous device which performs its tasks either according to direct human control, partial control with human supervision, or completely autonomously. ...

Ego positioning

Determining position and motion of the camera itself.


Examples:

There are several traditions of navigation. ... Jump to: navigation, search A humanoid robot playing the trumpet In practical usage, a robot is an autonomous or semi-autonomous device which performs its tasks either according to direct human control, partial control with human supervision, or completely autonomously. ...

Computer Vision Systems

A typical computer vision system can be divided in the following subsystems:


Image acquisition

The image or image sequence is acquired with a imaging system (camera,radar,lidar,tomography system). Often the imaging system has to be calibrated before being used. In common usage, an image (from Latin imago) or picture is an artifact that reproduces the likeness of some subject—usually a physical object or a person. ... Jump to: navigation, search This article needs to be cleaned up to conform to a higher standard of quality. ... Imaging is the action or process of producing images, animations, 3D computer graphics or any other spatial representation of a physical object. ... A camera is a device used to take pictures (usually photographs), either singly or in sequence, with or without sound, such as with video cameras. ... This long range radar antenna (approximately 40m (130ft) in diameter) rotates on a track to observe activities near the horizon. ... Lidar (light detection and ranging or laser imaging detection and ranging) is a technology that determines distance to an object or surface using laser pulses. ... Tomography is imaging by sections or sectioning. ... Imaging is the action or process of producing images, animations, 3D computer graphics or any other spatial representation of a physical object. ... Calibration– refers to the process of setting the magnitude of the output (or response) of a measuring instrument to the magnitude of the input property or attribute within specified accuracy and precision. ...


Preprocessing

In the preprocessing step, the image is being treated with "low-level"-operations. The aim of this step is to do noise reduction on the image (i.e. to dissociate the signal from the noise) and to reduce the overall amount of data. This is typically being done by employing different (digital)image processing methods such as: Noise reduction is the process of removing noise from a signal. ... Signaling, or signal, may mean: Look up signal on Wiktionary, the free dictionary. ... In general usage, noise can be considered data without meaning; that is, data that is not being used to transmit a signal, but is simply produced as an unwanted by-product of other activities. ... Jump to: navigation, search DATA (Debt, AIDS, Trade, Africa) was established in 2002 by Bono (Paul Hewson) of the Rock band U2, and Bobby Shiver, along with activists from the Jubilee 2000 Drop the Debt Campaign, as an organisaton focused on Justice, not charity. ... // Introduction Digital image processing is the use of computer algorithms to perform image processing on digital images. ... This article needs to be cleaned up to conform to a higher standard of quality. ...

In general, a sample is a part of the total, such as one individual or a set of individuals from a population (of people or things), a small piece or amount of something larger, a number of function values of a function, or part of a song. ... Jump to: navigation, search An FIR filter In electronics, a digital filter is any electronic filter that works by performing digital math operations on an intermediate form of a signal. ... For the computer science usage see convolution (computer science) . In mathematics and in particular, functional analysis, convolution is a mathematical operator which takes two functions f and g and produces a third function that in a sense represents the amount of overlap between f and a reversed and translated version... Jump to: navigation, search In probability theory and statistics, correlation, also called correlation coefficient, is a numeric measure of the strength of linear relationship between two random variables. ... LSI may mean either: Large Scale Integration electronic chips Latent Semantic Indexing The Life Span Institute at the University of Kansas This is a disambiguation page — a navigational aid which lists other pages that might otherwise share the same title. ... In computer vision, the Sobel operator is a simple edge detection algorithm using the 1st derivative of the intensity information. ... In the above two images, the scalar field is in black and white, black representing higher values, and its corresponding gradient is represented by blue arrows. ... In computer vision segmentation of an image is the division of a given (digital) image into contiguous regions. ... Look up Threshold on Wiktionary, the free dictionary In general, a threshold is a fixed location or value where an abrupt change is observed. ... The Fourier transform, named after Joseph Fourier, is an integral transform that re-expresses a function in terms of sinusoidal basis functions, i. ... Motion perception is the process of inferring the true velocity and direction of motion in a visual scene given some visual input. ... Optic flow is the perceived visual motion of objects as the observer moves relative to them. ... Disparity refers to the difference in images from the left and right eye that the brain uses as a binocular cue to determine depth or distance of an object. ... Stereoscopy, stereoscopic imaging or 3-D (three-dimensional) imaging is a technique to create the illusion of depth in a photograph, movie, or other two-dimensional image, by presenting a slightly different image to each eye. ... A multiresolution analysis (MRA) or multiscale approximation (MSA) is the design method of most of the practically relevant discrete wavelet transforms (DWT) and the justification for the algorithm of the fast wavelet transform (FWT). ...

Feature extraction

The aim of feature extraction is to further reduce the data to a set of features, which ought to be invariant to disturbances such as lighting conditions, camera position, noise and distortion. Examples of feature extraction are: Feature extraction is an area of image processing which involves using algorithms to detect and isolate various desired portions of a digitized image or video stream. ... In geographic information systems, a feature comprises an item of feature data. ... Architect lamps Dark lighting in a concert hall allow laser effects to be visible In the 2005 Classical Spectacular performance, a state-of-the-art lighting system was used to accompany the music Lighting refers to the devices or techniques used for illumination, usually referring to artificial light sources such... A camera is a device used to take pictures (usually photographs), either singly or in sequence, with or without sound, such as with video cameras. ... Jump to: navigation, search The word position can have one of a billion meanings. ... In general usage, noise can be considered data without meaning; that is, data that is not being used to transmit a signal, but is simply produced as an unwanted by-product of other activities. ... A distortion is the (usually) undesirable alteration of the original shape (or other characteristic) of an object, image, sound, waveform or other form of information or representation. ... Feature extraction is an area of image processing which involves using algorithms to detect and isolate various desired portions of a digitized image or video stream. ...

The goal of edge detection is to mark the points in an image at which the intensity changes sharply. ... A corner is the place where two walls meet at an acute angle, and is generally thought to be the least beneficial position to be in a life-or-death situation. ... The Comet Nucleus Tour (CONTOUR) was a Discovery-class space mission. ... Curvature is the amount by which a geometric object deviates from being flat. ...

Registration

The aim of the registration step is to establish correspondence between the features in the acquired set and the features of known objects in a model-database and/or the features of the preceding image. The registration step has to bring up a final hypothesis. To name a few methods: In computer vision, sets of data acquired by sampling the same scene or object at different times, or from different perspectives, will be in different coordinate systems. ... In mathematics and mathematical economics, correspondence is a term with several related but not identical meanings. ... Model has many different meanings, depending on the context. ... A database is an organized collection of data. ... In computer vision, sets of data acquired by sampling the same scene or object at different times, or from different perspectives, will be in different coordinate systems. ... Jump to: navigation, search A hypothesis (assumption in ancient Greek) is a proposed explanation for a phenomenon. ...

Least squares is a mathematical optimization technique that attempts to find a best fit to a set of data by attempting to minimize the sum of the squares of the differences (called residuals) between the fitted function and the data. ... The Hough transform is a feature extraction technique used in digital image processing. ... In computer science, geometric hashing is a method for efficiently finding geometric objects of the same or similar shape, even though they may be rotated or otherwise transformed. ... Result of particle filtering (red line) based on observed data generated from the blue line ( Much larger image) Particle filter methods, also known as Sequential Monte Carlo (SMC), are sophisticated model estimation techniques based on simulation. ...

Related Fields

Advanced systems are often borrowing from many different fields like pattern recognition, statistical learning, projective geometry, image processing, graph theory and other. For the William Gibson novel, see: Pattern Recognition (novel). ... Machine learning is an area of artificial intelligence concerned with the development of techniques which allow computers to learn. More specifically, machine learning is a method for creating computer programs by the analysis of data sets. ... Projective geometry can be thought of informally as the geometry which arises from placing ones eye at a point. ... This article needs to be cleaned up to conform to a higher standard of quality. ... A diagram of a graph with 6 vertices and 7 edges. ...


Cognitive computer vision is strongly related to cognitive psychology and biological computation. Cognitive psychology is the psychological science which studies cognition, the mental processes that are hypothesised to underlie behavior. ...


A University Video Communication on Model-Based Computer Vision

Joseph Mundy in a University Video Communication on Model-Based Computer Vision (1987):


"What do students need to learn to be prepared to meet the challenges?" -


"I would like to comment on the necessary courses a student should take to really be prepared to carry out research in model-based vision. As we can see the geometry of image projection and the mathematics of transformation is a very key element in studying this field, but there are many other issues, the student has to be prepared for. If we are going to talk about segmenting images and getting good geometric clues, we have to understand the relationship between the intensity of image data and its underlying geometry. And this would lead the student into such areas as optics, illumination theory, theory of shadows and the like. And also the mathematics underlying this kind of computations would of course require signal processing theory, fourier transform theory and the like. And in dealing with algebraic surfaces such as this curved surfaces as we talked about here, courses in algebraic geometry and higher pure forms of algebra will prove to be necessary in order to make any kind of progress in research to handle curved surfaces. So, I guess the bottom line of what I'm saying is: math courses, particularly those associated with geometric aspects will be key in all of this." // Introduction Geometry (Greek Γεωμετρια, geo = earth, metria = measure ) arose as the field of knowledge dealing with spatial relationships. ... Perspective projection is a type of drawing that graphically approximates on a planar (two-dimensional) surface (e. ... Wikibooks Wikiversity has more about this subject: School of Mathematics Wikiquote has a collection of quotations related to: Mathematics Look up Mathematics on Wiktionary, the free dictionary Wikimedia Commons has more media related to: Mathematics Bogomolny, Alexander: Interactive Mathematics Miscellany and Puzzles. ... In geometry, an affine transformation or affine map (from the Latin, affinis, connected with) between two vector spaces consists of a linear transformation followed by a translation. ... In computer vision segmentation of an image is the division of a given (digital) image into contiguous regions. ... Optics (appearance or look in ancient Greek) is a branch of physics that describes the behavior and properties of light and the interaction of light with matter. ... For the act of supplying light to an area, see lighting. ... Jump to: navigation, search Shadows on a pavement A shadow is a dark shape, e. ... Signal processing is the processing, amplification and interpretation of signals. ... The Fourier transform, named after Joseph Fourier, is an integral transform that re-expresses a function in terms of sinusoidal basis functions, i. ... An open surface with X-, Y-, and Z-contours shown. ... Algebraic geometry is a branch of mathematics which, as the name suggests, combines abstract algebra, especially commutative algebra, with geometry. ... Algebra is a branch of mathematics, which studies structure and quantity. ... An open surface with X-, Y-, and Z-contours shown. ...


Applications

In the related fields machine vision and medical imaging, systems using computer vision techniques are sold in markets worth billions of US dollars per year. Machine vision (MV) is the application of computer vision to the physical world. ... Medical imaging is the process by which physicians evaluate an area of the subjects body that is not normally visible. ...


One interesting application of computer vision, commonly used in the creation of visual effects for cinema and broadcast, is camera tracking or matchmoving. Computer vision also finds its applications in medicine, military industry, security and surveillance, quality inspection, robotics, automotive industry and many other fields. Visual effects (vfx) is the term given to a sub-category of special effects in which images or film frames are created or manipulated for film and video. ... In 3d graphic, is the process of finding, given a picture, the camera coordinates and properties that are compatible with those of the virtual camera that took the picture. ...


See also

... Machine learning is an area of artificial intelligence concerned with the development of techniques which allow computers to learn. More specifically, machine learning is a method for creating computer programs by the analysis of data sets. ... This article needs to be cleaned up to conform to a higher standard of quality. ... // Introduction Digital image processing is the use of computer algorithms to perform image processing on digital images. ... Machine vision (MV) is the application of computer vision to the physical world. ... Medical imaging is the process by which physicians evaluate an area of the subjects body that is not normally visible. ... A collection of techniques for digital image processing based on mathematical morphology. ... VXL Logo VXL, the Vision-something-Library is a collection of open source C++ libraries for Computer Vision. ... Dr. Herbert Freeman is a computer scientist who made important contributions to the field of computer graphics, including anti-aliasing and machine vision. ... There is also an Australian journalist and biographer named David Marr. ... Jerome H. Lemelson (born July 18, 1923 Staten Island, New York, died October 1, 1997) was a prolific American inventor and patent holder with over 550 patents, making him one of the centurys five most prolific inventors. ... Ron Kimmel is a professor of computer science at the Technion, Israel, specializing in the fields of geometric computational methods and algorithms in image processing and computer vision. ... Affective computing is computing which involves emotion. ... Computer graphics (CG) is the field of visual computing, where one utilizes computers both to generate visual images synthetically and to integrate or alter visual and spatial information sampled from the real world. ... This is a list of important publications in computer science, organized by field. ...

External links


  Results from FactBites:
 
Recent Publications on Object Recognition (4064 words)
Object recognition and Barnes maze performance were significantly impaired in both H1 receptor gene knockout (H1KO) and H2 receptor gene knockout (H2KO) mice when compared to the respective wild-type (WT) mice.
To investigate the developmental aspects of object recognition and lexical access in children, a large-scale functional MRI (fMRI) study was performed in 283 normal children ages 5-18 using a word-picture matching paradigm in which children would match an aurally presented noun to one of two pictures (line drawings).
This study shows that it may be possible to account for object recognition impairments after damage to perirhinal cortex within a hierarchical, representational framework, in which complex conjunctive representations in perirhinal cortex play a critical role.
Object Recognition (1646 words)
The basic idea is to represent the visual appearance of an object as a loosely structured combination of a number of local context regions keyed by distinctive key features, or fragments.
The basic recognition strategy is to utilize a database (here viewed as an associative memory) of key features embedded in local contexts, which is organized so that access via an unknown key feature evokes associated hypotheses for the identity and configuration of all known objects that could have produced such an embedded feature.
To find an object of known characteristics in a scene, that is to answer the question of the form "where is the dog in this image?", the same procedure is followed, except that key feature matches are filtered on the basis of whether the came from a view of a dog.
  More results at FactBites »


 

COMMENTARY     


Share your thoughts, questions and commentary here
Your name
Your comments
Please enter the 5-letter protection code

Want to know more?
Search encyclopedia, statistics and forums:

 


Lesson Plans | Student Area | Student FAQ | Reviews | Press Releases |  Feeds | Contact
The Wikipedia article included on this page is licensed under the GFDL.
Images may be subject to relevant owners' copyright.
All other elements are (c) copyright NationMaster.com 2003-5. All Rights Reserved.
Usage implies agreement with terms.