10-03-2012, 10:17 PM
Computational visual recognition concerns identifying what is in an image, video, or other visual data, enabling applications such as measuring location, pose, size, activity, and identity as well as indexing for search by content. Recent progress in making economical sensors and improvements in network, storage, and computational power make visual recognition practical and relevant in almost all experimental sciences and commercial applications such as image search
All of this work is part of an attempt to understand the structure of visual data and build better systems for extracting information from visual signals. Such systems are useful in practice because, although for many application areas human perceptual abilities far outstrip the ability of computational systems, automated systems already have the upper hand in running constantly over vast amounts of data, e.g. surveillance systems and process monitoring, and in making metric decisions about specific quantities such as size, distance, or orientation, where humans have difficulty. Surveillance illustrates the need for recognition in order to increase performance. From watching cells under a microscope to observing research mice in habitats to guarding national borders, surveillance systems are limited by false detections produced due to spurious and unimportant activity. This cost can be reduced by visual recognition algorithms that identify either activities of interest or the commonly occurring unimportant activity.
In the spatial object surveillance systems, the detection of moving objects must be quick and accurate. The background changes slowly in surveillance, so only detected objects are usually considered to be moving. Hence the background model algorithm is always used to detect moving object. The principle of background model algorithm is to set up statistical model of background, and then make the difference image of current image and background image to extract moving foreground. Anurag Mittal at al proposed the adaptive kernel density estimation algorithm to set up background model . This method can get better detection result, but it needs large memory space and complex calculation with bad real-time capability. Haritaoglu at al proposed the W4 algorithm , which depicted per pixel’s probability distribution by double-crest distribution. The background model was set up with maximum and max interval-frame change. It runs fast but only using the Greyscale information. Stauffer at al took use of Mixture of Guassian (MOG) as the statistic model of background , and every parameters of Guassian distribution change continuously to be adapted for the gradual change of background. The algorithm has better adaptive capability for incomplete dynamic background. The fault of MOG is that, when foreground texture and color are homogeneous and have low contrast with background, the detected foreground is also not intact. This is because MOG consider the pixels obey the independence and same distribution without considering their correlation, so the background model based on individual pixels is sensitive to noise. The paper used the correlation of the pixel local neighbors, and used a pixel and its neighbors as an image vector to represent that pixel and modeled different chrominance component pixel as a mixture of Gaussians. In order to make full use of spatial information, the color image segmentation and background model algorithm are combined to get intact moving object.
2. MACHINE VISION
It is the science and technology of machines that see. Here see means the machine is able to extract information from an image, to solve some task, or perhaps "understand" the scene in either a broad or limited sense. Applications range from (relatively) simple tasks, such as industrial machine vision systems which, say, count bottles speeding by on a production line, to research into artificial intelligence and computers or robots that can comprehend the world around them. As a scientific discipline, computer vision is concerned with the theory behind artificial systems that extract information from images. The image data can take many forms, such as video sequences, views from multiple cameras, or multi-dimensional data from a medical scanner. Today in the 2010s, machine vision is a rapidly developing field, in both the scientific and industrial fields, and indeed now even in the video game universe, with the first machine vision computer game systems appearing as world-wide commercial products. As a technological discipline, computer vision seeks to apply its theories and models to the construction of computer vision systems. Examples of applications of computer vision include systems for:
Controlling processes (e.g., an industrial robot or an autonomous vehicle).
Detecting events (e.g., for visual surveillance or people counting).
Organizing information (e.g., for indexing databases of images and image sequences).
Modeling objects or environments (e.g., industrial inspection, medical image analysis or topographical modeling).
Interaction (e.g., as the input to a device for computer-human interaction).
Some strands of computer vision research are closely related to the study of biological vision - indeed, just as many strands of AI research are closely tied with research into human consciousness. The field of biological vision studies and models the physiological processes behind visual perception in humans and other animals.
Computer vision, on the other hand, studies and describes the processes implemented in software and hardware behind artificial vision systems. Interdisciplinary exchange between biological and computer vision has proven fruitful for both fields. Computer vision is, in some ways, the inverse of computer graphics. While computer graphics produces image data from 3D models, computer vision often produces 3D models from image data. There is also a trend towards a combination of the two disciplines, e.g., as explored in augmented reality. Sub-domains of computer vision include scene reconstruction, event detection, video tracking, object recognition, learning, indexing, motion estimation, and image restoration.
FIG: 1, Relation between computer vision and various other fields
Much of artificial intelligence deals with autonomous planning or deliberation for robotic systems to navigate through an environment. A detailed understanding of these environments is required to navigate through them. Information about the environment could be provided by a computer vision system, acting as a vision sensor and providing high-level information about the environment and the robot. Artificial intelligence and computer vision share other topics such as pattern recognition and learning techniques. Consequently, computer vision is sometimes seen as a part of the artificial intelligence field or the computer science field in general. Physics is another field that is closely related to computer vision. Computer vision systems rely on image sensors, which detect electromagnetic radiation which is typically in the form of either visible or infra-red light. The sensors are designed using solid-state physics. The process by which light propagates and reflects off surfaces is explained using optics. Sophisticated image sensors even require quantum mechanics to provide a complete understanding of the image formation process. Also, various measurement problems in physics can be addressed using computer vision, for example motion in fluids.
A third field which plays an important role is neurobiology, specifically the study of the biological vision system. Over the last century, there has been an extensive study of eyes, neurons, and the brain structures devoted to processing of visual stimuli in both humans and various animals. This has led to a course, yet complicated, description of how "real" vision systems operate in order to solve certain vision related tasks. These results have led to a subfield within computer vision where artificial systems are designed to mimic the processing and behavior of biological systems, at different levels of complexity. Also, some of the learning-based methods developed within computer vision have their background in biology.
Yet another field related to computer vision is signal processing. Many methods for processing of one-variable signals, typically temporal signals, can be extended in a natural way to processing of two-variable signals or multi-variable signals in computer vision. However, because of the specific nature of images there are many methods developed within computer vision which have no counterpart in the processing of one-variable signals. A distinct character of these methods is the fact that they are non-linear which, together with the multi-dimensionality of the signal, defines a subfield in signal processing as a part of computer vision. Beside the above mentioned views on computer vision, many of the related research topics can also be studied from a purely mathematical point of view. For example, many methods in computer vision are based on statistics, optimization or geometry. Finally, a significant part of the field is devoted to the implementation aspect of computer
vision; how existing methods can be realized in various combinations of software and hardware, or how these methods can be modified in order to gain processing speed without losing too much performance.
The fields most closely related to computer vision are image processing, image analysis and machine vision. There is a significant overlap in the range of techniques and applications that these cover. This implies that the basic techniques that are used and developed in these fields are more or less identical, something which can be interpreted as there is only one field with different names. On the other hand, it appears to be necessary for research groups, scientific journals, conferences and companies to present or market themselves as belonging specifically to one of these fields and, hence, various characterizations which distinguish each of the fields from the others have been presented.
3. OBJECT RECOGNITION (COMPUTER VISION)
Object recognition in computer vision is the task of finding a given object in an image or video sequence. Humans recognize a multitude of objects in images with little effort, despite the fact that the image of the objects may vary somewhat in different viewpoints, in many different sizes / scale or even when they are translated or rotated. Objects can even be recognized when they are partially obstructed from view. This task is still a challenge for computer vision systems in general
OBJT RECGTN REPort.pdf (Size: 973.19 KB / Downloads: 235)