Computer Vision
Overall Goal of this Field of Research
There are two overall goals of Computer Vision. Many of the methods and techniques applied in artificial vision systems are motivated by biological vision systems. Those parts of human and animal brains that are occupied with the processing of visual information are probably the parts best understood today. Vision serves as the example par excellence for biological cognition. Thus, the first motivation to occupy oneself with Computer Vision, is to gain deeper insight into the nature of cognition and thus into an integral part of consciousness. This can be achieved by proposing computational models of visual information processing, implementing them on a computer system, running simulations, comparing the capabilities of the artificial system with those of biological systems, and adapting the theory of processing where appropriate or inspiring new investigations of the biological systems.
The second goal of Computer Vision is the analysis of information extracted from images with the purpose of understanding their content. As explained under Interactive Systems, one of the demands made on interactive systems is their ability to perceive and understand their environment. Thus, Computer Vision provides the visual modality of automatic scene understanding. Typical technologies comprise the visual acquisition, recognition, and tracking of objects in videos and their localization in the environment.
Major Challenges
Major challenges of Computer Vision consist in
- object recognition and classification: This means the decision which individual objects are present in an image (as for example my husband and our cat) and the assignment of an object to a special category (as for example a man or a cat), respectively. Often, these task are combined with the determination of the objects' positions.
- the handling of huge and high-dimensional input data spaces: To manage continuous video streams of an interactive system, for instance, the huge amount of data provided by the environment is supposed to be processed faster and in a more intelligent way than with current methods.
Unsolved Problems
To master the challenges of interactive systems such as big data and real-time processing, learning and recognition should not be regarded as two separated, sequential processes. Rather, learning to recognize or classify objects is supposed to be incremental and should persist while a system is in the field already.
- An unsolved problem in the field of classification is the question how the continuously changing context of the environment can be taken into account during the learning of object categories. A more recent development in the field of object classification is the utilization of 3D information for classification as well. Thus, the question of fusing techniques from Computer Vision and Computer Graphics, already mentioned under Interactive Systems, also arises for the task of 3D classification.
- An open subject of investigation in the field of object recognition is recognition by interaction. Interactive systems should be able to foster their recognition abilities by an active exploration of their environment. For example, a system could rotate unfamiliar objects with a manipulator or move a sensor around them until it is able to recognize them. In any case, interaction will be a key issue also in the field of Computer Vision.
- As to the major challenge of huge input spaces two unsolved problems are of special interest. One yet unanswered question concernes appropriate object descriptions. How should object representations be parameterized to endow an interactive system with the capability to comply with its requirements? For dynamic environments object descriptions should probably be dynamic as well. The second problem concernes the acquisition of these object descriptions. It is the question of strategies for an intelligent object acquisition, that can separate relevant data from the huge amount of irrelevant data with respect to the overall goal of the system.
Our Expertise
We have a strong background in classical Computer Vision topics with experiences in low level image processing, image segmentation, object recognition, and person tracking. In the context of the DFG-funded project "Dynamic Learning for Geometric and Graphic Object Acquisition" (2005-2012) we have developed a system that is able to handle a huge input data space by an intelligent selection of only those data relevant for a specific task. Furthermore, we proposed a method for object recognition by interaction, which involves a robotic arm rotating objects to views advantageous for recognition.
More recent activities include the fusion of techniques from Computer Vision, Computer Graphics, and Machine Learning to not only utilize 2D information for object classification but 3D information as well. Here we also deal with the question of appropriate object descriptors and their dynamic, context-dependent selection.
Images: Human-Computer Interaction
Downloads:
- Research Report 2010-2013 (PDF 2 MB)
- Forschungsbericht 2010-2013 (PDF 2 MB)
- Lehrbericht 2010-2022 (PDF 12 MB)