Yu, Yuanlong (2010) Cognitive visual perception mechanism for robots using object-based visual attention. Doctoral (PhD) thesis, Memorial University of Newfoundland.
- Accepted Version
Available under License - The author retains copyright ownership and moral rights in this thesis. Neither the thesis nor substantial extracts from it may be printed or otherwise reproduced without the author's permission.
Based on the psychological and physiological fact that humans employ a visual attention mechanism to connect perception and action by selecting the relevant parts of the environment in an unconscious or conscious way and using the relevant parts to produce an appropriate action, this thesis presents a cognitive visual perception paradigm that determines how visual inputs reach awareness and guide actions. -- Based on the idea that a general way of organizing the visual scene is to parcel it into discrete objects, object-based visual attention theory is employed in the proposed paradigm. This proposed paradigm models robotic visual perception as a three-stage process: pre-attentive processing, attentional selection and post-attentive perception. It indicates that robotic visual perception starts from a low-level cognitive attentional selection procedure that guides attention to the relevant object of the scene, followed by a high-level post-attentive analysis procedure that analyzes the attended object and formulates it into an internal mental representation used for further cognitive behaviors. -- The pre-attentive processing stage extracts pre-attentive features and divides the input scene into uniform proto-objects by using an irregular pyramid based segmentation method. The attentional selection stage guides attention to one proto-object of interest by means of unconscious bottom-up competition and conscious top-down biasing. The bottom-up competition is modeled by estimating the saliency of each proto-object. The top-down biasing is modeled by using integrated competition hypothesis: by directing attention to a task-relevant feature of an object, a competitive advantage over the whole object is produced. Furthermore, this thesis asserts that the task-relevant feature can be autonomously deduced from the internal representation of the task-relevant object that is specified by or inferred from the current task. -- Once a proto-object is selected by attention, it proceeds to the post-attentive perception stage, which includes perceptual completion processing, extraction of post-attentive features, object recognition, and development of the internal representation of the attended object in long-term memory. The internal representation is autonomously organized and learned under the framework of probabilistic neural networks in the sense that an object is modeled as a hierarchical cluster. Thus, each instance in the cluster can be abstracted as a mental state that can be used for high-level cognitive behaviors, such as attentional prediction and action determination. -- This proposed cognitive visual perception paradigm is applied into distinct robotic tasks, including detection of salient objects, detection of task-relevant objects and target tracking. Experimental results under different conditions are shown to validate this paradigm.
|Item Type:||Thesis (Doctoral (PhD))|
|Additional Information:||Bibliography: leaves 220-244.|
|Department(s):||Engineering and Applied Science, Faculty of|
|Library of Congress Subject Heading:||Robot vision; Robots--Control systems; Cognitive science|
Actions (login required)