Learning-based stereo matching for 3D reconstruction

Mao, Wendong (2019) Learning-based stereo matching for 3D reconstruction. Doctoral (PhD) thesis, Memorial University of Newfoundland.

[img] [English] PDF - Accepted Version
Available under License - The author retains copyright ownership and moral rights in this thesis. Neither the thesis nor substantial extracts from it may be printed or otherwise reproduced without the author's permission.

Download (15MB)


Stereo matching has been widely adopted for 3D reconstruction of real world scenes and has enormous applications in the fields of Computer Graphics, Vision, and Robotics. Being an ill-posed problem, estimating accurate disparity maps is a challenging task. However, humans rely on binocular vision to perceive 3D environments and can estimate 3D information more rapidly and robustly than many active and passive sensors that have been developed. One of the reasons is that human brains can utilize prior knowledge to understand the scene and to infer the most reasonable depth hypothesis even when the visual cues are lacking. Recent advances in machine learning have shown that the brain's discrimination power can be mimicked using deep convolutional neural networks. Hence, it is worth investigating how learning-based techniques can be used to enhance stereo matching for 3D reconstruction. Toward this goal, a sequence of techniques were developed in this thesis: a novel disparity filtering approach that selects accurate disparity values through analyzing the corresponding cost volumes using 3D neural networks; a robust semi-dense stereo matching algorithm that utilizes two neural networks for computing matching cost and performing confidence-based filtering; a novel network structure that learns global smoothness constraints and directly performs multi-view stereo matching based on global information; and finally a point cloud consolidation method that uses a neural network to reproject noisy data generated by multi-view stereo matching under different viewpoints. Qualitative and quantitative comparisons with existing works demonstrate the respective merits of these presented techniques.

Item Type: Thesis (Doctoral (PhD))
URI: http://research.library.mun.ca/id/eprint/14152
Item ID: 14152
Additional Information: Includes bibliographical references (pages 86-101).
Keywords: depth map, confidence measure, neural networks, multi-view stereo, point cloud consolidation
Department(s): Science, Faculty of > Computer Science
Date: September 2019
Date Type: Submission
Library of Congress Subject Heading: Three-dimensional modeling; Neural networks (Computer science); Depth perception--Computer simulation.

Actions (login required)

View Item View Item


Downloads per month over the past year

View more statistics