The Rapid Recovery of
Three-Dimensional Structure From Line Drawings
Ronald A. Rensink, Dept. of Computer Science, University of British Columbia, Vancouver, Canada.
PhD Dissertation, Computer Science Department, University of British Columbia. 1992. [pdf] [UBC CS Technical Report 92-25 (September 1992)]
A computational theory is developed that explains how line drawings of polyhedral objects can be interpreted rapidly and in parallel at early levels of human vision. The key idea is that a time-limited process can correctly recover much of the three-dimensional structure of these objects when split into concurrent streams, each concerned with a single aspect of scene structure.
The work proceeds in five stages. The first extends the framework of Marr to allow a process to be analyzed in terms of resource limitations. Two main concerns are identified: (i) reducing the amount of nonlocal information needed, and (ii) making effective use of whatever information is obtained. The second stage traces the difficulty of line interpretation to a small set of constraints. When these are removed, the remaining constraints can be grouped into several relatively independent sets. It is shown that each set can be rapidly solved by a separate processing stream, and that co-ordinating these streams can yield a low-complexity ``approximation'' that captures much of the structure of the original constraints. In particular, complete recovery is possible in logarithmic time when objects have rectangular corners and the scene-to-image projection is orthographic. The third stage is concerned with making good use of the available information when a fixed time limit exists. This limit is motivated by the need to obtain results within a time independent of image content, and by the need to limit the propagation of inconsistencies. A minimal architecture is assumed, viz., a spatiotopic mesh of simple processors. Constraints are developed to guide the course of the process itself, so that candidate interpretations are considered in order of their likelihood. The fourth stage provides a specific algorithm for the recovery process, showing how it can be implemented on a cellular automaton. Finally, the theory itself is tested on various line drawings. It is shown that much of the three-dimensional structure of a polyhedral scene can indeed by recovered in very little time. It is also shown that the theory can explain the rapid interpretation of line drawings at early levels of human vision.