Abstract | ||
---|---|---|
We present a framework for efficient inference in structured image models that explicitly reason about objects. We achieve this by performing probabilistic inference using a recurrent neural network that attends to scene elements and processes them one at a time. Crucially, the model itself learns to choose the appropriate number of inference steps. We use this scheme to learn to perform inference in partially specified 2D models (variable-sized variational auto-encoders) and fully specified 3D models (probabilistic renderers). We show that such models learn to identify multiple objects - counting, locating and classifying the elements of a scene without any supervision, e.g., decomposing 3D images with various numbers of objects in a single forward pass of a neural network at unprecedented speed. We further show that the networks produce accurate inferences when compared to supervised counterparts, and that their structure leads to improved generalization. |
Year | Venue | DocType |
---|---|---|
2016 | ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016) | Conference |
Volume | ISSN | Citations |
29 | 1049-5258 | 0 |
PageRank | References | Authors |
0.34 | 0 | 7 |
Name | Order | Citations | PageRank |
---|---|---|---|
S. M. Ali Eslami | 1 | 90 | 4.78 |
Nicolas Heess | 2 | 1762 | 94.77 |
Theophane Weber | 3 | 159 | 16.79 |
Yuval Tassa | 4 | 1097 | 52.33 |
Szepesvari, David | 5 | 0 | 0.34 |
Koray Kavukcuoglu | 6 | 10189 | 504.11 |
geoffrey e hinton | 7 | 40435 | 4751.69 |