RenderDiffusion: Image Diffusion for 3D Reconstruction, Inpainting and Generation

Anciukevicius, T., Xu, Z., Fisher, M., Henderson, P. , Bilen, H., Mitra, N. J. and Guerrero, P. (2023) RenderDiffusion: Image Diffusion for 3D Reconstruction, Inpainting and Generation. In: 2023 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, Canada, 18-22 Jun 2023, pp. 12608-12618. ISBN 9798350301298 (doi: 10.1109/CVPR52729.2023.01213)

[img] Text
295178.pdf - Accepted Version

10MB

Abstract

Diffusion models currently achieve state-of-the-art performance for both conditional and unconditional image generation. However, so far, image diffusion models do not support tasks required for 3D understanding, such as view-consistent 3D generation or single-view object reconstruction. In this paper, we present RenderDiffusion, the first diffusion model for 3D generation and inference, trained using only monocular 2D supervision. Central to our method is a novel image denoising architecture that generates and renders an intermediate three-dimensional representation of a scene in each denoising step. This enforces a strong inductive structure within the diffusion process, providing a 3D consistent representation while only requiring 2D supervision. The resulting 3D representation can be rendered from any view. We evaluate RenderDiffusion on FFHQ, AFHQ, ShapeNet and CLEVR datasets, showing competitive performance for generation of 3D scenes and inference of 3D scenes from 2D images. Additionally, our diffusion-based approach allows us to use 2D inpainting to edit 3D scenes.

Item Type:Conference Proceedings
Additional Information:We would like to thank Yilun Du, Christopher K. I. Williams, Noam Aigerman, Julien Philip and Valentin Deschaintre for valuable discussions. HB was supported by the EPSRC Visual AI grant EP/T028572/1; NM was partially supported by the UCL AI Centre.
Status:Published
Refereed:Yes
Glasgow Author(s) Enlighten ID:Henderson, Dr Paul
Authors: Anciukevicius, T., Xu, Z., Fisher, M., Henderson, P., Bilen, H., Mitra, N. J., and Guerrero, P.
College/School:College of Science and Engineering > School of Computing Science
Research Centre:College of Science and Engineering > School of Computing Science > IDA Section > GPU Cluster
ISSN:2575-7075
ISBN:9798350301298
Copyright Holders:Copyright © 2023 IEEE
First Published:First published in 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Publisher Policy:Reproduced in accordance with the publisher copyright policy

University Staff: Request a correction | Enlighten Editors: Update this record