Surface Reconstruction from Gaussian Splatting via Novel Stereo Views

Yaniv Wolf*, Amit Bracha*, Ron Kimmel
* Indicates equal contribution
Reference SuGaR BakedSDF Ours

TL;DR: We present a fast and accurate method to reconstruct smooth, geometrically-consistent surfaces from Gaussian Splatting models. Our method only needs about 5 minutes of addtitional compute time on top of the 3DGS scence capture, for a typical in-the-wild scene.

Pipeline Image

Abstract

The Gaussian splatting for radiance field rendering method has recently emerged as an efficient approach for accurate scene representation. It optimizes the location, size, color, and shape of a cloud of 3D Gaussian elements to visually match, after projection, or splatting, a set of given images taken from various viewing directions. And yet, despite the proximity of Gaussian elements to the shape boundaries, direct surface reconstruction of objects in the scene is a challenge.

We propose a novel approach for surface reconstruction from Gaussian splatting models. Rather than relying on the Gaussian elements' locations as a prior for surface reconstruction, we leverage the superior novel-view synthesis capabilities of 3DGS. To that end, we use the Gaussian splatting model to render pairs of stereo-calibrated novel views from which we extract depth profiles using a stereo matching method. We then combine the extracted RGB-D images into a geometrically consistent surface. The resulting reconstruction is more accurate and shows finer details when compared to other methods for surface reconstruction from Gaussian splatting models, while requiring significantly less compute time compared to other surface reconstruction methods.

We performed extensive testing of the proposed method on in-the-wild scenes, taken by a smartphone, showcasing its superior reconstruction abilities. Additionally, we tested the proposed method on the Tanks and Temples benchmark, and it has surpassed the current leading method for surface reconstruction from Gaussian splatting models. We present the first method capable of photorealistically reconstructing a non-rigidly deforming scene using photos/videos captured casually from mobile phones.

In-the-Wild Scenes from Uncontrolled Smartphone Videos - Comparison with SuGaR

Reference SuGaR Ours
Reference SuGaR Ours
Reference SuGaR Ours
Reference SuGaR Ours

Semi-Automatic Masking using SAM and Depth Projections

Reference Ours Ours Masked
Reference Ours Ours Masked
Reference Ours Ours Masked
Reference Ours Ours Masked

MobileBrick Dataset - Comparison with MVSFormer

Ground Truth MVSFormer Ours
Ground Truth MVSFormer Ours
Ground Truth MVSFormer Ours
Ground Truth MVSFormer Ours

Interactive 3D Meshes (simplified/compressed to reduce size)

BibTeX

@article{wolf2024surface,
      title={Surface Reconstruction from Gaussian Splatting via Novel Stereo Views},
      author={Wolf, Yaniv and Bracha, Amit and Kimmel, Ron},
      journal={arXiv preprint arXiv:2404.01810},
      year={2024}
    }

References

Barron, Jonathan T., et al. "Mip-nerf 360: Unbounded anti-aliased neural radiance fields." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022.

Cao, Chenjie, Xinlin Ren, and Yanwei Fu. "MVSFormer: Multi-View stereo by learning robust image features and temperature-based depth." arXiv preprint arXiv:2208.02541 (2022).

Guédon, Antoine, and Vincent Lepetit. "SuGaR: Surface-aligned Gaussian splatting for efficient 3D mesh reconstruction and high-quality mesh rendering." arXiv preprint arXiv:2311.12775 (2023).

Kerbl, Bernhard, et al. "3D Gaussian splatting for real-time radiance field rendering." ACM Transactions on Graphics 42.4 (2023): 1-14.

Kirillov, Alexander, et al. "Segment Anything." Proceedings of the IEEE/CVF International Conference on Computer Vision. 2023.

Knapitsch, Arno, et al. "Tanks and Temples: Benchmarking large-scale scene reconstruction." ACM Transactions on Graphics (ToG) 36.4 (2017): 1-13.

Li, Kejie, et al. "MobileBrick: Building LEGO for 3D reconstruction on mobile devices." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2023.

Yariv, Lior, et al. "BakedSDF: Meshing neural SDFs for real-time view synthesis." ACM SIGGRAPH 2023 Conference Proceedings. 2023.