GS2Mesh: Surface Reconstruction from Gaussian Splatting via Novel Stereo Views

Abstract

Recently, 3D Gaussian Splatting (3DGS) has emerged as an efficient approach for accurately representing scenes. However, despite its superior novel view synthesis capabilities, extracting the geometry of the scene directly from the Gaussian properties remains a challenge, as those are optimized based on a photometric loss. While some concurrent models have tried adding geometric constraints during the Gaussian optimization process, they still produce noisy, unrealistic surfaces.

We propose a novel approach for bridging the gap between the noisy 3DGS representation and the smooth 3D mesh representation, by injecting real-world knowledge into the depth extraction process. Instead of extracting the geometry of the scene directly from the Gaussian properties, we instead extract the geometry through a pre-trained stereo-matching model. We render stereo-aligned pairs of images corresponding to the original training poses, feed the pairs into a stereo model to get a depth profile, and finally fuse all of the profiles together to get a single mesh.

The resulting reconstruction is smoother, more accurate and shows more intricate details compared to other methods for surface reconstruction from Gaussian Splatting, while only requiring a small overhead on top of the fairly short 3DGS optimization process.

We performed extensive testing of the proposed method on in-the-wild scenes, obtained using a smartphone, showcasing its superior reconstruction abilities. Additionally, we tested the method on the Tanks and Temples and DTU benchmarks, achieving state-of-the-art results.

Results on Uncontrolled In-The-Wild Smartphone Videos

DTU Dataset

Quantitative results on the DTU Dataset. Chamfer distance - lower is better. Red-1^st, Orange-2^nd, Yellow-3^rd. Table adapted from Gaussian Opacity Fields. Our method surpasses other Splatting-based methods, and is comparable with state-of-the-art neural methods such as Neuralangelo while taking significantly less time to run.

MobileBrick Dataset - Comparison with MVSFormer

Quantitative results on the MobileBrick Dataset. F1 - Higher is better, Chamfer distance - lower is better. Red-1^st, Orange-2^nd, Yellow-3^rd.

References

Cao, Chenjie, Xinlin Ren, and Yanwei Fu. "MVSFormer: Multi-View stereo by learning robust image features and temperature-based depth." arXiv preprint arXiv:2208.02541 (2022).

Huang, Binbin, et al. "2D Gaussian Splatting for Geometrically Accurate Radiance Fields." SIGGRAPH 2024 Conference Papers, Association for Computing Machinery, 2024, doi:10.1145/3641519.3657428.

Jensen, Rasmus, et al. "Large Scale Multi-View Stereopsis Evaluation." 2014 IEEE Conference on Computer Vision and Pattern Recognition, IEEE, 2014, pp. 406-413.

Kerbl, Bernhard, et al. "3D Gaussian splatting for real-time radiance field rendering." ACM Transactions on Graphics 42.4 (2023): 1-14.

Knapitsch, Arno, et al. "Tanks and Temples: Benchmarking large-scale scene reconstruction." ACM Transactions on Graphics (ToG) 36.4 (2017): 1-13.

Li, Kejie, et al. "MobileBrick: Building LEGO for 3D reconstruction on mobile devices." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2023.

BibTeX

@inproceedings{
      wolf2024gsmesh,
      title={{GS}2{M}esh: Surface Reconstruction from {G}aussian Splatting via Novel Stereo Views},
      author={Yaniv Wolf and Amit Bracha and Ron Kimmel},
      booktitle={European Conference on Computer Vision (ECCV)},
      year={2024},
    }

GS2Mesh: Surface Reconstruction from Gaussian Splatting via Novel Stereo Views

ECCV 2024

Abstract

Results on Uncontrolled In-The-Wild Smartphone Videos

References

BibTeX