



We propose an adaptive sampling framework for 3D Gaussian Splatting (3DGS) that leverages comprehensive multi-view photometric error signals within a unified Metropolis-Hastings approach. Traditional 3DGS methods heavily rely on heuristic-based density-control mechanisms (e.g., cloning, splitting, and pruning), which can lead to redundant computations or the premature removal of beneficial Gaussians. Our framework overcomes these limitations by reformulating densification and pruning as a probabilistic sampling process, dynamically inserting and relocating Gaussians based on aggregated multi-view errors and opacity scores. Guided by Bayesian acceptance tests derived from these error-based importance scores, our method substantially reduces reliance on heuristics, offers greater flexibility, and adaptively infers Gaussian distributions without requiring predefined scene complexity. Experiments on benchmark datasets, including Mip-NeRF360, Tanks and Temples, and Deep Blending, show that our approach reduces the number of Gaussians needed, enhancing computational efficiency while matching or modestly surpassing the view-synthesis quality of state-of-the-art models.
Our framework reframes 3D Gaussian Splatting (3DGS) densification using a principled Metropolis-Hastings (MH) sampling process. We aim to draw Gaussian configurations \(\Theta\) from a target distribution proportional to \(e^{-\mathcal{E}(\Theta)}\), where the core is our energy function: \[\mathcal{E}(\Theta) = \mathcal{L}(\Theta) + \lambda_v \sum_{v \in \mathcal{V}} \ln (1 + c_\Theta(v))\] This strategically balances the reconstruction loss \(\mathcal{L}(\Theta)\) with a novel logarithmic sparsity prior (the second term). This prior, with \(c_\Theta(v)\) as the Gaussian count in voxel \(v\), penalizes overcrowding, promoting an efficient scene representation.
To guide new Gaussian proposals, we compute a multi-view importance score \(s(p)\): \[s(p) = \sigma(\alpha O(p) + \beta \mathrm{SSIM}_{\mathrm{agg}}(p) + \gamma \mathrm{L1}_{\mathrm{agg}}(p))\] This score fuses opacity and photometric errors (L1, SSIM) to highlight regions needing refinement. Each proposed Gaussian \(i\) (targeting voxel \(v'\)) is then accepted via a specialized probability \(\rho(i)\), our 'logistic-voxel product' derived from MH principles: \[\rho(i) = \sigma(I(i)) \cdot \frac{1}{1+\lambda_v c_\Theta(v')}\] Here, \(I(i)\) (derived from \(s(p)\)) acts as a lightweight surrogate for photometric gain, while the second term is the density factor directly incorporating our sparsity prior. This probabilistic mechanism, balancing fidelity against compactness, allows faster convergence and yields a high-quality Gaussian representation.
Dotted vertical lines indicate when each method reaches 98% of its eventual PSNR, highlighting near-optimal convergence speed.
While both methods ultimately attain comparable final PSNR, ours converges substantially faster.
@misc{kim2025metropolishastingssampling3dgaussian,
title={Metropolis-Hastings Sampling for 3D Gaussian Reconstruction},
author={Hyunjin Kim and Haebeom Jung and Jaesik Park},
year={2025},
eprint={2506.12945},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2506.12945},
}