paperMarch 2026

Drop-In Perceptual Optimization for 3D Gaussian Splatting

AuthorsEzgi Özyılkan†‡, Zhiqi Chen‡, Oren Rippel, Jona Ballé†, Kedar Tatwawadi

Despite their output being ultimately consumed by human viewers, 3D Gaussian Splatting (3DGS) methods often rely on ad-hoc combinations of pixel-level losses, resulting in blurry renderings. To address this, we systematically explore perceptual optimization strategies for 3DGS by searching over a diverse set of distortion losses. We conduct the first-of-its-kind large-scale human subjective study on 3DGS, involving 39,320 pairwise ratings across several datasets and 3DGS frameworks. A regularized version of Wasserstein Distortion, which we call WD-R, emerges as the clear winner, excelling at recovering fine textures without incurring a higher splat count. WD-R is preferred by raters more than 2.3× over the original 3DGS loss, and 1.5× over current best method Perceptual-GS. WD-R also consistently achieves state-of-the-art LPIPS, DISTS, and FID scores across various datasets, and generalizes across recent frameworks, such as Mip-Splatting and Scaffold-GS, where replacing the original loss with WD-R consistently enhances perceptual quality within a similar resource budget (number of splats for Mip-Splatting, model size for Scaffold-GS), and leads to reconstructions being preferred by human raters 1.8× and 3.6×, respectively. We also find that this carries over to the task of 3DGS scene compression, with ≈50% bitrate savings for comparable perceptual metric performance.

† New York University (Tandon School of Engineering)
‡ Equal contribution

Diagram of 3D Gaussian Splatting representation and compression frameworks showing optimization using 2D distortion and rate-distortion objectives with perceptual loss components.

Figure 1: 3DGS representation and compression frameworks optimized using 2D distortion and rate-distortion objectives, incorporating perceptual losses as part of the training framework.

Chart showing Bayesian Elo scores comparing 3D Gaussian Splatting representation methods across indoor, outdoor, and combined scene benchmarks, with WD-R and WD achieving the highest scores.

Figure 2: Bayesian Elo scores for 3DGS representation methods across indoor scenes (Deep Blending, Mip-NeRF 360 indoor), outdoor scenes (Tanks & Temples, Mip-NeRF 360 outdoor, and BungeeNeRF), and all scenes combined. WD-R and WD achieve the highest scores in all settings (within the 95% confidence interval).

Drop-In Perceptual Optimization for 3D Gaussian Splatting

Related readings and updates.

What Matters in Practical Learned Image Compression

HUGS: Human Gaussian Splats

Discover opportunities in Machine Learning.