Large-Scale High-Quality 3D Gaussian Head Reconstruction from Multi-View Captures

nimda May 8, 2026

0 9 1 minute read

Large-Scale High-Quality 3D Gaussian Head Reconstruction from Multi-View Captures

We propose HeadsUp, a scalable feedforward method for reconstructing high-quality 3D Gaussian heads from large multi-camera setups. Our method uses an efficient decoder architecture that compresses the input views into a compact hidden representation. This latent representation is then decoded into a set of UV-parameterized 3D Gaussians anchored to a central head template. This UV exposure separates the number of 3D Gaussians from the number and resolution of the input images, enabling training with high-resolution input views. We train and test our model on an internal dataset of over 10,000 subjects, which is an order of magnitude larger than human head datasets with multiple views. HeadsUp achieves the quality of modern reconstruction and includes novel identities without the limitation of test time. We analyze in detail the scaling behavior of our model across all identities, views, and model capabilities, revealing practical insights into the trade-offs of computational quality. Finally, we highlight the potential of our latent space by demonstrating two downstream applications: generating novel 3D identities and animating 3D heads in mixed situations.