4DTAM: Non-Rigid Tracking and Mapping via Dynamic Surface Gaussians

CVPR 2025
(Main Track and 4D Vision Workshop)

Hidenobu Matsuki Gwangbin Bae Andrew J. Davison

Dyson Robotics Laboratory, Imperial College London

Paper arXiv Code Dataset

4D reconstruction results produced by our method, including rendered appearance, motion, surface normals, depth, underlying Gaussian visualizations, and camera ego-motion.

Abstract

    We propose the first 4D tracking and mapping method that jointly performs camera localization and non-rigid surface reconstruction via differentiable rendering. Our approach captures 4D scenes from an online stream of color images with depth measurements or predictions by jointly optimizing scene geometry, appearance, dynamics, and camera ego-motion. Although natural environments exhibit complex non-rigid motions, 4D-SLAM remains relatively underexplored due to its inherent challenges; even with 2.5D signals, the problem is ill-posed because of the high dimensionality of the optimization space. To overcome these challenges, we first introduce a SLAM method based on Gaussian surface primitives that leverages depth signals more effectively than 3D Gaussians, thereby achieving accurate surface reconstruction. To further model non-rigid deformations, we employ a warp-field represented by a multi-layer perceptron (MLP) and introduce a novel camera pose estimation technique along with surface regularization terms that facilitate spatio-temporal reconstruction. In addition to these algorithmic challenges, a significant hurdle in 4D SLAM research is the lack of reliable ground truth and evaluation protocols, primarily due to the difficulty of 4D capture using commodity sensors. To address this, we present a novel open synthetic dataset of everyday objects with diverse motions, leveraging large-scale object models and animation modeling. In summary, we open up the modern 4D-SLAM research by introducing a novel method and evaluation protocols grounded in modern vision and rendering techniques.

Method Overview

Sim4D: A Scalable 4D-SLAM Dataset via Web-Sourced Animated Meshes

The dataset is available in this link.

Live Reconstruction Process (8x)

Input
MonoGS
4DTAM

Additional Results

Acknowledgement

Research presented in this paper was supported by Dyson Technology Ltd. The authors would like to thank members of the Dyson Robotics Lab for insightful feedback and discussions.

BibTeX

@inproceedings{4DTAM,
  title={4DTAM: Non-Rigid Tracking and Mapping via Dynamic Surface Gaussians},
  author={Matsuki, Hidenobu and Bae, Gwangbin and Davison, Andrew},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year={2025}
}