EMD: Explicit Motion Modeling for High-Quality Street Gaussian Splatting

Xiaobao Wei*,†, Qingpo Wuwu*,†, Zhongyu Zhao, Zhuangzhe Wu, Nan Huang, Ming Lu, Ningning Ma, Shanghang Zhang

Peking University, Autonomous Driving Division, NIO

*Equal contribution

Work done during internship NIO

Corresponding Author

Enhanced Scene Reconstruction with Motion Modeling

Our EMD effectively handles varying motion speeds in street scenes, leading to improved reconstruction quality

S3Gaussian Results StreetGaussian Results

Ground Truth

S3Gaussian

S3Gaussian + Ours

Ground Truth

S3Gaussian Error

S3Gaussian + Ours Error

EMD: A Plug-and-Play Motion Modeling Framework

EMD

Our EMD framework introduces two key components: Motion-aware Feature Encoding for capturing dynamic characteristics and Dual-scale Deformation Framework for handling varying motion speeds. This architecture enables effective modeling of both fast global motions and slow local deformations in complex street scenes.

Abstract

Photorealistic reconstruction of street scenes is essential for developing real-world simulators in autonomous driving. While recent methods based on 3D/4D Gaussian Splatting (GS) have demonstrated promising results, they still encounter challenges in complex street scenes due to the unpredictable motion of dynamic objects. Current methods typically decompose street scenes into static and dynamic objects , learning the Gaussians in either a supervised manner (e.g., w/ 3D bounding-box) or a self-supervised manner (e.g., w/o 3D bounding-box). However, these approaches do not effectively model the motions of dynamic objects (e.g., the motion speed of pedestrians is clearly different from that of vehicles), resulting in suboptimal scene decomposition.

To address this, we propose Explicit Motion Decomposition (EMD), which models the motions of dynamic objects by introducing learnable motion embeddings to the Gaussians, enhancing the decomposition in street scenes. The proposed EMD is a plug-and-play approach applicable to various baseline methods. We also propose tailored training strategies to apply EMD to both supervised and self-supervised baselines. Through comprehensive experimentation, we illustrate the effectiveness of our approach with various established baselines.

Improved Scene Reconstruction

S3Gaussian
S3Gaussian + Ours

Comprehensive evaluation showing +1.81 PSNR improvement in full scenes and +2.81 PSNR in vehicle regions

Error Analysis & Visualization

S3Gaussian S3Gaussian + Ours

High-Quality Street Scene Reconstruction

Ground Truth

StreetGaussian + Ours

Reconstruction Error

Results highlight improved metrics with PSNR increased by 1.42 and SSIM improved by 1.1% for complete scene synthesis

Ablation Study

Examining the contribution of each component in our EMD framework

Ablation study results
Qualitative ablation study results across three camera views from the Waymo dataset.

BibTeX

@article{wei2024emd,
        author    = {Wei, Xiaobao and Wuwu, Qingpo and Zhao, Zhongyu and Wu, Zhuangzhe and Huang, Nan and Lu, Ming and Ma, Ningning and Zhang, Shanghang},
        title     = {EMD: Explicit Motion Modeling for High-Quality Street Gaussian Splatting},
        journal   = {Under Review},
        year      = {2024},
    }