DrivingGaussian: Composite Gaussian Splatting for Surrounding Dynamic Autonomous Driving Scenes


Xiaoyu Zhou1, Zhiwei Lin1, Xiaojun Shan1, Yongtao Wang1, Deqing Sun2, Ming-Hsuan Yang3

1Wangxuan Institute of Computer Technology, Peking Univerisity, 2 Google Research, 3University of California, Merced

DrivingGaussian achieves photorealistic rendering performance for surrounding dynamic autonomous driving scenes. Naive approaches [14, 48] either produce unpleasant artifacts and blurring in the large-scale background or struggle with reconstructing dynamic objects and detailed scene geometry. DrivingGaussian first introduces Composite Gaussian Splatting to efficiently represent static backgrounds and multiple dynamic objects in complex surrounding driving scenes. DrivingGaussian enables the high-quality synthesis of surrounding views across multi-camera and facilitates long-term dynamic scene reconstruction.

Abstract

We present DrivingGaussian, an efficient and effective framework for surrounding dynamic autonomous driving scenes. For complex scenes with moving objects, we first sequentially and progressively model the static background of the entire scene with incremental static 3D Gaussians. We then leverage a composite dynamic Gaussian graph to handle multiple moving objects, individually reconstructing each object and restoring their accurate positions and occlusion relationships within the scene. We further use a LiDAR prior for Gaussian Splatting to reconstruct scenes with greater details and maintain panoramic consistency. DrivingGaussian outperforms existing methods in driving scene reconstruction and enables photorealistic surround-view synthesis with high-fidelity and multi-camera consistency. The source code and trained models will be released.


Overall pipeline of our method.

Left: DrivingGaussian takes sequential data from multi-sensor, including multi-camera images and LiDAR. Middle: To represent the large-scale dynamic driving scenes, we propose Composite Gaussian Splatting, which consists of two components. The first part incrementally reconstructs the extensive static background, while the second constructs multiple dynamic objects with a Gaussian graph and dynamically integrates them into the scene. Right: DrivingGaussian demonstrates good performance across multiple tasks and application scenarios.


Qualitative comparison on dynamic reconstruction.

We demonstrate the qualitative comparison results with our main competitors EmerNeRF and 3D-GS on dynamic reconstruction for 4D driving scenes of nuScenes. DrivingGaussian enables the high-quality reconstruction of dynamic objects at high speed while maintaining temporal consistency.

Visualization comparison using different initialization methods on KITTI-360.

Example of corner case simulation.

Corner case simulation using DrivingGaussian: A man walking on the road suddenly falls, and a car approaches ahead.


Visualization of surrounding multi-camera views in nuScenes dataset.

The surrounding views have small overlaps among multi-camera but large gaps across time.


Qualitative comparison on the nuScenes dataset.

We demonstrate the qualitative comparison results with our main competitors NSG, EmerNeRF and 3D-GS on driving scenes reconstruction of nuScenes.


Qualitative comparison on the KITTI-360 dataset.

We demonstrate the qualitative comparison results with our main competitors DNMP [8] and 3D-GS [4] on driving scenes reconstruction of KITTI-360.


Qualitative comparison with different initialization methods for 3D Gaussians.

Rendering with or w/o the Incremental Static 3D Gaussians (IS3G). & Rendering with or w/o the Composite Dynamic Gaussian Graph (CDGG).

IS3G ensures good geometry and topological integrity for static backgrounds in large-scale driving scenes. CDGG enables the reconstruction of dynamic objects at arbitrary speeds in the driving scenes (e.g., vehicles, bicycles, and pedestrians).

3DGS VS Ours


Results


3DGS VS Ours

EmerNeRF VS Ours


3DGS VS Ours

EmerNeRF VS Ours





Copyright © VDIG 2023