DiffusionDriveV2

Reinforcement Learning-Constrained Truncated Diffusion Modeling in End-to-End Autonomous Driving

Jialv Zou¹, Shaoyu Chen^3,†, Bencheng Liao^2,1, Zhiyu Zheng⁴, Yuehao Song¹,
Lefei Zhang⁴, Qian Zhang³, Wenyu Liu¹, Xinggang Wang^1,📧

¹ School of EIC, Huazhong University of Science and Technology
² Institute of Artificial Intelligence, Huazhong University of Science and Technology
³ Horizon Robotics
⁴ School of Computer Science, Wuhan University

^📧 corresponding author | †Project Lead

News

Dec. 9th, 2025: We released our paper on Arxiv, and released the initial version of code and weights, along with documentation and training/evaluation scripts.

Introduction

Generative diffusion models for end-to-end autonomous driving often suffer from mode collapse, tending to generate conservative and homogeneous behaviors. While DiffusionDrive employs predefined anchors representing different driving intentions to partition the action space and generate diverse trajectories, its reliance on imitation learning lacks sufficient constraints, resulting in a dilemma between diversity and consistent high quality. In this work, we propose DiffusionDriveV2, which leverages reinforcement learning to both constrain low-quality modes and explore for superior trajectories. This significantly enhances the overall output quality while preserving the inherent multimodality of its core Gaussian Mixture Model. First, we use scale-adaptive multiplicative noise, ideal for trajectory planning, to promote broad exploration. Second, we employ intra-anchor GRPO to manage advantage estimation among samples generated from a single anchor, and inter-anchor truncated GRPO to incorporate a global perspective across different anchors, preventing improper advantage comparisons between distinct intentions (e.g., turning vs. going straight), which can lead to further mode collapse. DiffusionDriveV2 achieves 91.2 PDMS on the NAVSIM v1 dataset and 85.5 EPDMS on the NAVSIM v2 dataset in closed-loop evaluation with an aligned ResNet-34 backbone, setting a new record. Further experiments validate that our approach resolves the dilemma between diversity and consistent high quality for truncated diffusion models, achieving the best trade-off.

Overall architecture of DiffusionDriveV2.

Qualitative Results on NAVSIM

Going straight behavior.

Turning left with diverse lane-changing behavior.

Complex driving scenarios with multiple potential solutions.

Getting Started

Contact

If you have any questions, please contact Jialv Zou via email ([email protected]).

Acknowledgement

DiffusionDrive is greatly inspired by the following outstanding contributions to the open-source community: NAVSIM, DiffusionDrive, DPPO, DeepSeek-R1 .

Citation

If you find DiffusionDriveV2 is useful in your research or applications, please consider giving us a star 🌟 and citing it by the following BibTeX entry.

@misc{zou2025diffusiondrivev2reinforcementlearningconstrainedtruncated,
      title={DiffusionDriveV2: Reinforcement Learning-Constrained Truncated Diffusion Modeling in End-to-End Autonomous Driving}, 
      author={Jialv Zou and Shaoyu Chen and Bencheng Liao and Zhiyu Zheng and Yuehao Song and Lefei Zhang and Qian Zhang and Wenyu Liu and Xinggang Wang},
      year={2025},
      eprint={2512.07745},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2512.07745}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
assets		assets
docs		docs
download		download
gtrs_traj		gtrs_traj
navsim		navsim
scripts		scripts
tutorial		tutorial
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
kmeans_navsim_traj_20.npy		kmeans_navsim_traj_20.npy
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DiffusionDriveV2

Reinforcement Learning-Constrained Truncated Diffusion Modeling in End-to-End Autonomous Driving

News

Table of Contents

Introduction

Qualitative Results on NAVSIM

Getting Started

Contact

Acknowledgement

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

hustvl/DiffusionDriveV2

Folders and files

Latest commit

History

Repository files navigation

DiffusionDriveV2

Reinforcement Learning-Constrained Truncated Diffusion Modeling in End-to-End Autonomous Driving

News

Table of Contents

Introduction

Qualitative Results on NAVSIM

Getting Started

Contact

Acknowledgement

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages