Awesome World Models for Autonomous Driving

Collect some World Models (for Autonomous Driving) papers.

If you find some ignored papers, feel free to create pull requests, open issues, or email me / Qi Wang. Contributions in any form to make this list more comprehensive are welcome. 📣📣📣

If you find this repository useful, please consider giving us a star 🌟.

Feel free to share this list with others! 🥳🥳🥳

Workshop & Challenge

CVPR 2024 Workshop & Challenge | OpenDriveLab Track #4: Predictive World Model.

Serving as an abstract spatio-temporal representation of reality, the world model can predict future states based on the current state. The learning process of world models has the potential to elevate a pre-trained foundation model to the next level. Given vision-only inputs, the neural network outputs point clouds in the future to testify its predictive capability of the world.
CVPR 2023 Workshop on Autonomous Driving CHALLENGE 3: ARGOVERSE CHALLENGES, 3D Occupancy Forecasting using the Argoverse 2 Sensor Dataset. Predict the spacetime occupancy of the world for the next 3 seconds.

Papers

World model original paper

Using Occupancy Grids for Mobile Robot Perception and Navigation [paper]

Technical blog or video

Yann LeCun: A Path Towards Autonomous Machine Intelligence [paper] [Video]
CVPR'23 WAD Keynote - Ashok Elluswamy, Tesla [Video]
Wayve Introducing GAIA-1: A Cutting-Edge Generative AI Model for Autonomy [blog]

World models are the basis for the ability to predict what might happen next, which is fundamentally important for autonomous driving. They can act as a learned simulator, or a mental “what if” thought experiment for model-based reinforcement learning (RL) or planning. By incorporating world models into our driving models, we can enable them to understand human decisions better and ultimately generalise to more real-world situations.

Survey

A survey on multimodal large language models for autonomous driving. WACVW 2024 [Paper] [Code]
Is Sora a World Simulator? A Comprehensive Survey on General World Models and Beyond. arXiv 2024.5 [Paper] [Code]
World Models for Autonomous Driving: An Initial Survey. 2024.3, arxiv [Paper]

2024

[ViDAR] Visual Point Cloud Forecasting enables Scalable Autonomous Driving. CVPR 2024 [Paper] [Code]
[GenAD] Generalized Predictive Model for Autonomous Driving. CVPR 2024 [Paper] [Data]
[Cam4DOCC] Cam4DOcc: Benchmark for Camera-Only 4D Occupancy Forecasting in Autonomous Driving Applications. CVPR 2024 [Paper] [Code]
[Drive-WM] Driving into the Future: Multiview Visual Forecasting and Planning with World Model for Autonomous Driving. CVPR 2024 [Paper] [Code]
[DriveWorld] DriveWorld: 4D Pre-trained Scene Understanding via World Models for Autonomous Driving. CVPR 2024 [Paper]
[Panacea] Panacea: Panoramic and Controllable Video Generation for Autonomous Driving. CVPR 2024 [Paper] [Code]
[MagicDrive] MagicDrive: Street View Generation with Diverse 3D Geometry Control. ICLR 2024 [Paper] [Code]
[Copilot4D] Copilot4D: Learning Unsupervised World Models for Autonomous Driving via Discrete Diffusion. ICLR 2024 [Paper]
[SafeDreamer] SafeDreamer: Safe Reinforcement Learning with World Models. ICLR 2024 [Paper] [Code]
[DriveSim] Probing Multimodal LLMs as World Models for Driving. arXiv 2024.5 [Paper] [Code]
[RoboDreamer] RoboDreamer: Learning Compositional World Models for Robot Imagination. arXiv 2024.4 [Paper] [Code]
[LidarDM] LidarDM: Generative LiDAR Simulation in a Generated World. arXiv 2024.4 [Paper] [Code]
[3D-VLA] 3D-VLA: A 3D Vision-Language-Action Generative World Model. arXiv 2024.3 [Paper]
[DriveDreamer-2] DriveDreamer-2: LLM-Enhanced World Models for Diverse Driving Video Generation. arXiv 2024.3 [Paper] [Code]
[Think2Drive] Think2Drive: Efficient Reinforcement Learning by Thinking in Latent World Model for Quasi-Realistic Autonomous Driving. arXiv 2024.2 [Paper]

2023

[TrafficBots] TrafficBots: Towards World Models for Autonomous Driving Simulation and Motion Prediction. ICRA 2023 [Paper] [Code]
[WoVoGen] WoVoGen: World Volume-aware Diffusion for Controllable Multi-camera Driving Scene Generation. arXiv 2023.12 [Paper] [Code]
[CTT] Categorical Traffic Transformer: Interpretable and Diverse Behavior Prediction with Tokenized Latent. arXiv 2023.11 [Paper]
[OccWorld] OccWorld: Learning a 3D Occupancy World Model for Autonomous Driving. arXiv 2023.11 [Paper] [Code]
[MUVO] MUVO: A Multimodal Generative World Model for Autonomous Driving with Geometric Representations. arXiv 2023.11 [Paper]
[DrivingDiffusion] DrivingDiffusion: Layout-Guided multi-view driving scene video generation with latent diffusion model. arXiv 2023.10 [Paper] [Code]
[GAIA-1] GAIA-1: A Generative World Model for Autonomous Driving. arXiv 2023.9 [Paper]
[ADriver-I] ADriver-I: A General World Model for Autonomous Driving. arXiv 2023.9 [Paper]
[DriveDreamer] DriveDreamer: Towards Real-world-driven World Models for Autonomous Driving. arXiv 2023.9 [Paper] [Code]
[UniWorld] UniWorld: Autonomous Driving Pre-training via World Models. arXiv 2023.8 [Paper] [Code]

2022

[MILE] Model-Based Imitation Learning for Urban Driving. NeurIPS 2022 [Paper] [Code]
[Iso-Dream] Iso-Dream: Isolating and Leveraging Noncontrollable Visual Dynamics in World Models. NeurIPS 2022 Spotlight [Paper] [Code]
[Symphony] Symphony: Learning Realistic and Diverse Agents for Autonomous Driving Simulation. ICRA 2022 [Paper]
Hierarchical Model-Based Imitation Learning for Planning in Autonomous Driving. IROS 2022 [Paper]
[SEM2] Enhance Sample Efficiency and Robustness of End-to-end Urban Autonomous Driving via Semantic Masked World Model. NeurIPS 2022 workshop [Paper]

Other World Model Paper

2024

[3D-VLA] 3D-VLA: A 3D Vision-Language-Action Generative World Model. ICML 2024 [Paper] [Code]
[Genie] Genie: Generative Interactive Environments. DeepMind [Paper] [Blog]
[Sora] Video generation models as world simulators. OpenAI [Technical report]
[IWM] Learning and Leveraging World Models in Visual Representation Learning. Meta AI [Paper]
[V-JEPA] V-JEPA: Video Joint Embedding Predictive Architecture. Meta AI [Blog] [Paper] [Code]
[Newton] Newton™ – a first-of-its-kind foundation model for understanding the physical world. Archetype AI [Blog]
[MAMBA] MAMBA: an Effective World Model Approach for Meta-Reinforcement Learning. ICLR 2024 [Paper] [Code]
[Compete and Compose] Compete and Compose: Learning Independent Mechanisms for Modular World Models. arXiv 2024.4 [Paper]
[MagicTime] MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators. arXiv 2024.4 [Paper] [Code]
[Dreaming of Many Worlds] Dreaming of Many Worlds: Learning Contextual World Models Aids Zero-Shot Generalization. arXiv 2024.3 [Paper] [Code]
[ManiGaussian] ManiGaussian: Dynamic Gaussian Splatting for Multi-task Robotic Manipulation. arXiv 2024.3 [Paper] [Code]
[LWM] World Model on Million-Length Video And Language With RingAttention. arXiv 2024.2 [Paper] [Code]
Planning with an Ensemble of World Models. OpenReview [Paper]
[WorldDreamer] WorldDreamer: Towards General World Models for Video Generation via Predicting Masked Tokens. arXiv 2024.1 [Paper] [Code]

2023

[IRIS] Transformers are Sample Efficient World Models. ICLR 2023 Oral [Paper] [Torch Code]
[STORM] STORM: Efficient Stochastic Transformer based World Models for Reinforcement Learning. NIPS 2023 [Paper] [Torch Code]
[TWM] Transformer-based World Models Are Happy with 100k Interactions. ICLR 2023 [Paper] [Torch Code]
[Dynalang] Learning to Model the World with Language. arXiv 2023.8 [Paper] [JAX Code]
[CoWorld] Making Offline RL Online: Collaborative World Models for Offline Visual Reinforcement Learning. arXiv 2023.5 [Paper]
[DreamerV3] Mastering Atari with Discrete World Models. arXiv 2023.1 [Paper] [JAX Code] [Torch Code]

2022

[DreamerPro] DreamerPro: Reconstruction-Free Model-Based Reinforcement Learning with Prototypical Representations. ICML 2022 [Paper] [TF Code]
Deep Hierarchical Planning from Pixels. NIPS 2022 [Paper] [TF Code]

2021

[DreamerV2] Mastering Atari with Discrete World Models. ICLR 2021 [Paper] [TF Code] [Torch Code]

2020

[DreamerV1] Dream to Control: Learning Behaviors by Latent Imagination. ICLR 2020 [Paper] [TF Code] [Torch Code]
[Plan2Explore] Planning to Explore via Self-Supervised World Models. ICML 2020 [Paper] [TF Code] [Torch Code]

2018

World Models. NIPS 2018 Oral [Paper]

Name		Name	Last commit message	Last commit date
Latest commit History 68 Commits
.gitattributes		.gitattributes
ContributionGuidelines.md		ContributionGuidelines.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.gitattributes

.gitattributes

ContributionGuidelines.md

ContributionGuidelines.md

README.md

README.md

Repository files navigation

Awesome World Models for Autonomous Driving

Workshop & Challenge

Papers

World model original paper

Technical blog or video

Survey

2024

2023

2022

Other World Model Paper

2024

2023

2022

2021

2020

2018

About

Releases

Packages

Contributors 3

LMD0311/Awesome-World-Model

Folders and files

Latest commit

History

Repository files navigation

Awesome World Models for Autonomous Driving

Workshop & Challenge

Papers

World model original paper

Technical blog or video

Survey

2024

2023

2022

Other World Model Paper

2024

2023

2022

2021

2020

2018

About

Topics

Resources

Stars

Watchers

Forks