Browse curated autonomous driving papers on end-to-end driving, BEV perception, 3D object detection, motion prediction, path planning, ADAS, Tesla FSD, Waymo, and self-driving foundation models.
2026-05-30
An autonomous driving research paper: Semantic-decoupled spatial partition guided point-supervised oriented object detection.
Engineering 5.0 · Research 7.0 · Business 5.0
2026-05-30
To bridge this gap, we introduce SkyShield, to the best of our knowledge the first front-view monocular semantic occupancy benchmark for urban UAV flight below 20 meters.
Engineering 5.0 · Research 7.0 · Business 5.0
2026-05-30
An autonomous driving research paper: Occlusion-aware multi-modal 3D object detection via multi-stage cross-modal fusion.
Engineering 5.0 · Research 7.0 · Business 5.0
2026-05-29
This paper proposes a coupled prediction-planning framework that deeply integrates intention-aware multi-agent prediction with gap-driven trajectory optimization.
Engineering 5.5 · Research 8.0 · Business 5.0
2026-05-29
Our approach extends two representative backbones: a radar-camera pipeline where radar substitutes LiDAR, and a LiDAR-radar pipeline where radar complements LiDAR.
Engineering 5.0 · Research 7.0 · Business 5.0
2026-05-29
Robotic systems generate large volumes of multimodal sensor data, but converting ROS bag recordings into machine learning datasets is often handled by ad hoc sequential scripts, creating engineering overhead and slow iteration cycles.
Engineering 7.0 · Research 7.0 · Business 5.0
2026-05-29
To address this limitation, we propose \mathbf{IDOL}, an inverse-dynamics-guided future prediction framework for world-model-based end-to-end planning in latent BEV space, where inverse dynamics serves as the key bridge between future prediction and trajectory optimization.
Engineering 5.5 · Research 8.0 · Business 5.0
2026-05-29
We introduce TouchSafeBench, a physics-grounded benchmark for evaluating collision grounding in vision-language models (VLMs).
Engineering 5.5 · Research 7.0 · Business 6.0
2026-05-29
To address this limitation, we introduce Neural Token Reconstruction (NTR), a representation learning framework to directly constrain the compact scene-token bottleneck in perception-free driving.
Engineering 6.0 · Research 8.0 · Business 6.0
2026-05-29
To this end, we present Grace-BEV, a lightweight and plug-and-play framework that enforces active reliability awareness during multi-modal fusion.
Engineering 5.5 · Research 7.0 · Business 5.0
2026-05-29
In this work, we introduce a structured multi-level visual perturbation framework to analyze visual-behavior dependency in VLA-based driving models systematically.
Engineering 5.0 · Research 7.0 · Business 5.0
2026-05-28
World models, internal simulators that learn the structure and dynamics of an environment, have emerged as a central paradigm in the pursuit of artificial general intelligence, enabling agents to predict, plan, and reason within learned representations.
Engineering 5.5 · Research 7.0 · Business 6.0
2026-05-28
Autonomous driving systems are commonly trained and evaluated within limited geographic regions, which hinders their scalability when deployed in new cities.
Engineering 5.5 · Research 7.0 · Business 6.0
2026-05-28
To this end, we propose ReasonLight, a multimodal foundation model-enhanced RL framework for zero-shot TSC.
Engineering 5.0 · Research 7.5 · Business 5.0
2026-05-28
To overcome these challenges, we propose ACF4D, a novel temporal fusion framework designed for multi-view 3D object detection.
Engineering 5.5 · Research 8.0 · Business 5.0
2026-05-27
To address this issue, we propose a pose-aware BEV feature refinement method for post-fusion BEV representations.
Engineering 5.5 · Research 7.0 · Business 6.5
2026-05-27
We present DriveWAM, a driving world-action model that adapts a pretrained video diffusion transformer into an autoregressive video-action policy.
Engineering 5.5 · Research 7.5 · Business 5.0
2026-05-27
We present DRIFT, a spatiotemporal risk field governed by an advection-diffusion-reaction partial differential equation (PDE), with an optional telegrapher term.
Engineering 5.5 · Research 7.0 · Business 5.5
2026-05-27
To model vehicle-type-specific pedestrian crash avoidance behavior, we develop a Smooth-Mamba Deep Deterministic Policy Gradient framework, termed SMamba-DDPG, which integrates smooth action constraints with efficient temporal representation learning.
Engineering 5.5 · Research 8.0 · Business 6.5
2026-05-26
To address these issues, we propose SDEF-BEV, a novel spatial-aware dual-expert fusion network.
Engineering 5.5 · Research 8.0 · Business 5.0