OmniNAV: Omniscient Navigation via Unified LiDAR–Camera BEV Fusion for End-to-End Autonomous Driving
Accepted at IEEE IV 2026
Engineering 5.5 · Research 7.0 · Business 5.0
End-to-end driving models that learn directly from sensor inputs to control commands, including transformer-based, imitation learning and reinforcement learning approaches.
Accepted at IEEE IV 2026
Engineering 5.5 · Research 7.0 · Business 5.0
Autonomous Vehicles (AVs) lack considerable reliability in unstructured contexts with road surface deterioration, particularly potholes.
Engineering 5.5 · Research 7.0 · Business 5.0
Level-4+ autonomous driving systems (ADS) must run dozens of heterogeneous deep neural networks (DNNs) as end-to-end (E2E) pipelines under a strict latency constraint (<=100 ms), even as execution time varies by up to 3.3x.
Engineering 5.5 · Research 7.0 · Business 5.0
While end-to-end (E2E) autonomous driving has become the dominant research direction, production vehicles continue to rely on modular multi-NN pipelines for a non-trivial transitional period.
Engineering 5.5 · Research 7.0 · Business 6.0
An autonomous driving research paper: This is The Way: Vision-Based End-to-End Planning for Autonomous Driving.
Engineering 5.5 · Research 7.0 · Business 5.0
We propose PLAN-S (PLANning with latent Style dynamics), a planner-facing bridge that addresses this compactness-controllability dilemma by decoding a style-conditioned, four-channel semantic cost map from the latent representation.
Engineering 6.0 · Research 7.0 · Business 5.0
We introduce Discrete-WAM, a unified latent vision-action world policy that represents future visual states and ego actions as aligned discrete tokens, enabling compositional causal reasoning across alternative futures.
Engineering 5.5 · Research 7.0 · Business 5.0
To overcome this limitation, we present D$^3$-MoE (Dual Disentangled Diffusion Mixture-of-Experts), which disentangles trajectory modeling along two complementary axes.
Engineering 5.5 · Research 8.0 · Business 5.0
This paper proposes an integrated end-to-end framework combining a Cross-Modal Attention Fusion (CMAF) module, a Kalman-Graph Neural Network (K-GNN) dynamic obstacle predictor, and a two-layer Proximal Policy Optimization path planning architecture.
Engineering 6.0 · Research 7.0 · Business 6.0
We present StandardE2E, a framework that provides a single unified interface over E2E driving datasets.
Engineering 7.5 · Research 7.0 · Business 6.0
The prediction of steering-angle for robots in complex scenarios is crucial in intelligent auto-navigation process.
Engineering 5.5 · Research 7.0 · Business 5.0
To validate this, we introduce the Action Diffusion Transformer (ADT), an anchor-free diffusion transformer trained with a MSE objective that natively models the multimodal distribution of plausible driving actions.
Engineering 5.5 · Research 8.0 · Business 5.0
In this paper, we propose PillarDETR, a novel end-to-end 3D object detection architecture that combines the efficiency of pillar-based LiDAR encoding with the representational power of modern 2D vision models.
Engineering 6.0 · Research 8.0 · Business 5.0
To address this challenge, we propose a novel collaborative (CO-) interaction-aware (-IN) MARL framework, named COIN.
Engineering 7.5 · Research 8.0 · Business 5.5
Recently, autonomous driving (AD) has become a popular technological frontier, mainly driven by the integration of Deep Learning (DL).
Engineering 6.0 · Research 7.0 · Business 6.5
To address this limitation, we propose \mathbf{IDOL}, an inverse-dynamics-guided future prediction framework for world-model-based end-to-end planning in latent BEV space, where inverse dynamics serves as the key bridge between future prediction and trajectory optimization.
Engineering 5.5 · Research 8.0 · Business 5.0
To address this limitation, we introduce Neural Token Reconstruction (NTR), a representation learning framework to directly constrain the compact scene-token bottleneck in perception-free driving.
Engineering 6.0 · Research 8.0 · Business 6.0
We present DriveWAM, a driving world-action model that adapts a pretrained video diffusion transformer into an autoregressive video-action policy.
Engineering 5.5 · Research 7.5 · Business 5.0
Inspired by error-correction notebooks used in learning practice, we design a novel multi-level replay buffer mechanism.
Engineering 6.0 · Research 8.0 · Business 5.5
In this work, we propose an end-to-end spiking encoder-decoder network for object detection in bird's eye view representations of LiDAR point clouds, trained using surrogate gradient backpropagation.
Engineering 6.0 · Research 8.0 · Business 6.5