Autonomous driving paper index

SPSC: Sparse and Scalable Multi-Modal 3D Occupancy Prediction for Autonomous Driving

2026-03-14 · Proceedings of the AAAI Conference on Artificial Intelligence

autonomous drivingoccupancy predictionoccupancylidarnuscenesprediction

One-line summary

This paper presents a framework leveraging SParse representation and SCalable feature interaction to address the aforementioned challenges, called SPSC.

Engineering notes

Without complex architectural designs, SPSC significantly outperforms SOTA approaches, relatively enhances the mIoU by 12.0%, 11.0% and 4.8% on nuScenes-Occupancy dataset under the muli-modal, LiDAR and camera settings, respectively.

Chinese explanation / 中文解读

中文解读待补充:本站会优先为端到端自动驾驶、BEV感知、3D目标检测、轨迹预测、路径规划、LiDAR感知等高价值论文补充中文说明。

Original abstract

3D semantic occupancy prediction offers a nuanced representation of the surrounding environment, which is crucial for ensuring the safety of autonomous driving. However, fine-grained scene representations inevitably result in cubic growth in data scale, which imposes substantial demands on model architecture and computational complexity, especially in high-resolution scenarios. Existing approaches for handling high-resolution scenes typically obtain fine-grained features by grid sampling on low-resolution feature map, resulting in limited sparsity and insufficient feature interaction. This paper presents a framework leveraging SParse representation and SCalable feature interaction to address the aforementioned challenges, called SPSC. Specifically, we maintain sparsity by progressively pruning unoccupied queries during the coarse-to-fine process, thereby reducing the scale of data that the model needs to handle. Subsequently, we introduce query serialization, which transforms queries into an ordered sequence while preserving their spatial structure, This enables fine-grained feature interaction while maintaining linear computational complexity and a larger receptive field. Without complex architectural designs, SPSC significantly outperforms SOTA approaches, relatively enhances the mIoU by 12.0%, 11.0% and 4.8% on nuScenes-Occupancy dataset under the muli-modal, LiDAR and camera settings, respectively.

5.5Engineering value
8.0Research novelty
5.0Business relevance

Links and sources

Need this topic turned into a technical roadmap?

Full Self Driving can prepare a custom autonomous driving literature review, code map, dataset map, and B2B technology assessment.

Request B2B research

Comments

No comments yet. Be the first to share your thoughts on this paper.
Login or register to leave a comment