Balancing cost and performance is crucial when choosing high- versus low-resolution point-cloud roadside sensors. Inspired by human multi-sensory integration, we propose a modular framework to assess whether reductions in spatial resolution can be compensated by informational richness in detecting traffic participants. Extensive experimental testing on the proposed framework shows that fusing velocity-encoded radar with low-resolution LiDAR yields marked gains (14 AP
for pedestrians and an overall improvement of 1.5 mAP
across six categories) at lower cost
than high-resolution LiDAR alone.
Additionally, the roadside sensor placement locations, tilt angles, types, and configuration, will influence point cloud density and distribution across the coverage area. We realized a simulation tool that builds on integer programming and can automatically iterate and compare different multimodal sensor placement strategies against sensing coverage and cost jointly.
We compared two LiDAR-only algorithms, Pointpillars and Centerpoint, and four multimodal perception algorithms based on LiDAR and radar fusion, such as PointPillars_LR, L4DR, Bi-LRFusion and LiRaFusion.
Model | Modality | Resolution | mAP | Car | Truck | Bus | G.C. | Ped. | Motor. |
---|---|---|---|---|---|---|---|---|---|
Pointpillars | L | 2 × 16‑beam 1 × 32‑beam 1 × 64‑beam |
89.1 86.1 91.5 |
98.9 98.1 98.5 |
91.3 96.0 90.2 |
58.9 58.2 91.8 |
98.8 98.2 98.9 |
89.1 72.6 75.4 |
97.4 93.5 94.1 |
CenterPoint | L | 2 × 16‑beam 1 × 32‑beam 1 × 64‑beam |
94.6 91.6 92.5 |
97.8 96.7 96.7 |
92.2 93.3 93.3 |
97.8 97.8 97.8 |
95.5 95.5 95.8 |
88.9 74.4 78.9 |
95.6 92.2 92.2 |
Pointpillars_LR | L+R | 2 × 16‑beam 1 × 32‑beam 1 × 64‑beam |
98.1 95.1 95.9 |
98.9 97.8 97.8 |
98.8 99.8 99.9 |
100.0 100.0 100.0 |
98.9 99.9 98.9 |
94.3 78.6 84.1 |
97.8 94.3 94.4 |
L4DR | L+R | 2 × 16‑beam 1 × 32‑beam 1 × 64‑beam |
96.3 93.2 93.8 |
98.9 98.7 98.8 |
94.8 97.3 97.6 |
99.0 99.0 99.0 |
98.9 99.1 99.0 |
88.4 71.0 73.7 |
97.7 94.4 94.4 |
Bi-LRFusion | L+R | 2 × 16‑beam 1 × 32‑beam 1 × 64‑beam |
98.3 95.2 95.9 |
98.9 97.8 97.8 |
99.8 99.9 98.9 |
100.0 100.0 100.0 |
98.9 100.0 100.0 |
94.4 80.0 84.4 |
97.8 93.3 94.4 |
LiRaFusion | L+R | 2 × 16‑beam 1 × 32‑beam 1 × 64‑beam |
97.4 94.5 95.6 |
98.3 97.2 97.9 |
97.6 97.6 98.5 |
98.9 98.9 98.9 |
98.6 98.5 98.9 |
94.4 80.5 85.4 |
96.8 93.9 94.1 |
Without a standardized placement procedure and published sensor poses, mAP and other metrics are measured in incomparable setups and cannot reveal true model quality. Emphasizing roadside perception placement algorithms gives the research community a unified, repeatable geometric benchmark. Researchers can then evaluate LiDAR-only, radar-only, and multimodal-fusion methods on the same coverage-versus-cost baseline, ensuring that algorithm improvements.
Please refer to mmdetection3d for installation.
You can download our multi-modal dataset here and unzip all the zip files. We have provided offline processed annotation files and the prefix of our .pkl files is "nuscenes_mini_infos_train(val/test)"
To our knowledge, this is the first large-scale benchmark with multi-resolution and multi-modal dataset for roadside perception, allowing fair comparison of multi-modal detection and fusion methods. The data are organised similarly as in nuScenes, allowing immediate use with existing pipelines. We can see an overview of the registered 3D box labels and their properties in the following table. It is particularly noteworthy that there are many golf carts in the real world. This type of object is hardly considered by any other perception dataset. We import it into CARLA simulation through 3D modeling and collect corresponding data for training and test.
Class | Labels | Avg Length | Avg Width | Avg Height |
---|---|---|---|---|
Car | 39,169 | 4.56 | 2.06 | 1.56 |
Truck | 3,076 | 8.89 | 3.24 | 3.85 |
Motorcycle | 32,969 | 2.19 | 0.81 | 1.57 |
Bus | 6,032 | 5.56 | 2.31 | 2.52 |
Pedestrian | 51,689 | 0.50 | 0.50 | 1.73 |
Golf Cart | 2,398 | 3.73 | 1.61 | 1.96 |
Total | 135,333 | –– | –– | –– |
To train the model, use the following example command:
# single-gpu training
python ./tools/train.py <root_path>/configs/L4DR_carla.py
Here are the supported models now, please change the data_root
in config file to where you save data in your PC.
-
bi_lrfusion_carla.py
-
L4DR_carla.py
-
lrfusion_carla.py
-
yihong_carla.py
-
pointpillars_hv_secfpn_1xb6_nus-3d-6class.py
-
centerpoint_voxel0075_second_secfpn_1xb4-cyclic-20e_nus-6class.py
# single-gpu testing
python ./tools/test.py <CONFIG_FILE> <CHECKPOINT_FILE>
Our repo uses code from a few open source repositories. Without the efforts of these folks (and their willingness to release their implementations), our repo would not be possible. We thanks these authors for their efforts!