Fusion of Semantic Segmentation Models for Vehicle Perception Tasks

Abstract

In self-navigation problems for autonomous vehicles, the variability of environmental conditions, complex scenes with vehicles and pedestrians, and the high-dimensional or real-time nature of tasks make segmentation challenging. Sensor fusion can representatively improve performances. Thus, this work highlights a late fusion concept used for semantic segmentation tasks in such perception systems. It is based on two approaches for merging information coming from two neural networks, one trained for camera data and one for LiDAR frames. The first approach involves fusing probabilities along with calculating partial conflicts and redistributing data. The second technique focuses on making individual decisions based on sources and fusing them later with weighted Shannon entropies. The two segmentation models are trained and evaluated on a particular KITTI semantic dataset. In the realm of multi-class segmentation tasks, the two fusion techniques are compared and evaluated with illustrative examples. Intersection over union metric and quality of decision are computed to assess the performance of each methodology.

Repository info

This repository represents some approaches of fusing two identical segmentation models. They are both convolutional neural network based, inspired from a cross-fusion model. One represents the neural network architecture that is trained with camera images, and the same one is used to learns features from dense map Lidar data.

Method 1: Bayesian PCR6+ Fusion

Method 2: Fused Decision Obtained with Shannon Entropy

Code to be uploaded once the work is recognized representative and the writing advances in the publishing proccedure.

Bayesian Decision Fusion with Weighted Entropies

The second approach works by making some decisions based on the Bayesian output of the architectures, considering entropies thereafter to check how consistent the information is. Suppose that for a camera model, a pixel ((i,j)), is considered with the following mass values for each class:

[m₁(R) = 0.80, m₁(V) = 0.15, m₁(B) = 0.05]

In this situation, taking the decision for pixel ((i,j) = R) (from the camera model) can be relevant, but not 100% sure because m₁(R) < 1. Similarly, for a LiDAR frame, suppose a pixel with mass values accordingly:

[m₂(R) = 0.55, m₂(V) = 0.25, m₂(B) = 0.20]

The decision will be the same, pixel ((i,j) = R), which can again be relevant, but the decision tends to be riskier as m₂(R) is just above 0.5. Instead of fusing probabilities directly, another way is to fuse weighted decisions by their quality, calculated from entropy.

Weighted Decision Fusion

In the previous example, based on m₁, the early state of the decision will represent (R) (road) class for the camera segmentation model:

[ md₁(R) = 1, md₁(V) = 0, md₁(B) = 0]

Then, according to the weight, the decision will be updated. The weight of source 1 for this pixel is calculated by the quality measure as:

w₁ = 1 - H(m₁)/H^max

where H(m₁) is the entropy of m₁ because m₁ is Bayesian. (For a more general (non-probabilistic) context when working with non-Bayesian BBAs, we could use the generalized entropy for belief functions defined in DezertEntropy.) Therefore, H(m₁) corresponds to Shannon entropy, while H^max is the maximum of Shannon entropy obtained for a uniform probability mass function.

Based on m₂, the (R) class will be decided. Therefore, the judgment based on LiDAR data is:

[md₂(R) = 1, md₂(V) = 0, md₂(B) = 0]

with the weight of source 2 (LiDAR) provided by the quality:

w₂ = 1 - H(m₂)/H^max

Final Fused Decisions

The decisions are fused by a simple weighted averaging rule as follows:

md(R) = (w₁ / (w₁ + w₂)) * md₁(R) + (w₂ / (w₁ + w₂)) * md₂(R) md(V) = (w₁ / (w₁ + w₂)) * md₁(V) + (w₂ / (w₁ + w₂)) * md₂(V) md(B) = (w₁ / (w₁ + w₂)) * md₁(B) + (w₂ / (w₁ + w₂)) * md₂(B)

In this simple example, Theta = 3 since the frame of discernment (FoD) has three singletons only. Therefore, w₁ will have a greater value than w₂ due to the lower entropy of H(m₁). Consequently, the camera source shows greater confidence.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
README.md		README.md
alg_pcr6.png		alg_pcr6.png
model_camera.ipynb		model_camera.ipynb
model_lidar.ipynb		model_lidar.ipynb
pcr6_example.ipynb		pcr6_example.ipynb
run_train_camera_model.ipynb		run_train_camera_model.ipynb
sample.png		sample.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Fusion of Semantic Segmentation Models for Vehicle Perception Tasks

Abstract

Repository info

Method 1: Bayesian PCR6+ Fusion

Method 2: Fused Decision Obtained with Shannon Entropy

Code to be uploaded once the work is recognized representative and the writing advances in the publishing proccedure.

Bayesian Decision Fusion with Weighted Entropies

Weighted Decision Fusion

Final Fused Decisions

About

Releases

Packages

Languages

vasigiurgi/fusing-segmentation-models

Folders and files

Latest commit

History

Repository files navigation

Fusion of Semantic Segmentation Models for Vehicle Perception Tasks

Abstract

Repository info

Method 1: Bayesian PCR6+ Fusion

Method 2: Fused Decision Obtained with Shannon Entropy

Code to be uploaded once the work is recognized representative and the writing advances in the publishing proccedure.

Bayesian Decision Fusion with Weighted Entropies

Weighted Decision Fusion

Final Fused Decisions

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages