Uncertainty Estimation for Sound Source Localization with Deep Learning

This repository contains the python implementation for the paper "Uncertainty Estimation for Sound Source Localization with Deep Learning".

Dataset

Source signals: LibriSpeech
Noise signals: Noise92X
The real-world dataset: LOCATA

These datasets mentioned above can be downloaded from this OneDrive link.

The data directory structure is shown as follows:

.
|---data
    |---LibriSpeech
        |---dev-clean
        |---test-clean
        |---train-clean-100
    |---NoiSig
    |---test
    |---train
    |---dev

Note: The data/ file does not have to be within your project, you can put it somewhere you want. Please remembet to fill the correct data path in config/tcrnn.yaml.

Get Started

Dependencies

We strongly recommend that you can use VSCode and Docker for this project, it can save you much time😁! Note that the related configurations has already been within .devcontainer. The detail information can be found in this Tutorial_for_Vscode&Dokcer.

The environment:

cuda:11.8.0
cudnn: 8
python: 3.10
pytorch: 2.2.0
pytorch lightning: 2.2

Configurations

The realted configurations are all saved in config/.

The data_simu.yaml is used to configure the data generation.
The tcrnn.yaml is used to configure the dataloader, model training & test.

You can change the value of these items based on your need.

Note: Do not forget to intall gpuRIR and webrtcvad.

🚀 Quick Start

Inference We provide the checkpoint to help you reproduce the results represented in the paper. ckpt download
Data Generation

Generate the training data:

python data_simu.py DATA_SIMU.TRAIN=True DATA_SIMU.TRAIN_NUM=10000

In the same way, you can also generate the validation and test datasets by changing the DATA_SIMU.TRAIN=True to DATA_SIMU.DEV=True or DATA_SIMU.TEST=True.

Model Training

python main_crnn.py fit --config /workspaces/TCRNN/config/tcrnn.yaml

The parameter for --config should point to your config file path.

Model Evaluation

Change the ckpt_path in the config/tcrnn.yaml to the trained model weight.
Use Multiple GPUs or Single GPU to test the model performance.

python main_crnn.py test --config /workspaces/TCRNN/config/tcrnn.yaml

If you want to evaluate the model using the Single GPU, you can change the value of the devices from "0,1" to "0," in the config/tcrnn.yaml.

🎓 Citation

If you find our work useful in your research, please consider citing:

@article{pi2025uncertainty,
  author={Pi, Rendong and Yu, Xiang},
  journal={IEEE Transactions on Instrumentation and Measurement}, 
  title={Uncertainty Estimation for Sound Source Localization With Deep Learning}, 
  year={2025},
  volume={74},
  number={},
  pages={1-12},
  doi={10.1109/TIM.2024.3522632}
}


@inproceedings{pi2024tssl,
  title={TSSL: Trusted Sound Source Localization},
  author={Pi, Rendong and Song, Yang and Li, Linfeng and Yu, Xiang and Cheng, Li},
  booktitle={INTER-NOISE and NOISE-CON Congress and Conference Proceedings},
  volume={270},
  number={11},
  pages={941--949},
  year={2024},
  organization={Institute of Noise Control Engineering}
}

Acknowledge

This repository adapts and integrates from some wonderful works, shown as follows:

SRP-DNN
FN-SSL
Cross3D

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Uncertainty Estimation for Sound Source Localization with Deep Learning

Dataset

Get Started

Dependencies

Configurations

🚀 Quick Start

🎓 Citation

Acknowledge

Files

README.md

Latest commit

History

README.md

File metadata and controls

Uncertainty Estimation for Sound Source Localization with Deep Learning

Dataset

Get Started

Dependencies

Configurations

🚀 Quick Start

🎓 Citation

Acknowledge