Skip to content

This repo is for the paper "Uncertainty Estimation for Sound Source Localization".

Notifications You must be signed in to change notification settings

Devin-Pi/uncertainty-estimation-for-ssl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Uncertainty Estimation for Sound Source Localization with Deep Learning

This repository contains the python implementation for the paper "Uncertainty Estimation for Sound Source Localization with Deep Learning". arcchitecture

Dataset

These datasets mentioned above can be downloaded from this OneDrive link.

The data directory structure is shown as follows:

.
|---data
    |---LibriSpeech
        |---dev-clean
        |---test-clean
        |---train-clean-100
    |---NoiSig
    |---test
    |---train
    |---dev

Note: The data/ file does not have to be within your project, you can put it somewhere you want. Please remembet to fill the correct data path in config/tcrnn.yaml.

Get Started

Dependencies

We strongly recommend that you can use VSCode and Docker for this project, it can save you much time😁! Note that the related configurations has already been within .devcontainer. The detail information can be found in this Tutorial_for_Vscode&Dokcer.

The environment:

  • cuda:11.8.0
  • cudnn: 8
  • python: 3.10
  • pytorch: 2.2.0
  • pytorch lightning: 2.2

Configurations

The realted configurations are all saved in config/.

  • The data_simu.yaml is used to configure the data generation.
  • The tcrnn.yaml is used to configure the dataloader, model training & test.

You can change the value of these items based on your need.

Note: Do not forget to intall gpuRIR and webrtcvad.

🚀 Quick Start

  • Inference We provide the checkpoint to help you reproduce the results represented in the paper. ckpt download

  • Data Generation

Generate the training data:

python data_simu.py DATA_SIMU.TRAIN=True DATA_SIMU.TRAIN_NUM=10000

In the same way, you can also generate the validation and test datasets by changing the DATA_SIMU.TRAIN=True to DATA_SIMU.DEV=True or DATA_SIMU.TEST=True.

  • Model Training
python main_crnn.py fit --config /workspaces/TCRNN/config/tcrnn.yaml

The parameter for --config should point to your config file path.

  • Model Evaluation
  1. Change the ckpt_path in the config/tcrnn.yaml to the trained model weight.
  2. Use Multiple GPUs or Single GPU to test the model performance.
python main_crnn.py test --config /workspaces/TCRNN/config/tcrnn.yaml

If you want to evaluate the model using the Single GPU, you can change the value of the devices from "0,1" to "0," in the config/tcrnn.yaml.

🎓 Citation

If you find our work useful in your research, please consider citing:

@article{pi2025uncertainty,
  author={Pi, Rendong and Yu, Xiang},
  journal={IEEE Transactions on Instrumentation and Measurement}, 
  title={Uncertainty Estimation for Sound Source Localization With Deep Learning}, 
  year={2025},
  volume={74},
  number={},
  pages={1-12},
  doi={10.1109/TIM.2024.3522632}
}


@inproceedings{pi2024tssl,
  title={TSSL: Trusted Sound Source Localization},
  author={Pi, Rendong and Song, Yang and Li, Linfeng and Yu, Xiang and Cheng, Li},
  booktitle={INTER-NOISE and NOISE-CON Congress and Conference Proceedings},
  volume={270},
  number={11},
  pages={941--949},
  year={2024},
  organization={Institute of Noise Control Engineering}
}

Acknowledge

This repository adapts and integrates from some wonderful works, shown as follows:

About

This repo is for the paper "Uncertainty Estimation for Sound Source Localization".

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages