Skip to content

Latest commit

 

History

History
102 lines (87 loc) · 4.04 KB

README.md

File metadata and controls

102 lines (87 loc) · 4.04 KB

Uncertainty Estimation for Sound Source Localization with Deep Learning

This repository contains the python implementation for the paper "Uncertainty Estimation for Sound Source Localization with Deep Learning". arcchitecture

Dataset

These datasets mentioned above can be downloaded from this OneDrive link.

The data directory structure is shown as follows:

.
|---data
    |---LibriSpeech
        |---dev-clean
        |---test-clean
        |---train-clean-100
    |---NoiSig
    |---test
    |---train
    |---dev

Note: The data/ file does not have to be within your project, you can put it somewhere you want. Please remembet to fill the correct data path in config/tcrnn.yaml.

Get Started

Dependencies

We strongly recommend that you can use VSCode and Docker for this project, it can save you much time😁! Note that the related configurations has already been within .devcontainer. The detail information can be found in this Tutorial_for_Vscode&Dokcer.

The environment:

  • cuda:11.8.0
  • cudnn: 8
  • python: 3.10
  • pytorch: 2.2.0
  • pytorch lightning: 2.2

Configurations

The realted configurations are all saved in config/.

  • The data_simu.yaml is used to configure the data generation.
  • The tcrnn.yaml is used to configure the dataloader, model training & test.

You can change the value of these items based on your need.

Note: Do not forget to intall gpuRIR and webrtcvad.

🚀 Quick Start

  • Inference We provide the checkpoint to help you reproduce the results represented in the paper. ckpt download

  • Data Generation

Generate the training data:

python data_simu.py DATA_SIMU.TRAIN=True DATA_SIMU.TRAIN_NUM=10000

In the same way, you can also generate the validation and test datasets by changing the DATA_SIMU.TRAIN=True to DATA_SIMU.DEV=True or DATA_SIMU.TEST=True.

  • Model Training
python main_crnn.py fit --config /workspaces/TCRNN/config/tcrnn.yaml

The parameter for --config should point to your config file path.

  • Model Evaluation
  1. Change the ckpt_path in the config/tcrnn.yaml to the trained model weight.
  2. Use Multiple GPUs or Single GPU to test the model performance.
python main_crnn.py test --config /workspaces/TCRNN/config/tcrnn.yaml

If you want to evaluate the model using the Single GPU, you can change the value of the devices from "0,1" to "0," in the config/tcrnn.yaml.

🎓 Citation

If you find our work useful in your research, please consider citing:

@article{pi2025uncertainty,
  author={Pi, Rendong and Yu, Xiang},
  journal={IEEE Transactions on Instrumentation and Measurement}, 
  title={Uncertainty Estimation for Sound Source Localization With Deep Learning}, 
  year={2025},
  volume={74},
  number={},
  pages={1-12},
  doi={10.1109/TIM.2024.3522632}
}


@inproceedings{pi2024tssl,
  title={TSSL: Trusted Sound Source Localization},
  author={Pi, Rendong and Song, Yang and Li, Linfeng and Yu, Xiang and Cheng, Li},
  booktitle={INTER-NOISE and NOISE-CON Congress and Conference Proceedings},
  volume={270},
  number={11},
  pages={941--949},
  year={2024},
  organization={Institute of Noise Control Engineering}
}

Acknowledge

This repository adapts and integrates from some wonderful works, shown as follows: