This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
ICASSP 2025
Welcome to the official repository for our ICASSP2025 paper, "A2B: Ambisonic to Binaural Rendering using Neural Network." Here you'll find the implementation code, pre-trained models, and links to the A2B dataset discussed in our paper.
We are releasing over 33.5 hours of paired ambisonic-binaural recordings collected with a 10th, 4th and 2nd order ambisonic microphones. We provide the microphone array geometries, which are required for DSP-based baselines such as MagLS used in our paper.
- 10-A2B-R1: 635 GB (10.11 Hours)
- 10-A2B-R2: 648 GB (10.32 Hours)
- 4-A2B-MP: 72 GB (6.53 Hours)
- 2-A2B-MP: 32 GB (6.53 Hours)
The A2B Dataset is hosted on AWS S3. We recommend using the AWS command line interface (see AWS CLI installation instructions). No AWS credentials are needed.
To download the dataset run:
aws s3 cp --recursive --no-sign-request s3://fb-baas-f32eacb9-8abb-11eb-b2b8-4857dd089e15/A2B/ A2B/
or use sync
to avoid transferring existing files:
aws s3 sync --no-sign-request s3://fb-baas-f32eacb9-8abb-11eb-b2b8-4857dd089e15/A2B/ A2B/
To download a specific section of the dataset. For example "2-A2B-MP"
aws s3 cp --recursive --no-sign-request s3://fb-baas-f32eacb9-8abb-11eb-b2b8-4857dd089e15/A2B/A2B_Datasets/2-A2B-MP A2B/
We benchmarked our method on publicly available ambisonic-to-binaural datasets. Scripts to download each dataset are provided below:
For Urbansounds
$ sh src/preprocessing/urbansounds/download.sh
For BTPAB
$ sh src/preprocessing/bytedance/download.sh
This script enables you to combine multiple recordings into a unified dataset suitable for training and validation. Example configuration can be found in "configs/data/debug.yaml".
Here is an example that uses the debug.yaml configuration. This step generates JSON configuration files that will be read by a pytorch dataset loader.
$ python ./tools/prepare_dataset.py config_name="n2s_mk128_binaural" out_dir="exported_speakeasy_datasets/debug/"
Specify a configuration file via the config_name parameter.
$ config_name="a2b_model_bytedance_v10_1"
python ./tools/train.py config_name="models/${config_name}"
$ config_name="a2b_model_urbansounds_v2"
python ./tools/train.py config_name="models/${config_name}"
$ config_name="a2b_model_n2s_mk128_v1"
python ./tools/train.py config_name="models/${config_name}"
$ config_name="a2b_model_hearsay_mk128_v1.yaml"
python ./tools/train.py config_name="models/${config_name}"
config_name="n2s_mk128"
python inference/evaluations.py config_name=$config_name ckpt_path="pretrained_models/a2b_n2s/checkpoints/last.ckpt"
TODO
@inproceedings{gebru2025a2b,
title={A2B: Neural Rendering of Ambisonic Recordings to Binaural},
author={Gebru, Israel D and Keebler, Todd and Sandakly, Jake and Krenn, Steven and Markovi{\'c}, Dejan and Buffalini, Julia and Hassel, Samuel and Richard, Alexander},
booktitle={ICASSP 2025-2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
pages={1--5},
year={2025},
organization={IEEE}
}